Abstract
Despite empirical mode decomposition (EMD) becoming a de facto standard for time-frequency analysis of nonlinear and non-stationary signals, its multivariate extensions are only emerging; yet, they are a prerequisite for direct multichannel data analysis. An important step in this direction is the computation of the local mean, as the concept of local extrema is not well defined for multivariate signals. To this end, we propose to use real-valued projections along multiple directions on hyperspheres (n-spheres) in order to calculate the envelopes and the local mean of multivariate signals, leading to multivariate extension of EMD. To generate a suitable set of direction vectors, unit hyperspheres (n-spheres) are sampled based on both uniform angular sampling methods and quasi-Monte Carlo-based low-discrepancy sequences. The potential of the proposed algorithm to find common oscillatory modes within multivariate data is demonstrated by simulations performed on both hexavariate synthetic and real-world human motion signals.
1. Introduction
The empirical mode decomposition (EMD) algorithm is a fully data-driven method designed for multiscale decomposition and time-frequency analysis of real-world signals (Huang et al. 1998), whereby the original signal is modelled as a linear combination of intrinsic oscillatory modes, called intrinsic mode functions (IMFs). The IMFs are defined so as to exhibit locality in time and to represent a single oscillatory mode; the subsequent application of Hilbert transform provides meaningful instantaneous frequency estimates (the so-called Hilbert–Huang transform) (Huang & Shen 2005). Owing to no a priori assumptions regarding the data nature, EMD has found applications in the analysis of nonlinear and non-stationary signals (e.g. Duffy 2004; Gautama et al. 2004; Janosi & Muller 2005; Wu & Hu 2006; Huang & Wu 2008; Lin et al. 2009).
The recent advances in physics and engineering have brought to light new problems dealing with complexity, uncertainty, nonlinearity and multichannel dynamics (Gautama et al. 2004; Mandic & Goh 2009); these signals are, however, almost invariably processed channel-wise (e.g. Duffy 2004; Janosi & Muller 2005; Wu & Hu 2006; Huang & Wu 2008). Extensions of standard EMD to multivariate signals are, therefore, a prerequisite for the accurate data-driven time-frequency analysis of such processes. In addition, joint analysis of multiple oscillatory components within a higher dimensional signal also helps to circumvent the mode alignment problem^{1} (Looney & Mandic 2009) experienced with standard EMD; in the complex domain, this has proven to facilitate, e.g., the synchronization of multichannel EEG signals and image fusion (Mandic et al. 2008; Looney & Mandic 2009). Recent multivariate extensions of EMD include those suitable for the processing of bivariate (e.g. Tanaka & Mandic 2006; Altaf et al. 2007; Rilling et al. 2007) and trivariate (Rehman & Mandic in press) signals; however, general original n-variate extensions of EMD are still lacking, and are subject of this work.
The key issue in EMD algorithm is the computation of the local mean of the original signal, a step which depends critically on finding the local extrema. However, for multivariate signals, this is not straightforward; for instance, the complex and quaternion fields are not ordered (Mandic & Goh 2009). We propose to alleviate this problem by using multiple real-valued projections of the signal; the extrema of such projected signals are then interpolated component-wise to yield the desired multidimensional envelopes of the signal. In our proposed multivariate extension of EMD, we choose a suitable set of direction vectors in n-dimensional spaces by using: (i) uniform angular coordinates and (ii) low-discrepancy pointsets stemming from quasi-Monte Carlo methods. It is shown that a set of direction vectors based on uniform sampling in the angular coordinate system is convenient to deal with; however, it yields non-uniformly distributed direction vectors. The approach based on low-discrepancy pointsets (Niederreiter 1992) provides a more uniform distribution of direction vectors (Cui & Freeden 1997), and hence more accurate local mean estimates in n-dimensional spaces.
This paper is organized as follows: bivariate and trivariate extensions of EMD are first discussed in §2. Section 3 introduces the proposed multivariate EMD method and analyses choices for a set of direction vectors in n-dimensional spaces. Section 4 illustrates the mode alignment property of the proposed method on a synthetic hexavariate signal and on multivariate processing of real-world orientation data.
2. Existing multivariate extensions of EMD
EMD is a fully data-driven method for the multiscale analysis of nonlinear and non-stationary real-world signals (Huang et al. 1998). It decomposes the original signal into a finite set of amplitude- and/or frequency-modulated (AM/FM) components, termed IMFs, which represent its inherent oscillatory modes. More specifically, for a real-valued signal x(k), the standard EMD finds a set of N IMFs , and a monotonic residue signal r(k), so that 2.1 To ensure well-behaved intrinsic oscillations, IMFs c_{i}(k) are defined so as to have symmetric upper and lower envelopes, with the number of zero crossings and the number of extrema differing at most by one. An iterative process called the sifting algorithm is employed to extract IMFs; for illustration, a sifting procedure for obtaining the first IMF from the signal is outlined in algorithm 1.
Algorithm 1. The standard EMD algorithm. |
---|
1. Find the locations of all the extrema of . |
2. Interpolate (using cubic spline interpolation) between all the minima (respectively maxima) to obtain the lower signal envelope, e_{min}(k) (respectively e_{max}(k)). |
3. Compute the local mean m(k)=[e_{min}(k)+e_{max}(k)]/2. |
4. Subtract the mean from the signal to obtain the ‘oscillatory mode’ . |
5. If s(k) obeys the stopping criteria, then we define d(k)=s(k) as an IMF, otherwise set and repeat the process from step 1. |
Once the first IMF is obtained, the same procedure is applied iteratively to the residual r(k)=x(k)−d(k) to extract the remaining IMFs. The standard stopping criterion terminates the sifting process only after the above condition for an IMF is met for S consecutive times (Huang et al. 2003).
(a) Bivariate/complex extensions of EMD
The first complex extension of EMD was proposed by Tanaka & Mandic (2006); it employed the concept of analytical signal and subsequently applied standard EMD to analyse complex/bivariate data; however, this method cannot guarantee an equal number of real and imaginary IMFs, thus limiting its applications. An extension of EMD which operates fully in the complex domain was first proposed by Altaf et al. (2007), termed rotation-invariant EMD (RI-EMD). The extrema of a complex/bivariate signal are chosen to be the points where the angle of the derivative of the complex signal becomes zero, that is, based on the change in the phase of the signal. The signal envelopes are produced by using component-wise spline interpolation, and the local maxima and minima are then averaged to obtain the local mean of the bivariate signal.
The RI-EMD algorithm uses effectively only the extrema of the imaginary part of the complex signal, which results in envelopes based on only two projected directions. An algorithm which gives more accurate values of the local mean is the bivariate EMD (BEMD) (Rilling et al. 2007), where the envelopes corresponding to multiple directions in the complex plane are generated, and then averaged to obtain the local mean. The set of direction vectors for projections are chosen as equidistant points along the unit circle. The zero mean rotating components embedded in the input bivariate signal then become bivariate/complex-valued IMFs. The RI-EMD and BEMD algorithms are equivalent for K=4 direction vectors.
(b) Trivariate EMD
An extension of EMD to trivariate signals has been recently proposed by Rehman & Mandic (in press); the estimation of the local mean and envelopes of a trivariate signal is performed by taking projections along multiple directions in three-dimensional spaces. To generate a set of multiple direction vectors in a three-dimensional space, a lattice is created by taking equidistant points on multiple longitudinal lines on the sphere (obtaining the so-called ‘equi-longitudinal lines’). The three-dimensional rotating components are thus embedded within the input signal as pure quaternion IMFs, thus benefitting from the desired rotation and orientation modelling capability of quaternion algebra.
3. The proposed n-variate EMD algorithm
In real-valued EMD, the local mean is computed by taking an average of upper and lower envelopes, which in turn are obtained by interpolating between the local maxima and minima. However, in general, for multivariate signals, the local maxima and minima may not be defined directly.^{2} Moreover, the notion of ‘oscillatory modes’ defining an IMF is rather confusing for multivariate signals. To deal with these problems, we propose to generate multiple n-dimensional envelopes by taking signal projections along different directions in n-dimensional spaces; these are then averaged to obtain the local mean. This idea of mapping an input multivariate signal into multiple real-valued ‘projected’ signals, to generate multidimensional envelopes, is a generalization of the concept employed in existing bivariate (Rilling et al. 2007) and trivariate (Rehman & Mandic in press) extensions of EMD, yielding n-dimensional rotational modes via the corresponding multivariate IMFs. However, the issue of choosing a suitable set of direction vectors for taking signal projections in n-dimensional spaces needs special attention.
(a) Direction vectors on an n-sphere
The calculation of the local mean can be considered an approximation of the integral of all the envelopes along multiple directions in an n-dimensional space, and the accuracy of this approximation is dependent on how uniformly the direction vectors are chosen, especially for a limited number of direction vectors. As the direction vectors in n-dimensional spaces can be equivalently represented as points on the corresponding unit (n−1) spheres,^{3} therefore, the problem of finding a suitable set of direction vectors can be treated as that of finding a uniform sampling scheme on an n sphere.
(i) Uniform angular sampling
A simple and practically convenient choice for a set of direction vectors is to employ uniform angular sampling of a unit sphere in an n-dimensional hyperspherical coordinate system. The resulting set of direction vectors covers the whole (n−1) sphere, as shown in figure 1a for a particular example of a two-sphere. For the generation of a pointset on an (n−1) sphere, consider the n sphere with centre point C and radius R, given by 3.1 A coordinate system in an n-dimensional Euclidean space can then be defined to serve as a pointset (and the corresponding set of direction vectors) on an (n−1) sphere. Let {θ_{1},θ_{2},…,θ_{(n−1)}} be the (n−1) angular coordinates, then an n-dimensional coordinate system having as the n coordinates on a unit (n−1) sphere is given by 3.2
The pointset corresponding to the n-dimensional coordinate system is now very convenient to generate; however, for n>1, it does not provide a uniform sampling distribution, as illustrated by a higher density of the points when approaching the poles of a two-sphere (figure 1a).
(ii) Sampling based on low-discrepancy pointsets
We next employ the concept of discrepancy to generate a uniform pointset on an n sphere. Discrepancy can be regarded as a quantitative measure for the irregularity (non-uniformity) of a distribution, and may be used for the generation of the so-called ‘low discrepancy pointset’, leading to a more uniform distribution on the n sphere. It belongs to the class of quasi-Monte Carlo methods (Niederreiter 1992), which are particularly important in numerical integration problems where they ensure smaller error bounds as compared with the standard Monte Carlo methods.^{4} As the computation of local mean via envelope averaging can be seen as a numerical integration problem, we can use quasi-Monte Carlo techniques to estimate the local mean within multivariate extension of EMD. This way, the low-discrepancy sequences yield a more uniformly distributed set of direction vectors for generating signal projections and the corresponding signal envelopes ensuring accurate local mean estimates.
A convenient method for generating multidimensional ‘low-discrepancy’ sequences involves the family of Halton and Hammersley sequences, which are proven to show considerable improvement, in terms of error bounds, over standard Monte Carlo methods (Niederreiter 1992). Moreover, the set of direction vectors generated by the Hammersley sequence also yields improved generalized discrepancy estimates as compared with other sampling methods, and hence are uniformly distributed on a sphere (Cui & Freeden 1997). To generate the Hammersley sequence, let x_{1},x_{2},…,x_{n} be the first n prime numbers, then the ith sample of a one-dimensional Halton sequence, denoted by , is given by 3.3 where base-x representation of i is given by 3.4 Starting from i=0, the ith sample of the Halton sequence then becomes 3.5 The Hammersley sequence is used when the total number of samples n is known a priori; in this case, the ith sample within the Hammersley sequence is calculated as 3.6 For illustration, figures 1b and 2 show, respectively, the pointsets on the surface of the sphere (two-sphere) and hypersphere (three-sphere) generated by the low-discrepancy Hammersley sequence. Observe that, as desired, the points generated by the low-discrepancy method are more uniformly distributed. In figure 2, ideally, the pointset should be plotted on a three-sphere; however, for visualization purposes, we can only use three two-spheres.
The Halton and Hammersley sequence-based pointsets are convenient to generate; however, their performance may decrease with an increase in the number of dimensions. To alleviate this problem, (t,s) sequences and (t,m,s) nets (Niederreiter 1992) may be used. Once a suitable set of direction vectors on the n sphere is generated (by using any of the above methods), projections of the input signal are calculated along this set; the extrema of such projected signals are interpolated component-wise to yield the desired multidimensional envelopes of the signal. The multiple envelope curves, each corresponding to a particular direction vector, are then averaged to obtain the multivariate signal mean.
Consider a sequence of n-dimensional vectors which represents a multivariate signal with n components, and denoting a set of direction vectors along the directions given by angles on an (n−1) sphere. Then, the proposed multivariate extension of EMD suitable for operating on general nonlinear and non-stationary n-variate time series is summarized in algorithm 2.
Algorithm 2. Multivariate extension of EMD. |
---|
1. Choose a suitable pointset for sampling on an (n−1) sphere. |
2. Calculate a projection, denoted by , of the input signal along the direction vector x^{θk}, for all k (the whole set of direction vectors), giving as the set of projections. |
3. Find the time instants corresponding to the maxima of the set of projected signals . |
4. Interpolate to obtain multivariate envelope curves . |
5. For a set of K direction vectors, the mean m(t) of the envelope curves is calculated as |
6. Extract the ‘detail’ d(t) using d(t)=x(t)−m(t). If the ‘detail’ d(t) fulfills the stoppage criterion for a multivariate IMF, apply the above procedure to x(t)−d(t), otherwise apply it to d(t). |
The stoppage criterion for multivariate IMFs is similar to that proposed by Huang et al. (2003), the difference being that the condition for equality of the number of extrema and zero crossings is not imposed, as extrema cannot be properly defined for multivariate signals (Mandic & Goh 2009).
4. Simulations
Simulations were conducted on both a synthetic signal and a real-world multivariate inertial body motion recording. For all the signals, the low-discrepancy Hammersley sequence was used to generate a set of K=512 direction vectors for taking signal projections.
(a) Mode alignment using multivariate IMFs
Similarly to bivariate (Rilling et al. 2007) and trivariate (Rehman & Mandic in press) extensions of EMD, we will now show that the proposed n-variate extension of EMD has the ability to align ‘common scales’ present within multivariate data. Each ‘common scale’ is manifested in the common oscillatory modes in all the variates within an n-variate IMF. Such mode alignment property helps to make use of similar scales in different data sources, and hence, can be used for data fusion purposes (Mandic et al. 2008). To illustrate the mode alignment property of the proposed method, we analysed a synthetic hexavariate time series; each component (variate), shown in the top row of figure 3 (denoted by U,V,W,X,Y and Z), was constructed from a set of four sinusoids. One sinusoid was made common to all the components, whereas the remaining three sinusoidal components were combined so that the resulting signal had a common frequency mode in each UVWY , UVX and UWXZ components. The proposed n-variate EMD extension was then applied to the resulting hexavariate signal yielding multiple IMFs, as shown in figure 3. Observe that the sinusoid common to all the components of the input is the third IMF, whereas the remaining three frequency modes were also accurately extracted in the respective IMFs. Notice the similar separability of the tones as with standard EMD; for more detail, see Rilling & Flandrin (2008). Such mode alignment cannot be achieved by the real-valued EMD applied component-wise, as it generally does not yield the same number of IMFs per component.
(b) Rotational modes extraction from real-world signals
To illustrate the ability of the proposed method to extract common modes within multivariate real-world signals, we next considered body motion data recorded in a Tai Chi sequence. The data were captured using two inertial three-dimensional sensors attached to the left hand and the left ankle; these were combined to form a single hexavariate signal. The common rotational modes were found within multiple hexavariate IMFs, and the components corresponding to the hand and the ankle were plotted separately as three-dimensional plots in figure 4. Observe that each such IMF represents a unique rotational mode embedded within the original trivariate signal. Unlike when applying the trivariate EMD method separately on each three-dimensional signal, our proposed method guarantees the extraction of common rotational modes, as the direct analysis of a hexavariate signal results in matched IMFs (both in number and in frequency scale).
5. Conclusions
An extension of EMD has been proposed to cater for a general class of n-variate signals. The critical step of envelope interpolation is performed by taking projections of the multivariate signal along multiple directions on an n sphere. In addition, the use of low discrepancy pointset gives uniformly distributed direction vectors on an n sphere and makes the resulting method accurate and computationally efficient. It has been shown that the proposed method has the ability to extract common rotational modes across the signal components, making it suitable, for example, for fusion of information from multiple sources. Simulations on synthetic and real-world hexavariate data support the analysis.
Footnotes
↵1 Mode alignment in multivariate data corresponds to finding a set of common scales/modes across different components (variates) of a multivariate signal, thus ensuring that the IMFs are matched both in the number and in scale properties.
↵2 For instance, the fields of complex numbers and quaternions are not ordered, and relations such as ‘<’ and ‘>’ do not make sense.
↵3 An n sphere (hypersphere) is an extension of the ordinary sphere to an arbitrary dimension and is represented mathematically in equation (3.1). We adopt the terminology that an n sphere resides in an (n+1)-dimensional Euclidean coordinate system.
↵4 Standard Monte Carlo methods, using independent random samples, can also be used to develop multivariate extensions of EMD. However, this would only result in a probabalistic error bound; thus, any two applications of the algorithm with similar input and parameters would, in general, yield different decompositions.
- Received September 25, 2009.
- Accepted November 20, 2009.
- © 2009 The Royal Society