## Abstract

The Gaussian smoothing method is shown to have a wide transition zone around the cut-off frequency selected to filter a given dataset. We proposed two iterative Gaussian smoothing methods to tighten the transition zone: one being approximately diffusive and the other being strictly diffusive. The first version smoothes repeatedly the remaining high-frequency parts and the second version requires an additional step to further smooth the resulting smoothed response in each of the smoothing operation. Based on the choice of the criterion for accuracy, the smoothing factor and the number of iterations are derived for an infinite data length in both methods. By contrast, for a finite-length data string, results of the interior points (sufficiently away from the two endpoints) obtained by both methods can be shown to exhibit an approximate diffusive property. The upper bound of the distance affected by the error propagation inward due to the lack of data beyond the two ends is numerically estimated. Numerical experiments also show that results of employing the iterative Gaussian smoothing method are almost the same as those obtained by the strict diffusive version, except that the error propagation distance induced by the latter is slightly deeper than that of the former. The proposed method has been successfully applied to decompose the wave formation of a number of test cases including two sets of real experimental data.

## 1. Introduction

The least-squares method is widely used in many mathematical, physical and engineering applications (Marr & Hildreth 1980; Lancaster & Salkauskas 1986; Jeng 2000; Fasshauer 2002; Liew *et al*. 2002). When the dataset to be analysed is complicated, a lower degree polynomial approximation is not sufficient to capture the profile trends and a high-degree polynomial is frequently suggested. However, a high-degree polynomial approximation is more complicated to implement and it can sometimes give rise to undesirable oscillations.

If the least-squares method is defined locally at every point, the method is called the moving least-squares method (Lancaster & Salkauskas 1986). The method can be applied in both discrete and continuous datasets, but in this work we shall consider only the former cases. Moreover, the moving least-squares methods can be used in conjunction with a weighting kernel of Gaussian exponential function to eliminate unnecessary spurious oscillation (Marr & Hildreth 1980; Mackworth & Mokhtarian 1984; Mokhtarian & Mackworth 1986; Lowe 1989; Weiss 1994; Jeng 2000; Fasshauer 2002; Liew *et al*. 2002; Jeng & Cheng 2004). If the approximated polynomial is of the zeroth degree, the method becomes the conventional Gaussian smoothing method (Mackworth & Mokhtarian 1984; Mokhtarian & Mackworth 1986; Lowe 1989). Among many available weighting kernels for the zeroth-order moving least-squares method, the Gaussian exponential function has the following two useful properties. First, by a careful selection of an appropriate smoothing factor, *σ*, as defined in the Gaussian function exp [−*x*^{2}/(2*σ*^{2})], there exists a desired threshold frequency above which the waveforms can be effectively filtered out. Second, the relation between the original data and resulting smoothed data is akin to that observed between the initial temperature distribution and the exact solution at a corresponding instant of the equation for unsteady one-dimensional heat conduction (Carslaw & Jaeger 1957). In other words, the Gaussian smoothing method is closely related to the spontaneous diffusion process of nature and therefore obeys the second law of thermodynamics (Morse & Feshbach 1953).

The signal decomposition using the wavelet transform has been successfully applied to many fields (Hess-Nielsen & Wickerhauser 1996; Goswami & Chan 1999; Sonka *et al*. 1999; Gonzalez & Woods 2002; Misiti *et al*. 2007). The wavelet transform employs a weighting kernel function defined within a finite interval to resolve the local spectrum. Although many filters using the wavelet kernels perform very well, there is no evidence to show that they are non-diffusive. The other class of signal decomposition involves the Laplacian and Gaussian pyramids that are widely employed in the field of image processing (Burt & Adelson 1983; Sonka *et al*. 1999; Gonzalez & Woods 2002). While these methods aim to provide a significant data reduction for imagining procession, it is possible that the present approach can be used as a basic tool for the data reduction. In addition, there are a number of other methods proposed to decompose a time-series dataset consisting of many waves with different and variable wavelengths; for example, the matrix pencil method (Hua & Sarkar 1989; Ruscio 1997) and empirical mode decomposition method (Huang *et al*. 1998, 1999). However, except for the wavelet filters, a complete discussion of these topics is not within the scope of the present paper.

In §2*a*, the Gaussian smoothing method in terms of the zeroth-order moving least-squares method is illustrated. The diffusive property and transition zone of the resulting response are also discussed. In §2*b*,*c*, two iterative Gaussian smoothing methods are proposed, one with and one without strict diffusive properties, respectively. Their resulting smoothed part, which is defined as the difference between the original data and final high-frequency part, is shown to have a narrower transition zone than that obtained by employing the original Gaussian smoothing method. The applications of these iterative Gaussian smoothing methods for discrete finite-length data string are discussed in §2*d*. Error estimations by applying the smoothing methods discussed in §2*b* to finite-length data string are also shown in §2*d*. A list of the iterative procedure is shown in §2*e*. Finally, numerical tests of iterative Gaussian methods for a number of cases are included in §3.

## 2. Theoretical development

### (a) Gaussian smoothing method for a discrete data string of infinite range

Consider a set of infinite data (*x*_{i}, *y*_{i}), *x*_{i}=*i*Δ*x*, −*∞*<*i*<*∞*, to be approximated by a polynomial. The error targeting function of the weighted moving least-squares method is defined at every data point *x*_{j}, −*∞*<*j*<*∞* as(2.1)The moving least-squares method requires that *I*_{j} is a minimum with respect to *A*_{0,j}, …, *A*_{M,j}. Thus, a set of simultaneous algebraic equations will have to be solved at every point *x*=*x*_{j} (Marr & Hildreth 1980; Jeng 2000; Fasshauer 2002; Liew *et al*. 2002). If the polynomial contains only the zero-degree term (*M*=0), the result becomes the well-known Gaussian smoothing method (Mackworth & Mokhtarian 1984; Mokhtarian & Mackworth 1986; Lowe 1989), which takes the following form explicitly:(2.2)As Δ*x*→0, *k* can be replaced by the factor of and the above formula becomes the explicit form shown in Marr & Hildreth (1980), Mackworth & Mokhtarian (1984), Mokhtarian & Mackworth (1986), Lowe (1989), Weiss (1994), Jeng (2000), Fasshauer (2002) and Liew *et al*. (2002).

Assume *y*_{j} can be expressed in the following discrete form:(2.3)where *λ*_{l} is the wavelength of the *l*th mode. For a finite data string with a period *T*, *λ*_{l}=*T*/*l*, the upper limit of the summation is a finite integer and equation (2.3) is just a discrete Fourier series expansion. It should be pointed out that although we use equation (2.3) as an example in the following discussion, it is not always possible to identify such a period, *T*, for a real data string because not all data are periodical by their nature. Therefore, filters defined on the spectral domain may introduce both amplitude attenuation and phase error because it may not be possible to filter a real dataset on the spectral domain precisely. By contrast, it will be shown that the proposed filtering procedure, which is operated on the time domain, works very well without knowing the exact form of the spectrum, involving a prior knowledge of the values of *c*_{l}'s, *d*_{l}'s and *λ*_{l}'s.

After applying the Gaussian smoothing method, it can be shown that the response is(2.4)where(2.5)By approximating the summation over all *i*'s of equation (2.5) with a proper integration formula and making use of the definition of *k*, equation (2.2), one can deduce the following relationship:(2.6)which satisfies the following inequalities:(2.7)where *ϵ* is a positive machine round-off error. The second inequality of equation (2.7) appears to be obvious because a cosine function is always less than 1. The first inequality of equation (2.7) can be verified only by extensive numerical tests. It reflects the possibility that, for modes having *σ*≫*λ*_{l}, the exponential function of equation (2.6) is a positive value close to 0 but the error term, *O*(Δ*x*^{2}), may take a small negative value.

If the data are continuous, the diffusive property of the Gaussian smoothing method is(2.8)After applying the smoothing step to the low-frequency modes, the original *y*_{i} can be cast into a sum of the smooth part and the high-frequency part . This implies that the smoothing operation can be considered as a filter. An ideal filter is infinitely sharp such that, there exists a cut-off wavelength, *λ*_{c}, whereby does not contain any Fourier mode whose wavelength *λ* is less than *λ*_{c} and does not have any mode whose wavelength *λ* is greater than *λ*_{c}. Unfortunately, most time-domain filter operations are not ideal and therefore there exists an open interval, , where one cannot separate and into two distinct regions. This open interval is referred to as the transition zone of the filter. Considering the Gaussian smoothing method as an example, one may identify a region, , being the corresponding transition zone, where *ϵ*_{a} is a small positive constant. Unfortunately, as shown in figure 1*a*, the transition zone of the Gaussian smoothing can be too wide to be useful. A careful inspection of this figure reveals that(2.9)This implies that, when *σ*>1.7*λ*, the wave can be almost completely filtered; when *σ*<0.025*λ*, the wave can be completely retained; when *σ* has a value in between 1.7*λ* and 0.025*λ*, it is a transition region where the wave can only be partially retained (or filtered). For example, figure 1*b* shows that the response of a single sine function with *λ*=1 via the Gaussian smoothing method with *σ*=0.2 has led to a smoothed profile showing *a*(0.2, 1)≈0.45404. Unfortunately, the transition zone of the Gaussian smoothing method is rather wide, and it is only useful to decompose two wave forms with a wavelength ratio larger than 40 : 0.6. This makes the method impractical for filtering or decomposing any real data. In order to derive a smoothing method having a narrower transition zone, the following iterative scheme is proposed.

### (b) Iterative Gaussian smoothing method for a discrete data string of infinite range

By denoting *S*{*y*_{j}} as an operator for applying the Gaussian smoothing method to *y*_{j} and renaming on the l.h.s. of equation (2.2) as , where the subscript 1 denotes the smoothed part of the first cycle, one yields(2.10)where is the remaining high-frequency part of the first cycle. Likewise, the remaining high-frequency part is smoothed again to obtain yet again the smooth and high-frequency parts of the second cycle, say and , respectively. Repeating the procedure up to *m*th cycle, the corresponding smooth and high-frequency parts are, respectively,(2.11)The accumulated smooth parts, , are related to the final high-frequency part through the following relation:(2.12)For convenience, we rewrite the high-frequency part of equation (2.11) as(2.13)where(2.14)It can be shown that(2.15)Figure 2 shows the variation of with respect to the ratio for a range of iteration steps, *m*. The line with *m*=1 shows the attenuation factor of the low-frequency part generated by applying the Gaussian smoothing method once. As the iteration step, *m*, increases, the whole transition zone shifts to the left and its width becomes narrower and narrower. Since the lower bound of is ±*ϵ*, as shown in equation (2.7), the upper bound of will have a value of where . This may cause some problems that the high-frequency wave may manifest itself during iterations and hence lead to a divergence of the filtering procedure. We shall discuss a fix to this problem in §2*b*.

Suppose we applied the iterative Gaussian smoothing method to a data string having two distinct wavelengths, *λ*_{1} and *λ*_{2}, where *λ*_{1}>*λ*_{2}. If we would like to retain *λ*_{1}-wave while filter *λ*_{2}-wave, it is intuitive to require that(2.16)where the parameter *δ* can take an arbitrarily small value. The solution of this set of simultaneous equations will give the value of the smoothing factor *σ* and the number of cycles, *m*, required to perform the decomposition of the two waves. A plot of these solutions is depicted in figure 3*a*,*b*, for the value of *σ*/*λ*_{1} and *m*, respectively. In §3 we will show how to apply the iterative Gaussian smoothing method to separate two distinct waves.

Note that, as the value of *λ*_{2}/*λ*_{1} becomes smaller than 2 (whose corresponding parameters are *δ*=0.001, *σ*/*λ*_{1}≈0.7715 and *m*≈127), the required number of iteration cycles increases exponentially. For example, when *λ*_{2}/*λ*_{1}=1.5, 8134 iteration steps are needed to separate the two waves while when *λ*_{2}/*λ*_{1}=1.25, *m*≈4 615 005. These numbers can also be affected by the choice of the value of *δ*. As it can be seen from table 1, if the accuracy increases by an order of magnitude, the required value of *σ*/*λ*_{1} increases slightly while the iteration steps *m* increase approximately three to four orders of magnitude.

It is interesting to see that the underlying principle of the proposed iteration method is not very different from the classical iteration methods. For example, the filtering is operated on the smooth part in the current method while in the Ray & Ray iterative Gaussian scheme (Ray & Ray 1995) and the Gaussian pyramid (Burt & Adelson 1983; Sonka *et al*. 1999; Gonzalez & Woods 2002; Misiti *et al*. 2007), it is the high-frequency part that is repeatedly smoothed. Owing to the fact that the result of applying the Gaussian smoothing method can be looked upon as the exact solution of the one-dimensional time-dependent heat conduction problem, the result of one smoothing step can be interpreted as the corresponding solution of the one-dimensional time-dependent heat conduction problem at the same time instant. The method is therefore akin to a time-marching scheme for solving a partial differential equation by repeatedly reducing the residue to zero in a sequential manner.

### (c) An iterative Gaussian smoothing method with a strict diffusive property

As shown in equation (2.15), the upper bound of has a value of which may cause some problems. In order to make the iterative procedure satisfy a strict diffusive property such that , the Gaussian smoothing method is employed to smooth of equation (2.2) once more and denotes the resulting smoothed part as ,(2.17)where *S*^{2}{*y*_{j}} denotes the application of the Gaussian smoothing method, firstly to *y*_{j} and then to once again. The *m*th cycle's results are(2.18)Owing to the strict non-negative property of *a*^{2}, the factor satisfies(2.19)In other words, this iterative procedure is a strict diffusive smoothing method, and the corresponding variation of this new attenuation factor, , with respect to *λ*/*σ* is exactly the same as the one shown in figure 2, provided that *σ* is replaced by .

The estimated smoothing factor *σ*_{d} and iteration *m*_{d} (the subscript ‘d’ stands for the double smoothing that leads to the strict diffusive method) are solved from equation (2.16) with the factor *a*(*σ*, *λ*_{l}) replaced by *a*^{2}(*σ*, *λ*_{l}) and the factor 2 of the exponential function replaced by 4, respectively. The corresponding parameters and *m*_{d} for different values of *δ* are exactly the same as shown in figure 3*a*,*b*, respectively. It is noted that the required computing time of employing the strict diffusive iterative Gaussian smoothing method is twice that for the iterative Gaussian smoothing method.

### (d) Filtering a discrete data string with finite length

In practice, data are not periodic and therefore they cannot be interpreted as infinite data strings. The extension of the two proposed iterative Gaussian smoothing methods to a data string of finite length is discussed in this section.

For a finite-range data string, say (*x*_{j}, *y*_{j}), *j*=0, 1, 2, …, *n*, the corresponding discrete Fourier expansion is(2.20)The application of the Gaussian smoothing method, which is the zeroth-order moving least-squares method for a finite data length, will give(2.21)Unlike *k* of equation (2.2), is not a constant but changes with respect to *x*_{j} now. Following the similar transform to obtain equation (2.4), the following relations are obtained:(2.22)If *x*_{m}>5*σ* where , the magnitude of the Gaussian functions, , is of the order of *O*(−6) and is of no significance and hence equation (2.22) can be considered to be unaffected by the finite-length assumption of the data(2.23)As a result, equation (2.22) can be written as(2.24)On the other hand when data points are close to the boundary, the upper bound of the difference between smoothing for finite data points and infinite data points, and , respectively, can be estimated as(2.25)where *Φ* is the error function and in which and the index, *i*≈0, implies the points in the vicinity of region where .

When the iterative Gaussian smoothing method is employed, the error estimation at the *m*-step can be approximated by the following recurrent equation:(2.26)where *e*_{j,m−1} is the estimated error generated in the previous iteration cycle and *e*_{j,0}=0 initially. The resulting upper bound distributions with respect to iteration cycles are shown in figure 4. For the diffusive iterative Gaussian smoothing method, the error bound estimation is exactly the same provided that the iteration number *m* in equation (2.26) is interpreted as the accumulated number of smoothing. The penetration distances for different error levels estimated for both the approximate and strict diffusive methods are depicted in figure 4 and table 2. It can be seen that the penetration distance increases with the increasing number of iterations in an approximately exponential manner. It is noted that figure 4 and table 2 give the maximum possible errors and therefore the estimated distance of influence can be much larger than real numerical tests. This exercise only serves to give an indication of how the error propagates with iterations.

### (e) The algorithm for the iterative smoothing procedure

The algorithm for the proposed iterative procedure can be summarized as follows.

Select the scheme of the iterative Gaussian smoothing method—be it a diffusive or a strict diffusive procedure.

Determine the transition zone, say

*λ*_{1}and*λ*_{2}. It is recommended to allow the transition zone to satisfy*λ*_{2}/*λ*_{1}≥2 as discussed in §2*b*.Determine values of

*σ*and*m*by solving equation (2.16). If the strict diffusive version is employed, the resulting*σ*is replaced by .Repeatedly apply the corresponding iterative method with the parameter

*σ*for*m*iteration cycles.The resulting high-frequency part is the desired high-frequency part and the difference between the original data and the high-frequency part becomes the smooth part.

## 3. Results and discussions

In order to examine the proposed iteration procedure, the following composite waveform is made up of two independent waves having wavelengths 0.5 and 0.25, respectively:(3.1)If *δ*=0.001 and the iterative Gaussian smoothing method are chosen, table 1 suggests that the critical *σ* and *m* are 0.193 and 127, respectively. In order to simulate the infinite-domain limit, the calculation is taken in the range of 0≤*x*_{j}≤10 and −10≤*x*_{i}≤20 with Δ*x*=0.005 where *x*_{i} and *x*_{j} are defined in equations (2.1) and (2.2), respectively. After all the data at *x*_{j}'s are evaluated in every cycle, they are periodically mapped to those points *x*_{i} in the range of −10<*x*_{i}<0 and 10<*x*_{i}<20, respectively. These conditions ensure that for all |*x*_{i}−*x*_{j}|>10 and 0≤*x*_{j}≤10 simultaneously. Figure 5 shows the resulting accumulated smooth part of the 1st, 5th, 10th, 20th and 50th iteration cycles. It is clear that the smooth part gradually approaches the original long-wave mode. The converging history is shown as the dotted line in figure 6*a*, which shows an exponential decay of the *ℓ*_{2} error and confirms the estimation of the attenuation factor *a*(*σ*, *λ*_{l}) in equation (2.6). The final *ℓ*_{2} error is 0.0009794 while the *ℓ*_{∞} error of 0.001726 shows that the maximum error will be slightly larger than *δ*=0.001. Figure 6*a* also shows the convergent histories for several different *δ* values, all of which show similar exponential convergent behaviour. In figure 6*b*, there is a solid line that shows a non-convergent behaviour with Δ*x*=1/7 and all the other parameters are the same as that of the dotted line (fine-grid solution). Numerical experiments show that if Δ*x*<1/8, the convergent history is almost the same as that of the dotted line of figure 6*b*. It seems that a coarse resolution of the original data will affect the accuracy of the wave decomposition procedure.

If the strict diffusive iterative Gaussian smoothing method is chosen, keeping all the rest parameters except *σ*=0.13647, the corresponding result coincides with that of figure 5 and the converging history is almost equal to the dotted line shown in figure 6*a*. The final *ℓ*_{2} error is 0.0009795 and the *ℓ*_{∞} error is 0.001726, which are almost equal to that of the single smoothing per cycle. Table 3 shows the comparison of both versions for different values of *δ*, provided that the grid size is fine enough. It seems that both the *ℓ*_{2} and *ℓ*_{∞} errors are the same for most *δ*'s, except a small difference for the case with *δ*=0.000001. Besides, all the *ℓ*_{2} and *ℓ*_{∞} errors are also approximately equal to the corresponding values of *δ*.

Now the finite-domain wave decomposition for the waveform of equation (3.1) is examined. The data range of 0≤*x*_{i}, *x*_{j}≤20 is employed where data in the ranges of *x*<0 and *x*>20 are not considered. Figure 7*a*,*b* shows detailed results of 20th and 50th cycles for both versions of employing the iterative Gaussian smoothing method (*σ*=0.193) and diffusive version (*σ*=0.13746), respectively, where other parameters use values of Δ*x*=0.005 *δ*=0.001, *m*=127, etc. Obviously, except at the region around the endpoint where *x*=0, both solutions coincide with each other. The corresponding final error plots of two versions are plotted in figure 8*a*,*b*, respectively. Again, their differences are insignificantly small in most regions.

Table 4 shows the region affected by the endpoints for different values of *δ*. It is clear that the diffusive iterative Gaussian smoothing method results in a slightly deeper influence region (for cases with *δ*=0.01–0.00001 with moderate accuracy) than that using the iterative Gaussian smoothing method. For the highly accurate one with *δ*=0.000001, the diffusive version is approximately 20% deeper than the approximate diffusive version.

The second test case adds random number to the function of equation (3.1) so that it becomes(3.2)where the function *r*′(*x*) is the high-frequency part of *r*(*x*) and *r*(*x*) is the random number generated by the random number function of the Microsoft F-77 software. At first, a long enough data of *r*(*x*) are filtered by the iterative Gaussian smoothing method with *σ*=0.0175428 and *m*≈17 to drop those modes whose *λ*>0.025. That *r*′(*x*) employed in equation (3.2) is taken from the central part of the resulting high-frequency part. A typical result shown in figure 9 is corresponding to that of figure 5. All the other data are almost the same as those shown in figures 6 and 7 and tables 3 and 4. Since the spectrum of the random number is non-zero for all high-frequency modes, the consistent results between that of equations (3.1) and (3.2) verify that the proposed filter works.

The third test case uses the following composite waveform that involves a non-sinusoidal part.(3.3)In figure 10*a*,*b*, the original data are shown as a thin solid line, the original low-frequency and non-sinusoidal parts (the first line of equation (3.3)) are shown as a thick solid line, and the estimated low-frequency and non-sinusoidal parts are shown as a dashed line. The error around the two ends is again obvious. The corresponding log-error plots are shown in figure 11*a*,*b*, respectively. The error around the two ends attenuates in an exponential manner as points move towards the interior region such that the diffusive version has a slightly longer penetration length (approx. 1.8 for both versions) than the iterative Gaussian smoothing method.

The above comparisons show that, in the interior point far from the two ends, the difference between the results of employing the iterative and diffusive iterative Gaussian smoothing methods is insignificant for both the finite- and infinite-domain data. Note that the latter version should remove a longer segment around the two ends than that of the former version. Moreover, the required computing time of employing the strict diffusive iterative Gaussian smoothing method is about twice that obtained using the iterative Gaussian smoothing method. The authors have examined many other test cases with iteration step *m*≤10 000 and similar conclusions for the original and diffusive smoothing versions were obtained. Therefore, although the iterative Gaussian smoothing method cannot be proven to be a strict diffusive filter, it can be applied to the cases with *m*<10 000 for both finite- and infinite-domain data so as to save the computing resource. Based on this argument, the following practical applications merely employ the iterative Gaussian smoothing method.

According to the above-mentioned tests and many other cases not shown here, it seems that the penetration distance of employing the iterative Gaussian smoothing method confirms the estimation given in table 2. As the number of iterations increases, the distance may be scaled down by dividing by an empirical factor of . Since the resulting and *y*′ in regions of and may involve large errors, it is recommended to discard them if the data within these regions are not important and if the data length is long enough.

A comparison between the present smoothing method and five well-known wavelet filters has been performed. Since all these filters are subject to the errors caused by the end effects induced by the finite data range, only the interior points (4<*x*<6) are considered. Table 5 shows maximum absolute errors deduced from the estimated smooth parts as shown in figure 10. The schemes selected for comparison are the Gaussian smoothing method, the iterative Gaussian smoothing method and the following five well-known wavelet filters: Daubechies family wavelet (dbN); Symlet wavelet (symN); Coiflet wavelet (coifN); biorthogonal wavelet pair (biorNr.Nd); and reverse biorthogonal wavelet pair (rbioNr.Nd). In deducing the data using the five wavelet filters, we employed MatLab (Misiti *et al*. 2007) toolbox by applying the optimal tuning of the control parameters to suppress error. As shown in table 5, even under the optimal conditions, the performance of these five wavelet filters is not better than the current iterative Gaussian smoothing method and not surprisingly the conventional Gaussian smoothing method is worse than the other approaches. If one opts for a smaller value of *δ* under a moderate data length, the proposed iterative Gaussian smoothing method may turn out to be better than the wavelet filters.

The next case involves the decomposition of a real electrocardiogram (ECG) dataset downloadable from www.physionet.org website. The data record represents an ECG of a healthy adult—data record number edb/e0103. The total number of points is 16 500 and the duration of measurement is 60 s. Since this dataset is a real-time measurement, it contains both the high-frequency noise and the low-frequency baseline wander. The inter-beat (RR) interval of a healthy human at rest is of the order of 1 s. Therefore, we selected *σ*=1.4 s to filter the low-frequency baseline wander and *σ*=0.05 s to filter the high-frequency noise. The critical iterations for both smoothings were fixed at 127 implying that the cut-off transition zone for the low-frequency baseline wander is between wavelengths of 1.82 and 3.64 s, and the cut-off transition zone for the high-frequency noise is between wavelengths of 0.065 and 0.13 s. These choices were selected by trial and error aiming at eliminating both the baseline wander and the high-frequency noise while preserving the primary wave signatures.

Figure 12 shows the comparison of the raw and the smoothed ECG data in which both the low-frequency baseline wander and the high-frequency noise are filtered. It can be seen that the current smoothed data clearly display signatures of a healthy ECG pattern—it starts with a P wave, followed by the QRS complex and the T wave. The baseline wander of the raw data of the ECG data examined is fairly gradual. To create a challenge, we have altered the ECG data by adding a measured baseline wander, which is also downloadable from the same website. As shown in figure 13, the magnitude of this baseline wander is of the same order as that of the ECG signals and it is also accompanied by additional random noise. After the iterative Gaussian smoothing method is applied to the dataset to filter the low-frequency baseline wander and the high-frequency noise, the results are shown in figure 14. As shown in the figure, the current method provides excellent smoothing of the raw data, and the repeating pattern of a healthy ECG is clearly observed in the smoothed data.

The next case involves some experimental data from NASA Langley Research Center Workshop on Computational Fluid Dynamics (CFD) Validation of Synthetic Jets and Turbulent Separation Control (cfdval2004.larc.nasa.gov). One of the cases in the workshop involves a zero-mass-flux, or synthetic, jet issuing from a circular orifice. The zero-mass jet is driven by the rapid motion of a piston-driven deformable wall inside a cavity. The piston is driven electromechanically using a sinusoidal voltage input with a prescribed frequency. The phase-averaged data for the actuator's displacement about its neutral position are provided and the derivative of these data must be taken to obtain the velocity boundary condition of the diaphragm for later CFD calculations. Figure 15 shows the phase-averaged data for the piston displacement. Although the profile seems smooth, the best fit of a single sine wave reveals that the data profile is not exactly a sine wave and the data may consist of a combination of several different waves. Since these data are phase averaged with the sinusoidal voltage frequency, the data would contain only wave combinations of the voltage frequency and its subharmonics. Indeed, a Fourier power spectrum analysis has revealed that only waveforms with wavelengths of 360°, 180°, 120° and 90° of the sinusoidal voltage frequency are visible in the data. The wavelength ratios of two adjacent waveforms are therefore 2, 1.5 and 1.3333. The solutions of equation (2.16) provide choices of *m* and *σ* to divide these waveforms: a critical pair of *σ*=139° and *m*=127 is needed to divide the 360° and 180° waves; a critical pair of *σ*=108° and *m*=8134 is needed to divide 180° and 120° waves, and a critical pair of *σ*=91° and *m*=596 134 is needed to divide the 120° and 90° waves.

The wave decomposition is performed using the iterative Gaussian method discussed with the periodic assumption such that the computing range is extended to the point *t*, where , *t*_{e}=0 or 2*π*. The predominant waveform is the 360° one shown as the solid line in figure 16. The 360° waveform consists of two waves. The primary wave can be approximated by fitting a cosine function as 0.076 cos (2*π*/*T*)−0.051, where *T* is the time period of the piston cycle. The secondary wave can be obtained by subtracting the primary wave from the smoothed 360° wave and the difference is shown as symbols in the figure. We can further fit the secondary wave with a sine function as 0.066 sin (2*π*/*T*), as shown in figure 16.

The decomposed 180°, 120° and 90° waves are depicted in figure 17. These waves can be further approximated by three cosine functions with variations in phase shifts and magnitudes: 0.43 cos (4*π*(*t*/*T*−12.45/360)), 0.0177 cos (6*π*(*t*/*T*+24.53/360)) and 0.074 cos (8*π*(*t*/*T*+0.44/360)) for the 180°, 120° and 90° waveforms, respectively.

Finally, the velocity of the piston movement can be obtained analytically by differentiating the sum of five piston displacement equations with respect to time. The analytical form of the velocity is depicted as a solid line in figure 18. Also, a dashed line is presented showing the velocity derived from the best-fit single sine wave to the piston displacement data. These results are then compared with experimental velocity data, which are approximated by the slopes of the displacement data at every two adjacent points. It can be seen that the comparison of the experimental data and the current composite profile is excellent.

## 4. Conclusions

Two iterative algorithms based on the Gaussian smoothing method to filter and decompose time-series data are developed: one is approximately diffusive and the other is strictly diffusive. While these algorithms are general, they are used here in conjunction with the moving least-squares method to handle the finite data length. The connection between the frequency domain and the physical space can be quantified by the selection of three parameters: the accuracy criterion; the Gaussian smoothing factor; and the number of iteration performed. The distance of the error penetration from the two data ends into the interior region is studied. It is proven that, in the interior point remote from the two ends, the result is the same as that of the infinite data length. The method has been successfully tested in a number of problems.

## Acknowledgments

The first author is supported by the National Science Council of Taiwan under the grant no. NSC-94-2212-E006-084.

## Footnotes

- Received May 15, 2007.
- Accepted February 15, 2008.

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

- Copyright © 2008 The Royal Society