## Abstract

Before structural health monitoring (SHM) technologies can be reliably implemented on structures outside laboratory conditions, the problem of environmental variability in monitored features must be first addressed. Structures that are subjected to changing environmental or operational conditions will often exhibit inherently non-stationary dynamic and quasi-static responses, which can mask any changes caused by the occurrence of damage. The current work introduces the concept of *cointegration*, a tool for the analysis of non-stationary time series, as a promising new approach for dealing with the problem of environmental variation in monitored features. If two or more monitored variables from an SHM system are cointegrated, then some linear combination of them will be a stationary residual purged of the common trends in the original dataset. The stationary residual created from the cointegration procedure can be used as a damage-sensitive feature that is independent of the normal environmental and operational conditions.

## 1. Introduction

A major issue prohibiting the extension of structural health monitoring (SHM) technologies to structures in operation in the real world is the fact that many features monitored for their sensitivity to damage are also sensitive to changes caused in the structure by environmental and operational conditions (Worden *et al.* 2007). A successful SHM system must be able to distinguish between changes caused by innocent ambient variations, such as temperature fluctuations, and those caused by damage. Commonly, temperature is found to be the most dominant factor affecting structural response as it directly affects the stiffness of a structure, although it can additionally change the boundary conditions of a structure if foundations, etc. become frozen (see Sohn *et al.* 1999; Alampalli 2000; Peeters & De Roeck 2001).

A review of the relevant literature reveals a number of potential options already explored for dealing with the problem of environmentally induced variations in structural response. A dominant approach is to attempt to model the monitored parameters in question with respect to those environmental/operational factors considered to have an effect (Peeters *et al.* 2001; Sohn *et al.* 2001; Worden *et al.* 2002; Ni *et al.* 2009). If a model can predict the value of a monitored feature given the conditions affecting it, the error of the model would be suitable as a damage-sensitive feature. Often these approaches have employed a simple regression of the damage-sensitive feature (normally natural frequencies) onto measured structural temperature. With this approach, to have reasonable confidence in any model prediction capability, all factors affecting the monitored parameter must also be monitored, and importantly, understood—which is an obvious limiting factor in situations where only small sensor networks are available.

Other approaches, where perhaps measurements of the environmental/operational conditions are not available have also been explored. A simple approach to the problem is to use a long span of response data to define the normal condition of a system, an idea explored by Surace & Worden (1997), this could be, for example, data collected over a whole year where all ranges of environmental/operational conditions have occurred. New measurements may then be compared in some way with the defined normal condition. Evidently, this approach requires storage of a large amount of data, and a further drawback is that using a large normal condition set may reduce feature sensitivity to damage (Worden *et al.* 2002). A different approach is discussed by Manson (2002), where the proposed idea is to decompose the features used for SHM using principal component analysis (PCA) and to trap the high-variance signatures of environmental change in the first few principal component scores; the remaining minor component scores then constitute a temperature-independent feature set. These ideas of linear projection and trapping of environmental variation were also independently proposed through the use of factor analysis (FA); the work by Kullaa (2004) is a slightly later exposition of the approach. As both PCA and FA admit nonlinear generalizations, there is also scope for using the projection approach on more complex datasets. However, it transpires that the projection approach had actually been anticipated, but in the literature of econometrics, as described by Stock & Watson (1988, 1993). In fact, the PCA approach of Stock and Watson actually belonged to a larger class of algorithms being developed in the econometrics literature; these algorithms are associated with the concept of *cointegration*.

In the current work, cointegration is suggested as a suitable methodology for removing environmental trends from SHM data. Recently, the idea has been applied to statistical process control (SPC) by Chen *et al.* (2009). Cointegration is a property of some non-stationary multi-variate time series; briefly, two or more non-stationary variables are *cointegrated* if some linear combination of them is stationary. Econometricians traditionally test for cointegration between two or more economic variables as a means of establishing whether there is a statistically significant relation between them. Although engineers may well be interested in problems of a similar nature, they may find the stationary linear combination created during the cointegration process of more practical interest. If a number of variables from some process under investigation are cointegrated, the stationary linear combination of them found during the cointegration process will be purged of all common trends in the original datasets, leaving a residual equivalent to the long-run dynamic equilibrium of the process (Chen *et al.* 2009). In terms of SHM data, the common trends removed by the cointegration process will be those caused by the latent variables driving the response of the structure, i.e. the environmental and operational conditions.

In theory then, the cointegration process is ideally suited to remove environmental and operational trends from SHM data. This idea is explored using a simulated scenario in the next section. Subsequently, the sometimes complex mathematics behind the cointegration process will be introduced in a comprehensive manner with the aim of providing a self-contained paper tailored for structural dynamicists. Finally, the paper will conclude with the application of cointegration to a benchmark SHM problem, where Lamb waves are used to detect the presence of damage in a composite plate subjected to cyclic temperature fluctuations.

## 2. Illustrative example

The current section is intended as an illustrative and intuitive example of the approach taken towards removing environmental trends from damage-sensitive monitored variables in this work. In essence, the following section describes how cointegration can be used for SHM; any formal reference to the cointegration procedure is, however, left to §3. This example uses simulated natural frequencies as damage-sensitive features as they are easily accessible. Natural frequencies would not commonly be used as damage indicators in a state of the art SHM system, but are used here for illustrative purposes. To begin, the dynamic response of a 10 degree of freedom system, shown in figure 1, to a random excitation is simulated.

To imitate a dependency on environmental conditions and introduce non-stationarity to the dynamic response of the system, the spring stiffnesses were allowed to vary with time. A simulated temperature field was then added with the highest temperature at the first mass and affecting the subsequent masses successively less.

Each stiffness was set as a linearly decreasing function of temperature, and over the duration of the simulation, this temperature was steadily decreased. To serve as damage-sensitive features, the natural frequencies of the system were extracted by solving the eigenvalue problem at each time instant in the simulation, as shown in figure 2, a small amount of Gaussian noise has also been added to simulate instrument noise.

In their current state, each natural frequency is dependent on temperature and, therefore, less than ideal as a choice for a damage-sensitive feature to monitor. However, as each frequency here is driven by the same temperature field, the correct parameter choice for a simple linear combination of a number of the frequencies would result in a stationary residual, purged of any dependence on that temperature field. In this example, the first two natural frequencies are combined as follows: 2.1

where *ω*_{i} denotes the *i*th natural frequency, *ε* the residual sequence and *α*, *β* are constants to be determined. On the correct choice of the parameters *α*, *β* the residual sequence will become stationary, as shown in figure 3*a* for this example. Now, regardless of the temperature, this residual will continue to be stationary all the time the system is operating in its normal condition. Upon the occurrence of an event that changes the relationship between the variables included in the combination, the residual will become non-stationary. In other words, the residual sequence will become non-stationary if the system begins to operate outside of its normal condition. For SHM then the residual sequence seems a sensible choice for a damage-sensitive feature.

For the current example, it remains to test the created stationary residual's sensitivity to damage. In order to do this, the system was re-simulated, again with temperature variation but with a different excitation sample; in this case, the second spring stiffness was abruptly reduced to 50 per cent of the healthy value mid-way through the simulation. Figure 3*b* illustrates the projection of the newly simulated features onto the established combination (equation (2.1)). Clearly, the residual becomes non-stationary at the midway point, corresponding to the introduction of damage to the system. The upper and lower limits are computed from the undamaged residual and represent the mean±3 s.d. This plot is essentially an SPC ‘X-chart’ (Montgomery 2009), which clearly indicates that damage has occurred shortly after the mid-point of the time record.

As mentioned previously, the point of this example has been to illustrate the suggested approach in this work to removing unwanted environmental trends from damage-sensitive features. The problem has been reduced to finding the correct parameters with which to combine the non-stationary variables into one stationary residual sequence. This approach, however, is evidently valid only for variables that share common trends that can be removed by a linear combination of the original variables. From econometrics, this is in fact the definition of cointegrated variables; one can therefore draw on the considerable and sophisticated research carried out in that field to discover the best way to approach the problem of finding the parameters that will create the most stationary residual sequence.

## 3. The theory of cointegration

### (a) Overview

The current section aims to introduce the concept of cointegration in a more rigorous manner than attempted in the previous two sections. All of the following theory is known from the econometrics literature; however, it is considered useful here to present it in a form tailored to structural dynamicists and in appropriate notation. In the following, some familiarity with auto-regressive (AR) and vector auto-regressive (VAR) models is assumed, however, appendix A provides a short introduction to the points considered important. As noted above, this section draws from a number of key texts from the econometrics literature by Johansen (1995), Fuller (1996), Maddala & Kim (1998) and Juselius (2006), which may be referred to for a more mathematically rigorous treatment of the material. Here, to begin with, a simple definition of cointegration is introduced.

### Definition 3.1

Two or more non-stationary time series are cointegrated if a linear combination of them is stationary.

In the following equation, where the non-stationary time series are modelled as a VAR process {*y*_{i}} (in keeping with common practice in structural dynamics, vectors will be denoted with curved brackets {−}, and matrices with square brackets [−]), the series are cointegrated if a vector {*β*} exists such that *z*_{i} is stationary, where
3.1

If this is the case, {*β*}^{T} is called a *cointegrating vector*. If {*y*_{i}} includes a total of *n* variables, there may be as many as *n*−1 linearly independent cointegrating vectors. Clearly, for the time series to be cointegrated, to begin with, they must have shared/common trends. There is one further restriction, which is that all times series must be *integrated* of the same order.

### Definition 3.2

If a non-stationary process variable *y* becomes stationary after differencing *d* times, it is said to be integrated of order *d*, which is denoted *y*∼*I*(*d*).

In other words, the time series must have the same ‘degree of non-stationarity’ if they are cointegrated.

For the purposes of SHM, the intent would be to use monitored variables that are cointegrated and find the cointegrating vector to create a stationary residual sequence for damage detection. From an engineering point of view, monitored variables from the same process or system are more than likely to share common trends on account that each process variable will be driven by the same latent influences. This cannot be said, however, of the order of integration of each monitored variable, this must be ascertained before any attempt is made to find the cointegrating vector.

The order of integration of a time series is ascertained in econometrics by employing a stationarity test, which is often analogous to testing for a unit root in a time-series model. The stationarity test employed here is called the augmented Dickey–Fuller (ADF) test and will be described later in this section.

Once it has been ascertained to what order all process variables of interest are integrated to, it remains to find the cointegrating vector that will result in the most stationary combination of the variables. There are two common approaches to this problem in econometrics: the first is the Engle–Granger two-step estimation procedure (Engle & Granger 1987) often employed when there are only two process variables included in the analysis, and the second is the Johansen procedure (Johansen 1988), a more complex maximum-likelihood multi-variate estimation procedure. Owing to its increased sophistication, the Johansen procedure will be employed here, its sometimes complex mathematics will be described, but a summary of the process will also be provided for quick reference.

### (b) The augmented Dickey–Fuller test

The first step is to test that all variables under consideration are integrated of the same order, this is achieved by the ADF test (Dickey & Fuller 1979, 1981). Like many econometric stationarity tests, the ADF test is based on a unit root test for a time-series model. If a time-series model has a unit root, it will be inherently non-stationary. The idea is perhaps best illustrated by looking at a first-order AR model (AR(1)), which takes the form
3.2where *ε*_{i} can be considered to be a Gaussian white noise process. In this case, the value of *a*_{1} defines the root of the characteristic equation of the process (see appendix A for more details). The roots of the characteristic equation of any process determine its stability and, therefore, its stationarity. In this example, the process *y*_{i} will be stationary if *a*_{1} is less than one in magnitude and non-stationary if it is larger or equal to one in magnitude. In the case that *a*_{1} is equal to one, the process will have a *unit root*, and equation (3.2) becomes
The process will be non-stationary but its first difference will be stationary; in econometrics terminology, it will be integrated order one, denoted *y*_{i}∼*I*(1).

When fitting a process to an AR(1) model then, information on the stationarity of the process is obtained from the parameters defining the characteristic root. This is normally achieved by testing a null hypothesis of *a*_{1}=1. The most obvious way of going about this would be to carry out a *t*-test on the parameter *a*_{1}; however, under the assumption of non-stationarity, the least-squares estimate of the parameter will not be distributed around unity. Rather than carrying out a traditional *t*-test, the *t*-test statistic will normally be compared with critical values constructed by and found in Dickey & Fuller (1979).

The ADF test follows the same premise as described above, but involves fitting the data to a more complex time-series model as described in
3.3Here, the difference operator Δ is defined as Δ*y*_{i−j}=*y*_{i−j}−*y*_{i−j−1}. A suitable number of lags *p* should be included to insure that *ε*_{i} becomes a white noise process (Perman 1993). To convert from the more traditional AR(*p*) model to the model form (equation (3.3)), the following substitutions should be made: let *a*_{1}=1+*ρ*+*b*_{1}, *a*_{n}=−*b*_{n−1}+*b*_{n}, for *n*=2…*p*−1, and *a*_{p}=−*b*_{p−1}, where *a*_{j} are the AR model coefficients.

Using these substitutions, the characteristic equation of (3.3) can readily be obtained from the characteristic equation of an AR(*p*) process as
3.4where *λ* are the roots of the characteristic equation. With this more complex form of time-series model, there could be as many as *p* independent roots. As an explosive process would be obvious to the analyst from the outset, the scenarios that remain of interest here are those where all roots are smaller than or equal to unity. If at least one root of the characteristic equation is unity, it follows from equation (3.4) that *ρ* must equal zero. Assume for the moment that there is a single unit root and consider the remaining (*p*−1) roots of the characteristic equation. With *ρ*=0, equation (3.4) becomes
3.5

It is clear that if *y*_{i} has one unit root, all remaining roots are smaller in magnitude than one. Furthermore, on closer inspection, equation (3.5) is the characteristic equation of the AR process of the differenced time series,
3.6As equation (3.5) must have all roots smaller than one in magnitude, the first difference Δ*y*_{i} must be stationary. If *y*_{i} is non-stationary, but its first difference is stationary, as in the case above, the process is integrated order one.

To summarize, for *y*_{i} to be integrated of order one, it is necessary that *ρ*=0 in equation (3.3). The ADF test procedure is, therefore, to estimate the parameters in equation (3.3) by the least-squares methods and then test the null hypothesis *ρ*=0. The test statistic
3.7where is the least-squares estimate of *ρ* and *σ*_{ρ} is the variance of the parameter, which should be compared with the critical values from Dickey–Fuller (DF) tables (an example of the relevant DF table for this model type will be provided as electronic supplementary material, otherwise, see Fuller (1996)). The hypothesis is rejected at level *α* if *t*_{ρ}<*t*_{α}. If the hypothesis is accepted, the time series has a unit root and is *I*(1). If the hypothesis is rejected, the test is repeated for Δ*y*_{i}, if the hypothesis is then accepted *y*_{i} is an *I*(2) non-stationary sequence. This can be continued until the integrated order of the time series is obtained.

Additional hypotheses and test statistics are needed if the model form used is extended to include shifts or deterministic trends (or both). For the extended time-series model form,
3.8the null hypothesis for the time series to be integrated order one should be extended to include *μ*,*ν*=0. More details for these specific cases can be found in Dickey & Fuller (1981) and Fuller (1996).

Having ascertained the degree of non-stationarity of each process variable of interest, an attempt to create a stationary residual through combination of those variables integrated to the same order can be made. The Johansen procedure is outlined below for this purpose.

### (c) The Johansen procedure

The Johansen procedure is traditionally used to test if a number of *I*(1) economic variables are cointegrated, and if they are, to establish the number of independent cointegrating vectors and also determine which of the cointegrating vectors will create the most stationary linear combination of the variables in question. For completeness, the majority of the theory behind the procedure will be laid out here, although the real interest to SHM practitioners will most probably be in how to find the best cointegrating vector for a given set of monitored variables.

The premise of the Johansen procedure is to use a maximum-likelihood approach to estimate the parameters of a vector error-correction model (VECM) of the variables under consideration. A VECM takes the form
3.9where {*y*_{i}} denotes an *n*-vector including all *n* variables to be analysed, with the subscript *i* relating to time, *i*=1,…,*N*, *p* represents the model order, or the number of lags to be included in the model and {*ε*_{i}} is a normally distributed noise process, {*ε*_{i}}∼*N*(0,[*Σ*]). A term to describe a deterministic trend {*D*(*t*)} has also been included. Equation (3.9) is the multi-variate analogue of equation (3.8).

Error-correction models are common in econometrics and are closely linked with the idea of cointegration. In fact, the existence of an error-correction model implies that the included variables are cointegrated and vice versa, this is called the *Granger representation theorem* (Engle & Granger 1987). If a true error-correction model exists (i.e. where {*ε*_{i}}∼*N*(0,[*Σ*])), the parameters in [*Π*] would describe the long-run equilibrium between variables, and the parameters [*B*_{j}] would account for short-run adjustments needed to return the process to equilibrium after any drifts.

The Johansen procedure uses the maximum likelihood of observing the correct {*ε*_{i}} to estimate the parameter matrix [*Π*]. A summary of the necessary points for calculation of the best cointegrating vector will be provided in §4. The derivations of these points are provided below.

Under the assumption that equation (3.9) is a true error-correction model and that the variables under consideration are *I*(1) (which implies that {Δ*y*_{i}} and {Δ*y*_{i−j}} are stationary), the parameter matrix [*Π*] must be rank-deficient, say of rank *r*, and can therefore be decomposed into two matrices
3.10where [*α*] and [*β*] are both *n*×*r* matrices. From basic linear algebra theory, the *r* rows of [*β*]^{T} will span the row space of [*Π*]. Now as the original matrix [*Π*] described the long-run equilibrium relations between the variables, [*β*] can be taken as the desired cointegrating vector to be found.

As previously indicated, parameter estimation is achieved by maximizing the likelihood of observing the correct {*ε*_{i}}. If {*ε*_{i}}∼*N*(0,[*Σ*]), its probability density function will be
3.11where |*Σ*| is the determinant of the estimated covariance of {*ε*_{i}}. It follows that the likelihood of observing the entire correct sequence of {*ε*_{i}} will equal .

On closer inspection of equation (3.11), each individual term is bounded above by the fractional term preceding the exponent, therefore the likelihood function is bounded above by ((2*π*)^{n}|*Σ*|)^{−(N/2)}, and so
3.12In other words, the maximum-likelihood parameter estimates will correspond to the parameters that maximize |*Σ*|. This point will be returned to later; however, for now, the effort will be focused on manipulating the maximized-likelihood function in order to express all parameter estimates in terms of [*β*]. Before going any further, however, following Johansen (1995), some new notation will be introduced to simplify the VECM expression (equation (3.9)).

Let {*z*_{0i}}={Δ*y*_{i}}, {*z*_{1i}}={*y*_{i−1}}, and [*Ψ*]=[[*B*_{1}],[*B*_{2}],…,[*B*_{p−1}],[*ϕ*]], then equation (3.9) will take on the simplified form
3.13Referring to this simplified form the log-likelihood function *L*, where , is first used to estimate [*Ψ*] by calculating
3.14After the necessary matrix calculus and some careful rearrangement, the estimate for [*Ψ*] can expressed as
3.15where [*M*_{nm}] are product moment matrices defined by
3.16Substituting equation (3.15) back into equation (3.13), {*ε*_{i}} may now be expressed as
3.17This expression can be further simplified by defining the residuals {*R*_{0i}} and {*R*_{1i}} from the following regressions:
3.18where the coefficient matrices are found by ordinary least squares: [*C*_{1}]=[*M*_{02}][*M*_{22}]^{−1} and [*C*_{2}]=[*M*_{12}][*M*_{22}]^{−1}. Finally, equation (3.17) becomes
3.19{*ε*_{i}} has now been expressed in terms of the residuals of regressions of {*z*_{0i}} and {*z*_{1i}} on {*z*_{2i}} and [*α*],[*β*], which are still to be found. To use econometrics terminology, the term has been ‘concentrated out’.

It now remains to find the maximum-likelihood estimates of [*α*] and [*Σ*] in terms of [*β*]. Assuming a fixed [*β*], these are found to be
3.20and
3.21where, similar to equation (3.16), [*S*_{nm}] are product moment matrices defined by
3.22All free parameters of {*ε*_{i}} have now been expressed in terms of [*β*], which is still to be estimated. The estimation of [*β*] is achieved using the previously ascertained fact that the maximum-likelihood parameter estimates will correspond to the parameters that maximize |*Σ*| (equation (3.12)). From equation (3.21) then, the maximum-likelihood estimate of [*β*] corresponds to the [*β*] that maximizes
3.23Using a matrix lemma from Johansen (1995), this can be re-expressed as
3.24Now, since [*S*_{00}] is fixed, |*Σ*| is maximized by maximizing
3.25where [*M*]=[*S*_{11}]−[*S*_{10}][*S*_{00}]^{−1}[*S*_{01}] and [*N*]=[*S*_{11}]. Using a second lemma from Johansen (1995) (lemma A8), for [*M*] and [*N*] to be symmetric and positive definite, the ratio (equation (3.25)) is maximized by , with the maximal value equal , where {*v*_{i}} and *λ*_{i} are the solutions of the generalized eigenvalue problem
3.26From the same lemma, any can be chosen as the maximizing argument of equation (3.26), where [NS] is any non-singular *r*×*r* matrix. This allows normalization of the cointegrating vectors found.

By substituting the relevant [*M*] and [*N*] into equation (3.26), the final generalized eigenvalue problem to be solved is
3.27where . It is also interesting to note that with appropriate normalization, {*v*_{i}}[*S*_{11}]{*v*_{j}}=*δ*_{ij}. With this appropriate normalization for , the following hold true:
3.28and
3.29Upon solving equation (3.27), the relevant cointegrating vectors are found. It turns out that the choice for the ‘best’ cointegrating vector corresponds to the largest eigenvalue of equation (3.27). The eigenvalues *λ*_{i} can be interpreted as the squared canonical correlations between {*R*_{0i}} and {*R*_{1i}}, so between the cointegrated combinations and a linear combination of the (stationary) differences {*w*_{i}}{Δ*y*_{i−1}}. In other words, the eigenvalues *λ*_{i} measure how strongly the cointegrated relation correlates with the stationary part of the process. The larger the eigenvalue the ‘more stationary’ the cointegrated relation.

### (d) The Johansen test statistic

Upon solving equation (3.27), the required cointegrating vector has been found. The final step, however, is Johansen's test for cointegration. For econometricians, this test is the key point to the procedure, as it is that which verifies whether the variables under consideration are in fact cointegrated or not. From an engineering perspective, the relationships between a set of monitored variables are often much better understood, which means that the question of whether they are cointegrated or not is less important than the one of ‘will the residual created stay within a set of limits while in normal condition’ or not, which can more often than not be verified visually. Having said that, it is always useful to have a way to quantify and clarify any analysis carried out.

Johansen's test for cointegration, then, depends upon the rank of the matrix [*Π*] in equation (3.9). Recall that, for *I*(1) variables, the matrix [*Π*] in equation (3.9) is required to be rank deficient if the error-correction model is to hold true. If [*Π*] has full rank, the *I*(1) variables cannot be cointegrated. Therefore, a test for cointegration can be based on the rank of [*Π*]. Johansen's approach to this is to use a likelihood ratio test where the hypothesis *H*(*r*), of [*Π*] having rank *r* is tested against the hypothesis *H*(*n*), of [*Π*] having full rank. Recall from equation (3.12) that the maximized likelihood depends on the estimated covariance matrix of the error residual in the error-correction model. Using relations (3.28) and (3.29), and lemma (A8) from Johansen (1995), equation (3.12) can be re-expressed as
3.30The likelihood ratio test, *Q*, then takes the form
3.31This leads directly to the ‘trace statistic’, *λ*_{trace}, used to test the hypothesis that there are at most *r* cointegrating vectors,
3.32The asymptotic distribution of this test statistic depends on the type of deterministic trend in the model (equation (3.9)). The relevant table with critical values for a process not including a shift or deterministic trend is provided as electronic supplementary material, otherwise, see Johansen (1995).

Juselius (2006) recommends using the trace test statistic in the following way: to begin, the hypothesis of *r*=0 is tested against one of [*Π*] having full rank. If the test statistic is smaller than the relevant critical value, accept the hypothesis of *r*=0, i.e. no cointegration. If it is larger, reject the hypothesis and move on to test *r*=1. In general, if the test statistic is larger than the relevant critical value, one should reject that there are as few as *r* cointegrating vectors, and move on to test the next largest possible rank *r*. In this way, the hypothesis of *r*=0 up to *r*=*n* (i.e. full rank) is tested.

This concludes the cointegration theory section. The next section provides a summary of this theory that omits any derivation, and only includes the steps needed if one were to implement the cointegration test procedure for SHM purposes.

## 4. Summary of the cointegration process for structural health monitoring

The previous sections have outlined the sometimes complex mathematics around the theory of cointegration. For the purposes of this paper, the Johansen procedure is used to find the most stationary linear combination of a set of monitored variables. Although many steps are necessary in the derivation of the Johansen procedure, as outlined in the preceding section, its application to the kinds of problem in question can be achieved in just a couple of steps. The procedure and the application of cointegration for the creation of damage-sensitive features independent of the effects of the environmental and operational conditions are summarized below.

— The suitability of the monitored variables to application of the cointegration process should first be assessed. To be included in the analysis, each monitored variables should be integrated of the same order, in other words they should have the same ‘degree’ of non-stationarity. Furthermore, for application of the Johansen procedure, each monitored variable should be integrated order one

*I*(1), i.e. a non-stationary variable with first difference stationary. Information of this kind is obtained by an ADF test on each variable. To carry out an ADF test:Fit each variable in question to the following time-series model using a least-squares approach, 4.1where the difference operator Δ is defined as Δ

*y*_{i−j}=*y*_{i−j}−*y*_{i−j−1}. A suitable number of lags*p*should be included to ensure that*ε*_{i}becomes a white noise process (Perman 1993).The least-squares estimate of the parameter

*ρ*is used to infer the degree of non-stationarity of the variable. If*ρ*is statistically close to zero, the process will be non-stationary and integrated order one. The null hypothesis of*ρ*=0 is tested by comparing the value of the test statistic 4.2where is the least-squares estimate of*ρ*and*σ*_{ρ}is the variance of the parameter, with the critical values from the DF tables. The hypothesis is rejected at a level*α*if*t*_{ρ}<*t*_{α}. If the hypothesis is accepted, the time series has a unit root and is*I*(1). If the hypothesis is rejected, the test should be repeated for Δ*y*_{i}, if the hypothesis is then accepted,*y*_{i}is an*I*(2) non-stationary sequence. This can be continued until the integrated order of the time series is ascertained. Additional hypotheses and test statistics are needed if the model form needs to be extended to include shifts or deterministic trends (or both).

— The Johansen procedure can now be applied to monitored variables found to be integrated order one. If the variables are cointegrated, the Johansen procedure will find the linear combinations of them that will result in a stationary residual sequence purged of the common trends shared in the variable set.

Fit the variables in question to a VAR model, 4.3and determine for those variables the most suitable model order

*p*using the Akaike information criterion (AIC) or similar (see Anderson 1971).The best linear combination of the variables, or cointegrating vector, is found as the parameter [

*β*] in the VECM of the variable set that takes the form 4.4where {*z*_{0i}}={Δ*y*_{i}}, {*z*_{1i}}={*y*_{i−1}}, ,*p*is model order ascertained previously in (i) and {*D*(*t*)} is a deterministic trend. To find [*β*], the cointegrating vector, first establish the residuals {*R*_{0i}} and {*R*_{1i}} of the following regressions: 4.5Next, define the product moment matrices 4.6Now, the required cointegrating vectors are found as the eigenvectors of the generalized eigenvalue problem 4.7The cointegrating vector that will result in the most stationary combination of the original variables will be the eigenvector with the corresponding largest eigenvalue.

— Once a suitable cointegrating vector has been found, new data from the monitored variables should be projected on to it. If the cointegrating vector was established on data from the normal condition of the structure, the residual sequence from the linear combination will continue to be stationary all the time the structure continues to operate in its normal condition. The residual sequence should therefore be continually monitored and deviations from stationarity taken to indicate a deviation from the normal condition of the structure.

— Finally, a trace test can be carried out to indicate the number of cointegrating vectors possible for a set of variables. To test the hypothesis that there are at most

*r*cointegrating vectors, the trace test statistic is used, 4.8Normal procedure is to first test the hypothesis of*r*=0 (no cointegration). If the test statistic is smaller than the relevant critical value, accept the hypothesis of*r*=0. If it is larger, reject the hypothesis and move on to test*r*=1. In general, if the test statistic is larger than the relevant critical value, one should reject that there are as few as*r*cointegrating vectors, and move on to test the next largest possible rank*r*. In this way, the hypothesis of*r*=0 up to*r*=*n*(i.e. full rank) is tested.

## 5. Applications to real structural health monitoring problems

The suitability of cointegration for environmentally sensitive SHM problems is assessed here through application to a benchmark study, originating from the Brite–Euram project entitled DAMASCOS, which attempted to use Lamb waves to detect damage in a composite plate subjected to cyclic temperature variations. The DAMASCOS test set-up will be briefly described below.

Data collected by the DAMASCOS consortium originate from a series of tests that measured the propagation of Lamb waves through a composite plate. The plate material was a carbon fibre reinforced plastic (CFRP) laminate with a 0^{°}/90^{°} lay-up. Two identical piezoceramic discs bonded to the midpoint of the edges of the plate were used to transmit and receive fundamental symmetric and antisymmetric Lamb-waves. In the particular test that this paper concerns, the instrumented composite plate was placed in an environmental chamber while Lamb waves were recorded every minute. The test composed of three different parts. In the first part, the temperature was held constant at 25^{°}C. In the second part, the temperature was cycled between 10^{°}C and 30^{°}C every 9 h. In the final part of the testing, the temperature continued to be cycled, however, damage was introduced by drilling a 10 mm hole in the plate between the two sensors.

Previously, Manson (2002) used the DAMASCOS benchmark data to investigate the feasibility of discovering features for damage detection from the response spectrum that are insensitive to environmental variations. This study also looked at projecting data onto the minor components from PCA to remove any temperature dependency from the data.

Following the work of Manson, the features chosen for analysis are 50 spectral lines from the frequency spectrum (numbers 46–95) that occur around the peak. Figure 4 illustrates the variation of amplitude of the 50 spectral lines under investigation for the duration of the testing.

The aim here is to use cointegration to create a feature that is insensitive to temperature-induced variation but still sensitive to damage. For this applied example, the procedure would be to test the features under analysis for cointegration, those that are cointegrated could then be combined using the Johansen procedure to create a stationary residual. This stationary linear combination should be established using a training set of data that includes points from the second phase of testing, where temperature variations were introduced. Projecting all new data onto the linear combination will produce a single feature. A successful feature will remain stationary all the time the plate remains undamaged, regardless of the temperature fluctuations, and should become non-stationary on the occurrence (or introduction, in this case) of damage.

To begin, all 50 features considered were included in a cointegration analysis. For the training set, 1000 data points were used, which incorporated 355 points of data from the first part of the test where the temperature of the plate was held constant, and 645 data points from the second period of the test, where the temperature was cycled between 10^{°}C and 30^{°}C. Figure 5 shows all 50 features tracked over the whole period of the test, including the time where damage was introduced, projected onto the linear combination created by the Johansen procedure from the training set. Error bars have been added at ±3 s.d. of the training set residual. Studying figure 5 shows that the residual remains stationary for the majority of the test but becomes non-stationary around data point 2500, which was when damage was introduced to the plate, a successful detection of damage. The only other time the residual exceeds the error bars is just below data point 1500, in fact at data point 1364, which coincides with when the temperature of the environmental chamber holding the plate begins to be cycled. At this point, a large discontinuity appears in the original features and so it seems natural that some small indication of this should appear in the cointegrated residual. Importantly, the residual returns to stationarity quickly and the created feature maintains its credibility as a damage detector, giving a very clear indication of a departure from the normal condition upon the introduction of damage.

On closer inspection of figure 5, however, in the time before damage was introduced visually the residual appears to drift over short time periods, despite the fact that it is ‘statistically’ stationary. Although this has posed no problem to the functionality of the feature, it is interesting to note.

The issues of the small drift and the discontinuity that occur in the cointegrated residual shown in figure 5 can, in this case, be remedied by considering a smaller subset of features in the cointegration analysis. If only the first 20 features of the set are considered, no discontinuity or drift occurs in the cointegrated residual and the sensitivity to damage still remains. This is most probably owing to the fact that these spectral lines demonstrate less temperature dependency than those lines from around the spectral peak. As this has not stopped the residual being sensitive to damage, it is an encouraging indication of the power of using cointegration to create damage-sensitive features.

## 6. Conclusions

Cointegration has been put forward as a new way to attempt to deal with the problem of environmentally induced variation in measured structural response. The idea, which originates from econometrics, is to linearly combine response variables that are cointegrated to create a stationary residual whose stationarity represents the structure's normal condition. When monitoring this residual, a departure from stationarity will indicate that the structure is no longer operating under its normal condition. This paper has introduced the sometimes complex mathematics of the Johansen procedure, which finds the most stationary linear combination of a set of variables under scrutiny. The ADF test has also been introduced as a stationarity test. Finally, the idea of using cointegration for SHM has been tested on a benchmark SHM problem from the DAMASCOS consortium, where the stationary residual from a combination of monitored process variables, which included environmental variability, was able to very successfully detect the introduction of damage in a composite plate.

Although not used for its intended purpose, cointegration has provided a useful tool for inclusion in an SHM analysis/system. Its advantages lay in the simplicity of the idea, the huge background of sophisticated research already carried out in the field of econometrics, and the fact that in essence no information is being lost as features are created through combinations of monitored variables. The suggested cointegration procedure can also be implemented where no measurement of the environment/operational conditions are available, the only stipulation being that the residual should be trained on data coming from the normal condition of a structure. It should be said, however, that the training data should be chosen carefully: if any normal environmental/operational condition is known to cause a dramatic change in monitored response, data from events such as these must be included in the training period, so as to avoid a false-positive damage indication on the next occurrence of a similar event.

Having stressed these advantages, some thought must also be given to possible limitations. One seemingly obvious limitation is that the Johansen procedure is only designed to work for variables integrated of order one, approximately *I*(1). This, however, is not really a problem in itself, as the large majority of the types of variables monitored in SHM will not possess/exhibit more than one unit root. The next is in the fact that the method only works for linearly related variables. An extension of the current theory is presently under study by the authors; however, it should be noted that the Johansen procedure is still valid for variables that are nonlinearly dependent on some outside driving force.

## Acknowledgements

The authors would like to express their gratitude to Dr Graeme Manson for providing the DAMASCOS data, also to Prof. James Brownjohn, Dr Ki-Young Koo and Russell Sturgeon, and finally to EPSRC for funding this research.

## Appendix A

Appendix A serves as a short introduction to AR and VAR models and their properties relating to stationarity.

#### (a) Auto-regressive models

The first step of the cointegration procedure involves the generation of an AR model for each non-stationary variable. An AR model is one that describes the evolution of a time series by a combination of its previous values. Once each variable is represented in an AR form, it should be determined whether that AR model is stationary or non-stationary. Consider the AR model of order *p* (AR(*p*)),
A 1where *ε*_{i} can be considered to be a Gaussian white noise process driving the model, *ε*_{i}∼*N*(0,1). Equation (A1) can be considered to be a randomly forced difference equation. With this is mind, the general solution of equation (A1) can be considered in the normal way as one would consider a difference equation
A 2where is the ‘particular integral’ and is a solution of equation (A1), and is the ‘complementary function’ and is a solution of the form
A 3

Assuming a solution to equation (A3) of the form, *y*_{i}=*Aλ*^{i}, equation (A3) becomes
A 4where *z*=1/*λ*. This form is called the characteristic equation of the process and implies that there are *p* possible *λ* satisfying equation (A4), which in turn leads to a general solution of
A 5where *A*_{1},…,*A*_{p} are fixed by *p* initial conditions. The general solution (A5) can be used to indicate the stability/stationarity of the time series. Looking at equation (A5), this solution will remain stable as long as |*λ*_{i}|<1 for all *i*, or alternatively as long as |*z*_{i}|>1. If any |*z*_{i}|<1, the process will behave explosively. Now if any *λ*_{i}=1, which is called a *unit root*, then and the process has marginal stability. In terms of statistics, then the following properties hold true:
A 6

#### (b) Vector auto-regressive models

VAR models are an extension of AR models to include multiple time series. Now, the evolution of two or more time series is described by combinations of past outputs from each series. As before, the stability conditions for a VAR model are of interest. A general VAR process of order *p* takes the form
A 7where each *ε*_{i} can be considered to be a Gaussian white noise process driving the model, *ε*_{i}∼*N*(0,1). Here, the [*A*_{i}] are *n*×*n* matrices and {*y*_{i}} are *n*-vectors. Again, at the complementary function is studied, which is a solution of
A 8

Considering a trial solution of {*y*_{i}}={*α*}*λ*^{i}, then as before the characteristic equation of the process can be obtained as
A 9with *z*=1/*λ*. From this form, it is clear that the stability conditions of the process are the same as for the general AR process as described in equation (A6).

- Received January 10, 2011.
- Accepted March 21, 2011.

- This journal is © 2011 The Royal Society