## Abstract

The stability of a dynamical system can be indicated by eigenvalues of its underlying mathematical model. However, eigenvalue analysis of a complicated system (e.g. the heart) may be extremely difficult because full models may be intractable or unavailable. We develop data-driven statistical techniques, which are independent of any underlying dynamical model, that use principal components and maximum-likelihood methods to estimate the dominant eigenvalues and their standard errors from the time series of one or a few measurable quantities, e.g. transmembrane voltages in cardiac experiments. The techniques are applied to predicting cardiac alternans that is characterized by an eigenvalue approaching −1. Cardiac alternans signals a vulnerability to ventricular fibrillation, the leading cause of death in the USA.

## 1. Introduction

Stability analysis of dynamical systems is of great importance because loss of stability often leads to sudden, drastic effects in a system. For example, when the propagating electrical wave in the heart becomes destabilized, abnormal heartbeats arise, which lead to various cardiac arrhythmias. Of particular interest to this work is an unstable pattern (called cardiac alternans) that is characterized by long-short, beat-to-beat variation in the electrical impulses. Cardiac alternans has been recognized as a marker of ventricular fibrillation (Rosenbaum *et al.* 1994), a heart rhythm disorder that kills hundreds of thousands of people in the USA alone each year. Other investigations on stability have considered the ‘swinging heart’ (Bayly & Virgin 1993), the respiratory system (Batzel & Tran 2000) and issues in manufacturing processes (Balachandran 2001; Mann & Young 2006), for example.

Stability of a dynamical system (or loss of stability) can be indicated by its eigenvalues. For example, alternans results from a period-doubling bifurcation that occurs when a negative eigenvalue approaches −1. When the underlying model of a dynamical system is explicitly available (e.g. a simple pendulum), stability analysis is a straightforward exercise (Strogatz 1994; Nayfeh & Balachandran 1995), for example, Li & Otani (2003) and Otani *et al.* (2005) successfully analyse the stability of a mathematical model of a cardiac cell. Unfortunately, to fully describe the dynamics of a real, complex system, e.g. cardiac tissue undergoing a periodic stimulus or any system that incorporate time delays (see Mann & Patel 2010 and references therein), one may need to use equations with very large or possibly an infinite degrees of freedom. Specifying such a model may not be possible; so these traditional techniques to find the eigenvalues would be inapplicable. Thus, we seek a model-free, data-driven framework. A standard approach in stability analysis that relies only on experimental data is the ‘empirical technique’. This procedure applies the principles of Floquet theory to a time series of measurements of the state variables and has been used to estimate eigenvalues of a simple forced pendulum (Bayly & Virgin 1993) and of cutting machines (Mann & Young 2006).

However, one difficulty with the ‘empirical technique’ can be that the state variables may be ill-defined (what set of quantities fully characterizes paced cardiac tissue?) and they may not be capable of being physically measured or recorded in real time (the stability analysis of Li & Otani (2003) and Otani *et al.* (2005) require knowledge of unmeasurable internal model variables). The difficulty of estimating eigenvalues from physical measurements is thus twofold: the appropriate state variables governing the dynamical system are unknown (what do we measure?) and/or approximations to the state space may consist of quantities that cannot be readily measured. Another drawback of the empirical technique is that it provides only point estimates of eigenvalues and no standard errors.

This study illustrates a technique that overcomes these problems and can estimate the most dominant eigenvalues and their standard errors from a time series of one or more measurable quantities. Our technique is based on the principle that when a system is mildly perturbed from the steady state, the resultant dynamics may involve only a small number of dominant eigenvectors (Strogatz 1994; Nayfeh & Balachandran 1995). We first assume that the potentially infinite-dimensional dynamical system is well described by only a few of the dominant eigenvectors. Second, we make a linear projection of the (unknown) state space of the system onto a pseudo-state space of measurable quantities. The eigenvalues of the projected variables are approximately equal to those of the true underlying model, with inaccuracies arising from using a simple linear projection rather than a more accurate nonlinear one.

Dominant eigenvalues can be found via a principal component analysis (Jackson 1991) performed on measurements collected as a time series. We also illustrate a novel method that imposes a statistical model (e.g. a multivariate autoregressive time series) that combines the effects of measurement error and model misspecification error (incurred by ignoring non-dominant eigenvalues and by assuming a linear projection is appropriate) on the time series. Using standard maximum-likelihood techniques, one may then obtain both point estimates and approximate standard errors for the dominant eigenvalues.

We frame our discussion in the context of predicting cardiac alternans from a set of easily measurable quantities called action potentials (APs) (Plonsey & Barr 2007). A complementary line of analysis has recently been introduced by Lemay *et al.* (2012), who predict the onset of alternans from a time series of pacing cycle lengths, action potential durations (APDs) and diastolic intervals. The main differences between their innovative work and ours are that they consider random variation in the pacing intervals of the tissue, whereas we assume constant pacing, and they use transfer functions and apply signal processing in the frequency domain, whereas our analysis stays in the time domain, and they provide only point estimates of eigenvalues, whereas we also give standard errors of the estimates.

Our analysis also provides an excellent testbed, in general, for our techniques, because models of paced cardiac tissue are too complex to find exact solutions. Comparisons of eigenvalues estimated using our procedures and the actual values from a numerical model of alternans show excellent agreement, indicating our approaches have promise not only for applications involving the prediction of alternans, but also for general dynamical systems. Thus, our methods could be quite useful for fields that are concerned with stability analysis such as economics, robotics, biology and manufacturing systems. Most importantly, if a prosthetic cardiac pacemaker could be outfitted with a control algorithm using estimates of eigenvalues from our technique, it may help to prevent fatal ventricular fibrillation (Hall *et al.* 1994).

This study is organized as follows. In §2, we review stability dynamics and develop the equations from which we estimate dominant eigenvalues using physical measurements. In §3, we review the electrophysiology of the heart. In §4, we describe the principal component and maximum-likelihood methods used to estimate the number and magnitude of the dominant eigenvalues. In §5, we illustrate these methods using a mathematical model of alternans. In §6, we summarize our findings and discuss directions for further research.

## 2. Stability analysis of dynamical systems

When a dynamical system is perturbed from steady state, the system may be driven back towards or away from equilibrium, or may oscillate around steady state with an increasing or a decreasing amplitude. Which path the system follows can be indicated from the eigenvalues of the system of differential equations describing the dynamical system. We review how one can obtain these eigenvalues when there is explicit knowledge of the model, and then present a technique to find eigenvalues when the model is too complex to be specified and instead only a time series of measurements is available.

### (a) Continuous case when model is specified

Consider a *k*-dimensional dynamical system
2.1where **x** is the state vector of length *k* that characterizes the system and **f**(**x**) is a vector function. In the context of paced cardiac tissue, the components of **x** might be the numerical values of the size of the heart, its volume, electrical capacitance, etc. For the time being, let us assume that the components of the state vector **x** and the function **f** are known and well-defined. We can denote the solution of equation (2.1) as a ‘flow’ function ** Φ**(

**x**,

*t*), which represents a trajectory that starts from

**x**and ‘flows’ in the state space for time

*t*. Further, let us assume that the system possesses a periodic motion with period

*T*and let

**x**

_{*}be a point on this periodic motion. It then follows that a trajectory starts from

**x**

_{*}at

*t*=0 will return to

**x**

_{*}at

*t*=

*T*; i.e.

**(**

*Φ***x**

_{*},0)=

**x**

_{*}and

**(**

*Φ***x**

_{*},

*T*)=

**x**

_{*}.

Let ** Φ**(

**x**

_{*},

*t*) represent a periodic solution. The variational equation (Nayfeh & Balachandran 1995) shows that 2.2which has initial condition (∂/∂

**x**)

**(**

*Φ***x**

_{*},0)=

**I**. The stability of the periodic solution can be analysed by computing the eigenvalues of the so-called Monodromy matrix (∂/∂

**x**)

**(**

*Φ***x**

_{*},

*T*).

### (b) Discrete case when full model is unavailable

The aforementioned procedure is a continuous approach and relies on explicit knowledge of the underlying model of the system. If we wish to analyse a dynamical system by taking a time series of measurements, stability analysis requires a discrete description. This can be performed by introducing a Poincaré map (Nayfeh & Balachandran 1995). As an example, we assume that the system behaves continuously and introduce a stroboscopic section to sample the system in time steps of *T*. Thus, a trajectory starting from **x**_{0} will intersect with the stroboscopic section in time intervals of *T*, and the subsequent intersecting points can be denoted by **x**_{1},**x**_{2}, etc. Using the definition of the flow function, it follows that **x**_{n}=** Φ**(

**x**

_{0},

*nT*). It then follows that there exists a mapping relation between

**x**

_{n−1}and

**x**

_{n}such that

**x**

_{n}=

**(**

*Φ***x**

_{n−1},

*T*). Defining

**P**(

**x**)=

**(**

*Φ***x**,

*T*), the map can be written in more familiar form:

**x**

_{n}=

**P**(

**x**

_{n−1}). One can show that

**x**

_{*}is a fixed point of the map

**P**. Then, stability of the periodic solution is equivalent to the stability of the fixed point

**x**

_{*}.

For dynamics near **x**_{*}, one can linearize the Poincaré map: **x**_{n}≈∇**P**⋅**x**_{n−1}, i.e. each component of **x**_{n} is approximately given by some linear combination of the components of **x**_{n−1}. If this is the case, we can also write **x**_{n−1}≈∇**P**⋅**x**_{n−2}, etc., so that **x**_{n}≈(∇**P**)^{n}⋅**x**_{0}.

Performing an eigenvalue decomposition of ∇**P**, we can write ∇**P**=*Q**Λ**Q*^{−1}, where ** Q**=[

**q**

_{1},

**q**

_{2},…,

**q**

_{k}] is a matrix of eigenvectors and

**=diag(**

*Λ**λ*

_{1},…,

*λ*

_{k}) is a diagonal matrix of eigenvalues. Then, . If we define the vector

**c**=

*Q*^{−1}

**x**

_{0}(the values of this vector will be unimportant), then 2.3where

*δ*_{n}is some vector of disturbances which incorporates the error in linearizing the Poincaré map. Transient dynamics of a system tend to be dominated by less stable eigenvalues. Furthermore, perturbations may disturb motion more in the direction of the least stable eigenvectors (Murphy

*et al.*1994). Near equilibrium, a good approximation is that the dynamics lie on a low-dimensional space (Bayly & Virgin 1993), which is true for the transients of paced cardiac tissue (Kalb

*et al.*2004), implying that only a dominant few (

*m*of

*k*) eigenvalues matter. Thus, we truncate the series in equation (2.3) after the

*m*most important eigenvalues and write 2.4where

**is a**

*ψ**k*-dimensional error vector that now captures both the error (

**) in linearizing the Poincaré map and in truncating the series after a few dominant eigenvalues. If the components of the state vector**

*δ***x**

_{n}were known and measurable, one could estimate the eigenvectors and eigenvalues by least squares, for example. Unfortunately, the appropriate state vector will not be known for complex systems, so we introduce a pseudo-state vector to get around this complication.

Let **y**_{n} be a vector of *r* measurements (e.g. transmembrane voltage readings in cardiac tissue or calcium concentrations), observed at the *n*th time step. Then, we can consider **y**_{n} to be some projection of the true state vector **x**_{n}. If we use a linear approximation to this projection, which will of course introduce another source of error, then .

Assuming that has independent columns, then dominant eigenvalues can be estimated using the measured data **y**:
2.5Here, the components of the vector **y**_{*} represent the steady state values as , and the *i*th entry of vector **v**_{j} has entries . The *r*-dimensional error term *ξ*_{n} now represents measurement error plus ‘model misspecification error’, i.e. the errors introduced from three approximations: the linearization of the Poincaré map, truncating the series with only a few eigenvectors, and the linear projection of **x** onto **y**. After a brief discussion of alternans, the remainder of this study shows our approach for estimating the eigenvalues (*λ*_{i}) in equation (2.5) along with their standard errors.

## 3. Electrophysiology of the heart and cardiac alternans

Cardiac alternans is a marker of the vulnerability to ventricular fibrillation (Rosenbaum *et al.* 1994; Pastore *et al.* 1999), a lethal heart rhythm disorder that kills 350 000 Americans each year and is the leading cause of death in the USA according the American Heart Association (see http://www.americanheart.org/). To develop an understanding of cardiac rhythm instability, we briefly review the electrophysiology of the heart. Cardiac cells respond to an electrical stimulus (i.e. a pacing) by eliciting an AP (Nolasco & Dahlen 1968; Guevara *et al.* 1984). The response consists of a rapid depolarization of the transmembrane voltage followed by a much slower repolarization process before returning to the resting value (figure 1). The time interval during which the voltage is elevated is called the APD. The time interval between two consecutive stimuli is called basic cycle length (BCL) or simply the pacing period.

Under a periodic train of stimuli, the action potential is a phase-locked 1:1 response when the pacing period is sufficiently long. However, when the BCL is short, the response may become a phase-locked 2:2 pattern, a phenomenon called cardiac alternans in the medical literature. During alternans, the APD alternates between short and long values (figure 2). Experimental evidence has established a connection between alternans and ventricular fibrillation (Pastore *et al.* 1999), so its prediction is a crucial step in detection and prevention of ventricular fibrillation and thus in saving lives.

From the point of view of nonlinear dynamics, alternans is mediated by a period-doubling bifurcation that occurs when a characteristic eigenvalue leaves the unit circle through −1. Thus, following this eigenvalue as the BCL decreases allows one to predict the pacing frequency at which the bifurcation will occur. Current methods to estimate these eigenvalues have shortcomings. On the basis of the hypothesis that cardiac dynamics can be described by a one-dimensional mapping model, the characteristic eigenvalue is traditionally associated with the slope of the so-called dynamic restitution curve (Nolasco & Dahlen 1968). However, this assumption is oversimplified and thus may not accurately predict alternans (Cherry & Fenton 2004; Lemay *et al.* 2012). In a recent paper, Weiss *et al.* (2006) present a more advanced mechanism for the onset of alternans based on intracellular calcium cycling. Although this mechanism provides insightful understanding on the dynamics of alternans, the stability criterion is hard to apply in experiments because it involves complicated intrinsic functions that can not be measured in real time. Further, the innovative method in Lemay *et al.* (2012) fails to provide standard errors of estimated eigenvalues and considers the case only when the pacing is stochastic.

A complex system such as paced cardiac tissue may require equations with an inordinate number of degrees of freedom. In fact, the appropriate state space to represent paced cardiac tissue is unknown. To estimate eigenvalues, we follow the assumptions in §2 and take the essential dynamics of the tissue to involve only a few dominant eigenvectors when the system is near a steady state. To continue analysis, we must construct a pseudo-state vector of measurable quantities whose dimension we believe is higher than the (unknown) number of dominant eigenvalues. We take the components of this pseudo-state vector to be measurements of transmembrane voltages (e.g. APD_{90}, APD_{50}, etc., see figure 1 for definitions), though they could also be measurements of calcium concentrations in cardiac experiments, etc. Using a set of APD readings is novel in itself, because most research (Lemay *et al.* 2012) considers only one APD at a time. We treat this pseudo-state vector as (approximately) some projection of the dominant eigenvectors of the true state vector so that their eigenvalues are nearly identical.

Once these quantities are recorded for the desired number of beats at a given level of BCL, the system of equations (equation (2.5)) describing their time evolution can be used to obtain point estimates and standard errors for the eigenvalues. As the BCL is varied and the dominant eigenvalues for each level of BCL are found, one can follow the trajectory of the most negative eigenvalue with BCL to predict at what pacing frequency the eigenvalue will cross −1 and when alternans will occur.

## 4. Eigenvalue estimation

Given a complicated dynamical system whose state vector is unknown or not measurable, stability analysis must proceed using only a time series of physical measurements via equation (2.5). Two quantities must be estimated simultaneously: the number of dominant eigenvalues *m* and their numerical values. For example, we wish to estimate the most (negative) dominant of an unknown number of eigenvalues from the beat-to-beat evolution of observed APD values. These are difficult tasks because the equations are nonlinear and contain many unknowns.

The principal component approach (PCA) ignores measurement errors and misspecification errors (from the linearization of the Poincare map, dropping small eigenvalues, and the linear projection of the true state space onto the vector of observable quantities) and obtains point estimates using least squares. We also introduce the maximum-likelihood method that proposes a model for the measurement and model misspecification error *ξ*_{n} in equation (2.5) and obtains maximum likelihood estimates (MLEs) of the eigenvalues as well as their standard errors.

### (a) Principal component approach

Least squares analysis can estimate the eigenvalues if we know the correct number of dominant eigenvalues. For purpose of demonstration, let us assume that the transient dynamics of the heart are governed by three dominant eigenvalues denoted by *λ*_{1}, *λ*_{2} and *λ*_{3}, respectively (finding the number of eigenvalues will be discussed later) and that measurement and misspecification errors are negligible. Thus, the values of APD^{90}, APD^{70} and APD^{50} on the *n*th beat can be approximated using equation (2.5):
4.1Here, denotes the steady-state value of APD^{90}, etc. Note that in the above expressions every quantity on the right-hand side, with the exception of *n*, is unknown and must be estimated. Let us introduce a vector . It then follows that
4.2where and
4.3If APD^{50},APD^{70}, and APD^{90} are not perfectly correlated, then the matrix *C* is invertible. In that case, a rearrangement of equation (4.2) leads to
4.4where
4.5By definition of similar matrices, the eigenvalues of matrix *D* are *λ*_{1}, *λ*_{2} and *λ*_{3}. Thus, an alternative task is to estimate the matrix *D* using a sequence of measurements . To do so, we introduce the following matrices
4.6and
4.7It follows from equation (4.4) that *Y* _{2}=*D*⋅*Y* _{1}. This matrix equation is the same as in least-squares regression, so the solution to *D* is
4.8

A similar empirical technique has been discussed in Mann & Young (2006), who apply the analysis towards cutting machines and in Bayly & Virgin (1993) with applications to a simple spring-forced pendulum. In general, the number of dominant eigenvalues and their values must be estimated simultaneously, and this is where PCA (or Karhunen–Loeve decomposition as Bayly & Virgin (1993) explores) can be useful. To begin, assume that we have collected time-series data on *r* variables *y*^{(1)},…,*y*^{(r)}, where *r* exceeds the number of dominant eigenvalues *m*. For instance, we may have collected six different APD values, e.g. APD^{90}, APD^{80}, …, and APD^{40}, for more than six consecutive beats. We formulate a matrix using the collected data as follows
4.9

As in PCA, we decompose the data matrix *Z* using singular value decomposition as follows
4.10where both *U* and *V* are unitary matrices, i.e. *U*^{′}⋅*U*=*I* and *V* ^{′}⋅*V* =*I*. If measurement errors are small compared with the dominant eigenvalues, then the number of principal components in the earlier-mentioned decomposition is determined by the number of dominant eigenvalues. Then, *Z* will have *m*<*r* principal components. Keeping only the dominant components of *Z* yields a matrix , where has *N* rows and *m* columns. We denote the *i*th row of by and let and . Then, one can formulate a *D* matrix according to equation (4.8). The eigenvalues of *D* are approximately the dominant eigenvalues of the system.

### (b) Maximum-likelihood estimate approach

The techniques in Bayly & Virgin (1993) and Mann & Young (2006) and the principal component approach do not provide standard errors of the estimated eigenvalues. To obtain standard errors, we must assume some model for the measurement and misspecification error *ξ*_{n} in equation (2.5). Using maximum-likelihood techniques, we can both estimate each unknown parameter and obtain approximate standard errors by considering the Hessian of the likelihood function.

Because we imagine there will be substantial autocorrelation in the misspecification errors, we assume that they follow some stationary multivariate vector autoregressive (VAR) structure. The simplest model, where the current error depends only on the one preceding it, is a VAR(1) model:
4.11for some matrix *A*. Here, *ϵ*_{n} is a sequence of independent multivariate normals with mean zero and unknown covariance matrix *Σ*. Although a more complicated VAR model may be appropriate for other applications, we show that a VAR(1) is sufficient for our purposes in §5.

Let the vector *λ*=[*λ*_{1},*λ*_{2},…,*λ*_{m}]^{′} be an *m*×1 column vector. Let *λ*^{n} be the vector where each entry of *λ* is raised to the *n*th power, i.e. . Finally, let *V* be an *r*×*m* matrix of eigenvectors with column *i* being the column vector **v**_{i}. Then, we can rewrite equation (2.5) to model **y**_{n} as
4.12The likelihood of the vector of time series of measurements under the VAR(1) model can be written as
4.13Thus, we must find an expression for the conditional distributions *f*(**y**_{i}|**y**_{i−1}) and the unconditional distribution *f*(**y**_{1}). Using equation (4.12), we can rewrite **y**_{i} as a function of **y**_{i−1}:
4.14We see that *f*(**y**_{i}|**y**_{i−1}) is a multivariate normal with mean vector (*I*−*A*)**y**_{*}+*A***y**_{n−1}+*V* ⋅*λ*^{n}−*A*⋅*V* ⋅*λ*^{n−1} and covariance matrix *Σ*.

The unconditional distribution of **y**_{1} unfortunately cannot be written in closed form. While **y**_{1} is some multivariate normal with mean vector *E*[**y**_{1}]=**y**_{*}+*V* ⋅*λ* (our model takes *E*[*ξ*_{i}]=0), the covariance matrix *C* cannot be written in closed form. To see this, write *C*=cov(**y**_{1})=cov(*ξ*_{1}). Because *ξ*_{1}=*Aξ*_{0}+*ϵ*_{1}, we have that cov(*ξ*_{1})=*A* cov(*ξ*_{0})*A*^{′}+*Σ*. Because the processes *ξ* is stationary, cov(*ξ*_{1})=cov(*ξ*_{0}), yielding that the covariance matrix of the distribution of **y**_{1} solves the equation *C*=ACA^{′}+*Σ*. Only if *r*=1 (meaning *A*, *C* and *Σ* are scalars) do we find a closed form expression for *C*, namely the familiar result that *C*=*Σ*^{2}/(1−*A*^{2}).

For simplicity, we condition on **y**_{1}, and maximize the conditional log-likelihood
4.15An exact likelihood for the VAR(1) model alone (and more complicated models for the measurement and misspecification error) can be found in Mauricio (1995).

#### (i) Maximizing the likelihood

The log-likelihood in equation (4.15) can be maximized using many standard optimization techniques. We exploit the hierarchical structure of our model as written in equation (4.12) to find maximum-likelihood estimates of each parameter using an expectation–maximization type algorithm (McLachlan & Thriyambakam 1996) treating each *ξ*_{i} as ‘missing data’.

First, we take *A*=0 and *Σ*=*I* and use a trust region method (Nocedal & Wright 1999) find the maximum-likelihood estimates , and (in other words, initially we take each *ξ*_{i} to be an independent standard multivariate normal random variable). The residuals (i.e. the differences between **y**_{i} and ) give estimates . Using the series , we then find maximum-likelihood estimates and . The value of , once we condition on , allows us to update our estimates of using the second half of equation (4.12). Finally, we update , and by finding the maximum-likelihood estimates of the series . We repeat this process until the sequence of estimates converges that we take to be when the difference in estimates for the eigenvalues in sequential iterations is less than 5×10^{−5}.

To check the appropriateness of our model (equation (4.12)), we examine the residuals, i.e. prediction errors. If the residuals are free of any correlation structure (implying that the VAR(1) model for misspecification and measurement error is adequate), if the variance of the residuals is a constant for the duration of the time series (implying that *Σ* does not change over the course of the series), and if the residuals are normally distributed, then equation (4.12) provides a good description of the time series under analysis.

#### (ii) Standard errors

Once the maximum-likelihood estimates are obtained (and the residuals pass all tests), approximate standard errors for each parameter can be found by calculating the Hessian matrix, whose components are the second partial derivatives of the log-likelihood function. The negative of the Hessian is the observed, sample-based version of the Fisher information matrix and asymptotic theory states that the approximate standard error for a parameter is the square root of the corresponding diagonal element of the inverse Hessian matrix. Numerical methods provide accurate values for these derivatives should analytical expressions not exist.

However, because our model in equation (4.12) is still most likely misspecified (the likelihood is maximized but the underlying model is still incorrect owing to the approximations we have taken), the estimated standard errors from the Hessian may be inaccurate. One approach to find a standard error that is robust against misspecification is to use a ‘Huber sandwich estimator’ (Freedman 2006), though we prefer to obtain standard errors through a parametric bootstrap (Efron & Tibshirani 1993).

For a parametric bootstrap, we start with the maximum-likelihood estimates of and . Using these, we can create a ‘new’ multivariate time series using equation (4.12) by taking *ξ*_{0}=0 and by letting each *ϵ*_{i} be a random multivariate normal with mean vector zero and covariance matrix . We then find the maximum-likelihood estimates of each of the parameters for this bootstrapped series, and repeat the processes numerous times to get the approximate joint sampling distribution of the estimators. The approximate standard error of a parameter is the standard deviation of its marginal sampling distribution found via these (we use 200) bootstrap series. In our analysis, we find that these bootstrap standard errors are quite close to the ones calculated from the Hessian.

#### (iii) Choosing a model

Because the number of dominant eigenvalues is unknown, it must also be estimated from the data. A likelihood ratio test provides a convenient way to choose between models with *m* and *m*+1 eigenvalues. Standard asymptotic results show that has a *χ*^{2} distribution with degrees of freedom equal to the difference in the number of free parameters in the models (in our case *r*+1). For example, a model with two eigenvalues would be preferred to a model with a single eigenvalue if exceeds 15.5 (taking *r*=7) using standard decision criteria.

Another common method to choose between multiple models is to use Akaike's information criterion (AIC). In general, the AIC is defined to be , where *q* is the number of free parameters in the model (in our case *q*=*m*+*rm*+*r*^{2}+*r*(*r*+1)/2 corresponding to the number of unknown eigenvalues, elements of *V* , elements of *A*, and elements of *Σ*). The model with the lowest AIC value is the one chosen. In effect, the AIC selects the model that best explains the data (maximizes the likelihood) with a minimum number of free parameters.

An alternative method to choose between multiple models is to use the Schwarz or Bayesian information criterion (BIC) that penalizes free parameters more harshly than the AIC: . Not surprisingly, the BIC chooses more parsimonious models.

Our preferred strategy for choosing a model is to fit a model with *t*=1,2,… dominant eigenvalues and to choose the one with the lowest BIC, though in our experiments this yields the same model as the one chosen by maximum-likelihood ratio tests.

## 5. Experimental results: cardiac membrane model

Numerous models of cardiac membranes have been developed for different species and cell types (Fenton & Cherry 2008). We adopt the Shiferaw–Fox model (Fox *et al.* 2002; Shiferaw *et al.* 2003) for numerical studies because it integrates a more detailed description of ionic activities and intracellular calcium handling than historic models. Further, alternans of the Shiferaw–Fox model have been studied by numerous authors (Sato *et al.* 2006; Weiss *et al.* 2006; Zhao 2008; Tolkacheva & Zhao 2012). This model is based on the physiology of canine dogs whose rest heart rates are between 60 and 160 beats per minute. Future work will extend the demonstration of our technique to other animals and humans.

The Shiferaw–Fox model contains 16 state variables, is highly nonlinear, and is known to exhibit alternans. Figure 3*a* shows data generated from the model for the variable APD_{90}. In this model, the BCL is gradually decreased from 520 to 395 ms in steps of 25 ms. When the BCL is changed, the heart undergoes some transient dynamics (spikes in the APD_{90} values) and then settles into steady state. We analyse the beats near steady state (e.g. the last 18 beats at pacing period of 520 ms, the last 18 beats at pacing period of 495 ms, etc.). Figure 3*b* shows the steady-state APD values at each BCL, revealing the transition to alternans (cf. figure 2) that occurs at a BCL of 397 ms.

Instead of deriving the variational equation analytically, we numerically calculate the Poincaré map using finite difference methods as described in Nayfeh & Balachandran (1995). The most dominant negative eigenvalue of this Poincaré map is then used to test the estimations from the PCA and MLE methods.

The Shiferaw–Fox model can vary different parameters to describe different mechanisms of alternans: voltage-driven, positive Ca to APD coupling and negative Ca to APD coupling (Sato *et al.* 2006; Zhao 2008; Tolkacheva & Zhao 2012; Lemay *et al.* 2012). Our first model analyses time series from negative Ca to APD coupling (where alternans occurs at a BCL of 397 ms) while a second looks at time series from positive Ca to APD coupling (where alternans occurs at a BCL of 315 ms). To find eigenvalues using the MLE method, we fit each BCL period with models having up to four eigenvalues and choose the one with the lowest BIC. In all cases, models with two eigenvalues are picked (this is in agreement with a likelihood ratio test). We validate our model (equation (4.12)) by checking the residuals of the fit once the maximum-likelihood estimates have been obtained.

### (a) Model 1 (negative Ca to action potential duration coupling, alternans at 397 ms)

Eigenvalues are estimated at BCLs from 520 up to 420 ms in steps of 25 ms. A total of 18 sequential beats is analysed at each BCL using a set of seven APDs. With the exception of BCL=520, the autocorrelation and partial autocorrelation plots of the residuals yields no hint of any correlation structure not captured in the VAR(1) model, and indeed the residuals pass the Durbin–Watson test for (first-order) autocorrelation. At BCL=520, there is a minor hint of additional negative autocorrelation in the residuals with *p*-values of the Durbin–Watson test of about 0.015.

Further, with the exception of BCL=495, time plots of the residuals do not show any obvious indication that the variance of the residuals changes as time progresses, and each series passes the Breusch–Pagan test for equal variance. For BCL=495, the residuals at the second beat are particularly large which causes the equal variance test to be failed with *p*-values of about 0.025. In all cases, the residuals pass a standard Kolmogorov–Smirnov test for normality. Thus, with very minor exceptions our model in equation (4.12) appears to be adequate. We present estimated eigenvalues and standard errors in table 1 using the PCA and MLE approach.

The estimated eigenvalues from both approaches show very good agreement with the actual eigenvalues. The MLE approach yields estimates, with the exception of BCL=445, which are within about a standard error of the actual value (though all but one shows a negative bias). For BCL=445, the MLE is still within 1 per cent of the true value, but the estimated standard error is small enough for the difference to be statistically significant. The eigenvalues found using the PCA method are positively biased, though typically by only a small amount.

### (b) Model 2 (negative Ca to action potential duration coupling, alternans at 315 ms)

Eigenvalues are estimated at BCLs from 430 up to 330 ms in steps of 25 ms using a set of seven APDs. For BCLs of 380–430 ms, a total of nine sequential beats is analysed while for BCLs of 355 and 330 ms a total of 17 sequential beats are analysed. In all cases, the residuals pass the Durbin–Watson test for correlation, the Breusch–Pagan test for equal variance, and the Kolmogorov–Smirnov test for normality though this may be due to the relatively small number of beats analysed. Each estimated eigenvalue from the MLE method is within a standard error of the true value. The PCA estimates still show a small positive bias (table 1).

### (c) Discussion: standard errors and predicting alternans

Using either the PCA or MLE approach yields reliable estimates of the most relevant negative eigenvalue of the Shiferaw–Fox model. Because the PCA approach is computationally much less intensive, it should be preferred unless having standard errors is desired.

For our analysis, having standard errors is critical because it is essential to know how close the dominant negative eigenvalue is to −1 because alternans occurs at this value. Over the range of BCLs considered in the cardiac model, we see that there is no danger of alternans occurring, something we could not have been confident had standard errors not been available (e.g. if we only had point estimates from PCA).

Having standard errors is also useful when predicting the BCL at which alternans should occur because of more appropriate confidence intervals. We can estimate when alternans occurs by fitting the predicted eigenvalues to BCL using a linear regression, then finding the BCL at which the eigenvalue is predicted to be −1. Confidence intervals for the estimate can be obtained by using a parametric bootstrap. For PCA-based estimates, we add to the estimated eigenvalues a random normal amount of noise (with standard deviation equal to the residual standard deviation of the original linear fit), make a new linear regression, and re-predict the BCL at which the eigenvalue would equal −1. The limits of the confidence interval are the 0.025 and 0.975 quantiles of the predictions after completing this procedure 1000 times. For the MLE-based estimates, the amount of noise added is normally distributed with a standard deviation equal to the standard errors presented in table 1. Thus, knowledge of the standard errors is implicitly used in construction of the confidence interval for the MLE method.

For model 1 (where alternans occurs at 397 ms), PCA predicts alternans to occur at a BCL of 394.4 ms (95% CI: 386.1–401.3), whereas MLE predicts alternans to occur at a BCL of 397.8 ms (95% CI: 393.0–401.8 ms). For model 2 (where alternans occurs at 315 ms), PCA predicts alternans to occur at a BCL of 312.6 ms (95% CI: 309.2–315.6 ms), whereas MLE predicts alternans to occur at a BCL of 314.1 (95% CI: 309.4–318.1). We note that the MLE estimates are closer to the true values, and for model 1 the width of the confidence interval is significantly smaller than that of PCA.

## 6. Conclusion and further considerations

We have presented two procedures to analyse the stability of a dynamical system by estimating its dominant eigenvalues from a time series of measurable quantities. The procedures implemented use principal components and maximum-likelihood methods. The latter has not been previously considered in the literature and has the added advantage of providing estimated standard errors as well. A major advantage of both of these techniques is that the analysis does not require specification of the underlying dynamical model nor the state vector, so they are appropriate for complex dynamical systems whose exact analysis would be intractable.

To illustrate these techniques, we have considered the problem of predicting cardiac alternans based on a time series of action potentials of paced cardiac tissue. We construct a pseudo state vector with these measurements and treat it as an approximate projection of the true state vector of the (unknown) dynamical model of the heart. The resulting system of equations describing the time evolution of the measured quantities allows us to estimate the dominant eigenvalues of the system and to predict at what pacing period alternans may arise (when an eigenvalue becomes −1). We note that the techniques here are based on a local stability analysis that does not directly infer global stability issues.

Because we assume that only a few eigenvalues are dominant, that the projection from the true state vector to our pseudo-state vector of measurements is linear, and that the evolution function is only approximately linear near steady state, the system of equations we use only approximates the true time evolution of the action potentials. Taking the errors introduced by this ‘model misspecification’ and any additional measurement error as realizations from a multivariate vector autoregressive process, we maximize the likelihood of the data and find both estimates and standard errors of the eigenvalues. These eigenvalues, along with those found using a PCA, show excellent agreement when analysing artificial data created by a mathematical model of cardiac membrane dynamics. Further, the BCL at which alternans is expected to occur is predicted quite accurately.

Although our application has been cardiac alternans, a similar line of analysis can be applied to any complex dynamical system where the eigenvalues must be estimated by a time series of data. The quality of the estimates and standard errors may vary between applications and which quantities are being measured, but we believe our approaches have great potential.

One avenue for improving these techniques is to focus on more efficient ways to maximize the likelihood function once the appropriate error model has been specified, as well as to consider a richer category of error models besides a vector autoregressive process of order one. Although this error model looks to be quite suitable for our data, in other applications a higher-order model or one that incorporates moving average terms or non-normal errors may be necessary.

## Acknowledgements

This work is in part supported by the NSF under grant no. CMMI-0845753.

- Received February 13, 2012.
- Accepted June 6, 2012.

- This journal is © 2012 The Royal Society