## Abstract

Model reduction for complex systems is a rather active area of research. For many real-world systems, constructing an accurate reduced model is prohibitively expensive. The main difficulty stems from the tremendous range of spatial and temporal scales present in the solution of such systems. This leads to the need to develop reduced models where, inevitably, the resolved variables do *not* exhibit (spatial and/or temporal) scale separation from the unresolved ones. We present a brief survey of recent results on the construction of Mori–Zwanzig-reduced models for such systems. The construction is inspired by the concepts of scale dependence and renormalization which first appeared in the context of high-energy and statistical physics.

## 1. Introduction

The advent of powerful computers has enhanced our ability to study complex systems and understand their dynamics, yet there are many problems that are too complex for computer solution. The task facing the numerical analyst working on such problems is to reduce their complexity to something manageable, yet preserve, perhaps only in a statistical sense, their salient features. Atmospheric flows and, more generally, flows at very high Reynolds numbers provide a prime example of a problem where direct solution is impossible, and some simplification is needed. The representation of the solution involves activity at a range of scales which is much larger than the range of scales that can be presently (or in the foreseeable future) handled by direct numerical simulations. When the solution develops activity at a scale smaller than the smallest scale available to the simulation, the numerically computed solution becomes under-resolved. This leads to a rapid deterioration of the accuracy of the simulation.

The notion of propagation of activity to smaller and smaller scales depends on the physical context of the partial differential equation (PDE). In some cases, e.g. the three-dimensional Euler or Navier–Stokes equations [1], this could mean a cascade of energy to smaller and smaller scales. In other cases, e.g. the nonlinear focusing Schrödinger equation [2], this could mean a cascade of mass to smaller and smaller scales. Irrespective of the specific physical context, the problem is how to use a simulation with smaller resolution than the one needed and yet prevent the computed solution from suffering a loss of accuracy. In other words, how to construct a numerical method which reproduces correctly the features of the solution of the original equation at the length scales that are available numerically. This is the motivation behind the construction of reduced models [3,4].

By construction, a reduced model must allow for energy (mass) to escape from the scales that are accessible to the simulation (called resolved scales or modes) to the inaccessible scales (called unresolved). The main difficulty in constructing an accurate reduced model is the need to estimate the *correct* rate at which activity is propagated from the resolved to the unresolved scales.

For problems which exhibit simplifying features such as time- and/or length-scale separation the construction of reduced models is easier. This simplifying assumption lies behind several methods that have been proposed in the past 20 years, e.g. the inertial manifold method [5], the stochastic mode reduction method [6] and the equation-free (heterogeneous multi-scale) method [7,8]. However, for problems such as the prediction of atmospheric flows, the situation is considerably more complicated owing to the absence of (time- and/or length-) scale separation. The reason is that the absence of scale separation leads to long memory effects which are difficult to model and rather expensive to compute.

The Mori–Zwanzig (MZ) formalism originally developed in the context of irreversible statistical mechanics [9,10] is a formalism that allows the construction, in principle, of *exact* reduced models. The MZ formalism was reformulated and applied to the construction of reduced models for general systems of ordinary differential equations (ODEs) in [11–14]. The formalism proceeds by dividing the *available* resolution into resolved and unresolved parts. Then, it constructs a reduced model for the resolved part and uses the unresolved part to effect the drain of energy (mass) out of the resolved part.

Although the MZ formalism allows for the construction, in principle, of an exact reduced model it has two drawbacks (which are also shared by *any* other reduction formalism). First, the reduced model can be, in general, prohibitively expensive to calculate. The reason is that one must obtain an accurate representation of the behaviour of the unresolved scales before they can be safely eliminated. Obtaining this representation can be rather costly.

The second drawback is more subtle and has not been adequately appreciated by the scientific computing community. It is specific to the case of constructing reduced models for systems of differential equations which are larger than any available numerical resolution. Suppose that you have to construct a reduced model of a full system which is larger than any available numerical simulation. Let us call this system S1. Exactly because S1 is larger than any available numerical simulation, if we want to construct a reduced model we have to use as a starting point a system, call it S2, whose size is smaller than the size of S1. Suppose that you start with S2 and use the MZ formalism (or any other reduction formalism for that matter) and construct an *exact* reduced model S3 for a subset of S2. An exact reduced model means that if one evolves S2 and S3 separately, then the behaviour of the scales resolved by S3 will be the same as the behaviour of the scales in S3 predicted by the simulation of the system S2. However, and this is the heart of the problem, because S2 itself will become eventually under-resolved, the exact reduced model S3 will also become under-resolved. In other words, the predictions of the exact reduced model S3 can only be trusted for as long as the predictions of the system S2 can be trusted. As a result, *any reduced model that has any chance of being accurate for longer times cannot be exact.*

There are examples of *inexact* reduced models, e.g. the *t*-model, coming from the MZ formalism. These have been applied to various systems including singular (or near-singular) time-dependent PDEs [12,15–19]. The models have been shown numerically to be relatively accurate for long time intervals. However, the *t*-model's accuracy is difficult to assess beforehand, and the reason for its relative success has remained a mystery. In order to construct better reduced models, we need to incorporate dynamic information from the full system which will help us decide which of the terms appearing in the exact reduced model are the ones that are most important. In this way, we can construct an inexact but accurate reduced model by keeping the important terms and disregarding the unimportant ones.

One way to address the problem of constructing better reduced models is to embed the MZ-reduced models in a larger class of reduced models which share the same functional form as the MZ-reduced models but have different coefficients in front of the various terms that appear in the reduced models. Then, one can estimate these coefficients ‘on the fly’, whereas the original system of equations is still valid. ‘On the fly’ means that the computation of the coefficients is performed during the actual simulation of the system. One starts the simulation with the full system and uses it to estimate the appropriate values of the coefficients for the reduced model. This is done at a point in time *before* the full system becomes under-resolved. Exactly when this will happen is controlled by a user-specified tolerance. The coefficients are computed *once* and after that one switches to the simulation of the reduced model for the remaining time. ‘On the fly’ also means that the coefficients computed are the appropriate ones for the given initial condition. This means that, for a different initial condition, the coefficients are allowed to change.

The estimation of the coefficients is achieved by requiring that certain integral quantities (e.g. rates of change of *l*_{p} norms) involving only resolved scales should acquire the same values when computed from the original system and the reduced model [20]. The constraints used to obtain the coefficients are the analogue of the ‘matching conditions’ used in effective field theory [21]. Note that, in our case, the constraints are time dependent and this creates an extra complication as to when is the right time to estimate the coefficients from the constraints. This is addressed through a user-specified tolerance that monitors the transfer of activity from the resolved to the unresolved scales (for more details, see §3c). In addition, the approach is the time-dependent analogue of the process of renormalization used in high-energy and condensed matter physics [22]. Before the original system ceases to be valid, one reverts to the reduced model with the various coefficients having their estimated values [20]. We have called this approach the renormalized Mori–Zwanzig (rMZ) formalism (see [20] for more details).

## 2. The Mori–Zwanzig formalism

Here, we provide a brief presentation of the MZ formalism (see [4,11,12] and references therein for more details).

Suppose we are given the full system
*u*=({*u*_{k}}), *k*∈*F*∪*G* with initial condition *u*(0)=*u*_{0}. We have deliberately written the set of variables of the full system as the union of two sets, *F* and *G*. The reason for this is that we use the MZ formalism to construct a reduced model for the set of variables in *F* (called the resolved variables). The variables in *G* are called unresolved and are the variables of the full system which will be eliminated and accounted for by the different terms that appear in the reduced model (please see below).

The system of ODEs we are asked to solve can be transformed into a system of *linear* PDEs. The idea behind this transformation is that, while the system (2.1) evolves a specific initial condition *u*_{0}, the solution of the system of PDEs provides the solution for *all* possible initial conditions. The advantage of the PDE formulation is the linearity of the equations, which facilitates the use of projection operators (see below). After the reduction is performed on the level of the *linear* PDEs, one can extract the ordinary differential form of the reduced model. In particular, we have [11]
*u*_{k}(*u*_{0},*t*)=*ϕ*_{k}(*u*_{0},*t*). Using semigroup notation, we can rewrite (2.2) as
*P* be an orthogonal projection on the space of functions of *Q*=*I*−*P*. We delay the specification of the projection operator until §4, where we present applications to specific PDEs.

Equation (2.2) can be rewritten as
*u*_{k}, *k*∈*F*. The first term in (2.3) is usually called Markovian because it depends only on the values of the variables at the current instant, the second is called ‘noise’ and the third ‘memory’.

If we write
*w*_{k}(*u*_{0},*t*) satisfies the equation
*PQ*=0. In addition, for the initial condition
*P*. We call (2.5) the orthogonal dynamics equation. Because the solutions of the orthogonal dynamics equation remain orthogonal to the range of *P*, we can project the MZ equation (2.3) and find
*e*^{tQL}. In fact, it is the presence of this operator which makes, in general, the computation of MZ-reduced models prohibitively expensive (see Chorin & Stinis [4] for a thorough discussion).

### (a) A class of simplified models

One can start from (2.6) and based on assumptions derive simplified reduced models that are easier to calculate [12,15–17]. However, it has proven difficult to justify the assumptions underlying the derivation of the simplified models. In addition, it is hard to work with the expression for the memory term given in (2.6).

Through Dyson's formula (2.4) and the linearity of *e*^{tL}, the memory term can be written as
*I*=*P*+*Q* and the Baker–Campbell–Hausdorff (BCH) series for *e*^{−tL}*e*^{tQL} [23]. The BCH formula reads *e*^{−tL}*e*^{tQL}=*e*^{C(t,u0)}, where
*tL*,*tQL*]=−*tLtQL*−*tQL*(−*tL*). In addition, the last equality comes from noting that [−*tL*,*tQL*]=[*tL*,*tPL*]=[*tQL*,*tPL*]=−[*tPL*,*tQL*]. We find that
*tPL*. It is very helpful computationally if we can keep this term and discard the higher order ones because it involves only the projected dynamics (see also (2.9)). We want to examine when is the approximation *C*(*t*,*u*_{0})≈−*tPL* acceptable. From the BCH series, we have
*PL*,*QL*] may be small and thus allow the simplification of the memory term expression.

If we assume that [*PL*,*QL*]≈0 and thus *C*(*t*,*u*_{0})≈−*tPL*, then from (2.7) we obtain
*e*^{−tPL} in Taylor series around *t*=0 gives
*j*. In particular, if we omit all the terms after the first one, we obtain the *t*-model, which has been studied thoroughly [12,15,16]. Note that the representation of the memory offered by (2.9) does not involve any integrals. Thus, the resulting reduced models are differential and not integrodifferential equations. This leads to minimal storage requirements and also allows the efficient evaluation of the memory terms.

The series representation of the memory term in (2.9) is based on the assumption of analyticity in time of the operator *e*^{−tPL}. This assumption may be true for small *t* but it does not have to hold for larger *t*. In other words, the Taylor expansion of the operator *e*^{−tPL} has, in general, only a *finite* radius of convergence. Insisting on using the Taylor expansion of the operator *e*^{−tPL} as is for later times is dangerous and can lead to the instability of the reduced model [17]. In fact, the breakdown of the Taylor expansion of the operator *e*^{−tPL} is related to the onset of under-resolution on the part of the full system. This can be established by studying numerically the radius of convergence of the Taylor series, which turns out to be comparable to the time it takes for the full system to become under-resolved.

As advertised in section Introduction, we can address the instability of the MZ-reduced models by embedding the MZ models in a larger class of models which share the same functional form but may have different coefficients. These generalized MZ models are presented in the §3.

## 3. Generalized Mori–Zwanzig models

In order to present the generalized MZ models, we need to rewrite a reduced model in an abstract way which does not make reference to any specific reduction formalism. Afterwards, we see how the MZ formalism can be generalized to fit in this framework.

### (a) Reformulation of the full and reduced systems

Suppose that we want to construct a reduced model for the PDE
*H* is a, in general nonlinear, operator and *F*∪*G* denote the set of Fourier modes retained in the series, where we have split the Fourier modes into two sets, *F* and *G*. We call the modes in *F* resolved and the modes in *G* unresolved. The reduced model involving only the resolved modes *F* will be called the reduced system and the system involving both the resolved *and* unresolved modes *F*∪*G* will be called the full system.

The main idea behind the algorithm is that the evolution of moments of the reduced set of modes, for example *l*_{p} norms of the modes in *F*, should be the same whether computed from the full or the reduced system. This requirement will eventually allow us to compute the coefficients appearing in the reduced model (see §3c).

The full system of equations for the modes *F*∪*G* is given by
*u*=({*u*_{k}}), *k*∈*F*∪*G* is the vector of Fourier coefficients of *u*, and *R* is the Fourier transform of the operator *H*. The system should be supplemented with an initial condition *u*(0)=*u*_{0}. The vector of Fourier coefficients can be written as *F*) and *G*). Similarly, for the right-hand sides (RHSs), we have

In general, when one constructs a reduced model, additional terms appear on the RHS of the equations of the reduced model (see §2 for more details). The role of these additional terms is to account for the interactions between the resolved and unresolved modes, because the unresolved modes no longer appear explicitly in the reduced model. As is standard in renormalization theory [24], one can augment the RHSs of the equations in the full system by including such additional terms. That is accomplished by multiplying each of these additional terms by a zero coefficient. In this way, the reduced and full systems' RHSs have the same functional form. In particular, for each mode *u*_{k}, *k*∈*F*∪*G*, we can rewrite *i*=2,…,*m* are of the same functional form as the additional terms which appear in the reduced model. This is easy to do by taking *i*=2,…,*m*. Thus, the equation for the mode *u*_{k}, *k*∈*F*∪*G* is written as
*u*′_{k},*k*∈*F* is given by
*u*_{k}′(0)=*u*_{0k}. We repeat that the functions *i*=1,…,*m*, *k*∈*F*, have the same form as the functions *i*=1,…,*m*, *k*∈*F*, but they depend *only* on the reduced set of modes *F*. Dimensional reduction transforms the vector *a*^{(0)} to *a*^{(1)}. In addition, the vectors *a*^{(0)} and *a*^{(1)} do not have to be constant in time.

### (b) Generalization of the Mori–Zwanzig models

To proceed, we need to put the MZ model given by (2.6) and (2.9) in the framework of the previous section. To do that, we set
*a*^{(1)}=(1,1,1,…).

The original MZ models may suffer from instabilities. This means that if one evolves the models their solution will blow up in finite time even though the solution of the original system does not. On the other hand, the new models can be made stable by *assigning to each term in the reduced model the appropriate coefficient*. The magnitude of the coefficient of a term reflects the importance of the term in the reduced model. The assignment of the proper magnitude for each coefficient is akin to the ‘renormalization’ of the expansion (2.9). For this reason, we have called the resulting reduced models renormalized MZ (rMZ) models.

### (c) Estimation of the coefficients of the reduced model

In order to implement the generalized-reduced models, we need to estimate the vector of coefficients *a*^{(1)}. In [20], this was accomplished by requiring that the reduced model reproduces the rate of change of certain *l*_{p} norms of the solution of the full system, before the full system becomes under-resolved. These *l*_{p} norms are computed using only the resolved variables, because the reduced model does not have access to the values of the unresolved variables. If one uses different *l*_{p} norms, one obtains a *linear* system of equations for the vector *a*^{(1)}. As was found in [20], possible dependencies between the different *l*_{p} norms lead to difficulties in the numerical solution of the linear system of equations for *a*^{(1)}.

We now discuss an alternative way to compute the coefficients of the reduced model which is based on the following observation. By construction, all the terms appearing in MZ models have the correct dimensions. This means that the coefficients appearing in front of the terms must be dimensionless. As is usually the case, dimensionless coefficients depend not only on one quantity, but also on ratios of quantities of the same dimensions.

The numerical results in [20] suggest that the coefficient of the renormalized *t*-model exhibits a scaling law dependence on the ratio of the highest wavenumber present in the initial condition to the highest wavenumber allowed in the reduced model. Inspired by this, we have explored the possibility that the coefficients for the higher-order terms also exhibit scaling law dependencies on the same ratio. For the numerical results presented in §4, it was found that this is almost but not quite the case. The situation is more complicated but, still, the scaling law ansatz is a sound guiding principle. If we assume certain scaling laws for the coefficients, we need to have a way to compute the necessary prefactors and scaling exponents. This can be done by enforcing the agreement between the reduced and full model on the estimate of one integral quantity, e.g. the rate of change of the *l*_{2} norm, for a collection of different resolutions. We should enforce the agreement for as many different resolutions as the total number of prefactors and scaling exponents. One then obtains a system of *nonlinear* algebraic equations which can be solved by standard methods (please see §4b for a concrete example).

For both ways of estimating the coefficients, it is necessary to know when is the right time to compute the coefficients. This must be done at a time *before* the full system becomes under-resolved. A safe way to estimate such a time is by monitoring the contribution of the lowest order memory term, i.e. the *t*-model term, to the rate of change of the *l*_{2} norm. Recall that the rate of change of the *l*_{2} norm measures the transfer of mass (energy) from the resolved to the unresolved scales. We can prescribe a tolerance TOL and when the contribution of the *t*-model term to the rate of change of the *l*_{2} norm exceeds TOL we can record the quantities needed in the calculation of the coefficients.

After the calculation of the coefficients for the reduced model, one can stop evolving the full system (which was bound to become under-resolved anyway) and switch to the reduced model.

## 4. An illustrative example

To illustrate the renormalized-reduced models discussed in §3b, we present here results for the one-dimensional Burgers equation.

### (a) Set-up of the reduced model

We use the one-dimensional inviscid Burgers equation as an instructive example for the constructions presented in this section. The equation is given by
*u*(*x*,0)=*u*_{0}(*x*) and boundary conditions. We solve (4.1) in the interval [0,2*π*] with periodic boundary conditions. This allows us to expand the solution in Fourier series
*F*∪*G*=[−*M*/2,*M*/2−1]. We have written the set of Fourier modes as the union of two sets in anticipation of the construction of the reduced model comprising only the modes in *F*=[−*N*/2,*N*/2−1], where *N*<*M*. The equation of motion for the Fourier mode *u*_{k} becomes
*k*∈*F*∪*G*. The system (4.3) is supplemented by the initial condition *L* by
*Lu*_{0k}=*R*_{k}(*u*_{0}).

We also need to define a projection operator *P*. For a function *h*(*u*_{0}) of all the variables, the projection operator we will use is defined by *h*(*u*_{0}) by zero. The choice of the projection comes from the fact that we are using Fourier expansions and initial conditions which involve only a few resolved Fourier modes. So, we choose a projection operator which sets to zero the unresolved Fourier modes, as they are also zero in the exact initial condition.

Also, we define the Markovian term
*F*. The full system conserves the energy *not* alter the energy content of the resolved modes. The necessary energy transfer out of the resolved modes rests on the memory terms. Based on our choice of projection operator and the scaling symmetries of the Burgers equation, we set *N*=*M*/2.

With the definition of *P* given above, we find for *QLu*_{0k}
*QLu*_{0k} contains three terms that involve at least one wavenumber in the unresolved range *G*. The terms in the Taylor expansion of the memory term are given by
*j*th term, we have

For equations with polynomial nonlinearities, the terms (*PL*)^{ j−1}*QLu*_{0k} can be computed recursively using a construction that involves Pascal triangles. The use of a Pascal triangle for Burgers arises as follows. On the one hand, the quantity *QLu*_{0k} is *quadratic* in Fourier modes. On the other hand, each application of the operator *PL* involves a differentiation. So, the number of terms after each application of *PL* increases according to the coefficients of the Pascal triangle (like the coefficients in the binomial theorem) [20].

### (b) Numerical results

Figure 1 shows the evolution of the energy

The initial condition is *T*=1 [26]. All the reduced models use *n*=16 Fourier modes, whereas the full system has *M*=32 modes. The results of the reduced models are compared with a converged solution of the random choice method with *n*=4096 points. The energy of the random choice method solution was computed using only *n*=16 modes. However, note that practically all the energy of the random choice method solution is concentrated in the first few Fourier modes, so even if we had computed the energy for all *n*=4096 Fourier methods the results would not have changed. This is to be expected, because, for the initial condition we are using, a standing shock forms at time *T*=1 and, thus, by time *T*=100 the only Fourier modes having some energy left in them are the first few.

All the calculations are done in double precision. The tolerance TOL which is used to decide when it is time to switch to the reduced model is set to TOL=10^{−12}. In order to renormalize the coefficients, one needs to know what is the rate of transfer of activity from the resolved to the unresolved modes (from the modes in *F* to the modes in *G*). If this transfer is zero, then there is no meaningful information that can be used. On the other hand, if we wait too long, our calculation can become under-resolved. The parameter TOL is used to monitor how activity is transferred. It has to be larger than the precision used (double in our case) but also not too large. We have used the value TOL=10^{−12}, which allows for a few digits of accuracy when computing the renormalized coefficients. The systems of ODEs for the different reduced models were solved using the Runge–Kutta–Fehlberg method with the stepsize control tolerance set to 10^{−10} [27].

The renormalized coefficients were computed using the scaling ansatz outlined in §3c. In particular, for the highest order model considered here (third order) we need to estimate the renormalized coefficients

We see that we have a total of four parameters that need to be determined, namely *α*,*β*_{1},*β*_{2} and *β*_{3}. To do that we used four different reduced model resolutions *n*=6,8,10,12 (with the corresponding full system resolutions *M*=12,16,20,24) and we obtained, for each resolution, one equation involving the four unknown parameters. Each one of these equations was obtained by requiring that, at the instant when the TOL is reached, the rate of change of the *l*_{2} norm of the resolved Fourier modes computed by the full system is reproduced by the reduced model. Thus, we obtained a system of four nonlinear algebraic equations which was solved by Newton's method (converged to double precision within a few iterations). The values we found were *α*=1.532, *β*_{1}=0.452, *β*_{2}=−0.661 and *β*_{3}=0.728. Note that the numerical results we have presented in figure 1 are for *n*=16. In order to obtain the values of the coefficients for *n*=16, we used the estimated scaling laws.

As shown in figure 1, the renormalized MZ models of first and second order give practically the same results as the *t*-model. However, the energy evolution predicted by the third-order model is practically identical to the correct energy evolution of the resolved modes predicted by the random choice method. Note that what we have obtained is a reduced model that is constructed directly in Fourier space, that is based solely on the MZ formalism, that does *not* assume time-scale separation and thus allows for long memory effects, that is stable and that is accurate in predicting the energy decay rate of a singular solution.

## 5. Conclusion

The construction of reduced models for systems exhibiting absence of time- and/or length-scale separation is a difficult task. The difficulty stems from the fact that the absence of scale separation leads to long memory effects. The accurate representation of long memory effects can be computationally expensive and can also lead to reduced models which are themselves expensive to integrate.

The MZ formalism offers the ability to construct reduced models of arbitrary precision. However, the construction of reduced models based on the original MZ formalism can be quite expensive to obtain and implement. In addition, the reduced models may be unstable [17]. The generalized MZ models which were briefly presented in this article can alleviate these difficulties. To accomplish that one needs to incorporate dynamic information from the evolution of the full system. This additional information allows one to compute the necessary coefficients appearing in the generalized models. This procedure is akin to the renormalization procedure developed first in the context of high-energy and statistical physics [28]. We have presented numerical results for the one-dimensional Burgers equation, which is a standard test case for the development of reliable reduced models. The construction presented here can be straightforwardly extended (at least for the case of spectral methods) to physically relevant systems such as the Navier–Stokes equations (see [20] for an application to the three-dimensional Euler equations of incompressible fluid flow).

## Data accessibility

The source code of the programs is available from the author upon request.

## Funding statement

This material is based upon work supported by the US Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program, Collaboratory on Mathematics for Mesoscopic Modeling of Materials (CM4), under award no. DE-SC0009280.

## Conflict of interests

The author has no competing interests.

## Acknowledgements

The author thanks Prof. J. Weare for many useful discussions and comments.

## Footnotes

One contribution to a special feature on ‘Climate dynamics: multiple scales, memory effects and control perspectives’.

- Received June 7, 2014.
- Accepted January 21, 2015.

- © 2015 The Author(s) Published by the Royal Society. All rights reserved.