## Abstract

Dynamic systems involving convolution integrals with decaying kernels, of which fractionally damped systems form a special case, are non-local in time and hence infinite dimensional. Straightforward numerical solution of such systems up to time *t* needs computations owing to the repeated evaluation of integrals over intervals that grow like *t*. Finite-dimensional and local approximations are thus desirable. We present here an approximation method which first rewrites the evolution equation as a coupled infinite-dimensional system with no convolution, and then uses Galerkin approximation with finite elements to obtain linear, finite-dimensional, constant coefficient approximations for the convolution. This paper is a broad generalization, based on a new insight, of our prior work with fractional order derivatives (Singh & Chatterjee 2006 *Nonlinear Dyn.* **45**, 183–206). In particular, the decaying kernels we can address are now generalized to the Laplace transforms of known functions; of these, the power law kernel of fractional order differentiation is a special case. The approximation can be refined easily. The local nature of the approximation allows numerical solution up to time *t* with computations. Examples with several different kernels show excellent performance. A key feature of our approach is that the dynamic system in which the convolution integral appears is itself approximated using another system, as distinct from numerically approximating just the solution for the given initial values; this allows non-standard uses of the approximation, e.g. in stability analyses.

## 1. Introduction

Initial value integrodifferential equations involving convolution integrals appear in diverse areas of science and engineering, including viscoelasticity (Bagley & Torvik 1983), biological tissue models (Miller & Chinzei 2002), image processing algorithms (Yaroslavsky 2004), Bassets problem in fluid mechanics (Basset 1910) and in control systems (Podlubny 1999; Sabatier *et al.* 2002). Many important convolution integrals involve decaying kernels.

To fix ideas and motivate what follows, let us consider a system of the form
where our aim is to emphasize that *F*, which may be nonlinear, depends on a convolution integral involving a rather arbitrary kernel function *f* (restrictions on *f* will be discussed below). In the convolution integral, *x*(*t*−*τ*) may conceivably be replaced by some arbitrary function thereof, as in *h*(*x*(*t*−*τ*)).

This paper is about the efficient numerical evaluation of the convolution integral above, which for arbitrary nonlinear problems must proceed hand in hand with the numerical solution of the whole system. In particular, our proposed method identifies a corresponding ordinary differential equation (ODE)–partial differential equation (PDE) system, develops a Galerkin approximation and uses a finite element discretization on the unit interval to construct a simple and accurate approximation that is easy to implement and refine. The approximation is local in time: what Agrawal (2009) calls a memory free formulation.

To clarify the contribution of this paper, it is worthwhile to first consider routine approaches to the same problem. The most direct method involves the numerical evaluation of the convolution integral at each time step; essentially, the time step Δ*t* after time *t* requires an computation (we will use this technique in a numerical example in §6*a*). This direct method requires computations for obtaining a solution up to time *T*, which makes long-time simulations difficult.

Computational costs can be reduced by noting that, owing to the decaying nature of the kernel, the integrand in the convolution is smaller deep into the past (near time 0) and larger at more recent times (close to the present time ‘*t*’). Accordingly, numerically accurate integration methods can be designed based on non-uniform meshes. Along these lines, algorithms for fractional order kernels (and also kernels with asymptotic decay like power laws) may be found in Ford & Simpson (2001) and Diethelm & Freed (2006). Such non-uniform-remeshing approaches involve an explicit attention to the past that we will be able to avoid.

Computational costs can also be reduced by making a finite-dimensional and local approximation of the convolution integral (e.g. the discussion in Golberg 1979). Some such approximation techniques are mentioned below.

One technique approximates the convolution integral with rational functions in the Laplace domain, thereby obtaining a finite set of ODEs in the time domain. This method requires extra differentiability conditions on dependent variables, extra initial conditions, and has added analytical complications if the convolution is with a nonlinear function of the dependent variable. For example, if the Laplace transform of *f* in the convolution
is approximated as
then the time domain approximation has *m* derivatives of *y* and *n* of *x*, which we wish to avoid. If the convolution is with, e.g. instead of just *x*, then higher derivatives of will arise (tedious, even if tractable).

Alternatively, the finite-dimensional approximation may be obtained by the direct approximation of the kernel in the time domain. Approximation by various types of polynomials has been proposed (Bownds & Wood 1976, 1977; Bownds & Ariz 1982). These methods have the shortcomings of approximating a decaying function using a polynomial, in addition to having to keep track of many terms arising out of multiple derivatives.

Better than the polynomials is to approximate decaying kernels using exponentially decaying functions (Rice 1962; Evans *et al.* 1980; Liu 2001). A key difficulty here is in the selection of the exponential *rates*. Some authors, such as Baxter & Iserlesi (1997) and Beylkin & Monzón (2005), have tackled these issues within the context of approximating a function over a finite interval. We consider this approach reasonable, especially for decaying kernels that do not fall within the purview of the present paper.

In the present work, we *implicitly* approximate the kernel using decaying exponentials. As outlined above, our approach is based on approximating solutions to an ODE–PDE system that has an exact correspondence with the convolution integral. The present work is a broad generalization of our prior work (Singh & Chatterjee 2006) with fractional order systems, where the decay kernels were of the form *t*^{−α} with 0 < *α* < 1. Here, on the basis of a new insight, we extend our method to the kernels which are Laplace transforms of other, analytically known, functions; and which, possibly after a finite number of slope reversals, monotonically approach some finite limit (where we would subtract the limit, so that the kernel approaches zero). See figure 1 for some typical examples of such kernels.

An important feature of our approach is that it approximates dynamic systems with convolutions using other systems, which are then numerically tackled using routine methods. This is distinct from numerically approximating just the solution for the given initial values. A consequent advantage of this feature will be discussed later.

The papers most relevant to our approach are Singh & Chatterjee (2006), Yuan & Agrawal (2002), and more recently Agrawal (2009). All three papers were specifically directed to power law (decaying) kernels. The ODE–PDE system we refer to above was only implicit in Yuan & Agrawal (2002). Here, we offer a single unifying idea and present a more general formulation than in these three papers.

Incidentally, the approach of Yuan & Agrawal (2002) has been critiqued by Schmidt & Gaul (2006) who pointed to inaccuracies, incurred owing to discretization, for the extreme limits of time or frequency. Agrawal (2009) has responded to them; and we, too, have addressed their comments elsewhere (Singh & Chatterjee 2006). Here, we note that these inaccuracies remain small for practical purposes, and accept such errors as part and parcel of discretization of any genuinely infinite-dimensional dynamics. In the end, our numerical results will speak for themselves.

We now present the basic idea of our approximation scheme and demonstrate it for 10 different arbitrarily chosen kernels given in figure 1.

## 2. An equivalent of the kernel

Let *f*(*t*) be the kernel of interest, and let *f* also be the Laplace transform of *g*. That is, *g* is the inverse Laplace transform of *f* (see Widder 1946; Doetsch 1974). Now, consider the following PDE, which can also be viewed as an ODE with *t* as independent variable and *ξ* a parameter:
2.1
where *δ*(*t*) is the Dirac delta function. Solution of the above equation is given by
2.2
On integrating the above w.r.t. *ξ*,
2.3
because *f* is the Laplace transform of *g* (think momentarily of *t* as *s*, and *ξ* as *t* in table 1).

In this way, we first solve equation (2.1), then integrate its solution with respect to *ξ* to obtain *f*(*t*), which may be thought of as the impulse response function of a single-input, single-output system as given above. Numerical issues aside, the convolution integral is now easy.

For illustration, replace *δ*(*t*) in equation (2.1) with some arbitrary *x*(*t*) to get
2.4
It follows that
2.5
Now, integrating
2.6
Assuming that we are permitted to change the order of integration, using the routine trick of replacing *τ* with *t*−*τ* in the integrand, pulling *x*(*t*−*τ*) out of the *ξ* integral, and using equation (2.3), we get
which is the convolution integral we want.

Thus, we have indirectly evaluated the convolution integral with the following two operations:

Solve 2.7

Then, integrate to find 2.8

There is no approximation so far. The advantage gained by replacing the convolution *f*(*t*)**x*(*t*) with the above two operations is that now we can use Galerkin projections on equation (2.7) to get good finite-dimensional approximations.

## 3. Definitions

In this section, for the purposes of the discussion below, we put down some (routine) definitions we will use below.

We will consider integrals with respect to *ξ* on the domain . Functions referred to as square integrable and/or absolutely integrable will be so over this domain. The inner product of functions *p* and *q*, denoted by (*p*,*q*), is defined as
which is bounded for square integrable *p* and *q*. We will also use the *ξ*-weighted inner product (*p*,*q*)_{ξ}, defined as

Finally, to quantify our approximation errors, we will measure the absolute difference between two functions *x*_{1} and *x*_{2} over a finite time interval [0,*T*] using

## 4. Galerkin projection

For our Galerkin projection (Knabner & Angermann 2003; Gockenbach 2006), we assume that equation (2.7) is satisfied by
4.1
where *n* is finite, the *ϕ*_{i} are to be chosen by us and the *a*_{i} are to be solved for. The choice of shape functions *ϕ*_{i} will be discussed later. Below, we outline the Galerkin procedure for equation (2.7).

Substituting the approximation for *u*(*ξ*,*t*) in equation (2.7), we define
where *R*(*ξ*,*t*) is called the residual. This residual is made orthogonal to the shape functions, yielding *n* equations
4.2
Equation (4.2) constitutes *n* ODEs, which can be written in the form
4.3
where **A** and **B** are *n*×*n* matrices; **a** is an *n* × 1 vector containing *a*_{i}, *i* = 1,2,…*n*; and **c** is an *n* × 1 vector. The entries of **A**, **B** and **c** are
4.4
It is clear that each shape function *ϕ*, by itself as well as upon multiplication by , needs to be square integrable. The integrals involved in c_{m} also need to exist. We observe these restrictions here, and will satisfy them later in our numerical implementation, though we will not discuss them formally in what follows.

Notice that the zero initial conditions used in equation (4.3) are derived from the fact that the system starts from rest in equation (2.7). On enforcing the same initial conditions on the assumed solution in equation (4.1), we get **a**(0) = **0**. The solution of equation (4.3) can be obtained using any good routine for the numerical integration of ODEs (we have uniformly used Matlab’s ‘ode23t,’ which is designed for stiff systems). Finally, by equations (2.7), (2.8) and (4.3), our numerical approximation to the convolution integral is
4.5
where T denotes matrix transpose and **d** is an *n* × 1 vector with *m*th entry
4.6
From the above, each *ϕ* needs to be integrable, and we will only choose such.

Now, it is easy to show that our approximation of the convolution integral given by equation (4.5) is in fact equivalent to approximating *f*(*t*) using a linear combination of decaying exponentials, as in
4.7
However, the above approximation is implicit because we do not actually compute these exponentials; nor do we explicitly choose the exponential decay rates, which in our case come out of the Galerkin projection. Advantages of the approximation proposed here, as we will demonstrate with examples, are that (i) by suitable choice of shape functions we can obtain good performance over a given time interval and (ii) refinement of the approximation involves straightforward calculations.

We mention here two related papers by Diethelm (2008, 2009) where he emphasizes the distinction between the initial solution of ODEs and the subsequent integration with respect to the auxiliary parameter (here, *ξ*), specifically in the context of Yuan & Agrawal (2002). Diethelm shows that the details of the second integration step can dramatically affect accuracy. Here, within our finite element discretization strategy, we address the solution of *ξ*-parameterized ODEs and the subsequent integration with respect to *ξ* on the same footing. It seems conceivable that future work along the lines of Diethelm’s identification of two steps may further refine the accuracy of the method proposed here, though our straightforward finite element approach gives good results in any case.

### (a) Choice of *ϕ*_{i}

For Galerkin projections, we will use one-dimensional finite element shape functions, because of their familiarity and simplicity. To this end, we will first map (the *ξ* domain) to [0,1] (the *η* domain) using the following invertible mapping:
4.8
We will then define finite element shape functions on the *η* domain. In particular, here we will use ‘hat’ functions on a non-uniform grid, as described below.

We use (briefly borrowing Matlab notation)
4.9
where ‘logspace’ is shorthand for *n*−1 points that are logarithmically equispaced between *y*_{1} = 10^{−β1} and *y*_{n−1} = 10^{β2}. We then set
4.10
to get an (*n*−1) × 1 array of non-uniformly spaced points in the interval (0,1). We finally add two more nodes, at 0 and 1, to get an (*n*+1) × 1 array of nodal locations.

Notice that the above choice of non-uniformly spaced nodal points in the *η* domain leads to relatively higher refinement near 0 and 1, improving performance for long and short time scales, respectively, as may be anticipated on observing that is equivalent to , and from the inverse relationship between *ξ* and *t* observed in the exponential (see equation (2.5)).

Now, using the above-defined nodal points, we define the ‘hat’ shape functions as follows (see figure 2): 4.11a 4.11b and 4.11c

In using the *η* domain above, we note that
In this way, all *ξ*-domain calculations are actually done in the *η* domain. It is now clear that in the last shape function above (equation (4.11c)), the power of 3/2 is higher than needed: anything greater than unity seems enough. We use the 3/2 here as a conservative choice, leaving finer examination to future work.

### (b) Implementation

Consider again the nonlinear integrodifferential equation
4.12
where *F* may be nonlinear. In our scheme, we replace the above by the following set of ODEs:
4.13
and
4.14
where matrices **A, B, c, d** and the state vector **a**(*t*) are as defined in §4. It is thus clear that our method can be used in a straightforward fashion to solve a fairly wide class of nonlinear integrodifferential equations with convolution integrals. Some non-trivial extensions, which involve the use of differential algebraic equations (DAEs), are possible along the lines discussed in Singh & Chatterjee (2008).

Note that matrices **A**, **B** and **d** are mesh-dependent but not kernel-dependent. Matrix **c** is both mesh and kernel dependent. These matrices, for 30 elements and *β*_{1} = *β*_{2} = 2, are provided in Matlab format for all 10 kernels of figure 1 in the electronic supplementary material.

## 5. Numerical results for 10 kernels

In this section, we present the numerical examples of our approximation. Instead of looking at specific solutions to dynamic systems, we directly compare in the time domain the approximated kernel obtained from equation (4.7) with the exact kernel, for each of the 10 cases of figure 1. We also present some numerical convergence results.

First, notice that the approximate kernel obtained by equation (4.7) can be thought of as arising from the solution of
where the matrices **A, B, c** and **d** are as defined in §4, and *δ* is the Dirac delta function. The above is equivalent to
which is the only thing we solve in this section. Finally, **a** is multiplied from the left by **d**^{T} to obtain the approximated kernel.

In each of the 10 cases presented below, for Galerkin projections, we use 30 finite elements; and we use, unless stated otherwise, *y*_{1} = 0.01 and *y*_{n−1} = 100 (i.e. *β*_{1} = *β*_{2} = 2; see equations (4.9) and (4.10)). Results are given in figures 3–6. The exact and approximate curves are visually indistinguishable in figures 3 and 6*b*,*c*. In figures 4, 5 and 6*a*, there is a visible mismatch over a very short time interval near *t* = 0, and the subsequent match is very good.

The mismatch observed near *t* = 0 is studied more closely in figure 5 (for kernel 6 of figure 1). Figure 5*a* shows the results from the calculations described so far; and figure 5*c* shows an enlarged picture close to *t* = 0. For the same kernel, we can improve the approximation for small *t* by refining the mesh for *η* close to 1. To this end, we repeated the above calculations with *y*_{1} = 0.02 and *y*_{n−1} = 400 (and all else held fixed, including 30 elements as before). The results obtained are shown in figure 5*b*,*d*; and comparison with the respective subplots on the left shows significant improvement. Figure 5*e* shows the *L*_{1} error (numerically evaluated through separate numerical integration) for these two meshes, and a roughly 40-fold reduction is seen.

A detailed kernel-by-kernel study of the most appropriate choices of *β*_{1} and *β*_{2}, numbers of mesh points, etc. is not conducted here; such a study would be more appropriate for specific applications where the additional effort was considered worthwhile. In this paper, we focus on the main idea behind the approximation.

Returning to our numerical results, we consider figure 6*d*, which shows the errors for a short time near *t* = 0 in our approximation of *t*^{−1/4} (familiar from the fractional order integrals and derivatives). The unboundedness of the kernel is not captured by the approximation, but the *L*_{1} error is small (we found it to be comparable to figure 5; details are not shown here), and the pointwise error for larger *t* is small as well. Note that this particular kernel is essentially the subject of Singh & Chatterjee (2006, 2008), and so we discuss it no further in this paper.

Finally, we present an informal numerical convergence study for three kernels that are well behaved near *t* = 0: numbers 1, 2 and 9. We use 10, 20 and 30 finite elements, but uniformly use *y*_{1} = 0.01 and *y*_{n−1} = 100 for each case. Results are presented in figure 7. It is clear that the accuracy improves as we increase the number of finite elements (from 10 elements to 30 elements, the improvement is more than 200-fold). Note that merely increasing the number of elements will eventually stop giving better accuracy, unless *y*_{1} is decreased and *y*_{n−1} is increased. In our experience with 16-digits of precision, very large numbers of finite elements lead to difficulties in the sufficiently accurate evaluation of the integrals involved in the finite element calculation, but we view this as a reflection of the widely disparate time scales present in the problem, and in this sense a feature of the problem rather than a shortcoming of the solution technique.

## 6. Two applications

In this section, for the demonstration of implementation and accuracy, we discuss two of the several potential applications of our method. In the first application, using our method, we solve an integrodifferential equation involving a decaying kernel. The second application involves locating the Hopf bifurcation point of a similar integrodifferential equation.

### (a) Application 1

Consider
6.1
To solve the above with our method, we use
6.2
and
6.3
where the matrices **A**, **B**, **c** and **d** are obtained as explained in §4*b*. As mentioned earlier, these matrices for all 10 kernels of figure 1 are provided in Matlab format in the electronic supplementary material.

We solved the above set of ODEs using Matlab function ‘ode23t’ with tight error tolerances (10^{−8}). For comparison, we also solved equation (6.1) by the direct numerical evaluation of the integral term at each time step (we used a constant-step fourth-order Runge–Kutta solver, and Simpson’s rule; and refined the time step until convergence in the computed error plot was obtained). Results are shown in figure 8*a*,*b*, where a good match is observed; and in figure 8*c*,*d*, where the actual absolute pointwise error is seen to be small. The tight error tolerance used for ‘ode23t’ along with the refinement-until-convergence criteria used for the Simpson’s-rule-based convolution integrals implies that the errors plotted are essentially due to our approximation scheme alone.

### (b) Application 2

An essential feature of our algorithm is that we approximate the system using another system, and not merely the solution for specific initial conditions using some numerical discretization strategy. This allows us to consider the situations outside the reach of usual numerical solvers of initial value problems.

Consider
6.4
where *α* and *β* are parameters. We will locate Hopf bifurcation points of this equation using our method, and obtain a stability boundary in the (*α*,*β*) plane. This boundary will be compared with that obtained from the direct analysis of the original system.

As explained in §4*b*, we first replace equation (6.4) with the following system of ODEs (we use 30 elements as before):
6.5
and
6.6
which together make a standard finite-dimensional linear constant coefficient system. For given *α* and *β*, we can numerically find the eigenvalues of the system. Here, for given *α*, we iteratively adjust *β* until the largest of the real parts of all the eigenvalues is as close as possible to zero. That value of *β*, which we treat as a function of *α*, gives a point (*α*,*β*) on the stability boundary in the parameter plane. The approximated stability boundary is obtained easily by repeating the above calculation for many values of *α*.

We now determine the stability boundary through the direct treatment of equation (6.4). Although the general solution cannot be written in a closed form as far as we know, we assume that, at the stability boundary, after a sufficiently long time passes for the transients to decay, a pure sinusoidal solution exists (this much is essential for a Hopf bifurcation). Seeking such sinusoidal solutions, we substitute (*λ* > 0) in equation (6.4) and simplify (using MAPLE) to get
where Ei is the exponential integral. As discussed above, we let , observe that , and obtain
In the above, for any *α*, on equating real and imaginary parts individually to zero, we get two equations to be solved for *β* and *λ*; a stability boundary in the (*α*,*β*) plane is thereby obtained. While the solution of these two equations is numerical, the roots obtained are highly accurate and so we consider the resulting stability boundary to be exact. Note that all the numerical root finding for this example was done using MAPLE, because of its facility with the exponential integral.

Figure 9 shows the results obtained for the stability boundary. Figure 9*a* shows the stability boundaries (visually indistinguishable). Pointwise absolute error is shown in figure 9*b*, and is seen to be less than about 0.4 per cent.

## 7. Conclusions

We have developed a novel technique for the numerical solutions of integrodifferential equations involving convolution integrals with decaying kernels, for such cases where the kernel itself has an analytically known inverse Laplace transform. Decaying power law kernels, important in fractional order derivatives, are a special case.

The present work is a significant generalization of our earlier work on fractional order derivatives.

The advantage gained from our finite-dimensional approximation is the low computational cost within a simple algorithm. The original infinite-dimensional system’s direct solution to time *t* requires computations; and solution using algorithms such as in Diethelm & Freed (2006) requires computations at the cost of using a more complicated algorithm. By contrast, an accurate solution is now obtained with only computations.

The accuracy of our scheme has been demonstrated through several examples and preliminary studies of mesh refinement. We emphasize that the accuracy demonstrated here is by no means optimal; specific convolution kernels of particular interest may justify further work towards optimizing the discretization for those kernels. Nevertheless, accuracy is seen to be very good, particularly in light of the simplicity of the algorithm.

Consequently, our approach allows for the nice insights into the systems where such convolutions may appear, allowing (for example) stability analyses which are difficult to conduct with the usual numerical methods.

## Footnotes

- Received July 19, 2009.
- Accepted October 2, 2009.

- © 2009 The Royal Society