## Abstract

We study solutions to nonlinear stochastic differential systems driven by a multi-dimensional Wiener process. A useful algorithm for strongly simulating such stochastic systems is the Castell–Gaines method, which is based on the exponential Lie series. When the diffusion vector fields commute, it has been proved that, at low orders, this method is more accurate in the mean-square error than corresponding stochastic Taylor methods. However, it has also been shown that when the diffusion vector fields do not commute, this is not true for strong order one methods. Here, we prove that when there is *no* drift, and the diffusion vector fields do not commute, the exponential Lie series is usurped by the sinh-log series. In other words, the mean-square error associated with a numerical method based on the sinh-log series is always smaller than the corresponding stochastic Taylor error, in fact to *all* orders. Our proof uses the underlying Hopf algebra structure of these series, and a two-alphabet associative algebra of shuffle and concatenation operations. We illustrate the benefits of the proposed series in numerical studies.

## 1. Introduction

We are interested in designing strong series solutions of nonlinear Stratonovich stochastic differential systems of the form
Here, (*W*^{1},…,*W*^{d}) is a *d*-dimensional Wiener process and for some and all . We suppose that , *i*=1,…,*d*, are smooth non-commuting vector fields that, in coordinates, are . Repeated iteration of the chain rule reveals the stochastic Taylor expansion for the solution to the stochastic differential system. Indeed, for any smooth function , we have the formal stochastic Taylor series expansion
Here, is the collection of non-empty words over the alphabet . We adopt the standard notation for Stratonovich integrals; if *w*=*a*_{1}…*a*_{n}, then we set . We have also written the composition of the vector fields as *V*_{w}≡*V*_{a1}○*V*_{a2}○···○*V*_{an}. We define the flow map *φ*_{t} as the map such that *φ*_{t}○*f*○*y*_{0}=*f*○*y*_{t}. It has a formal stochastic Taylor expansion of the form
Note that *φ*_{t}○*φ*_{s}=*φ*_{t+s} for all non-negative *t*,*s* and *φ*_{0}=id, the identity mapping.

Classical strong numerical methods are based on truncating the stochastic Taylor expansion for the flow map and applying the resulting approximate flow map over successive small subintervals of the global interval of integration required (see Kloeden & Platen 1999 or Milstein 1995). An important and expensive ingredient in all numerical methods is the strong simulation/approximation of the required retained multiple integrals *J*_{w}(*t*), on each integration step. Here, we will take their suitable approximation as granted (e.g. Wiktorsson 2001 for their practical simulation). Now let be a smooth function.

We can also construct flow approximations from *φ*_{t} via the following procedure:

(i) construct the new series

*ψ*_{t}=*F*(*φ*_{t}),(ii) truncate this series to produce the finite expansion ,

(iii) reconstruct an approximate flow map as ,

(iv) the ‘flow error’ is the flow remainder ,

(v) an approximate solution is given by , and

(vi) the mean-square error in this approximation is .

For the special case , i.e. the logarithm function, this procedure was outlined by Castell & Gaines (1996). The resulting series is the exponential Lie series, which lies in , the non-commutative algebra of formal series generated by the vector fields *V*_{1},…,*V*_{d}. Indeed, any truncation , with multiple integrals replaced by suitable approximations, also lies in and is therefore a vector field. Hence, and an approximation to the solution can be constructed by solving the ordinary differential system for *u*=*u*(*τ*), for *τ*∈[0,1] with *u*(0)=*y*_{0}. The solution to this system at time *τ*=1, itself approximated by an ordinary differential numerical method, is .

So far, what has been proved for the Castell–Gaines method? Castell & Gaines (1995, 1996) proved that the strong order one-half method constructed in this way is always more accurate than the Euler–Maruyama method. Indeed, they proved that this method is *asymptotically efficient* in the sense of Newton (1991). Further in the case of a single driving Wiener process (*d*=1), they proved that the same is true for the strong order one Castell–Gaines method. By asymptotically efficient we mean, quoting from Newton (1991), that they ‘minimize the leading coefficient in the expansion of mean-square errors as power series in the sample step size’. Lord *et al*. (2008) and Malham & Wiese (2008) proved that when the diffusion vector fields *commute*, but not necessarily with the drift vector field, then the strong order one and also three-halves Castell–Gaines methods have a mean-square error that is smaller than the mean-square error for the corresponding stochastic Taylor method. However, Lord *et al*. (2008) also proved that, for linear diffusion vector fields that do not commute, the strong order one Castell–Gaines method does not necessarily have a smaller mean-square error than the corresponding stochastic Taylor method. Indeed, there are regions in the phase space where the local error of the stochastic Taylor method is smaller.

Hence, we are left with the following natural question when the diffusion vector fields do not commute. Is there a solution series ansatz *ψ*_{t}=*F*(*φ*_{t}) for some function *F*, for which the mean-square error of the numerical method so constructed is always smaller than the corresponding stochastic Taylor method? In this paper, we answer this question, indeed under the assumption that there is no *drift*, we

(i) prove the mean-square error of an approximate solution constructed from the sinh-log expansion (with

*F*=sinh-log above) is smaller than that for the stochastic Taylor expansion,*to all orders*,(ii) prove that a numerical method based on the sinh-log expansion has a global error that is smaller than that for the corresponding stochastic Taylor method,

(iii) use the Hopf shuffle algebra of words underlying such expansions; in fact, we retract to a

*new*associative algebra of concatenation and shuffle operators, which acts on the Hopf shuffle algebra of words, and(iv) underpin our theoretical results with concrete numerical simulations.

We examine and interpret these statements in detail next, in §2, where we answer the following immediate questions. First, what precisely is the sinh-log approximation and the properties we prove for it? Second, how do we prove the result; what is the connection with Hopf shuffle algebras and the concatenation–shuffle operator algebra mentioned? In §3, we provide the technical specification of the concatenation–shuffle operator algebra and prove some polynomial identities important for our main result. In §4, we present our main result. Then in §5, we discuss the global error result above, and perform numerical simulations confirming our results. We give some concluding remarks in §6.

## 2. Principal ideas

The goal of this section is to motivate and make precise statements about the sinh-log approximation we propose.

### (a) Stochastic series expansion approximations

We begin by outlining the approximation procedure presented in the introduction in more detail. Suppose the smooth function *F* has a real-series expansion of the form , with some finite radius of convergence about *x*=1, and for some coefficient set with *C*_{1}=1. Given the flow map *φ*_{t}, we construct the series

We substitute, into this series expansion, the stochastic Taylor series for the flow map *φ*_{t}. After rearrangement, we get
where

We truncate the series, dropping all terms *V*_{w} with words *w* of length . This generates the approximation , once we have replaced all retained multiple integrals *J*_{u}(*t*) by suitable approximations. Then, in principle, we construct the solution approximation from , where . Performing this reconstruction is non-trivial in general (see §5).

For example, to construct the exponential Lie series approximation of Castell & Gaines (1995), we take and construct the series
where *C*_{k}=1/*k*(−1)^{k−1} for . Substituting the stochastic Taylor series for *φ*_{t}, the series expansion for *ψ*_{t} above becomes the exponential Lie series (see Strichartz 1987; Ben Arous 1989; Castell 1993 or Baudoin 2004 for the full series)
where, for *w*=*a*_{1}…*a*_{n}, we have *V*_{[w]}=[*V*_{a1},[*V*_{a2},…,[*V*_{an−1},*V*_{an}],…,]] and

Here, is the group of permutations of the index set {1,…,|*w*|}, *e*(*σ*) is the cardinality of the set and is the combinatorial number: |*w*|−1 choose *e*(*σ*). Truncating this series and using suitable approximations for the retained *J*_{σ−1○w} produces . We then reconstruct the solution approximately using . The actual solution approximation is then computed by solving the ordinary differential equation generated by the vector field .

To construct the sinh-log approximation, we take *F*=sinh-log so that
where *C*_{1}=1 and *C*_{k}=1/2(−1)^{k−1} for . Again, substituting the stochastic Taylor series for *φ*_{t}, we get the series expansion for *ψ*_{t} shown above with terms *K*_{w}(*t*)*V*_{w}, where the coefficients *K*_{w}(*t*) now explicitly involve the sinh-log coefficients *C*_{k}. Then, in principle, we can reconstruct the solution approximately using

### Remark 2.1

Suppose the vector fields *V*_{i}, *i*=1,…,*d* are sufficiently smooth and *t* sufficiently small (but finite). Then, the approximation constructed using the sinh-log expansion, as just described, is square-integrable. Further, if *y* is the exact solution of the stochastic differential equation, and includes all terms *K*_{w}*V*_{w} involving words of length , then there exists a constant *C*(*n*,|*y*|) such that ; here, |·| is the Euclidean norm. This follows by arguments exactly analogous to those for the exponential Lie series given in Malham & Wiese (2008, theorem 7.1 and appendix A).

### Remark 2.2

Naturally, the exponential Lie series *ψ*_{t} and and its truncation lie in the Lie algebra of vector fields generated by *V*_{1},…,*V*_{d}. Hence, is simply the ordinary flow map associated with the autonomous vector field .

### Remark 2.3

The exponential Lie series originates with Magnus (1954) and Chen (1957); see Iserles (2002). In stochastic approximation, it appears in Kunita (1980), Fleiss (1981), Azencott (1982), Strichartz (1987), Ben Arous (1989), Castell (1993), Castell & Gaines (1995, 1996), Lyons (1998), Burrage & Burrage (1999), Baudoin (2004), Lord *et al*. (2008) and Malham & Wiese (2008).

### (b) Hopf algebra of words

Examining the coefficients *K*_{w} in the series expansion for *ψ* above, we see that they involve linear combinations of products of multiple Stratonovich integrals (we suspend explicit *t* dependence momentarily). The question is, can we determine *K*_{w} explicitly? Our goal here is to reduce this problem to a pure combinatorial one. This involves the Hopf algebra of words (see Reutenauer 1993).

Let be a commutative ring with unit. In our applications, we take or , the ring generated by multiple Stratonovich integrals and the constant random variable 1, with pointwise multiplication and addition. Let denote the set of all non-commutative polynomials and formal series on the alphabet over . With the concatenation product, is the associative *concatenation algebra*. For any two words with lengths |*u*| and |*v*|, we define the *shuffle product* to be the sum of the words of length |*u*|+|*v*| created by shuffling all the letters in *u* and *v* while preserving their original order. The shuffle product is extended to by bilinearity. It is associative, and distributive with respect to addition, and we obtain on a commutative algebra called the *shuffle algebra* (note for any word *w* where 1 is the empty word). The linear signed reversal mapping : *α*○*w*=(−1)^{n}*a*_{n}…*a*_{1}, for any word *w*=*a*_{1}…*a*_{n}, is the *antipode* on . There are two *Hopf algebra* structures on , namely and , where *η* and *ϵ* are unit and co-unit elements, and δ and δ′ are the respective co-products (Reutenauer 1993, p. 27). We define the associative algebra using the complete tensor product
with the shuffle product on the left and the concatenation product on the right (Reutenauer 1993, p. 29). The product of elements is given by , and formally extended to infinite linear combinations in via linearity. As a tensor product of two Hopf algebra structures, itself acquires a Hopf algebra structure.

### (c) Pullback to Hopf shuffle algebra

Our goal is to pullback the flow map *φ* and also *ψ* to (with ). Let be the set of all vector fields on ; it is an -module over (see Varadarajan 1984, p. 6). We know that, for the stochastic Taylor series, the flow map (with vector field composition as product). As , where is the subset of of the compositions of vector fields of length *n*, we can write

The linear *word-to-vector field map* given by *κ*:*ω*↦*V*_{ω} is a concatenation homomorphism, i.e. *κ*(*u**v*)=*κ*(*u*)*κ*(*v*) for any . And the linear *word-to-integral map* given by *μ*:*ω*↦*J*_{ω} is a shuffle homomorphism, i.e. for any (e.g. Lyons *et al*. 2007, p. 35 or Reutenauer 1993, p. 56). Hence, the map is a Hopf algebra homomorphism. The pullback of the flow map *φ* by *μ*⊗*κ* is

All the relevant information about the stochastic flow is encoded in this formal series; it is essentially Lyons’ *signature* (see Baudoin 2004; Lyons *et al*. 2007). Hence, by direct computation, formally we have
where *K*○*w* is defined by
and corresponds to *K*_{w} (indeed it is the pullback *μ***K*_{w} to ; see §3). Having reduced the problem of determining *K*○*w* to the algebra of shuffles, a further simplifying reduction is now possible.

### Remark 2.4

The use of Hopf shuffle algebras in stochastic expansions can be traced through, for example, Strichartz (1987), Reutenauer (1993), Gaines (1994), Li & Liu (2000), Kawski (2002), Baudoin (2004), Ebrahimi-Fard & Guo (2006), Murua (2006), Lyons *et al*. (2007) and Manchon & Paycha (2007), to name a few. The paper by Munthe-Kaas & Wright (2008) on the Hopf algebraic of Lie group integrators actually instigated the Hopf algebra direction adopted here. A useful outline on the use of Hopf algebras in numerical analysis can be found therein, as well as the connection to the work by Connes & Miscovici (1998) and Connes & Kreimer (1998) in renormalization in perturbative quantum field theory.

### (d) Retraction to concatenations and shuffles

For any given word *w*=*a*_{1}…*a*_{n+1}, we now focus on the coefficients *K*○*w*. We observe that it is the concatenation and shuffle operations encoded in the structural form of the sum for *K*○*w* that carry all the relevant information. Indeed, each term is a partition of *w* into subwords that are shuffled together. Each subword *u*_{i} is a concatenation of |*u*_{i}| letters, and so we can reconstruct each term of the form from the following sequence applied to the word *w*:
where the power of the letter *c* indicates the number of concatenations between letters concatenated together in each subword *u*_{i} and the letter *s* denotes the shuffle product between the subwords. In other words, if we factor out the word *w*, we can replace *K*○*w* by a polynomial *K* of the letters *c* and *s*. In fact, in lemma 3.5, we show that
Thus, we are left with the task of simplifying this polynomial into two variables (lying in the real associative algebra of concatenation and shuffle operations).

### Remark 2.5

We devote §3 to the rigorous justification of this retraction, the result above, and those just following. A key ingredient is to identify the correct action of this algebra over .

### Remark 2.6

There is a natural right action by the symmetric group on , the subspace of spanned by words of length *n* (Reutenauer 1993, ch. 8). This action is transitive and extends by linearity to a right action of the group algebra on . We are primarily concerned with shuffles and multi-shuffles, a subclass of operations in , and in particular, we want a convenient structure that enables us to combine single shuffles to produce multi-shuffles.

### (e) Stochastic sinh-log series coefficients

The coefficient set determines the form of the function *F*. Our ultimate goal is to show order by order that the stochastic sinh-log expansion guarantees superior accuracy. Hence, order by order, we allow a more general coefficient set , and show that the sinh-log choice provides the guarantee we seek.

### Definition 2.7 (partial sinh-log coefficient set)

Define the partial sinh-log coefficient sequence
where , and the *C*_{k} are zero for *k*>*n*+1.

With the choice of coefficients {*C*_{k}}, we see that
This has an even simpler form.

### Lemma 2.8

*With the partial sinh-log coefficients* {*C*_{k}}, *the coefficient K is given by K*=1/2(*c*^{n}−*α*_{n})+ϵ *s*^{n}, *where α*_{n}*is the antipode for words of length n*+1.

### Proof.

We think of (*c*−*s*)^{n}, with expansion by concatenation, as the generator for the polynomial (in ; see §3): . Then, by lemma 3.4 in §3, we have the following identity (*c*−*s*)^{n}≡−*α*_{n}. Hence, using the sinh-log coefficients and splitting the first term, we have . ▪

### Corollary 2.9

*For any word w*=*a*_{1}…*a*_{n+1}, *we have*, *and thus*

*Here, we use J*_{α○w}≡(−1)^{|w|} *J*_{ρ○w}, *where ρ is the unsigned reversal mapping, i.e. if w*=*a*_{1}…*a*_{n}, *then ρ*○*w*=*a*_{n}…*a*_{1}.

### Remark 2.10

For the stochastic sinh-log expansion, the coefficients *K*_{w} thus have an extremely simple form. There are several strategies to prove this form. The result can be proved directly in terms of multiple Stratonovich integrals by judicious use of their properties, the partial integration formula and induction—the proof is long but straightforward. That this strategy works is also revealed by the strategy we have adopted in this paper, which we believe is shorter and more insightful.

## 3. Concatenation–shuffle operator algebra

### (a) Algebra and action

With , let denote the set of all non-commutative polynomials and formal series on over . We can endow with the concatenation product or shuffle product, and also generate an associative *concatenation–shuffle operator algebra* on as follows.

### Definition 3.1 (shuffle gluing product)

The map , the associative and bilinear shuffle gluing product, is defined by *g*:*b*_{1}⊗*b*_{2}↦*b*_{1}*s**b*_{2}, i.e. we concatenate the element *b*_{1} with *s* and the element *b*_{2} in as shown.

Endowed with the shuffle gluing product, is an associative algebra with unit element *s*^{−1} (see Reutenauer 1993, p. 26, for the definition of *s*^{−1}). We define the graded associative tensor algebra by , with the shuffle gluing product on the left in and concatenation product on the right in ; here, and denote the subspaces of and , respectively, spanned by words of length *n* and *n*+1, respectively. Thus, if *b*_{1}⊗*u*_{1} and *b*_{2}⊗*u*_{2} are in , then their product is (*b*_{1}⊗*u*_{1})(*b*_{2}⊗*u*_{2})=(*b*_{1}*s**b*_{2})⊗(*u*_{1}*u*_{2}), with extension to by bilinearity.

We now define the homomorphism as follows. Any word , for some and , can be expressed in the form

There are (*k*−1) occurrences of the symbol ‘*s*’ in *b*, and *n*_{1}+*n*_{2}+···+*n*_{k}+*k*−1=|*b*|. Here, *c*^{n} represents the word consisting of *c* multiplied by concatenation *n* times, *c*^{0}=1; similarly for *s*^{n} and *s*^{0}. Then we define
where *w*=*u*_{n1}*u*_{n2}…*u*_{nk} and the successive subwords *u*_{n1},*u*_{n2},…,*u*_{nk} have respective lengths *n*_{1}+1, *n*_{2}+1,…, *n*_{k}+1. Note that the sum of the lengths of the subwords is *n*+1. The map ζ extends by linearity to . The *c*-symbol indicates a concatenation product and the *s*-symbol a shuffle product in the appropriate *n* slots between the *n*+1 letters in any word *w* on of length *n*+1. That ζ is a homomorphism from to follows from .

### Definition 3.2 (partition orbit)

We define the (shuffle) partition orbit, , of a word to be the subset of whose elements are linear combinations of words constructed by concatenating and shuffling the letters of *w*=*a*_{1}…*a*_{n},

For any , there exists a such that *u*=ζ(*b*⊗*w*). Hence, we can consider the preimage of under ζ in given by . Thus, any element in can be identified with an element for a unique and there is a natural projection map .

### (b) Polynomial identities

Here, we prove a sequence of lemmas that combine to prove our main results. The aim of the first two lemmas is to establish a form for the antipode *α* as a polynomial in the concatenation–shuffle operator algebra . We shall denote the antipode in by *α*_{n}; it sign reverses any word .

### Lemma 3.3 (partial integration formula)

*The partial integration formula applied repeatedly to the multiple Stratonovich integral J*_{w}, *where w*=*a*_{1}…*a*_{n+1}, *pulled back to**is given by*.

### Proof.

Repeated partial integration on the integral *J*_{w} with *w*=*a*_{1}…*a*_{n+1}, pulled back to via the word-to-integral map *μ*, generates the identity: . After rearrangement, the projection of this identity in onto via *π*, using the definition for *α*_{n}, generates the identity shown. ▪

### Lemma 3.4 (antipode polynomial)

*The antipode**and polynomial**are the same linear endomorphism on*,

### Proof.

The statement of the lemma is trivially true for *n*=1,2. We assume it is true for *k*=1,2,…,*n*−1. Direct expansion reveals that
By induction, using our assumption in this expansion and comparing with the partial integration formula in lemma 3.3 proves the statement for *k*=*n*. ▪

### Lemma 3.5

*The projection of**onto**via π generates*

### Proof.

For any with |*w*|=*n*+1, we have
Projecting this onto establishes the result. ▪

## 4. Mean-square sinh-log remainder is smaller

Suppose the *flow remainder* associated with a flow approximation is . The remainder associated with the approximation is thus *R*_{t}○*y*_{0}. We measure the error in this approximation, for each , in mean-square by

If we truncate *ψ*=*F*(*φ*) to , including all terms *V*_{w} with words of length , suppose the remainder is *r*, i.e. we have . Then, the sinh-log remainder *R*^{sl} to the corresponding approximate flow , taking the difference with the exact stochastic Taylor flow *φ*^{st}, is given by
where we can ignore the and terms in this section; we comment on their significance in §5. We compare this with the stochastic Taylor flow remainder , where is the stochastic Taylor flow series truncated to include all terms *V*_{w} with words of length . Indeed, we set , and use the *L*^{2} norm to measure the remainder. Hence, for any data *y*_{0}, we have
where the *mean-square excess*
If *E* is positive, then *R*^{sl}○*y*_{0} is smaller than *R*^{st}○*y*_{0} in the *L*^{2} norm.

## Theorem 4.1

*Suppose we construct the finite sinh-log expansion ψ using the partial sequence of sinh-log coefficients* {*C*_{k}} *and truncate ψ, producing*, *which only includes terms with words w, with*. *Then, the flow remainders for the sinh-log and the corresponding stochastic Taylor approximations are such that, for any data y*_{0}*and order**the mean-square excess is given by E*=*E*_{0}−ϵ*E*_{1}−ϵ^{2}*E*_{2}, *where E*_{0}>0,*E*_{2}>0 *and*
*where, for n even, we have*
*and*

## Proof.

If we truncate the sinh-log series flow map including all integrals associated with words *w* of length *n*, the remainder is given by
where henceforth we will ignore integrals in the remainder with (we will return to their possible significance in §5). Recall that from corollary 2.9, we have . The corresponding stochastic Taylor flow-map remainder is . The difference between the two is , where . The mean-square excess to the sinh-log remainder is *E*, which at leading order is

We need to determine whether this quantity is positive definite or not. We refer to as the *cross-correlation* terms and as the *auto-correlation* terms. The forms for *K*_{u} and imply that

Consider the zero-order ϵ^{0} term. Using lemma 4.3, the expectation of the cross-correlation term therein (the first term on the right above) is zero. Hence, we have
which is strictly positive. At the next order ϵ^{1}, the terms shown are solely from cross-correlations, with the auto-correlation terms cancelling with other cross-correlation terms. When *n* is odd, the expectation of this term is zero, again using lemma 4.3. When *n* is even, we get the expression for *E*_{1} stated in the theorem. Finally, at order ϵ^{2}, the coefficient shown is the auto-correlation term multiplied by minus one. Explicitly, we see that
which again is strictly positive. Combining these results generates the form for *E* stated. ▪

## Corollary 4.2

*When n is odd, E is positive and maximized when ϵ*=0. *When n is even, it is positive at ϵ*=0, *but maximized at a different value of ϵ; the maximizing value will depend on the vector fields*.

## Lemma 4.3

*For any pair**we have that*.

## Proof.

Every Stratonovich integral *J*_{w} is a linear combination of Itô integrals , where the set consists of *w* and all multi-indices *u* obtained by successively replacing any two adjacent (non-zero) equal indices in *w* by 0, see Kloeden & Platen (1999, eqn (5.2.34)). As all indices in *w* are non-zero by assumption, the constant *c*_{u} is given by *c*_{u}=(1/2)^{n(u)}, where *n*(*u*) denotes the number of zeros in *u*. Because adjacency is retained when reversing an index, it follows that . Lemma 5.7.2 in Kloeden & Platen (1999) implies that the expected value of the product of two multiple Itô integrals is of the form
for some function *f*. Here, ℓ(*u*) denotes the number of non-zero indices in *u*, while *k*_{0}(*u*) denotes the number of zero components preceding the first non-zero component of *u*, and *k*_{i}(*u*), for *i*=1,…,ℓ(*u*), denotes the number of components of *u* between the *i*th and (*i*+1)th non-zero components, or the end of *u* if *i*=ℓ(*u*). It follows that *k*_{i}(*u*)=*k*_{ℓ(u)−i}(*ρ*○*u*). As all other operations in the arguments of *f* are unchanged by permutations in *u* and *v*, we deduce that , and consequently, . ▪

## 5. Practical implementation

### (a) Global error

We define the *strong global error* associated with an approximate solution over the global interval [0,*T*] as . Suppose the exact *y*_{T} and approximate solution are constructed by successively applying the exact and approximate flow-maps *φ*_{tm,tm+1} and on *M* successive intervals [*t*_{m},*t*_{m+1}], with *t*_{m}=*m**h* for *m*=0,1,…,*M*−1 and *h*=*T*/*M* as the fixed step size, to the initial data *y*_{0}. A straightforward calculation shows that, up to higher order terms, we have
where and (see Lord *et al*. 2008 or Malham & Wiese 2008). Note that the flow remainder *R*_{tm,tm+1} always has the form
where for the sinh-log series , for the exponential Lie series (for each term in the linear combination *V*_{[w]}) and for the stochastic Taylor series . Substituting this into , we see that has the form
where

This formula outlines the contribution of the standard accumulation of local errors, over successive subintervals of the global interval of integration, to the global error. Note that, to leading order, we have

For the term , we focus for the moment on the case *m*<ℓ (our final conclusions below are true irrespective of this). At leading order, we have
and

Substituting these expressions into the form for above, we get

This breakdown allows us to categorize the different mechanisms through which local errors contribute to the global error at leading order. Indeed, in the local flow remainder *R*, we distinguish terms as follows.

(i)

*Zero expectation*. Terms with |*w*|=*n*+1 of zero expectation generate terms of order*h*^{n}in through two routes, through and the last term in the expression for just above. In , they generate order*h*^{n+1}terms, and the single sum over*m*means that their contribution to the global error is order . In the last term in , they generate, when the expectations of the products indicated are non-zero, terms of order*h*^{n+2}; the double sum over ℓ and*m*is then order*h*^{n}. Higher order terms with zero expectation simply generate a higher order contribution to the global error.(ii)

*Non-zero expectation*. Terms with |*w*|=*n*+1 of non-zero expectation will generate, through the first term in , terms of order*h*^{n−1}—not consistent with an order*n*/2 integrator with global mean-square error of order*h*^{n}. If any such terms exist in*R*, their expectation must be included (which is a cheap additional computational cost) in the integrator, i.e. we should include in . This will mean that the term left in*R*is , which has zero expectation and contributes to the global error through mechanism (i) above. Further, terms with |*w*|=*n*+2 of non-zero expectation will generate terms of order*h*^{n}in , i.e. they will also contribute at leading order through the first term in .

The terms of non-zero expectation in *R*, which contribute at leading order to , appear either as natural next order terms or through the higher order terms mentioned in the last section. We will see this explicitly in §5*b* presently. We can, with a cheap additional computational cost, include them through their expectations in the integrators, so that when we compare their global errors, the corresponding terms left in the remainders have zero expectation and are higher order (and thus not involved in the comparison at leading order). Further, fortuitously, the terms of zero expectation that contribute in (i) through the last term in are exactly the same for the stochastic Taylor and sinh-log integrators. This is true at all orders and is a result of the following lemma, and that for the stochastic Taylor and sinh-log expansions when |*a*|=1, then .

### Lemma 5.1

*Suppose**and* |*a*|=1, *then*.

### Proof.

If |*w*| is even, the expectations on both sides are zero. If |*w*| is odd, in the shuffle products and , pair off terms where the letter *a* appears in the same position in an individual term of and the reverse of an individual term of . The pair, with the one-half factor, will have the same expectation as the corresponding term in shuffle product . ▪

Lastly, in the sinh-log remainder, the terms *K*_{w} with |*w*|=*n*+1 all have zero expectation. Similarly, in the stochastic Taylor remainder, the terms *J*_{w} with |*w*|=*n*+1 odd also all have zero expectation. However, when |*w*|=*n*+1 is even, then some terms may have non-zero expectation, and as described above, their expectation would be included in the integrator, leaving the terms in the stochastic Taylor remainder. In these cases, the argument in §4 comparing the local errors needs to be modified slightly, with the terms in replaced by . This results in the term in *E*_{0} in the proof of theorem 4.1 being replaced, when |*u*| and |*v*| are even, by . This still implies that *E*_{0} is strictly positive. Contributions through the last term in by the terms are the same as *J*_{w}*V*_{w}.

### (b) Simulations

We will demonstrate the properties we proved for the sinh-log series for numerical integration schemes of strong orders one and three halves. We will consider a stochastic differential system with no drift, two driving Wiener processes and non-commuting governing *linear* vector fields *V*_{i}○*y*≡*a*_{i}*y* for *i*=1,2.

We focus on the strong order one case first, and extend the analytical computations in Lord *et al*. (2008). With *n*=2,*C*_{1}=1 and *C*_{2}=−1/2+ϵ, the mean-square excess *E*, for general , is given by

Here, and are both 4*N*×*N* real matrices and the 4*N*×4*N* real matrix *B*(ϵ) consists of *N*×*N* blocks of the form *b*(ϵ)⊗*I*_{N} (here ⊗ denotes the Kronecker product), where *I*_{N} is the *N*×*N* identity matrix and *b*(ϵ) is

Each of the eigenvalues of *b*(ϵ) are *N* multiple eigenvalues for *B*(ϵ). The eigenvalues of *b*(ϵ) are shown in figure 1. The sinh-log expansion corresponds to ϵ=0, while the exponential Lie series corresponds to ϵ=1/6. The eigenvalues of *b*(0) are 5/24 and zero (thrice), confirming our general result for the sinh-log expansion. However, the eigenvalues of *b*(1/6) are 5/24, 0.5264, 0.1667 and −0.0264. One negative eigenvalue means that there are matrices *a*_{1} and *a*_{2} and initial conditions *y*_{0} for which the order one stochastic Taylor method is more accurate, in the mean-square sense, than the exponential Lie series method (for linear vector fields, we also call this the Magnus method). From figure 1, we deduce that, for any small values of ϵ away from zero, we cannot guarantee *E*>0 for all possible governing vector fields and initial data. The strong order one sinh-log integrator is optimal in this sense. This is also true at the next order from corollary 4.2.

For our simulations, we take *N*=2 and set the coefficient matrices to be
In figure 2, using these matrices, we plot the mean-square excess *E*^{ls} for the exponential Lie series and *E*^{sl} for the sinh-log series, as a function of the two components of *y*_{0}. We see there are regions of the phase space where *E*^{ls} is negative—of course *E*^{sl} is positive everywhere. Hence, if the solution *y*_{t} of the stochastic differential system governed by the vectors fields *V*_{i}○*y*=*a*_{i}*y*, *i*=1,2, remains in the region where *E*^{ls} is negative, then the numerical scheme based on the order one exponential Lie series is less accurate than the stochastic Taylor method. Note that for the stochastic Taylor method, we need to include the terms
in the integrator. These are the expectation of terms with |*w*|=4 that contribute at leading order in the global error (only), and which can be cheaply included in the stochastic Taylor integrator. For the exponential Lie series, we include the terms
in . These are non-zero expectation terms with |*w*|=4 that contribute at leading order in the global error through , where *C*_{2}=−1/2. In the same vein, for the sinh-log integrator, we include in the terms

Figure 3 shows the global error versus time for all three integrators for the linear system. We used the global interval of integration [0,0.0002], starting with *y*_{0}=(19.198,28.972)^{T}, and step size *h*=2.5×10^{−5}. With these initial data, the small global interval of integration means that all 10 000 paths simulated stayed within the region of the phase space where *E*^{ls} is negative in figure 2.

The error for the exponential Lie series integrator, we see in figure 3*a*, is larger than that for the stochastic Taylor integrator. The error for the sinh-log integrator is smaller, though only marginally so. In fact, it is hardly discernible from the stochastic Taylor plot, so figure 3*b* shows a magnification of the left region of the plot in figure 3*a*. We plot the differences between the errors in figure 3*c* to confirm the better performance of the sinh-log integrator over the global interval. Further, estimates for the local errors for the sinh-log and Lie series integrators from the data in figure 3, of course, quantitatively match analytical estimates for the mean-square excess *E* above.

### Remark 5.2

Generically, the Castell–Gaines method of strong order one markedly outperforms the sinh-log method (which itself outperforms the stochastic Taylor method more markedly). However, as we have seen, there are pathological cases for which this is not true.

In figure 3*d*, we compare the global errors for the strong order three-halves sinh-log and stochastic Taylor methods, with governing linear vector fields with coefficient matrices
and initial data *y*_{0}=(1,1/2)^{T}. Again, as expected, we see that the stochastic Taylor method is less accurate than the sinh-log method for sufficiently small step sizes.

### Remark 5.3

There is one caveat we have not mentioned thus far. Constructing the approximation from is, in general, non-trivial. For linear vector fields *V*_{i}○*y*= *a*_{i} *y*, we know that is simply a matrix, and we can straightforwardly construct using the matrix square root. For general nonlinear vector fields, we have not, as yet, found a superior method to simply expanding the square root shown to a sufficient number of high degree terms.

## 6. Concluding remarks

We have shown that the mean-square remainder associated with the sinh-log series is always smaller than the corresponding stochastic Taylor mean-square remainder, when there is no drift, to all orders. As the order one-half sinh-log numerical method is the same as the order one-half Castell–Gaines method, it trivially inherits the asymptotically efficient property as well (indeed, if we include a drift term as well). We have not endeavoured to prove asymptotic efficiency more generally. However, in §5, we demonstrated that for two driving Wiener processes, the order one sinh-log numerical method is optimal in the following sense. From figure 1, we see that any deviation of ϵ from zero will generate a negative eigenvalue for *b*(ϵ). Consequently, there exist vector fields such that the mean-square excess will be negative in regions of the phase space. Further, from corollary 4.2, the order three-halves sinh-log integrator is also optimal in this sense. These results are only true when there is no drift, and it could be argued that our simulations and demonstrations are somewhat academic. However, the results we proved have application in splitting methods for stochastic differential equations. For example, in stochastic volatility simulation in finance, Ninomiya & Victoir (2006) and Halley *et al*. (2008) simulated the Heston model for financial derivative pricing, and used splitting to preserve positivity for the volatility numerically. They employed a Strang splitting that separates the diffusion flow from the drift flow and requires a distinct simulation of the purely diffusion governed flow.

Why is the sinh-log expansion the answer? This result is intimately tied to the mean-square error measure we chose. The terms in the remainder of any truncation contain multiple Stratonovich integrals. Associated with each one is a mean-square error. There is a structure underlying these expectations. The sinh-log expansion somehow encodes this structure in an optimal form, it emulates the stochastic Taylor information more concisely. The next question is of course, what is the best structure when we include drift? Answering this is our next plan of action.

## Acknowledgements

We would like to thank Peter Friz, Terry Lyons and Hans Munthe-Kaas for interesting and useful discussions related to this work. We would also like to thank the anonymous referees for their critique and suggestions that helped to improve the original paper.

## Footnotes

Dedicated to Mrs Jeanne Anne Malham.

- Received April 16, 2009.
- Accepted August 18, 2009.

- © 2009 The Royal Society