## Abstract

The aim of this paper is to introduce a new numerical algorithm for solving the continuous time nonlinear filtering problem. In particular, we present a particle filter that combines the Kusuoka–Lyons–Victoir (KLV) cubature method on Wiener space to approximate the law of the signal with a minimal variance ‘thinning’ method, called the tree-based branching algorithm (TBBA) to keep the size of the cubature tree constant in time. The novelty of our approach resides in the adaptation of the TBBA algorithm to *simultaneously* control the computational effort and incorporate the observation data into the system. We provide the rate of convergence of the approximating particle filter in terms of the computational effort (number of particles) and the discretization grid mesh. Finally, we test the performance of the new algorithm on a benchmark problem (the Beneš filter).

## 1. Introduction

The main goal of stochastic filtering is to estimate the state of a dynamical system based on partial observation. We model the dynamical system by a stochastic process *X*={*X*_{t}}_{t≥0}, called the *signal*. We do not observe the signal directly. Instead, we make use of the information provided by observing another process *Y* ={*Y* _{t}}_{t≥0}, called the *observation process*. The information process is, at each instant of time, a functional of the signal until that time and some measurement noise, i.e. *Y* _{t}=*Γ*({*X*_{s}}_{s≤t},*W*_{t}), where {*W*_{t}}_{t≥0} is another stochastic process modelling the noise. In mathematical terms, the problem reduces to compute the following conditional expectation, , where is a suitable space of test functions, and is the filtration generated by the observation process. In other words, we are interested in computing the conditional distribution of the signal *X*_{t} given , which can be viewed as a probability measure-valued process *π*={*π*_{t}}_{t≥0}. With notable exceptions (such as the Kalman–Bucy filter and the Beneš filter), *π* is an infinite-dimensional process. One cannot find an analytical computable expression for *π* and has to rely on numerical approximations for inference purposes. In the numerical experiments below, we will make use of the explicit solution for the Beneš filter to test the accuracy of our algorithm. There are a number of different numerical methods for solving the filtering problem, ranging from the solution of partial differential equation to Wiener chaos expansions [1, ch. 8]. One of the most successful approaches, which is widely used in practice, is the class of the particle approximations. In this approach, the conditional distribution *π*_{t} is approximated by the empirical distribution of a system of random-weighted particles. The classical particle filter, first introduced by Gordon *et al.* [2], uses a correction mechanism that eliminates, at particular times, the particles with small weights and multiplies those with bigger weights, maintaining the total number of particles in the system constant. However, this procedure adds some randomness to the system, diminishing the accuracy of the approximation. Hence, it is appropriate to use a technique that minimizes this undesired effect. Crisan & Lyons [3] introduced the tree-based branching algorithm (TBBA), which satisfies a minimal variance property. Using the TBBA, the correction in the particle system can be performed optimally. Another aspect of these branching particle filters is the choice of the resampling times. Most of the theoretical results assume fixed, deterministic times of resampling and this is the approach that we will follow. Nevertheless, in practice, these times are randomly selected in terms of some overall characteristics of the particle systems, see Del Moral *et al.* [4] and Crisan & Obanubi [5] for a theoretical study of this problem.

In the standard particle filter, the component particles follow the law of the signal. Usually, we model the signal by means of a stochastic differential equation (SDE) driven by a Brownian motion. A classical result tells us that *π*_{t}(*φ*) can be expressed as the expected value of a functional of the signal parametrized by the given observation path. Naturally, an efficient approximation of the law of the signal would give a good approximation of *π*_{t}(*φ*). In recent years, Kusuoka [6], Lyons & Victoir [7] and Kusuoka [8] among others have introduced high-order schemes for solving SDEs, known as cubature on Wiener space or Kusuoka–Lyons–Victoir (KLV) methods. Surprisingly, these methods are essentially deterministic. They involve the construction of a discrete (deterministic) measure with support given by leaves of an *n*-ary tree, with the nodes being obtained by solving ordinary differential equations (ODEs). Unfortunately, this tree-like structure makes the number of ODEs to be solved increase exponentially. In order to counter this feature, the KLV cubature methods can be combined with a partial sampling procedure, particularly useful when the dimension of the SDE to solve is high or the final time is large. The use of cubature methods for solving the stochastic filtering problem has been suggested in Crisan & Ghazali [9] and Litterer & Lyons [10]. Moreover, the area of application of these methods is expanding continuously, see, for instance, Crisan & Manolarakis [11], where they are used to solve backward SDEs.

In this paper, we present a new numerical algorithm to solve the nonlinear stochastic filtering problem. This algorithm is based on a combination of the KLV method and the TBBA. The KLV method is used to compute a high-order approximation of the law of the signal *X*, whereas the TBBA is used to partially prune the KLV tree in a coherent manner. In our approach, the weights of the TBBA are computed taking into account both the cubature weights (the weights of the discrete measure) and the likelihood weights. In this way, we can simultaneously control the computational effort at each time step and mitigate the sample degeneracy.

The paper is organized as follows. In §2, we introduce some basic notation on multi-indices and vector fields necessary to present the cubature on Wiener space. In §3, we introduce the basic results on cubature; in particular, we give the main bound on the local and global error of the method. Section 4 is devoted to the detailed description of the filtering problem. In addition, we also introduce the Crisan–Ghazali approach to apply cubature methods to filtering. In §5, we recall the TBBA algorithm, recall the basic properties of the random variables generated by the method and describe in detail the construction of the associated trees. In §6, we introduce the new algorithm and prove the convergence of the particle approximation. A variation of the algorithm where the likelihood weights are not taking into account when pruning the KLV-tree is also introduced. Finally, in §7, we test the new algorithm on the Beneš filter.

## 2. Basic notation and preliminaries

Here, we introduce some basic notation on vector fields and multi-indices used to present the Stratonovich–Taylor expansion and the main results on cubature.

Let , and let be the set of all multi-indices with values in {0,…,*p*}, i.e. . For any (non-empty) multi-index , define its length by , its degree by , the truncated index of length *i*=1,…,*k* by and . We also define the set .

Let denote the space of -valued infinitely differentiable bounded functions defined on whose derivatives of any order are bounded. Recall that can be viewed as a vector field (or a first-order differential operator) on . Given a family of vector fields , *V* _{α} will stand for the usual composition of vector fields, i.e. *V* _{α}=*V* _{α1}°⋯°*V* _{αk}.

Consider the probability space , where is the space of -valued continuous functions starting at 0, its Borel *σ*-algebra and the Wiener measure. Also consider the coordinate mapping process , which under is a Brownian motion starting at 0. For *ω*∈*Ω*, we make the convention *ω*^{0}(*t*)=*t* and . Let *X*_{t,x} be the unique solution of the following *d*-dimensional SDE written in Stratonovich form
2.1where .

Given *f*, a sufficiently smooth function, and *X*_{t,x}, the solution of (2.1), we can expand *f*(*X*_{t,x}) in terms of iterated Stratonovich integrals. The precise statement is as follows (see proposition 2.1 in Lyons & Victoir [7], and Kloeden & Platen [12] for more details on stochastic Taylor expansions).

### Lemma 2.1 (Stratonovich–Taylor expansion)

*Let* . *Then*
*where* . *The remainder process R*_{m}(*t*,*x*,*f*) *satisfies*
*where C*=*C*(*m*) *is a positive constant that only depends on m*.

## 3. Cubature method on Wiener space

The cubature method on Wiener space is an infinite-dimensional extension of cubature methods on . In this framework, the role of polynomials is played by iterated Stratonovich integrals, and the role of Taylor expansions is played by Stratonovich–Taylor expansions.

### (a) One-step cubature measure

Let *X*_{t,x} be the unique solution of equation (2.1). We choose a version of *X*_{t,x} that coincides on , the subspace of of functions of bounded variation, with the pathwise solution.

### Definition 3.1

A measure assigning positive weights *λ*_{1},…,*λ*_{cd} to paths is a cubature measure of degree for the Wiener measure, if for all ,
The constant *c*_{d}=*c*_{d}(*m*,*p*) only depends on the degree and the dimension of the Brownian motion.

Lyons & Victoir [7] proved that one can always find a cubature measure supported on at most continuous paths of bounded variation, see theorem 2.4 therein. They also gave an explicit expression of degree-five cubature measure. Gyurkó & Lyons [13] have constructed cubature formulae of higher degrees and for various dimensions of the driving Brownian motion.

### Remark 3.2

Assume is a cubature measure of degree *m* on [0,1]. Then, for any *T*>0, as , we have that is a cubature measure of degree *m* on [0,*T*], where

### Definition 3.3

We define the cubature approximation of degree *m* of by .

### Remark 3.4

Let *u*(*t*,*x*) be the solution at time *t* of ∂*u*/∂*t*(*t*,*x*)=*Lu*(*t*,*x*),*u*(0,*x*)= *f*(*x*), where the operator *L* is defined by . Then, . Hence, is an approximation of the semigroup *P*_{T}*f*(*x*), which has infinitesimal generator *L*. In other words, the cubature on Wiener space can be used to produce a high-order approximation method for solving second-order parabolic differential equations.

The main tool to bound the approximation error relies on the Stratonovich–Taylor expansion and is stated in lemma 3.5, which is proved in Litterer & Lyons [10, lemma 27.2].

### Lemma 3.5

*Let* . *Then, the remainder process R*_{m}(*T*,*x*,*f*) *in the Stratonovich–Taylor expansion of f*(*X*_{T,x}) *satisfies*
*where* *is a positive constant that only depends on p*,*m and* .

Using lemma 3.5, lemma 2.1 and the triangule inequality, one obtains a bound for the error of the one-step cubature approximation, see Lyons & Victoir [7, proposition 3.2].

### Proposition 3.6

*Let* *be a degree m cubature measure then*
*where* *is a positive constant that only depends on p*,*m and* .

### (b) Iterated cubature measure

In general, the bound obtained in proposition 3.6 does not allow to directly obtain a good approximation of *P*_{T}*f*(*x*) when *T* is large. To overcome this difficulty, one iterates the cubature measure along a partition of [0,*T*]. We will denote by , the subpartitions of *Π*^{n,T} and .

### Definition 3.7

Let the measure define a cubature formula of degree *m* on [0,1], and *Π*^{n,T} be a partition of [0,*T*]. The global cubature measure of degree *m* is defined by
where denotes the concatenation of the paths *ω* and .

It is also useful to view the cubature formulae on Wiener space as Markov operators acting on discrete measures on . This interpretation justifies the following definition introduced by Litterer & Lyons [10].

### Definition 3.8

Given a positive measure on and a cubature measure of degree *m* on [0,1], we define the KLV_{m} operation of degree *m* with respect to *μ* over a time step *s* by , where *X*_{s,xi}(〈*s*,*ω*〉_{j}) is the solution, at time *s*, of the following ODE

Note that one can identify the measure *μ* in definition 3.8 as a discrete distribution over all initial conditions . One can also iterate the KLV_{m} operation along a partition *Π*^{n,T}.

### Definition 3.9

Let be a cubature measure of degree *m* on [0,1] and let *Π*^{n,T} be a partition of [0,*T*]. The KLV_{m} operation of degree *m* along *Π*^{n,T} is defined recursively by
and .

Some remarks are in order.

### Remark 3.10

Let denote the set of multi-indices . For any define and points by setting inductively
and . Then, the global cubature measure along *Π*^{n,T} can be written as the following discrete measure on paths , whereas the KLV_{m} operation along *Π*^{n,T} can be written as the following discrete measure on , . Moreover, .

### Remark 3.11

The iterative procedure to generate can be viewed as an *c*_{d}-ary tree, which we will call the cubature tree. Hence, the support of the measure (and of KLV_{m}(*Π*^{n,T},*x*)) grows exponentially with the number of subintervals of the partition. In particular, we have to solve ODEs to obtain the points in the support of KLV_{m}(*Π*^{n,T},*x*). When *n* is large, the computational cost associated to solving these ODEs cannot be ignored, and some mechanism to control the size of the support of is needed. The basic approach is to allow the size of the tree to grow only up to a constant decided by the user and then to keep it constant by culling the branches with small weights. The procedure can be random, as in Ninomiya [14], where it is proposed to use the TBBA algorithm of Crisan and Lyons. Litterer & Lyons [10] have recently introduced a deterministic recombination procedure that essentially allows to change the original cubature measure with a measure with smaller support, without increasing the error.

### Remark 3.12

As our results will be based on the cubature approximation to Picard's filter in Crisan & Ghazali [9], see theorem 4.4, we refrain to present a detailed study of the error for the global cubature measure. For further details and developments (e.g. the application of cubature to Lipschitz test functions), we refer the reader to Lyons & Victoir [7], Kusuoka [8], Crisan & Ghazali [9], Cass & Litterer [15] and Nee [16].

## 4. Cubature applied to filtering

Here, we introduce the set-up for the filtering problem [1, ch. 3]. We also present the approach by Crisan & Ghazali [9] to the application of cubature on Wiener space to filtering.

### (a) Stochastic filtering set-up

Let be the probability space defined on §2, assumed to accommodate a *k*-dimensional Wiener process *W* independent of *B*. Let be a filtration satisfying the usual conditions of completeness and right continuity. In this probability space, we consider a partially observed system (*X*,*Y*)={(*X*_{t},*Y* _{t})}_{0≤t≤T}. The unobserved process *X*={*X*_{t}}_{0≤t≤T}, called *the signal*, is the solution of the *d*-dimensional Stratonovich SDE (2.1). The observed component *Y* ={*Y* _{t}}_{0≤t≤T}, called *the observation process*, is given by the following *k*-dimensional process
where is a bounded measurable function, and is a *k*-dimensional -Brownian motion independent of *B*. Let be the usual augmentation of the filtration generated by the process *Y* , i.e. where are the *P*-null sets of . The stochastic filtering problem consists in determining the conditional distribution *π*_{T} of the signal *X* at time *T* given the information accumulated from observing *Y* in the time interval [0,*T*]; that is, for *φ* bounded Borel measurable, it consists in computing . The process
is an -martingale. For a fixed 0≤*t*≤*T*, we can define a new probability measure on via . By the martingale properties of , the family of probability measures is consistent and this property allows us to define a new probability which is equivalent to on . By means of Girsanov's theorem, *Y* becomes, under , a Brownian motion independent of the signal *X*. Note also that the law of *X* is invariant under this change of probability measure. In order to construct numerical algorithms to approximate *π*_{T} one relies, crucially, in the Kallianpur–Striebel (KS) formula, see Kallianpur & Striebel [17], *π*_{T}(*φ*)=*ρ*_{T}(*φ*)/*ρ*_{T}(**1**), where *ρ*_{t}, called the un-normalized conditional distribution, is given by
Thanks to the KS formula, the problem is reduced to find an approximation of the above functional.

### Remark 4.1

*ρ*_{T}(*φ*) is the expected value of a functional of the signal *X*, which is parametrized by the observation process *Y* . This representation shows the fact that the signal *X* enters the problem only through the evolution of its law, whereas its path properties are not relevant. On the other hand, the observed path of *Y* determines the functional to be integrated and the distribution of *Y* only plays a secondary role. In practice, we will know only the values of *Y* along the points in a partition of [0,*T*] and we may not know the law of *X*. Hence, we will need to approximate the *Y* -dependent functional of *X* as well as the law of *X*. Therefore, the filtering problem can be viewed as a particular case within the theory of weak approximations of SDEs.

It follows from remark 4.1 that the design of an approximating scheme for *ρ*_{T}(*φ*) should contain the following three components:

— the discretization the

*Y*-dependent functional of*X*.— the approximation of the law of the signal

*X*.— the control of the computational effort.

### (b) Picard's filter

Here, we introduce the discretization of the *Y* -dependent functional of *X* to be integrated. This discretization was first introduced by Picard [18] and we shall call it Picard's filter, see also Clark [19] and Talay [20]. Assume that we have a uniform partition of the interval [0,*T*] and that we know {*Y* _{ti}}_{i=0,…,n}, the values of the observation process *Y* on *Π*^{n,T}. For any , we can define the function by , where , are the following functions and . We define Picard's filter by .

The following result was proved in theorem 1 of Picard [18]. See also Crisan [21] for an updated account on the discretization of the continuous time filtering problem.

### Theorem 4.2

*Let φ be a bounded and Lipschitz continuous function. Then, there exists a constant* *independent of n such that* *.*

### Remark 4.3

Theorem 4.2 shows that, for uniform partitions, is a first-order approximation of *ρ*_{T}. As the algorithms we are going to develop will be based on the Picard discretization, the error of these algorithms when approximating *ρ*_{T} will not be better than *C*/*n*.

### (c) The cubature approximation

The second step is to approximate the law of the signal *X*. We define the cubature approximation of Picard's filter by . In order to analyse the error when approximating by , it is convenient to introduce an alternative representations for Picard's filter and its cubature approximation. We define operators and for , and *t*∈(0,*T*] by
To simplify the notation, we also define and for 1≤*i*<*j*≤*n*. Then, we have that

The main result concerning the cubature approximation of Picard's filter is theorem 4.1 proved in Crisan & Ghazali [9], theorem 4.3. The result basically says that is an approximation of order (*m*−1)/2 of , where *m* is the degree of the cubature measure.

### Theorem 4.4

*There is a positive constant C=C(T,m,p) such that for all* *, we have* *where
*

### Corollary 4.5

*There is a positive constant* *such that for all* *we have* .

### Proof.

Using the triangle inequality and the estimates in theorems 4.2 and 4.4, we have that . The result follows from applying the Cauchy–Schwarz inequality to the following inequality
and the fact that ∥*ρ*_{T}(**1**)^{−1}∥_{p} is finite for any *p*≥1. □

We will need the following lemmas regarding the cubature approximation of Picard's filter.

### Lemma 4.6

*Given the notation in remark 3.10, we have*
4.1*where* *are called the filtering weights and by convention x*_{β[0]}=*x*.

### Proof.

Note that and
Set *δ*=*T*/*n*. We have that where we have used the notation . Applying to , we obtain
where we have used the notation . Iterating this procedure we obtain the result. □

### Remark 4.7

From lemma 4.6, it follows that the computation of the cubature approximation of Picard's filter requires knowledge of all intermediate nodes in the cubature tree, contrasting to the typical use of cubature methods where the knowledge of the leaves is sufficient to compute the approximation. Obviously, this is due to the particular form of the functional to be integrated that depends explicitly on the values of *X*_{t} along the points of the partition and not just on the terminal value.

### Lemma 4.8

*For any p*≥1, *we have that* .

### Proof.

Lemma 4.6 and Jensen inequality yield that . By the definition of the filtering weights, we can write
As *Y* is a *k*-dimensional standard Brownian motion under , we have that
Hence,
□

## 5. The control of the computational effort

The TBBA is a method that assigns a number of particles to different sites, according to a probability distribution with finite support on the sites. The computational effort is controlled as it is proportional to the number of particles. One can see the TBBA as a method to generate rational-valued random distributions which are unbiased estimators of the original probability distribution. The interesting feature of the method is that the assignment is carried out, so that it satisfies a certain minimum variance property. The results presented here can be extended to probability distributions with infinite support, see Crisan & Lyons [3].

Let be a given set and be a set of non-negative numbers adding up to 1, which can be seen as a probability distribution with support . The problem is to generate a family of random variables , defined on some probability space , with values in {0,…,*N*} and such that
5.1
5.2
and
5.3where denote the set of all random variables with values in {0,…,*N*} and satisfying (5.1). Let [*x*] denote the integer part of , and {*x*}=*x*−[*x*] denote the fractional part of . It is immediate that any family of random variables with marginal distributions given by
satisfies the minimal variance property (5.3) and the unbiasedness condition (5.1). It can be helpful to use a ‘particle’ picture to describe the random variables in the set . Essentially, one can think that is the (un-normalized) empirical measure associated to a set of *N* particles that are allocated to the sites . Hence, represents the number of particles allocated to site *x*_{i}. This number is random and its mean is given by *Nγ*_{i} (which is not necessarily integer). However, its generation is not straightforward as condition (5.2) makes the random variables corresponding to different sites *x*_{i} to be correlated. The TBBA precisely allows to construct a family of random variables satisfying (5.1), (5.3) and the additional condition (5.2). The name of the algorithm comes from the fact that it can be described using a binary tree structure, see Crisan & Lyons [3], Ninomiya [14] and Ninomiya [22]. The description is as follows.

1. We start with a

*k*-ary tree. This tree has a root node initially storing*N*particles and*k*leaves that represent the sites where the particles have to be allocated.2. We embed the

*k*-ary tree into a binary tree satisfying the following rules.(a) The set of all leaves of the tree is .

(b) Each node

*z*of the tree has a positive weight*γ*_{z}.(c) If two different nodes share the same parent, then their weights add up to the weight of the parent.

3. We move the

*N*particles down along the tree until they obtain to the leaves using the following TBBA rules:(a) We start by allocating all

*N*particles to the root node (the corresponding weight of the root is ).(b) We then proceed recursively as follows: let

*z*be a node with particles and weight*γ*_{z}. If*z*has two child nodes*z*_{1}and*z*_{2}, then*γ*_{z}=*γ*_{z1}+*γ*_{z2}and we will split the particles associated to*z*into particles associated to*z*_{1}and particles associated to*z*_{2}, i.e. , according to the following two possible cases.If [

*Nγ*_{z}]=[*Nγ*_{z1}]+[*Nγ*_{z2}], then , where*u*_{m}∼Bernoulli({*Nγ*_{z1}}/{*Nγ*_{z}}).If [

*Nγ*_{z}]=[*Nγ*_{z1}]+[*Nγ*_{z2}]+1, then , where*u*_{m}∼Bernoulli((1−{*Nγ*_{z1}})/(1−{*Nγ*_{z}})).

Note that for each intermediate node in the tree, we need to generate a random variable *u*_{m}. These random variables are independent of each other. Assume that we have an *n*-times iterated *k*-ary tree such that at the first level of the tree, we have a probability distribution . Moreover, assume that the probability distributions in the next levels are generated by iterating the distribution in the first level, i.e. . The TBBA provides an approximation of the probability distribution not just at the final level, but also at all intermediate levels. Let *z* be a node in the iterated *k*-ary tree with particles assigned and *γ*_{z} weight. The algorithm that allocates the particles in *z* to its *k* direct descendants according to the probability law is as follows:

Using this notation, the approximation to the probability measure *Γ* with support is given by . The algorithm generates a (random) measure supported on at most *N* sites of the original *k*, as it is the empirical distribution of *N* particles. Some of the properties satisfied by the random variables are stated in the following proposition, see proposition 9.3. in Bain & Crisan [1].

### Proposition 5.1

*The random variables* *satisfy* (5.1)–(5.3) *with* . *Moreover*, *are negatively correlated, i.e.* .

Note that for any bounded function , we obtain that
where we have used that are negatively correlated, the bound on the variance of stated in proposition 5.1 and that is a probability distribution. The previous bound entails that is an unbiased approximation to *γ*_{i} in .

## 6. Convergence results for the Kusuoka–Lyons–Victoir particle filter

Here, we present the main results of the paper. Let be a cubature measure of degree . Assume that *Π*^{n,T} is the uniform partition of the interval [0,*T*] with *n*+1 points, i.e. and that we know the values of the observation process *Y* in *Π*^{n,T}. Let be the total set of multi-indices with values in {1,…,*c*_{d}}, where *c*_{d} is the number of paths in the support of . For *j*=1,…*n*, define . Let , be the support of KLV _{m}(*Π*^{j,T},*x*), the discrete measure obtained by the *j*-iteration of the KLV operation or cubature measure along the partition *Π*^{n,T}. According to remark 3.10, we can see the global cubature measure along the partition *Π*^{n,T} as discrete measure on with points indexed by . Recall also the expression for the cubature approximation of Picard's filter given in lemma 4.6. In the following two sections, we are going to introduce two different approximation procedures for . Both procedures will be based on the TBBA and equation (4.1). The first approximation, denoted by , will only use the cubature weights to allocate the particles along the cubature tree, whereas the second one will combine both the cubature and the filtering weights. Hence, the second approximation, denoted by , incorporates a correction mechanism similar to the one in the classical particle filters where law of the signal is approximated using the Euler scheme. We will assume that the probability space is rich enough to carry the auxiliary random variables needed to apply the TBBA. As *Y* is also defined on , in what follows will denote the expectation with respect to the TBBA random variables.

### (a) Computational control of the Kusuoka–Lyons–Victoir particle filter based only on the cubature weights

We define a collection of random variables according to the following recursion. Let be the cubature weights. Define . For any *j*∈{2,…,*n*}, define , where, for . Note that TBBA(*N*,0,*λ*_{β−},*Γ*^{1}) returns all the random variables equal to zero. Moreover, for *j*=1,…,*n*, has the same distribution of TBBA(*N*,*N*,1,*Γ*^{j}), where . In particular, , satisfy the TBBA properties stated in proposition 5.1. Finally, we define

### Theorem 6.1

*There exists a positive random variable* *such that*
*for all* *. Moreover,* *is integrable with respect to* *.*

### Proof.

We have that where we have used the TBBA properties of . The previous computation yields that , and the integrability of is deduced using similar arguments as in lemma 4.8. □

### Definition 6.2

We define by 6.1

### Corollary 6.3

*There exists a positive random variable* *such that*
*for all* . *Moreover* *is integrable with respect to* .

### Proof.

We have that By theorem 6.1, we obtain Hence, . Finally, that follows, using Cauchy–Schwarz inequality, from the integrability of and lemma 4.8. □

### (b) Computational control of the Kusuoka–Lyons–Victoir particle filter based on the cubature and filtering weights

We define a collection, , of random variables according to the following recursion. Let and be defined by
Next, define . For any *j*∈{2,…,*n*}, let and
Also, define . By construction, and the TBBA weights are recursively defined and random. Set . Then, conditionally on , satisfy the TBBA properties stated in proposition 5.1, *j*=1,…,*n*. Finally, we define
where and .

### Theorem 6.4

*There exists a positive random variable* *such that
**for all* *. Moreover* *is integrable with respect to* *.*

### Proof.

For *j*=1,…,*n*, define
and note that and . We shall proceed by induction.

*Case* **j=1**. By the TBBA properties of , we obtain
Note that, . Moreover,
Hence,

*Case* **j=n**. We can write
and . Consider the auxiliary random measure
Then
because
by the definition of . Note that
and by the induction hypothesis, we obtain
Hence,
because . For the term *A*_{2}, we can write
where we have used the measurability of and that the random variables conditionally to are negatively correlated. Using again the TBBA properties of , we have that
Note that
because . By taking iteratively conditional expectations with respect to , it is easy to see that . Hence, using again the induction hypothesis, we obtain that
which yields
Combining the bounds for *A*_{1} and *A*_{2}, we obtain that
and, therefore, . The integrability of follows by using similar arguments as in the proof of lemma 4.8. □

### Definition 6.5

We define by 6.2

Theorem 6.4 yields the following result.

### Corollary 6.6

*There exists a positive random variable* *such that*
*for all* . *Moreover* *is integrable with respect to* .

### Proof.

The proof is the same as the one for corollary 6.3, making the obvious changes and using theorem 6.4. □

### (c) Main result for the Kusuoka–Lyons–Victoir particle filter

The main result of the paper is the following.

### Theorem 6.7

*There exists a positive constant* *and* *such that, for all* *we have
**where* *and* *are defined by equations (*6.1*) and (*6.2*), respectively.*

### Proof.

Using the triangular inequality, we have that By corollary 4.5, it follows that , and by corollary 6.3, we have that which yields the result. The proof for is analogous. □

## 7. Numerical simulations

Here, we present some numerical experiments to test the performance of the algorithms we have introduced. The model chosen is a particular case of the Beneš problem. This is a stochastic filtering problem with a nonlinear dynamics for the signal and a linear dynamics for the observation process. This problem has an analytical finite dimensional solution, known as the Beneš filter. In particular, the conditional distribution of the signal given the observation process is a Gaussian mixture. Although the model does not satisfy some of the conditions for which our theory holds (the function *h* and the derivatives of *f* are not bounded), we believe that is good benchmark to test the new algorithms as it allows for sufficient rich (nonlinear) behaviour while still having a closed form expression for its solution.

### (a) The model and its exact solution

The dynamics of the signal is given by the following one-dimensional SDE
7.1where , and the observation process is given by the one-dimensional process , where *h*(*x*)=*h*_{1}*x*+*h*_{2},*V* _{t} and *W*_{t} are two independent, one-dimensional Brownian motions, *x*_{0},*μ*,*h*_{1} and and *σ*>0. The conditional law of *X*_{t} given in the previous problem has an explicit expression. It is a weighted mixture of two normal distributions [1, ch. 6]. That is, given a realization *Y* ={*Y* _{s}}_{0≤s≤t} of the observation process, we have the following equality in law for the conditional distribution of *X*_{t} given ,
where
and denotes a normal distribution with mean *μ* and variance *σ*^{2}. Recall that in practice, we observe only *Y* at a finite partition *Π*^{n,T}={0=*t*_{0}<*t*_{1}<⋯<*t*_{n−2}<*t*_{n−1}=*T*} of the interval [0,*T*]. Hence, we have to approximate the integral in the definition of *Ψ*_{t} using a Riemann–Stieltjes sum.

### (b) Numerical experiments

In the numerical experiments, we set certain values for the parameters *μ*,*σ*,*h*_{1},*h*_{2},*x*_{0} and *T*, and we compute a realization of *X*_{t} and *Y* _{t} using the Euler scheme and an equidistant partition with *m*=10^{6}. We will call the realization of *X*_{t} the seed particle. All the other simulations will be done assuming that we are given the fixed path *Y* _{t}, computed from the seed particle path. Usually, we will consider partitions *Π*^{n,T}, with *n*<*m* and the values of *Y* _{t} in these partitions will be obtained using linear interpolation of the values of *Y* _{t} in *Π*^{m,T}.

Using the previously simulated discrete path of *Y* , we can approximate the integral *Ψ*_{t} and compute the values of and for *t*∈*Π*^{m,T}, obtaining the parameters of the normal mixtures at each *t*∈*Π*^{m,T}.

The simulations in this paper were performed in a laptop with an Intel Core i5-450M processor (clock speed of 2.44 GHz) and 8 GB of RAM. The algorithms are implemented in C++.

#### (i) Convergence of the Kusuoka–Lyons–Victoir particle filter

Here, we investigate numerically the convergence of the KLV filter , equation (6.2) in definition 6.5, in terms of the number of time steps in the partition and the number of particles. We use the following set of parameter values *mu*=0.5,*h*_{1}=0.4,*h*_{2}=0.0,*σ*=0.8,*x*_{0}=1.0 and *T*=10.0.

We will estimate *π*_{T}(*φ*), where *φ*(*x*)=*x*, that is, the conditional expectation of the signal at time *T* given the observation process up to time *T*. We take as the exact value for *π*_{T}(*φ*) the value given by the Beneš filter . First, we estimate *π*_{T}(*φ*) using where the number of particles *N*=10^{6} is fixed and we choose various values for *n*, the number of steps in the discretization partition. We compute using cubature formulae of degree 3 and 5.

In figure 1*a*, we plot the absolute error in the estimation of *π*_{T}(*φ*) by against the number of time steps, the two axes in scale. Both, the cubature formula of degree 3 and degree 5 give a rate of convergence of one. Hence, it is clear that the discretization error of Picard's filter is a lower bound for the discretization error of the method, even when one uses a cubature formula of degree 5 (which is of order 2 when approximating the law of the signal).

In figure 1*b*, we plot the absolute error obtained using with *n*=256 fixed and varying the number of particles *N*, both axes in scale. Apparently, with 2^{16}=65 536 particles, we already obtain a good partial sampling of the cubature tree (we get close to the discretization error) and there is no significant improvement in using a larger number of particles. According to the theory, one observes that the part of the error owing to the partial sampling decreases as *N*^{−1/2}, until the discretization error dominates the overall error. Note that we present here the results of just one run to illustrate the almost sure convergence of the method. However, it is beneficial to average over several independent runs each containing a smaller number of particles as can be seen from the following comparison method.

#### (ii) Comparison with the classical particle filter based on Euler approximation

Here, we compare the performance of our algorithm against the classical particle filter implemented using the Euler scheme to approximate the signal and the TBBA to perform the resampling at each time step Bain & Crisan [1, section 9.6]. We will denote by the classical particle filter. In what follows, we add a subindex *r* indicating the result of the *r*th independent launch of both algorithms, *r*=1,…,*M*. Hence, our approximations will be
We set the same number of launches *M*=10 and the same number of steps in the partition for both estimators. However, the number of particles used in our algorithm will be just *N*=100, whereas we set *N*=10 000 for the classical particle filter. We use the following set of parameter values *μ*=0.05,*h*_{1}=0.8,*h*_{2}=0.0,*σ*=1.0,*x*_{0}=0.0 and *T*=20.0.

In figure 2, we plot the absolute error of the estimates obtained using the classical particle filter (Euler) and the new algorithm with cubature formulae of degree 3 and 5 against the CPU time (seconds) used in each computation.

One can see from figure 2 that the KLV particle filter obtains better errors with less computational time. In addition, it seems that the KLV particle filter is more robust to the resampling procedure than the classical particle filter, in the sense that the additional randomness added by resampling does not increase the error when using a large number of time steps.

## Acknowledgements

The work of the first author was partially supported by the EPSRC (grant no. EP/H0005500/1). S.O-L. acknowledges the support of the Commission for Universities and Research of the Department of Innovation, Universities and Enterprise of the Government of Catalonia (grant no. BP-DGR 2009). The work of the second author was conducted during a postdoctoral stay at the Department of Mathematics, Imperial College London, UK. The authors thank the referees for their careful reviews and useful suggestions.

- Received February 18, 2013.
- Accepted April 26, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.