Signal processing with Lévy information

Dorje C. Brody, Lane P. Hughston, Xun Yang

Abstract

Lévy processes, which have stationary independent increments, are ideal for modelling the various types of noise that can arise in communication channels. If a Lévy process admits exponential moments, then there exists a parametric family of measure changes called Esscher transformations. If the parameter is replaced with an independent random variable, the true value of which represents a ‘message’, then under the transformed measure the original Lévy process takes on the character of an ‘information process’. In this paper we develop a theory of such Lévy information processes. The underlying Lévy process, which we call the fiducial process, represents the ‘noise type’. Each such noise type is capable of carrying a message of a certain specification. A number of examples are worked out in detail, including information processes of the Brownian, Poisson, gamma, variance gamma, negative binomial, inverse Gaussian and normal inverse Gaussian type. Although in general there is no additive decomposition of information into signal and noise, one is led nevertheless for each noise type to a well-defined scheme for signal detection and enhancement relevant to a variety of practical situations.

1. Introduction

The idea of filtering the noise out of a noisy message as a way of increasing its information content is illustrated by Norbert Wiener in his book Cybernetics [1] by means of the following example. The true message is represented by a variable X which has a known probability distribution. An agent wishes to determine as best as possible the value of X, but owing to the presence of noise the agent can only observe a noisy version of the message of the form ξ=X+ϵ, where ϵ is independent of X. Wiener shows how, given the observed value of the noisy message ξ, the original distribution of X can be transformed into an improved a posteriori distribution that has a higher information content. The a posteriori distribution can then be used to determine a best estimate for the value of X.

The theory of filtering was developed in the 1940s when the inefficiency of anti-aircraft fire made it imperative to introduce effective filtering-based devices [2,3]. A breakthrough came with the work of Kalman, who reformulated the theory in a manner more well-suited for dynamical state-estimation problems [4,5]. This period coincided with the emergence of the modern control theory of Bellman & Pontryagin [6,7]. Owing to the importance of its applications, much work has been carried out since then. According to an estimate of Kalman [8], over 200 000 articles and monographs have been published on applications of the Kalman filter alone. The theory of stochastic filtering, in its modern form, is not much different conceptually from the elementary example described by Wiener in the 1940s. The message, instead of being represented by a single variable, in the general setup can take the form of a time series (the ‘signal’ or ‘message’ process). The information made available to the agent also takes the form of a time series (the ‘observation’ or ‘information’ process), typically given by the sum of two terms, the first being a functional of the signal process, and the second being a noise process. The nature of the signal process can be rather general, but in most applications the noise is chosen to be a Wiener process [911]. There is no reason, however, why an information process should necessarily be ‘additive’, or even why it should be given as a functional of a signal process and a noise process. From a mathematical perspective, it seems that the often proposed ansatz of an additive decomposition of the observation process is well adapted to the situation where the noise is Gaussian, but is not so natural when the noise is discontinuous. Thus, while a good deal of recent research has been carried out on the problem of filtering noisy information containing jumps [1217] such work has usually been pursued under the assumption of an additive relation between signal and noise, and it is not unreasonable to ask whether a more systematic treatment of the problem might be available that involves no presumption of additivity and that is more naturally adapted to the mathematics of the situation.

The purpose of the present paper is to introduce a broad class of information processes suitable for modelling situations involving discontinuous signals, discontinuous noise and discontinuous information. No assumption is made to the effect that information can be expressed as a function of signal and noise. Instead, information processes are classified according to their ‘noise type’. Information processes of the same noise type are then distinguished from one another by the messages that they carry. Each noise type is associated to a Lévy process, which we call the fiducial process. The fiducial process is the information process that results for a given noise type in the case of a null message, and can be thought of as a ‘pure noise’ process of that noise type. Information processes can then be classified by the characteristics of the associated fiducial processes. To keep the discussion elementary, we consider the case of a one-dimensional fiducial process and examine the situation where the message is represented by a single random variable. The goal is to construct the optimal filter for the class of information processes that we consider in the form of a map that takes the a priori distribution of the message to an a posteriori distribution that depends on the information that has been made available. A number of examples will be presented. The results vary remarkably in detail and character for the different types of filters considered, and yet there is an overriding unity in the general scheme, which allows for the construction of a multitude of examples and applications.

A synopsis of the main ideas, which we develop more fully in the remainder of the paper, can be presented as follows. We recall the idea of the Esscher transform as a change of probability measure on a probability space Embedded Image that supports a Lévy process {ξt}t≥0 that possesses Embedded Image-exponential moments. The space of admissible moments is the set Embedded Image. The associated Lévy exponent Embedded Image then exists for all αAC :={w∈C:Re wA}, and does not depend on t. A parametric family of measure changes Embedded Image commonly called Esscher transformations can be constructed by use of the exponential martingale family Embedded Image, defined for each λA by Embedded Image. If {ξt} is a Embedded Image-Brownian motion, then {ξt} is Embedded Image-Brownian with drift λ; if {ξt} is a Embedded Image-Poisson process with intensity m, then {ξt} is Embedded Image-Poisson with intensity eλm; if {ξt} is a Embedded Image-gamma process with rate parameter m and scale parameter κ, then {ξt} is Embedded Image-gamma with rate parameter m and scale parameter κ/(1−λ). Each case is different in character. A natural generalization of the Esscher transform results when the parameter λ in the measure change is replaced by a random variable X. From the perspective of the new measure Embedded Image, the process {ξt} retains the ‘noisy’ character of its Embedded Image-Lévy origin, but also carries information about X. In particular, if one assumes that X and {ξt} are Embedded Image-independent, and that the support of X lies in A, then we say that {ξt} defines a Lévy information process under Embedded Image carrying the message X. Thus, the change of measure inextricably intertwines signal and noise. More abstractly, we say that on a probability space Embedded Image a random process {ξt} is a Lévy information process with message (or ‘signal’) X and noise type (or ‘fiducial exponent’) ψ0(α) if {ξt} is conditionally Embedded Image-Lévy given X, with Lévy exponent ψ0(α+X)−ψ0(X) for α∈CI:={w∈C:Re w=0}. We are thus able to classify Lévy information processes by their noise type, and for each noise type we can specify the class of random variables that are admissible as signals that can be carried in the environment of such noise. We consider a number of different noise types, and construct explicit representations of the associated information processes. We also derive an expression for the optimal filter in the general situation, which transforms the a priori distribution of the signal to the improved a posteriori distribution that can be inferred on the basis of received information.

The plan of the paper is as follows. In §2, after recalling some facts about processes with stationary and independent increments, we define Lévy information, and in proposition 2.2 we show that the signal carried by a Lévy information process is effectively ‘revealed’ after the passage of sufficient time. In §3, we present in proposition 3.1 an explicit construction using a change of measure technique that ensures the existence of Lévy information processes, and in proposition 3.2 we prove a converse to the effect that any Lévy information process can be obtained in this way. In proposition 3.3 we construct the optimal filter for general Lévy information processes, and in proposition 3.4 we show that such processes have the Markov property. In proposition 3.5 we establish a result that indicates in more detail how the information content of the signal is coded into the structure of an information process. Then in proposition 3.6 we present a general construction of the so-called innovations process associated with Lévy information. Finally, in §4, we proceed to examine a number of specific examples of Lévy information processes, for which explicit representations are constructed in propositions 4.1–4.8.

2. Lévy information

We assume that the reader is familiar with the theory of Lévy processes [1823]. For an overview of some of the specific Lévy processes considered later in this paper, we refer the reader to [24]. A real-valued process {ξt}t≥0 on a probability space Embedded Image is a Lévy process if: (i) Embedded Image, (ii) {ξt} has stationary and independent increments, (iii) Embedded Image, and (iv) {ξt} is almost surely càdlàg. For a Lévy process {ξt} to give rise to a class of information processes, we require that it should possess exponential moments. Let us consider the set defined for some (equivalently for all) t>0 by Embedded Image2.1If A contains points other than w=0, then we say that {ξt} possesses exponential moments. We define a function ψ:A→R called the Lévy exponent (or cumulant function), such that Embedded Image2.2for αA. If a Lévy process possesses exponential moments, then an exercise shows that ψ(α) is convex on A, that the mean and variance of ξt are given, respectively, by ψ′(0) t and ψ′′(0) t, and that as a consequence of the convexity of ψ(α) the marginal exponent ψ′(α) possesses a unique inverse I(y) such that I(ψ′(α))=α for αA. The Lévy exponent extends to a function ψ:AC →C where AC ={w∈C:Re wA}, and it can be shown [19], Theorem 25.17 that ψ(α) admits a Lévy–Khintchine representation of the form Embedded Image2.3with the property that (2.2) holds for all αAC . Here, 1{⋅} denotes the indicator function, p∈R and q≥0 are constants, and the so-called Lévy measure ν(dz) is a positive measure defined on R\{0} satisfying Embedded Image2.4If the Lévy process possesses exponential moments, then for αA we also have Embedded Image2.5The Lévy measure has the following interpretation: if B is a measurable subset of R\{0}, then ν(B) is the rate at which jumps arrive for which the jump size lies in B. Consider the sets defined for n∈N by Bn={z∈R:1/n≤|z|≤1}. If ν(Bn) tends to infinity for large n, we say that {ξt} is a process of infinite activity, meaning that the rate of arrival of small jumps is unbounded. If Embedded Image one says that {ξt} has finite activity. We refer to the data K=(p,q,ν) as the characteristic triplet (or ‘characteristic’) of the associated Lévy process. Thus, we can classify a Lévy process abstractly by its characteristic K, or equivalently its exponent ψ(α). This means that one can speak of a ‘type’ of Lévy noise by reference to the associated characteristic or exponent.

Now suppose we fix a measure Embedded Image on a measurable space Embedded Image, and let {ξt} be Embedded Image-Lévy, with exponent ψ0(α). There exists a parametric family of probability measures Embedded Image on Embedded Image such that for each choice of λ the process {ξt} is Embedded Image-Lévy. The changes of measure arising in this way are called Esscher transformations [2529]. Under an Esscher transformation, the characteristics of a Lévy process are transformed from one type to another, and one can speak of a family of Lévy processes interrelated by Esscher transformations. The relevant change of measure can be specified by use of the process Embedded Image defined for λA by Embedded Image2.6where Embedded Image. One can check that Embedded Image is an Embedded Image-martingale: indeed, as a consequence of the fact that {ξt} has stationary and independent increments, we have Embedded Image2.7for st, where Embedded Image denotes conditional expectation under Embedded Image with respect to Embedded Image. It is straightforward to show that {ξt} has Embedded Image-stationary and independent increments, and that the Embedded Image-exponent of {ξt}, which is defined on the set AλC :={w∈C:Re w+λA}, is given by Embedded Image2.8from which by use of the Lévy–Khintchine representation (2.3) one can work out the characteristic triplet Kλ of {ξt} under Embedded Image. We observe that if the Esscher martingale (2.6) is expanded as a power series in λ, then the resulting coefficients, which are given by polynomials in ξt and t, form a so-called Sheffer set [30], each element of which defines an Embedded Image-martingale. The first three of these polynomials take the form Q1(x,t)=xψt, Embedded Image, and Embedded Image, where ψ′=ψ0′(0), ψ′′=ψ0′′(0), and ψ′′′=ψ0′′′(0). The corresponding polynomial Lévy–Sheffer martingales are given by Embedded Image, Embedded Image, and Embedded Image.

In what follows, we use the terms ‘signal’ and ‘message’ interchangeably. We write CI={w∈C:Re w=0}. For any random variable Z on Embedded Image we write Embedded Image, and when it is convenient we write Embedded Image for Embedded Image. For processes, we use both of the notations {Zt} and {Z(t)}, depending on the context.

With these background remarks in mind, we are in a position to define a Lévy information process. We confine the discussion to the case of a ‘simple’ message, represented by a random variable X. In the situation when the noise is Brownian motion, the information admits a linear decomposition into signal and noise. In the general situation, the relation between signal and noise is more subtle, and has the character of a fibre space, where one thinks of the points of the base space as representing the different noise types, and the points of the fibres as corresponding to the different information processes that one can construct in association with a given noise type. Alternatively, one can think of the base as being the convex space of Lévy characteristics, and the fibre over a given point of the base as the convex space of messages that are compatible with the associated noise type.

We fix a probability space Embedded Image, and an Esscher family of Lévy characteristics Kλ, λA, with associated Lévy exponents ψλ(α), αAλC . We refer to K0 as the fiducial characteristic, and ψ0(α) as the fiducial exponent. The intuition here is that the abstract Lévy process of characteristic K0 and exponent ψ0(α), which we call the ‘fiducial’ process, represents the noise type of the associated information process. Thus, we can use K0, or equivalently ψ0(α), to label the noise type.

Definition 2.1

By a Lévy information process with fiducial characteristic K0, carrying the message X, we mean a random process {ξt}, together with a random variable X, such that {ξt} is conditionally KX-Lévy given Embedded Image.

Thus, given Embedded Image we require {ξt} to have conditionally independent and stationary increments under Embedded Image, and to possess a conditional exponent of the form Embedded Image2.9for α∈CI, where ψ0(α) is the fiducial exponent of the specified noise type. It is implicit in the statement of definition 2.1 that a certain compatibility condition holds between the message and the noise type. For any random variable X, we define its support SX to be the smallest closed set F with the property that Embedded Image. Then we say that X is compatible with the fiducial exponent ψ0(α) if SXA. Intuitively speaking, the compatibility condition ensures that we can use X to make a random Esscher transformation. In the theory of signal processing, it is advantageous to require that the variables to be estimated should be square integrable. This condition ensures that the conditional expectation exists and admits the interpretation as a best estimate in the sense of least squares. For our purpose, it will suffice to assume throughout the paper that the information process is square integrable under Embedded Image. This in turn implies that ψ′(X) is square integrable, and that ψ′′(X) is integrable. Note that we do not require that the Lévy information process should possess exponential moments under Embedded Image, but a sufficient condition for this to be the case is that there should exist a nonvanishing real number ϵ such that λ+ϵA for all λSX.

To gain a better understanding of the sense in which the information process {ξt} actually ‘carries’ the message X, it will be useful to investigate its asymptotic behaviour. We write I0(y) for the inverse marginal fiducial exponent.

Proposition 2.2

Let {ξt} be a Lévy information process with fiducial exponent ψ0(α) and message X. Then for every ϵ>0 we have Embedded Image2.10

Proof.

It follows from (2.9) that ψX(0)=ψ0(X), and hence that at any time t the conditional mean of the random variable t−1ξt is given by Embedded Image2.11A calculation then shows that the conditional variance of t−1ξt takes the form Embedded Image2.12which allows us to conclude that Embedded Image2.13and hence that Embedded Image2.14On the other hand, for all ϵ>0 we have Embedded Image2.15by Chebychev's inequality, from which we deduce that Embedded Image2.16and it follows that I0(t−1ξt) converges to X in probability.

Thus, we see that the information process does indeed carry information about the message, and in the long run ‘reveals’ it. The intuition here is that as more information is gained, we improve our estimate of X to the point that the value of X eventually becomes known with near certainty.

3. Properties of Lévy information

It will be useful if we present a construction that ensures the existence of Lévy information processes. First, we select a noise type by specification of a fiducial characteristic K0. Next, we introduce a probability space Embedded Image that supports the existence of a Embedded Image-Lévy process {ξt} with the given fiducial characteristic, together with an independent random variable X that is compatible with K0.

Write Embedded Image for the filtration generated by {ξt}, and Embedded Image for the filtration generated by {ξt} and X jointly: Embedded Image. Let ψ0(α) be the fiducial exponent associated with K0. One can check that the process Embedded Image defined by Embedded Image3.1is a Embedded Image-martingale. We are thus able to introduce a change of measure Embedded Image on Embedded Image by setting Embedded Image3.2It should be evident that {ξt} is conditionally Embedded Image-Lévy given Embedded Image, since for fixed X the measure change is an Esscher transformation. In particular, a calculation shows that the conditional exponent of ξt under Embedded Image is given by Embedded Image3.3for α∈CI, which shows that the conditions of definition 2.1 are satisfied, allowing us to conclude the following:

Proposition 3.1

The Embedded Image-Lévy process {ξt} is a Embedded Image-Lévy information process, with message X and noise type ψ0(α).

In fact, the converse also holds: if we are given a Lévy information process, then by a change of measure we can find a Lévy process and an independent ‘message’ variable. Here follows a more precise statement.

Proposition 3.2

Let {ξt} be a Lévy information process on a probability space Embedded Image with message X and noise type ψ0(α). Then there exists a change of measure Embedded Image such that {ξt} and X are Embedded Image-independent, {ξt} is Embedded Image-Lévy with exponent ψ0(α), and the probability law of X under Embedded Image is the same as probability law of X under Embedded Image.

Proof.

First, we establish that the process Embedded Image defined by Embedded Image is a Embedded Image-martingale. We have Embedded Image3.4by virtue of the fact that {ξt} is Embedded Image-conditionally Lévy under Embedded Image. By use of (2.9), we deduce that ψX(−X)=−ψ0(X), and hence that Embedded Image, as required. Then we use Embedded Image to define a change of measure Embedded Image on Embedded Image by setting Embedded Image3.5To show that ξt and X are Embedded Image-independent for all t, it suffices to show that their joint characteristic function under Embedded Image factorizes. Letting α,β∈CI, we have Embedded Image3.6where the last step follows from (2.9). This argument can be extended to show that {ξt} and X are Embedded Image-independent. Next we observe that Embedded Image3.7for ut≥0, and it follows that ξuξt and ξt are independent. This argument can be extended to show that {ξt} has Embedded Image-independent increments. Finally, if we set α=0 in (3.6), it follows that the probability laws of X under Embedded Image and Embedded Image are identical; if we set β=0 in (3.6), it follows that the Embedded Image exponent of {ξt} is ψ0(α); and if we set β=0 in (3.7), it follows that {ξt} is Embedded Image-stationary.

Going forward, we adopt the convention that Embedded Image always denotes the ‘physical’ measure in relation to which an information process with message X is defined, and that Embedded Image denotes the transformed measure with respect to which the information process and the message decouple. Therefore, henceforth we write Embedded Image rather than Embedded Image. In addition to establishing the existence of Lévy information processes, the results of proposition 3.2 provide useful tools for calculations, allowing us to work out properties of information processes by referring the calculations back to Embedded Image. We consider as an example the problem of working out the Embedded Image-conditional expectation under Embedded Image of a Embedded Image-measurable integrable random variable Z. The Embedded Image-expectation of Z can be written in terms of Embedded Image-expectations, and is given by a ‘generalized Bayes formula’ [31] of the form Embedded Image3.8This formula can be used to obtain the Embedded Image-conditional probability distribution function for X, defined for y∈R by Embedded Image3.9In the Bayes formula, we set Z=1{Xy}, and the result is Embedded Image3.10where Embedded Image is the a priori distribution function. It is useful for some purposes to work directly with the conditional probability measure πt(dx) induced on R defined by Embedded Image. In particular, when X is a continuous random variable with a density function p(x), one can write πt(dx)=pt(x)dx, where pt(x) is the conditional density function.

Proposition 3.3

Let {ξt} be a Lévy information process under Embedded Image with noise type ψ0(α), and let the a priori distribution of the associated message X be π(dx). Then the Embedded Image-conditional a posteriori distribution of X is Embedded Image3.11

It is straightforward to establish by use of a variational argument that for any function f:R→R such that the random variable Y =f(X) is square integrable, the best estimate for Y conditional on the information Embedded Image is given by Embedded Image3.12By the ‘best estimate’ for Y, we mean the Embedded Image-measurable random variable Embedded Image that minimizes the quadratic error Embedded Image.

It will be observed that at any given time t the best estimate can be expressed as a function of ξt and t, and does not involve values of the information process at times earlier than t. That this should be the case can be seen as a consequence of the following:

Proposition 3.4

The Lévy information process {ξt} has the Markov property.

Proof.

For the Markov property, it suffices to establish that for a∈R we have Embedded Image3.13where Embedded Image and Embedded Image. We write Embedded Image3.14where Embedded Image is defined as in equation (3.1). It follows that Embedded Image3.15since {ξt} has the Markov property under the transformed measure Embedded Image.

We note that since X is Embedded Image-measurable, which follows from proposition 2.2, the Markov property implies that if Y =f(X) is integrable, we have Embedded Image3.16This identity allows one to work out the optimal filter for a Lévy information process by direct use of the Bayes formula. It should be apparent that simulation of the dynamics of the filter is readily approachable on account of this property.

We remark briefly on what might appropriately be called a ‘time consistency’ property satisfied by Lévy information processes. It follows from (3.11) that, given the conditional distribution πs(dx) at time st, we can express πt(dx) in the form Embedded Image3.17Then, if for fixed s≥0, we introduce a new time variable u:=ts, and define ηu=ξu+sξs, we find that {ηu}u≥0 is an information process with fiducial exponent ψ0(α) and message X with a priori distribution πs(dx). Thus given up-to-date information, we can ‘re-start’ the information process at that time to produce a new information process of the same type, with an adjusted message distribution.

Further insight into the nature of Lévy information can be gained by examination of expression (2.9) for the conditional exponent of an information process. In particular, as a consequence of the Lévy–Khintchine representation (2.3) we are able to deduce that Embedded Image3.18for α∈CI, which leads to the following:

Proposition 3.5

The randomization of the Embedded Image-Lévy process {ξt} achieved through the change of measure generated by the randomized Esscher martingale Embedded Image induces two effects on the characteristics of the process: (i) a random shift in the drift term, given by Embedded Image3.19and (ii) a random rescaling of the Lévy measure, given by ν(dz)→eXzν(dz).

The integral appearing in the shift in the drift term is well defined because the term z(eXz−1) vanishes to second order at the origin. It follows from proposition 3.5 that in sampling an information process an agent is in effect trying to detect a random shift in the drift term, and a random ‘tilt’ and change of scale in the Lévy measure, altering the overall rate as well as the relative rates at which jumps of various sizes occur. It is from these data, within which the message is encoded, that the agent attempts to estimate the value of X. It is interesting to note that randomized Esscher martingales arise in the construction of pricing kernels in the theory of finance [32,33].

We turn to examine the properties of certain martingales associated with Lévy information. We establish the existence of a so-called innovations representation for Lévy information. In the case of the Brownian filter, the ideas involved are rather well understood [9], and the matter has also been investigated in the case of Poisson information [34]. These examples arise as special cases in the general theory of Lévy information. Throughout the discussion that follows, we fix a probability space Embedded Image.

Proposition 3.6

Let {ξt} be a Lévy information process with fiducial exponent ψ0(α) and message X, let Embedded Image denote the filtration generated by {ξt}, let Y =ψ0(X), where ψ0′(α) is the marginal fiducial exponent, and set Embedded Image. Then the process {Mt} defined by Embedded Image3.20is an Embedded Image-martingale.

Proof.

We recall that {ξt} is by definition Embedded Image-conditionally Embedded Image-Lévy. It follows therefore from (2.11) that Embedded Image, where Y =ψ0(X). As before, we let Embedded Image denote the filtration generated jointly by {ξt} and X. First, we observe that the process defined for t≥0 by mt=ξtY t is a Embedded Image-martingale. This assertion can be checked by consideration of the one-parameter family of Embedded Image-martingales defined by Embedded Image3.21for ϵ∈CI. Expanding this expression to first order in ϵ, we deduce that the process defined for t≥0 by Embedded Image is a Embedded Image-martingale. Thus, we have Embedded Image3.22Then using Embedded Image to make a change of measure from Embedded Image to Embedded Image we obtain Embedded Image3.23and the result follows if we set Y =ψ0′(X). Next, we introduce the ‘projected’ process Embedded Image defined by Embedded Image. We note that since {mt} is a Embedded Image-martingale we have Embedded Image3.24and thus Embedded Image is an Embedded Image-martingale. Finally, we observe that Embedded Image3.25where we have made use of the fact that the final term is Embedded Image-measurable. The fact that Embedded Image and Embedded Image are both Embedded Image-martingales implies that Embedded Image3.26from which it follows that Embedded Image, which is what we set out to prove.

Although the general information process does not admit an additive decomposition into signal and noise, it does admit a linear decomposition into terms representing (i) information already received and (ii) new information. The random variable Y entering via its conditional expectation into the first of these terms is itself in general a nonlinear function of the message variable X. It follows on account of the convexity of the fiducial exponent that the marginal fiducial exponent is invertible, which ensures that X can be expressed in terms of Y by the relation X=I0(Y), which is linear if and only if the information process is Brownian. Thus, signal and noise are deeply intertwined in the case of general Lévy information. Vestiges of linearity remain, and these suffice to provide an overall element of tractability.

4. Examples of Lévy information processes

In a number of situations one can construct explicit examples of information processes, categorized by noise type. The Brownian and Poisson constructions, which are familiar in other contexts, can be seen as belonging to a unified scheme that brings out their differences and similarities. We then proceed to construct information processes of the gamma, the variance gamma, the negative binomial, the inverse Gaussian, and the normal inverse Gaussian type. It is interesting to take note of the diverse nature of noise, and to observe the many different ways in which messages can be conveyed in a noisy environment.

Example 1 Brownian informationBrownian information

On a probability space Embedded Image, let {Bt} be a Brownian motion, let X be an independent random variable, and set Embedded Image4.1The random process {ξt} thereby defined, which we call the Brownian information process, is Embedded Image-conditionally KX-Lévy, with conditional characteristic KX=(X,1,0) and conditional exponent ψX(α)=+1 2α2. The fiducial characteristic is K0 = (0,1,0), the fiducial exponent is ψ0(α)=1 2α2, and the associated fiducial process or ‘noise type’ is standard Brownian motion. In the case of Brownian information, there is a linear separation of the process into signal and noise. This model, considered by Wonham [35], is perhaps the simplest continuous-time generalization of the example described by Wiener [1]. The message is given by the value of X, but X can only be observed indirectly, through {ξt}. The observations of X are obscured by the noise represented by the Brownian motion {Bt}. Because the signal term grows linearly in time, whereas Embedded Image, it is intuitively plausible that observations of {ξt} will asymptotically reveal the value of X, and a direct calculation using properties of the normal distribution function confirms that t−1ξt converges in probability to X; this is consistent with proposition 2.2 if we note that ψ0(α)=α and I0(y)=y in the Brownian case.

The best estimate for X conditional on Embedded Image can be derived by use of the generalized Bayes formula (3.8). In the Brownian case, there is an elementary method leading to the same result, worth mentioning briefly because it is of interest. First, we present an alternative proof of proposition 3.4 in the Brownian case that uses a Brownian bridge argument.

We recall that if s>s1>0, then Bs and Embedded Image are independent. More generally, we observe that if s>s1>s2, then Bs, Embedded Image, and Embedded Image are independent, and that Embedded Image. Extending this line of reasoning, we see that for any a∈R we have Embedded Image4.2since ξt and ξs are independent of Embedded Image, and that gives us the Markov property (3.13). Since we have established that X is Embedded Image-measurable, it follows that (3.16) holds. As a consequence, the a posteriori distribution of X can be worked out by use of the standard Bayes formula, and for the best estimate of X, we obtain Embedded Image4.3

The innovations representation (3.20) in the case of a Brownian information process can be derived by the following argument. We observe that the Embedded Image-martingale {Φt} defined in (3.14) is a ‘space–time’ function of the form Embedded Image4.4By use of the Ito calculus together with (4.3), we deduce that Embedded Image, and thus by integration we obtain Embedded Image4.5Since {ξt} is an Embedded Image-Brownian motion, it follows from (4.5) by the Girsanov theorem that the process {Mt} defined by Embedded Image4.6is an Embedded Image-Brownian motion, which we call the innovations process (see [36]). The increments of {Mt} represent the arrival of new information.

We conclude our discussion of Brownian information with the following remarks. In problems involving prediction and valuation, it is not uncommon that the message is revealed after the passage of a finite amount of time. This is often the case in applications to finance, where the message takes the form of a random cash flow at some future date, or, more generally, a random factor that affects such a cash flow. There are also numerous examples coming from the physical sciences, economics, and operations research, where the goal of an agent is to form a view concerning the outcome of a future event by monitoring the flow of information relating to it. How does one handle problems involving the revelation of information over finite time horizons?

One way of modelling finite time horizon scenarios in the present context is by use of a time change. If {ξt} is a Lévy information process with message X and a specified fiducial exponent, then a generalization of proposition 2.2 shows that the process {ξtT} defined over the time interval 0≤t<T by Embedded Image4.7reveals the value of X in the limit as Embedded Image, and one can check that Embedded Image4.8In the case where {ξt} is a Brownian information process represented as above in the form ξt=Xt+Bt, the time-changed process (4.7) takes the form ξtT=Xt+βtT, where {βtT} is a Brownian bridge over the interval [0,T]. Such processes have had applications in physics [3740] and in finance [4145]. It seems reasonable to conjecture that time-changed Lévy information processes of the more general type proposed above may be similarly applicable.

Example 2 Poisson informationPoisson information

Consider a situation in which an agent observes a series of events taking place at a random rate, and the agent wishes to determine the rate as best as possible because its value conveys an important piece of information. One can model the information flow in this situation by a modulated Poisson process for which the jump rate is an independent random variable. Such a scenario arises in many real-world situations, and has been investigated in the literature [34,4649]. The resulting scheme can be seen to emerge naturally as an example of our general model for Lévy information.

As in the Brownian case, one can construct the relevant information process directly. On a probability space Embedded Image, let {N(t)}t≥0 be a standard Poisson process with jump rate m>0, let X be an independent random variable, and set Embedded Image4.9Thus, {ξt} is a time-changed Poisson process, and the effect of the signal is to randomly modulate the rate at which the process jumps. It is evident that {ξt} is Embedded Image-conditionally Lévy and satisfies the conditions of definition 2.1. In particular, Embedded Image4.10and for fixed X one obtains a Poisson process with rate meX. It follows that (4.9) is an information process. The fiducial characteristic is given by K0 = (0,0,1(dz)), that of a Poisson process with unit jumps at the rate m, where δ1(dz) is the Dirac measure with unit mass at z=1, and the fiducial exponent is ψ0(α)=m(eα−1). A calculation using (2.9) shows that KX=(0,0,meXδ1(dz)), and that ψX(α)=meX(eα−1). The relation between signal and noise in the case of Poisson information is rather subtle. The noise is associated with the random fluctuations of the inter-arrival times of the jumps, whereas the message determines the average rate at which the jumps occur.

It will be instructive in this example to work out the conditional distribution of X by elementary methods. Since X is Embedded Image-measurable and {ξt} has the Markov property, we have Embedded Image4.11for y∈R. It follows then from the Bayes law for an information process taking values in N0 that Embedded Image4.12In the case of Poisson information, the relevant conditional distribution is Embedded Image4.13After some cancellation, we deduce that Embedded Image4.14and hence Embedded Image4.15and thus Embedded Image4.16which we can see is consistent with (3.11) if we recall that in the case of noise of the Poisson type the fiducial exponent is given by ψ0(α)=m(eα−1).

If a Geiger counter is monitored continuously in time, the sound that it produces provides a nice example of a Poisson information process. The crucial message (proximity to radioactivity) carried by the noisy sputter of the instrument is represented by the rate at which the clicks occur.

Example 3 Gamma informationGamma information

It will be convenient first to recall a few definitions and conventions [5052]. Let m and κ be positive numbers. By a gamma process with rate m and scale κ on a probability space Embedded Image we mean a Lévy process {γt}t≥0 with exponent Embedded Image4.17for αAC ={w∈C:Re w<κ−1}. The probability density for γt is Embedded Image4.18where Γ[a] is the gamma function. A short calculation making use of the functional equation Γ[a+1]=[a] shows that Embedded Image and Embedded Image. Clearly, the mean and variance determine the rate and scale. If κ=1, we say that {γt} is a standard gamma process with rate m. If κ≠1, we say that {γt} is a scaled gamma process. The Lévy measure associated with the gamma process is Embedded Image4.19It follows that Embedded Image and hence that the gamma process has infinite activity. Now let {ξt} be a standard gamma process with rate m on a probability space Embedded Image, and let λ∈R satisfy λ<1. Then the process Embedded Image defined by Embedded Image4.20is an Embedded Image-martingale. If we let Embedded Image act as a change of measure density for the transformation Embedded Image, then we find that {γt} is a scaled gamma process under Embedded Image, with rate m and scale 1/(1−λ). Thus we see that the effect of an Esscher transformation on a gamma process is to alter its scale. With these facts in mind, one can establish the following:

Proposition 4.1

Let {γt} be a standard gamma process with rate m on a probability space Embedded Image and let the independent random variable X satisfy X<1 almost surely. Then the process {ξt} defined by Embedded Image4.21is a Lévy information process with message X and gamma noise, with fiducial exponent Embedded Image for α∈{w∈C:Re w<1}.

Proof.

It is evident that {ξt} is Embedded Image-conditionally a scaled gamma process. As a consequence of (4.17), we have Embedded Image4.22for α∈CI. Then we note that Embedded Image4.23It follows that the Embedded Image-conditional Embedded Image exponent of {ξt} is ψ0(X+α)−ψ0(X). □

The gamma filter arises as follows. An agent observes a process of accumulation. Typically, there are many small increments, but now and then there are large increments. The unknown factor X appearing in the overall rate m/(1−X) at which the process is growing is the figure that the agent wishes to estimate as accurately as possible. The accumulation can be modelled by gamma information, and the associated filter can be used to estimate X. It has long been recognized that the gamma process is useful in describing phenomena such as the water level of a dam or the totality of the claims made in a large portfolio of insurance contracts [5355]. Use of the gamma information process and related bridge processes, with applications in finance and insurance, is pursued in Brody et al. [51], Hoyle [56] and Hoyle et al. [57]. We draw the reader's attention to Yor [50] and references cited therein, where it is shown how certain additive properties of Brownian motion have multiplicative analogues in the case of the gamma process. One notes in particular the remarkable property that γt and γs/γt are independent for ts≥0. Making use of this relation, it will be instructive if we present an alternative derivation of the optimal filter for gamma noise. We begin by establishing that the process defined by (4.21) has the Markov property. We observe first that for any times tss1s2≥⋯≥sk the variables γs1/γs,γs2/γs1, and so on, are independent of one another and are independent of γs and γt. It follows that Embedded Image4.24since {γt} and X are independent, and this gives us (3.13). In working out the distribution of X given Embedded Image it suffices therefore to work out the distribution of X given ξt. We note that the Bayes formula implies that Embedded Image4.25where π(dx) is the unconditional distribution of X, and ρ(ξ|X=x) is the conditional density for the random variable ξt, which can be calculated as follows: Embedded Image4.26It follows that the optimal filter in the case of gamma noise is given by Embedded Image4.27We conclude with the following observation. In the case of Brownian information, it is well known (and implicit in Wiener's example [1]) that if the signal is Gaussian, then the optimal filter is a linear function of the observation ξt. One might therefore ask in the case of a gamma information process if some special choice of the signal distribution gives rise to a linear filter. The answer is affirmative. Let U be a gamma-distributed random variable with the distribution Embedded Image4.28where r>1 and θ>0 are parameters, and set X=1−U. Let {ξt} be a gamma information process carrying message X, let Y =ψ0′(X)=m/(1−X), and set τ=(r−1)/m. Then the optimal filter for Y is given by Embedded Image4.29

Example 4 Variance-gamma informationVariance-gamma information

The so-called variance-gamma or VG process [5860] was introduced in the theory of finance. The relevant definitions and conventions are as follows. By a VG process with drift μ∈R, volatility σ≥0, and rate m>0, we mean a Lévy process with exponent Embedded Image4.30The VG process admits representations in terms of simpler Lévy processes. Let {γt} be a standard gamma process on Embedded Image, with rate m, as defined in the previous example, and let {Bt} be a standard Brownian motion, independent of {γt}. We call the scaled process {Γt} defined by Γt=m−1γt a standard gamma subordinator with rate m. Note that Γt has dimensions of time and that Embedded Image. A calculation shows that the Lévy process {V t} defined by Embedded Image4.31has the exponent (4.30). The VG process thus takes the form of a Brownian motion with drift, time-changed by a gamma subordinator. If μ=0 and σ=1, we say that {V t} is a ‘standard’ VG process, with rate parameter m. If μ≠0, we say that {V t} is a ‘drifted’ VG process. One can always choose units of time such that m=1, but for applications it is better to choose conventional units of time (seconds for physics, years for economics), and treat m as a model parameter. In the limit Embedded Image we obtain a gamma process with rate m and scale μ/m. In the limit Embedded Image we obtain a Brownian motion with drift μ and volatility σ.

An alternative representation of the VG process results if we let Embedded Image and Embedded Image be independent standard gamma processes on Embedded Image, with rate m, and set Embedded Image4.32where κ1 and κ2 are nonnegative constants. A calculation shows that the exponent is of the form (4.30). In particular, we have Embedded Image4.33where μ=m(κ1κ2) and σ2=21κ2, or equivalently Embedded Image4.34where α∈{w∈C:−1/κ2<Re w<1/κ1}. Now let {ξt} be a standard VG process on Embedded Image, with exponent Embedded Image for Embedded Image. Under the transformed measure Embedded Image defined by the change-of-measure martingale (2.6), one finds that {ξt} is a drifted VG process, with Embedded Image4.35for Embedded Image. Thus, in the case of the VG process, an Esscher transformation affects both the drift and the volatility. Note that for large m the effect on the volatility is insignificant, whereas the effect on the drift reduces to that of an ordinary Girsanov transformation.

With these facts in hand, we are now in a position to construct the VG information process. We fix a probability space Embedded Image and a number m>0.

Proposition 4.2

Let {Γt} be a standard gamma subordinator with rate m, let {Bt} be an independent Brownian motion, and let the independent random variable X satisfy Embedded Image almost surely. Then the process {ξt} defined by Embedded Image4.36is a Lévy information process with message X and VG noise, with fiducial exponent Embedded Image4.37for Embedded Image.

Proof.

Observe that {ξt} is Embedded Image-conditionally a drifted VG process of the form Embedded Image4.38where the drift and volatility coefficients are Embedded Image4.39The Embedded Image-conditional Embedded Image-exponent of {ξt} is by (4.30) thus given for α∈CI by Embedded Image4.40which is evidently by (4.37) of the form ψ0(X+α)−ψ0(X), as required. □

An alternative representation for the VG information process can be established by the same method if one randomly rescales the gamma subordinator appearing in the time-changed Brownian motion. The result is as follows.

Proposition 4.3

Let {Γt} be a gamma subordinator with rate m, let {Bt} be an independent standard Brownian motion and let the independent random variable X satisfy Embedded Image almost surely. Write Embedded Image for the subordinator: Embedded Image4.41Then the process {ξt} defined by Embedded Image is a VG information process with message X.

A further representation of the VG information process arises as a consequence of the representation of the VG process as the asymmetric difference between two independent standard gamma processes. In particular, we have:

Proposition 4.4

Let Embedded Image and Embedded Image be independent standard gamma processes, each with rate m, and let the independent random variable X satisfy Embedded Image almost surely. Then the process {ξt} defined by Embedded Image4.42is a VG information process with message X.

Example 5 Negative-binomial informationNegative-binomial information

By a negative binomial process with rate parameter m and probability parameter q, where m>0 and 0<q<1, we mean a Lévy process with exponent Embedded Image4.43for Embedded Image. There are two representations for the negative binomial process [61,52]. The first of these is a compound Poisson process for which the jump size J∈N has a logarithmic distribution Embedded Image4.44and the intensity of the Poisson process determining the timing of the jumps is given by Embedded Image. One finds that the characteristic function of J is Embedded Image4.45for Embedded Image. Then if we set Embedded Image4.46where {Nt} is a Poisson process with rate λ, and {Jk}k∈N denotes a collection of independent identical copies of J, representing the jumps, one deduces that Embedded Image4.47and that the resulting exponent is given by (4.43). The second representation of the negative binomial process makes use of the method of subordination. We take a Poisson process with rate Λ=mq/(1−q), and time-change it using a gamma subordinator {Γt} with rate parameter m. The moment-generating function thus obtained, in agreement with (4.43), is Embedded Image4.48With these results in mind, we fix a probability space Embedded Image and find:

Proposition 4.5

Let {Γt} be a gamma subordinator with rate m, let {Nt} be an independent Poisson process with rate m, let the independent random variable X satisfy Embedded Image almost surely, and set Embedded Image4.49Then the process {ξt} defined by Embedded Image4.50is a Lévy information process with message X and negative binomial noise, with fiducial exponent (4.43).

Proof.

This can be verified by direct calculation. For α∈CI we have Embedded Image4.51which by (4.43) shows that the conditional exponent is ψ0(X+α)−ψ0(X). □

There is also a representation for negative binomial information based on the compound Poisson process. This can be obtained by an application of proposition 3.5, which shows how the Lévy measure transforms under a random Esscher transformation. In the case of a negative binomial process with parameters m and q, the Lévy measure is given by Embedded Image4.52where δn(dz) denotes the Dirac measure with unit mass at the point z=n. The Lévy measure is finite in this case, and we have Embedded Image, which is the overall rate at which the compound Poisson process jumps. If one normalizes the Lévy measure with the overall jump rate, one obtains the probability measure (4.44) for the jump size. With these facts in mind, we fix a probability space Embedded Image and specify the constants m and q, where m>1 and 0<q<1. Then as a consequence of proposition 3.5 we have the following:

Proposition 4.6

Let the random variable X satisfy Embedded Image almost surely, let the random variable JX have the conditional distribution Embedded Image4.53let Embedded Image be a collection of conditionally independent identical copies of JX, and let {Nt} be an independent Poisson process with rate m. Then the process {ξt} defined by Embedded Image4.54is a Lévy information process with message X and negative binomial noise, with fiducial exponent (4.43).

Example 6 Inverse Gaussian informationInverse Gaussian information

The inverse Gaussian (IG) distribution appears in the study of the first exit time of Brownian motion with drift [62]. The name ‘inverse Gaussian’ was introduced by Tweedie [63], and a Lévy process whose increments have the IG distribution was introduced by Wasan [64]. By an IG process with parameters a>0 and b>0, we mean a Lévy process with exponent Embedded Image4.55for Embedded Image. Let us write {Gt} for the IG process. The probability density function for Gt is Embedded Image4.56and we find that Embedded Image and that Embedded Image. It is straightforward to check that under the Esscher transformation Embedded Image induced by (2.6), where 0<λ<1 2b2, the parameter a is left unchanged, whereas Embedded Image. With these facts in mind, we are in a position to introduce the associated information process. We fix a probability space Embedded Image and find the following:

Proposition 4.7

Let G(t) be an inverse Gaussian process with parameters a and b, let X be an independent random variable satisfying 0<X<1 2b2 almost surely, and set Z=b−1(b2−2X)1/2. Then the process {ξt} defined by Embedded Image4.57is a Lévy information process with message X and inverse Gaussian noise, with fiducial exponent (4.55).

Proof.

It should be evident by inspection that {ξt} is Embedded Image-conditionally Lévy. Let us therefore work out the conditional exponent. For α∈CI we have: Embedded Image4.58which shows that the conditional exponent is of the form ψ0(α+X)−ψ0(X). □

Example 7 Normal inverse Gaussian informationNormal inverse Gaussian information

By a normal inverse Gaussian (NIG) process [65,66] with parameters a, b and m, such that a>0, |b|<a and m>0, we mean a Lévy process with an exponent of the form Embedded Image4.59for α∈{w∈C:−ab<Re w<ab}. Let us write {It} for the NIG process. The probability density for its value at time t is given by Embedded Image4.60where Kν is the modified Bessel function of third kind [67]. The NIG process can be represented as a Brownian motion subordinated by an IG process. In particular, let {Bt} be a standard Brownian motion, let {Gt} be an independent IG process with parameters a′ and b′, and set a′=1 and b′=m(a2b2)1/2. Then the characteristic function of the process {It} defined by Embedded Image4.61is given by (4.59). The associated information process is constructed as follows. We fix a probability space Embedded Image and the parameters a, b and m.

Proposition 4.8

Let the random variable X satisfyab<X<ab almost surely, let Embedded Image be Embedded Image-conditionally IG, with parameters a′=1 and b′=m(a2−(b+X)2)1/2, and let Embedded Image. Then the process {ξt} defined by Embedded Image4.62is a Lévy information process with message X and NIG noise, with fiducial exponent (4.59).

Proof.

We observe that the condition on Embedded Image is that Embedded Image4.63for α∈CI. Thus, if we set Embedded Image for α∈CI it follows that Embedded Image4.64which shows that the conditional exponent is of the required form.

Similar arguments lead to the construction of information processes based on various other Lévy processes related to the IG distribution, including for example the generalized hyperbolic process [68], for which the information process can be shown to take the form Embedded Image4.65Here, the random variable X is taken to be Embedded Image-independent of the standard Brownian motion {B(t)}, and Embedded Image is Embedded Image-conditionally a generalized inverse Gaussian process with parameters (δ,(a2−(b+X)2)1/2,ν). It would be of interest to determine whether models can be found for information processes based on the Meixner process [30] and the CGMY process [69,70].

We conclude this study of Lévy information with the following remarks. Recent developments in the phenomenological representation of physical [38] and economic [42] time series have highlighted the idea that signal-processing techniques may have far-reaching applications to the identification, characterization and categorization of phenomena, both in the natural and in the social sciences, and that beyond the conventional remits of prediction, filtering, and smoothing there is a fourth and important new domain of applicability: the description of phenomena in science and in society. It is our hope therefore that the theory of signal processing with Lévy information herein outlined will find a variety of interesting and exciting applications.

Acknowledgements

The research reported in this paper has been supported in part by Shell Treasury Centre Limited, London, and by the Fields Institute, University of Toronto. The authors are grateful to N. Bingham, M. Davis, E. Hoyle, M. Grasselli, T. Hurd, S. Jaimungal, E. Mackie, A. Macrina, P. Parbhoo and M. Pistorius for helpful comments and discussions.

  • Received July 19, 2012.
  • Accepted September 20, 2012.

References

View Abstract