## Abstract

An asymmetric information model is introduced for the situation in which there is a small agent who is more susceptible to the flow of information in the market than the general market participant, and who tries to implement strategies based on the additional information. In this model market participants have access to a stream of noisy information concerning the future return of an asset, whereas the informed trader has access to a further information source which is obscured by an additional noise that may be correlated with the market noise. The informed trader uses the extraneous information source to seek statistical arbitrage opportunities, while at the same time accommodating the additional risk. The amount of information available to the general market participant concerning the asset return is measured by the mutual information of the asset price and the associated cash flow. The worth of the additional information source is then measured in terms of the difference of mutual information between the general market participant and the informed trader. This difference is shown to be non-negative when the signal-to-noise ratio of the information flow is known in advance. Explicit trading strategies leading to statistical arbitrage opportunities, taking advantage of the additional information, are constructed, illustrating how excess information can be translated into profit.

## 1. Introduction

There are many different approaches to the modelling of so-called insider trading strategies. Starting with the work of Kyle (1985) and Back (1992), a number of investigations have been carried out (to name a few, Seyhun 1988; Amendinger *et al*. 1998; Föllmer *et al*. 1999; Back *et al*. 2000; León *et al*. 2003; Corcuera *et al*. 2004; Biagini & Øksendal 2005, 2006; Ankirchner *et al*. 2006; Campi & Çetin 2007). It is sometimes assumed in the literature that the ‘insider’ has direct access to the values of future asset prices. While such a scenario may indeed occasionally prevail, the more common situation is that informed agents do not have advance access to the exact values of future asset prices. How, then, do agents having high information susceptibility—when compared with the average market participant—use their strengths in reality? An increasingly popular strategy adopted by some large hedge funds is to make use of publicly available information in addition to the high-frequency price data. What gives these funds an edge is their significant computational power for the data and text mining, thus allowing them to extract useful information from publicly available sources *faster* than their competitors. Against this background it is natural to ask how much added information an extra source provides, how does it affect trading strategies and more generally in what way can such information-based strategies be modelled mathematically?

The purpose of the present paper is: (i) to introduce a phenomenological approach to model the agent susceptive to information, (ii) to quantify the amount of added information, and (iii) to derive trading strategies that lead to statistical arbitrage opportunities for such an agent. Our analysis is carried out within the information-based asset pricing framework of Brody, Hughston and Macrina (Macrina 2006; Brody *et al*. 2007, 2008*a*,*b*; Rutkowski & Yu 2007; Hughston & Macrina 2008). In this framework—hereafter referred to as the BHM framework—the price process of an asset is derived from the specification of (a) the future cash flows associated with the asset, and (b) the flow of information to market participants. The price is then given by the discounted risk-adjusted expectation of the cash flows conditional on the available information.

The simplest model that arises in the BHM framework is briefly reviewed in §2. In this set-up the asset is characterized by a contract that delivers a single random cash flow at a predesignated time. Such an asset can be interpreted as having the structure of a credit-risky discount bond. In §3 we consider the problem of quantifying the amount of information contained in the bond price concerning the value of the future bond payout. To this end we determine the mutual information (Yaglom & Yaglom 1983; Cover & Thomas 1991) of these two random variables. Initially the market has no information, beyond that already implicit in the asset price, concerning the value of the cash flow. However, as time goes by, the market gathers information. When the amount of information reaches the level of the initial entropy of the cash flow, the market finally ‘learns’ what the value of the cash flow is. The information-theoretic analysis is extended in §4, where we show that the mutual information at time *t* is given by the initial uncertainty, less the expected uncertainty that remains at that time.

In §5 a simple model for an informed trader is introduced. In our approach the informed trader is more susceptive to the flow of market information than other market participants, and thus on average is able to estimate the value of the impending cash flow more quickly and accurately than the other market participants. Simulation studies show a comparison of the sample paths for the market price process and the corresponding valuations made by the informed trader, revealing various intuitive as well as counter-intuitive properties of these processes. The dynamics of the valuations estimated by the informed trader are worked out in §6, where we obtain the associated innovations representation. With a basic model for an informed trader at hand we are able to quantify the amount of added information held by the trader. This is worked out in §7, where we construct an elementary trading strategy making use of the additional information, demonstrating the existence of statistical arbitrage opportunities in such circumstances.

## 2. Information and asset pricing

The simplest model arising in the BHM framework can be summarized thus. We fix a probability space (*Ω*, , ), where denotes the risk-neutral measure. We write for expectation with respect to . The market is not assumed to be complete, but we assume the absence of arbitrage and the existence of an established pricing kernel—these assumptions ensure the existence of a preferred pricing measure (cf. Cochrane 2005). Let *X*_{T} denote the random cash flow associated with the asset and paid at time *T* (for example, the payout of a credit-risky discount bond). Before time *T* market participants do not have direct access to the value of the cash flow. We assume, nevertheless, that partial information concerning the value of *X*_{T}, obscured by market noise, can be obtained before *T*. This noisy ‘observation’ of *X*_{T} generates the market filtration {_{t}}, and the price at time *t* is given by the risk-neutral expectation of the discounted cash flow, conditional on _{t}.

We assume that the ‘signal’ of the noisy observation concerning *X*_{T} is revealed to the market at a constant rate *σ*, that the ‘noise’ is generated by an independent Brownian bridge process {*β*_{tT}}, and that the market filtration is generated by an information process {*ξ*_{t}} defined for 0≤*t*≤*T* by(2.1)In other words, {_{t}}=*σ*({*ξ*_{s}}_{0≤s≤t}). The use of a bridge process for the noise is motivated by the idea that at time 0 all the available information about *X*_{T} is incorporated into the *a priori* distribution, and that at time *T* the value of *X*_{T} is revealed and there is no remaining noise: the choice of a Brownian bridge is made for simplicity and tractability. If we further assume that the default-free system of interest rates is deterministic and let {*P*_{tT}} denote the price at time *t* of a discount bond that matures at *T*, then the price of the credit-risky discount bond at time *t* is given by *B*_{tT}=*P*_{tT}[*X*_{T}|_{t}]. In the case where *X*_{T} takes the discrete values {*x*_{i}}_{i=1,…,n} with the *a priori* probabilities {*p*_{i}}_{i=1,…,n}, a calculation shows that(2.2)This follows from the fact that the conditional risk-neutral probability defined by *π*_{it}=(*X*_{T}=*x*_{i}|_{t}) takes the form(2.3)By taking the stochastic differential of (2.2) one finds that the dynamical equation satisfied by the bond price is(2.4)where *r*_{t}=−∂ ln *P*_{0t}/∂*t*, and where the process {*W*_{t}} defined by the expression(2.5)turns out to be a standard {_{t}}-Brownian motion. Here, denotes the conditional expectation of the cash flow *X*_{T}, and(2.6)is the conditional variance of *X*_{T} (see Macrina 2006 and Brody *et al*. 2007 for derivations of the foregoing results).

We observe that in the information-based framework it is possible to *deduce* the diffusive dynamics (2.4) for the price process, starting from the specification of the cash flow *X*_{T} and the information process {*ξ*_{t}}. The theme that underlies this framework is that the market acts as a ‘signal processor’ for future cash flows so as to generate the dynamics of asset prices. This point of view is natural as a basis for understanding the elements of price formation, since investment decisions are often made in accordance with the perceptions of market participants concerning the future cash flows associated with a given asset.

As far as the market filtration is concerned, the information contained in {*ξ*_{t}} is equivalent to that in {*B*_{tT}}: we have *σ*({*ξ*_{s}}_{0≤s≤t})=*σ*({*B*_{sT}}_{0≤s≤t}). This follows from the fact that one can write *B*_{tT}=*B*(*t*, *ξ*_{t}), where(2.7)from which by differentiation we deduce that(2.8)which is positive. Therefore, *B*(*t*, *x*) is monotonically increasing in *x*, and hence invertible. It follows that from knowledge of the trajectory {*ξ*_{s}}_{0≤s≤t}, one can construct {*B*_{sT}}_{0≤s≤t}; conversely from knowledge of the trajectory {*B*_{sT}}_{0≤s≤t} one can construct {*ξ*_{s}}_{0≤s≤t}.

## 3. Amount of information about the future cash flow contained in the price process

We would like to quantify how much information regarding the value of the cash flow *X*_{T} is contained in the value at time *t* of the information process (2.1). A reasonable measure for such quantification is given by the mutual information *J*(*ξ*_{t}, *X*_{T}) between the two random variables (Shannon & Weaver 1949; Gel'fand *et al*. 1956; Gel'fand & Yaglom 1957), which in the present context is given by the expression(3.1)where(3.2)is the joint density function of the random variables (*ξ*_{t}, *X*_{T}), and where *ρ*_{ξ}(*x*) and *ρ*_{X}(*i*) are the respective marginal probabilities. By use of the relation(3.3)we deduce that(3.4)since conditional on *X*_{T}=*x*_{i} the random variable *ξ*_{t} is normally distributed with mean *σx*_{i}*t* and variance *t*(*T*−*t*)/*T*. From (3.4) the marginal densities(3.5)can be deduced at once. In particular, we have *ρ*_{X}(*i*)=*p*_{i}, as it should.

An alternative way of deriving the mutual information in this context is to make use of the identity(3.6)Here *H*(*ξ*_{t})=−[ln *ρ*_{ξ}(*ξ*_{t})] is the entropy of the random variable *ξ*_{t} (Wiener 1948; Khintchine 1953) and *H*(*ξ*_{t}|*X*_{T})=−[ln *ρ*_{ξ}(*ξ*_{t}*|X*_{T})] is the entropy of *ξ*_{t} conditional on *X*_{T}. The conditional density function *ρ*_{ξ}(*x*|*X*_{T}), *x*∈, is defined by *ρ*_{ξ}(*x*|*X*_{T})=d(*ξ*_{t}<*x*|*X*_{T})/d*x*. Since conditional on *X*_{T} both of the random variables *ξ*_{t} and *β*_{tT} are normally distributed, with the same variance *t*(*T*−*t*)/*T*, and since the entropy of a normally distributed random variable is independent of its mean, we find that *H*(*ξ*_{t}|*X*_{T})=*H*(*β*_{tT}). In other words, the mutual information in the present context is given by the difference of the two entropies:(3.7)As a consequence, the information about the cash flow *X*_{T} contained in *ξ*_{t} can be determined (a different approach to extracting information concerning the asymptotic dividend stream from option price data is considered in Geman *et al*. 2007).

From an information-theoretic point of view, a pair of processes related through an invertible smooth function, and thus sharing the *same* filtration, in general possess *different* information content (entropy). On the other hand, since what is directly observed in the market is the price *B*_{tT}, which is an invertible function of *ξ*_{t}, one might argue that it is more relevant to determine the mutual information *J*(*B*_{tT}, *X*_{T}), that is, the amount of information about the future cash flow contained in the market price. However, since mutual information is given by a difference of entropies, and since changes in the two entropies resulting from the transform cancel, we have *J*(*B*_{tT}, *X*_{T})=*J*(*ξ*_{t}, *X*_{T}). Therefore, the amount of information about *X*_{T} contained in *B*_{tT} is given by (3.7).

In figure 1 we plot the mutual information *J*(*B*_{tT}, *X*_{T}) as a function of *t*∈[0, *T*] for three values of the information flow-rate parameter *σ*. The information gained by market participants increases more rapidly as *σ* is raised. On the other hand, the dynamical relation (2.4) shows that the value of *σ* determines the overall magnitude of the price volatility. Thus, it is possible to quantify the market information gain and to compare this with the price volatility.

To see how entropy transforms under a nonlinear invertible map, suppose that *X* is a random variable with density *p*(*x*), and that *Y* is another random variable given by *Y*=*f*(*x*), where *f*(*x*) is smooth and invertible. Then the density function for *Y* is given by *q*(*y*)=*p*(*f*^{−1}(*y*))/*f*′(*f*^{−1}(*y*)). Substituting this in and changing variables by setting *y*=*f*(*x*), we find that . As a consequence, we have(3.8)where *B*′(*t*, *x*)=∂*B*(*t*, *x*)/∂*x*. The advantage of this expression is that we need not determine the inverse of the function *B*(*t*, *x*) defined in (2.7) in order to calculate *H*(*B*_{tT}). From (2.8) it follows that(3.9)In other words, the difference between *H*(*B*_{tT}) and *H*(*ξ*_{t}) is the average log volatility of the price process at time *t*.

## 4. Analysis of information measures

We proceed in this section to consider the Shannon–Wiener entropy associated with the conditional risk-neutral probabilities. The analysis of the entropy leads to insights into the qualitative behaviour of the asset price volatility. The Shannon–Wiener entropy is defined by the expression(4.1)We shall demonstrate that the mutual information (3.1) and the entropy (4.1) are related as follows:(4.2)Thus, the mutual information at time *t* is given by the initial uncertainty, less the expected uncertainty that still remains at that time.

The derivation of (4.2) proceeds in two steps. First we shall show that(4.3)and then we show that(4.4)

Let us begin by deriving the dynamical equation satisfied by the Shannon–Wiener entropy. From (2.3) and (2.5) we find that the conditional probability satisfies(4.5)It follows, by an application of Ito's lemma to (4.1), that(4.6)Taking the expectation of both sides of (4.6), we obtain (4.3), as desired. ▪

In Gel'fand & Yaglom (1957) it is shown that the mutual information can be expressed as the expectation of the log density of the joint measure *μ*_{ξX} with respect to the product measure *μ*_{ξ}⊗*μ*_{X}:(4.7)We are thus required to determine the relevant Radon–Nikodym derivative. We shall follow the line of argument presented in Davis (1978); see also Duncan (1970). To proceed, we require the introduction of an auxiliary measure introduced in Macrina (2006) and Brody *et al*. (2007, 2008*a*). This is the so-called bridge measure under which the information process {*ξ*_{t}} becomes a Brownian bridge. The argument goes as follows. We fix the probability space (*Ω*,,) with a filtration {_{t}}_{0≤t<∞}, and introduce a -Brownian motion {*B*_{t}} such that the Brownian bridge {*β*_{tT}} appearing in (2.1) is given by(4.8)This is the standard integral representation for a Brownian bridge (Hida 1980; Protter 2005). Setting *ν*_{t}=*σT*/(*T*−*t*) we define(4.9)where *X*_{T} is -independent of {*B*_{t}} and is _{0}-measurable. For fixed *u*<*T* we introduce the measure on _{u} by writing(4.10)Then the process defined by(4.11)is a -Brownian motion, since *ν*_{s}*X*_{T} is bounded for any *s*≤*u*.

Under we find that the distribution of *X*_{T} is the same as it is under , that {*ξ*_{t}} and *X*_{T} are independent, and that {*ξ*_{t}} is a -Brownian bridge (cf. Brody *et al*. 2008*a*). To see these, recall that, since *X*_{T} and {*β*_{tT}}, and hence *X*_{T} and {*B*_{t}}, are independent, the probability law of {*B*_{t}} conditional on *X*_{T} remains that of a Brownian motion. Now take a bounded function *f*(*x*) and consider(4.12)Conditional on *X*_{T}, takes the form of a Girsanov exponential, since {*B*_{t}} is a -Brownian motion. Therefore, the inner expectation equals unity, and we find(4.13)for every bounded function *f*(*x*). In other words, *X*_{T} has the same probability law under and . In a similar manner, for any sequence of times *t*_{1}, … ,*t*_{n}∈[0,*u*] and any bounded function *g*:^{n}→, we wish to calculate(4.14)By the same argument as above, for each *X*_{T} the process is Brownian under the measure whose density is . Since itself is Brownian under , we have(4.15)and hence(4.16)However, and coincide on *X*_{T} so that , from which it follows that *X*_{T} and {*ξ*_{t}} are -independent. By combining (4.8) and (4.11) we get(4.17)Eliminating *β*_{tT} by the use of *β*_{tT}=*ξ*_{t}−*σtX*_{T} we obtain the relation(4.18)which is the dynamical equation satisfied by a Brownian bridge in the -measure.

Let *Ψ* be the map *ω*→{{*ξ*_{t}(*ω*)}_{0≤t≤u},*X*_{T}(*ω*)}. Then the joint sample space measure of {{*ξ*_{t}(*ω*)},*X*_{T}(*ω*)} is *μ*_{ξX}(*A*)=(*Ψ*^{−1}(*A*)) for any measurable set *A*, and the sample space measure of *X*_{T} is for any measurable set *A*′. However, since {*ξ*_{t}} and *X*_{T} are independent under , and {*ξ*_{t}} is a -Brownian bridge, we have *μ*_{X}⊗*μ*_{β}(*A*)=(*Ψ*^{−1}(*A*)). It follows from (4.9) that(4.19)Substituting (4.11) in here we thus deduce that(4.20)

Turning to the innovations representation (2.5) we find, along with (4.18), that {*W*_{t}} and are related according to(4.21)Thus, following a similar line of argument we obtain(4.22)which is a version of the likelihood ratio formula of Kailath (1971). The measure *μ*_{ξ} thus corresponds to the ‘signal present’ situation, while the ‘signal absent’ case corresponds to {*ξ*_{t}} being pure bridge noise with measure *μ*_{β}. Combining (4.20) and (4.22), and making use of (4.20), we deduce that(4.23)Taking the expectation of the logarithm of this, bearing in mind that {*B*_{t}} is a -Brownian motion, we recover (4.4), as claimed. ▪

The entropy process {*H*_{t}}_{0≤t<T} has the property that lim_{t→T}*H*_{t}=0. This follows from the fact that the conditional probability process {*π*_{it}}_{0≤t<T} has the limiting behaviour(4.24)for *i*=1, …, *n*.

Let *ω*∈*Ω* be fixed, and suppose that *X*_{T}(*ω*)=*x*_{k} for some *k*. For this realization of *ω* the information process is given by *ξ*_{t}=*σtx*_{k}+*β*_{tT}. Substituting this expression for *ξ*_{t} into (2.3) and dividing the denominator and the numerator by the exponential factor appearing in the numerator, we deduce that(4.25)Observe that all of the terms in the sum in the denominator vanish as *t* approaches *T*, and therefore lim_{t→T}*π*_{kt}=1. It follows that lim_{t→T}*π*_{it}=0 for *i*≠*k*, and thus (4.24). Finally, since by (4.1) and since , we deduce that lim_{t→T}*H*_{t}=0. ▪

If we let *t* approach *T* in (4.3), we find the following relation:(4.26)Since *H*_{0} is bounded by ln *n*, where *n* is the number of values *X*_{T} can take, and since the coefficient of the conditional variance {*V*_{sT}} in the integrand diverges quadratically as *s* approaches *T*, this relationship shows that the variance process has to decay sufficiently rapidly to ensure the existence of the right-hand side of (4.26). On the other hand, the conditional variance also generates the random movement of the asset price volatility in (2.4). As a consequence, we are able to obtain a crude estimate of the magnitude of the cumulative volatility. It is worth noting that in the models based on Brownian noise, the entropy and mutual information are closely related to the prices of variance or volatility derivatives. A related observation has been made by Soklakov (2008).

It should be remarked that the limiting behaviour lim_{t→T}*H*_{t}=0 is specific to the case in which *X*_{T} takes discrete values. If *X*_{T} is a continuous random variable, then the associated entropy has the property that lim_{t→T}*H*_{t}=−∞, which can be seen from the Hirschman inequality in Fourier analysis (Beckner 1975):(4.27)It follows from (4.6) that the variance process {*V*_{sT}} in this case does not vanish sufficiently rapidly to ensure finiteness of the right-hand side of (4.3) as *t* approaches *T*. In other words, there is a qualitative difference in the behaviour of the volatility process, depending on whether the cash flow is a continuous or discrete random variable. In particular, the volatility products may be overpriced in models based on continuous cash flows, since the real market cash flows are not continuous.

## 5. A model for an informed trader

In the previous sections we have examined the BHM framework from an information-theoretic perspective. In particular, we have been able to quantify the amount of information about the future cash flow of an asset contained in its price. We turn now to consider a model for an informed trader who has access to an additional information source, apart from the price process itself, concerning the future return of an asset. We assume that the informed trader is ‘small’ in the sense that access to the additional information is limited, and that the actions of the informed trader will not impact the price process. In other words, the model is not for a large number of small agents; rather, it is for a single agent, or a highly restricted number of agents, who carefully execute their trading strategies, taking advantage of the additional information.

One might expect that the use of additional information gives a *definite* advantage for the trader. This, however, is not necessarily the case: additional information is in general obscured by additional noise. As a consequence, the valuation process estimated by the informed trader can entail higher volatility than the actual market price movements. It follows that any strategy making use of additional information will tend to embody additional risk. Nevertheless, on the whole such strategies are expected to outperform the market, and this is the idea behind some of the statistical arbitrage schemes adopted by hedge funds.

The BHM framework is sufficiently flexible to model this kind of scenario. Indeed, the use of this approach as a basis for the development of insider-trading models was recognized early on (Macrina 2006). Our intention here is to apply such ideas to describe the disparity in the ability of processing publicly available information, and to illustrate how statistical arbitrage opportunities can be seen to arise in a simple model. It should be emphasized that many of the simplifying assumptions—that the asset entails a single cash flow, that the information flow rates are constants, that the interest rate is deterministic, and that the noises are modelled by Brownian bridges—can be relaxed without affecting the main qualitative features of the model.

We assume the set-up for the market outlined in §2. There is additionally, however, an informed trader who has a further noisy information source represented by the information process(5.1)The noise term may or may not be correlated with the market noise {*β*_{tT}}. We let {*B*_{t}} and be a pair of Brownian motions with correlation *ρ*, and define the associated Brownian bridges by(5.2)In this way we can model the two noise terms with a fair amount of generality (we may use alternatively the integral representation (4.8) to construct the bridge processes; however, the choice (5.2) is more suitable for simulation purposes), since *ρ* determines the correlation of {*β*_{tT}} and . In particular, if |*ρ*|=1 then the informed trader has two linear equations (2.1) and (5.1) for the two unknowns *X*_{T} and *β*_{tT}; hence the value of the future cash flow *X*_{T} will become instantly accessible to the trader (assuming |*σ*|≠|*σ*′|). This situation corresponds to the fully-informed insider often considered in the literature. The other extreme, for which |*ρ*|≪1, is of interest, since the informed trader must choose a strategy optimally to avoid being overwhelmed by the additional noise.

Let denote the filtration generated by . If *σ*′>*σ*, then knowledge of the value of *X*_{T} is revealed at a faster rate in the ‘primed’ filtration. This, however, does not mean that {_{t}} is contained in {′_{t}} even if *ρ*=1; the two filtrations are merely inequivalent. On the other hand, since the informed trader also has access to the price process, which is adapted to {_{t}}, it is reasonable to assume that the information source is given by . We assume that the additional information commences at time *t*=0; hence, the *a priori* probabilities {*p*_{i}} for *X*_{T} to take the values {*x*_{i}} remain the same. This assumption may seem limiting; however, it is not unreasonable since the ‘lifetime’ of the extra information, i.e. the period over which extra information has value, is often short in practice.

The informed trader will use the extra information to work out the price that the market would have made had {_{t}} been accessible to general market participants. We shall now calculate the valuation process made by the informed trader on this basis. From the Markov property of the joint information process we find that the informed valuation process is given by(5.3)where . By the use of the Bayes formula(5.4)where the conditional density is given by the bivariate normal density function(5.5)we deduce that(5.6)Here we set *σ*_{1}=*σ*−*ρσ*′, *σ*_{2}=*σ*′−*ρσ*, and *ϱ*=1−*ρ*^{2}.

The process (5.6) is our model for the valuation made by the informed trader. It is straightforward to simulate the informed valuation process and compare this against the uninformed market price process {*B*_{tT}}. By suitably adjusting the values for *σ*, *σ*′ and *ρ*, we are able to confirm various intuitive aspects of the behaviour of these processes; some examples are displayed in figure 2. With respect to any given sample path, the valuation of the informed trader at time *t*∈[0,*T*] may be less accurate (by ‘accurate’ we mean close to the true value) than the market price. However, on average the valuation determined by the informed trader converges more rapidly to the true value of the bond. This is illustrated in figure 3 where we plot the averaged sample paths conditional on the given outcome.

Figure 3*b* indicates that the performance of the informed trader is high even if the signal-to-noise ratio *σ*′ of the additional information source is set to zero (and hence is pure noise). In fact the quality of the estimate decreases as the value of *σ*′ is raised from zero, until it reaches the critical level *σ*′=*ρσ*. Putting the matter differently, we observe that the quality of the estimate made by an informed trader is not monotonic in the signal-to-noise ratio of the additional information source. This feature may appear counter-intuitive, but it can be understood by rearrangement of (5.6) into a form analogous to (4.5):(5.7)Here we write *ω*_{ik}=*x*_{i}−*x*_{k}. This expression shows that the exponential rate of convergence for the process to approach the ‘true’ value *x*_{k} is governed by the ratio . In particular, for fixed *σ* and *ρ* this ratio takes a minimum value at *σ*′=*ρσ*. When *σ*′=*ρσ*, the linear equations *ξ*_{t}=*σtX*_{T}+*β*_{tT} and become the closest to being degenerate, and hence the value of the additional information is minimized.

## 6. Innovations and the dynamics of informed valuations

Our objective now is to obtain an innovations representation for the valuations made by the informed trader. For this purpose it suffices to derive the dynamical equation satisfied by the ‘insider’ valuation , or equivalently, by the conditional probability . The calculation simplifies if we express (5.6) in the form(6.1)Here the process(6.2)represents the ‘enhanced’ information being effectively received by the informed trader, with the modified bridge noise(6.3)Applying Ito's lemma to (5.4) and making use of (6.2), we find that(6.4)where . The process {*Z*_{t}} appearing in (6.4) is defined by(6.5)By following a line of argument similar to that presented in Brody *et al*. (2007), it can be shown that {*Z*_{t}} is a standard _{t}-Brownian motion. We deduce that the valuation process of the informed trader obeys the following dynamical equation:(6.6)where the volatility process {*Γ*_{t}} is given by(6.7)and denotes the variance of *X*_{T} conditional on _{t}.

The fact that the information accessible to the informed trader can be ‘compactified’ into a single enhanced information can be understood as follows. Since the noise processes {*β*_{tT}} and have correlation *ρ*, one can write(6.8)where the Brownian bridge process is taken to be independent of {*β*_{tT}}. Similarly, we can decompose the extra information in the form(6.9)where and . It should be evident that the filtration generated jointly by is equivalent to that generated jointly by . However, the information processes {*ξ*_{t}} and have independent noises. To proceed, we note that the process {*δ*_{t}} defined by(6.10)is purely noise, and is independent of *X*_{T}. We can construct a new Brownian bridge that is independent of this noise. A standard orthogonalization shows that this is given by the bridge process defined by (6.3). It follows that the enhanced information process defined by (6.2) is independent of {*δ*_{t}}. Furthermore, the filtration generated jointly by is equivalent to that generated jointly by . Since {*δ*_{t}} provides no useful information about *X*_{T}, i.e. , the informed trader in effect has as the primary basis for valuation.

This line of argument, making use of the orthogonalization procedure, extends more generally to the case where there are multiple information processes relating to the cash flow *X*_{T}: starting with a family of processes with signal-to-noise ratios , one orthogonalizes the associated noises. The result is a new set of information processes with signal-to-noise ratios . Then the information process that the informed trader uses as a basis for valuation can be represented by a single *effective* information process with the enhanced signal-to-noise ratio .

## 7. Additional information held by the informed trader and statistical arbitrage strategies exploiting this

We are in a position to quantify the amount of excess information held by the informed trader above that of the market. This is measured by the difference of the mutual information:(7.1)By the argument in §3, the mutual information of the informed trader is given by an entropy difference of the form(7.2)The entropy of the ‘insider’ information is determined by the marginal density of , whereas the conditional entropy is the entropy of a Brownian bridge.

Following the line of argument presented in §4 we are able to represent the mutual information difference in terms of the expected entropy differences:(7.3)where . This expression makes it apparent that Δ*J* is non-negative, since the entropy characterizes the amount of uncertainty concerning the value of the cash flow *X*_{T}, and for any *t*∈(0,*T*) this uncertainty is greater on average for the general market participant than for the informed trader. In figure 4 we plot an example of Δ*J*, indicating the way in which the excess information held by the informed trader changes in time.

Given that the informed trader is in general ‘more knowledgeable’ than the general market participant, it is natural to ask how this advantage can be turned into profit. One of the issues that can be addressed in this connection is the derivation of optimal trading strategies. For such an analysis, one may need to introduce additional structure into the problem in the form of a suitable criterion for optimality and a specification of the market price of risk. In the present investigation, we confine the discussion to a demonstration, supported by simulation studies, of how even very simple strategies can yield statistical arbitrage opportunities by outperforming the market.

For example, suppose we consider a strategy such that at some designated time *t*∈(0,*T*) a market trader purchases a digital bond iff at that time the bond price *B*_{tT} is greater than *KP*_{tT} for some specified threshold *K*. The value of *K* can be regarded as the risk aversion level of the trader. An informed trader follows the same rule, but makes a better estimate for the value of the bond, and hence purchases the bond iff . In either case a bond that is purchased will be held until maturity. That such a strategy leads to a statistical arbitrage opportunity for the informed trader can be seen as follows. We assume that the initial position of the trader is zero, i.e. purchase of a digital bond at *t* requires borrowing the amount *B*_{tT} at that time, and repaying the amount at *T*. Thus, the value of the market trader's portfolio at *T* is(7.4)whereas the terminal value of the informed trader's portfolio is(7.5)Consider now the present value of a security that delivers a cash flow equal to the excess profit and loss (P&L) generated by the strategy of the informed trader. By use of the tower property we have ; but(7.6)since the random variables *B*_{tT} and are both _{t}-measurable. If , then is non-negative, whereas if then is non-positive. It follows that is a non-negative random variable, and hence , since with probability greater than zero. Therefore, the informed trader can execute a transaction at zero cost that has positive value, and this is what we mean by ‘statistical arbitrage’.

We have examined the P&L profile, both for a general market trader and for an informed trader, resulting from the repeated application of such a strategy. The results are plotted in figure 5. In particular, we consider 2000 realizations (sample paths) for the information processes governing the bond price valuations. For each fixed *t*∈[0,*T*], we calculate the total P&L for the informed trader and for the market trader obtained by following the specified strategy over and over for each of the 2000 independent sample paths. For each fixed *t*, we chart in figure 5 the difference between the total P&L of the informed trader and that of the market trader. Providing that the strategy is executed after enough time has passed for the informed trader to gain an informational advantage, we find that the difference between the P&L of the informed trader and that of the market trader is always positive. Furthermore, the qualitative behaviour of the resulting P&L difference is in agreement with the qualitative behaviour of the magnitude of the excess information possessed by the informed trader indicated in figure 4 (we have chosen the same parameter values for these two figures to allow for direct comparison).

Our objective has been to demonstrate how statistical arbitrage strategies arise in a market characterized by heterogeneous information flow. It is interesting that a qualitatively similar behaviour for the excess information and the excess P&L is observed in the case of the rather primitive strategy we have considered. There are many ways in which one can improve upon the trading strategy examined above. An important open issue is to determine the optimal trading strategy, subject to suitable optimality criteria, that exploits the excess information. We conclude by remarking that a related approach to the modelling of insider trading within the information-based framework is suggested in Macrina (2006), where the asset return is modelled as being dependent on more than one market factor, for which only some of the associated information processes are accessible to the market. It would be of interest to examine whether the kind of hedge fund strategy considered here is also applicable in a setup involving multiple market factors.

## Acknowledgments

We thank J. Z. Kelly, A. Macrina, B. K. Meister and M. F. Parry for stimulating discussions.

## Footnotes

- Received November 14, 2008.
- Accepted November 17, 2008.

- © 2008 The Royal Society