## Abstract

The aim of the study is to propose a technique for the retrieval of point sources of atmospheric trace species from concentration measurements. The inverse problem of identifying the parameters of a point source is addressed within the assimilative framework of renormalization recently proposed for the identification of distributed emissions. This theory has been extended for the point sources based on the property that these are associated with the maximum of the renormalized estimate computed from the observations. This approach along with an analytic dispersion model is used for point source identification, and the sensitivity of the samplers is described by the same model in backward mode. The proposed technique is illustrated not only with synthetic measurements but also with seven sets of observations, corresponding to convective conditions, taken from the low-wind tracer diffusion experiment conducted at the Indian Institute of Technology Delhi in 1991. The position and intensity of the source are retrieved exactly with the synthetic measurements in all the sets validating the technique. The position of the source is retrieved with an average error of 17 m, mostly along wind; its intensity is estimated within a factor 2 for all the sets of real observations. From a theoretical point of view, the link established between point and distributed sources clarifies new concepts for the exploitation of monitoring networks. In particular, the influence of the noise on the identification of a source is related to the relative visibility of the various regions described with a renormalizing weight function. The geometry of the environment modified according to the weights is interpreted as an apparent geometry. It is analogous to the apparent flatness of the starry sky in eye's view, usually considered an impression rather than a scientific fact.

## 1. Introduction

In this paper, we investigate the identification of point sources of atmospheric trace species based on an assimilative technique related to the more general problem of distributed emissions. The point of view is mainly theoretical to explain the geometric concepts associated with this recent theory (Issartel *et al*. 2007). In general, the releases of the species are not directly observable and must be deduced from concentration measurements by solving an inverse problem. The proposed inversion technique applies when the emissions and concentrations are related linearly. It can be handled with concentrations measured by monitoring networks of all scales, from local to global, provided that the appropriate atmospheric dispersion model is available. It is illustrated here at a local scale with an analytic dispersion model for low-wind unstable steady conditions and the seven runs of the Indian Institute of Technology (IIT) Delhi tracer diffusion experiment (Singh *et al*. 1991) corresponding to these conditions.

The problem of identification of point sources has raised much attention among researchers (Rao 2007). It is often related to safety issues for species accidentally released in the air, for instance, from an industrial leakage (Thomson *et al*. 2007) or nuclear testing. A point source is characterized by its position, time and intensity, and thus identification is a parametric problem. If the corresponding number of measurements is available, success is limited only to the noise in the data. In general, it is addressed as a least-squares problem by comparing the observed concentrations to those ideally associated to the various values of the parameters and computed by an adjoint dispersion model (Seibert 2001; Penenko *et al*. 2002; Davoine & Bocquet 2007). On the other hand, the identification of the distributed emissions is an estimation problem called assimilation of data (Bouttier & Courtier 2001). These more general sources should be described as functions of space and time equivalent to a very large or infinite number of parameters. As the number of concentration measurements is limited, a complete identification is not feasible. The assimilative methods proposed so far (Mulholland & Seinfeld 1995; Peylin *et al*. 2001; Meirink *et al*. 2006) are not helpful in the resolution of the above parametric problem, so that the identification of point sources is most often addressed in a specific theoretical framework.

Here, robust relationships are established between the inverse problems for point and distributed sources within the framework of renormalization. We originally introduced this assimilative technique to estimate distributed emissions based on a geometric transformation of space and time called renormalization and associated with a visibility function. This function indicates how well a possible location of a release is observed and, thus, it provides a simple criterion for optimizing the design of the monitoring network. The geometric tools associated with the proposed theory are fully and easily realizable, whereas in the Bayesian framework, the elicitation of the background error covariance matrix is impossible in practice and always subject to simplifications. The special behaviour of renormalized estimation with respect to point sources proves to be especially helpful in understanding the extent of applicability of renormalization theory for the retrieval of the emissions of the atmospheric species. In particular, it helps in understanding some unexpected concepts associated with this new theory.

## 2. Inverse modelling

Inverse modelling denotes any technique used for identifying the emissions of a trace species from a set of *n* concentration measurements *μ*_{1}, *μ*_{2}, …, *μ*_{n}. Here, we use a technique based on the concept of *renormalization* described by Issartel *et al*. (2007). The technique returns an emission's estimate linear with respect to *μ*_{i}. It was primarily designed for the estimation of distributed sources and, at first sight, may look inappropriate for the identification of point sources. The renormalized estimate is a combination of *n* base functions with no particular focus when the measurements originate from a point source. However, in that case, the maximum value of the estimate coincides with the position of the source. This allows identification of the source from a set of concentration measurements. The coincidence is perturbed by the various errors in the measurements, and the perturbation influences significantly the estimated location of the source as the maximum is flat.

### (a) Normed space of continuous ground sources

Renormalized inversion is based on the choice of a *fundamental* vector space of emission functions acceptable for estimation, and this vector space is associated with a fundamental scalar product. In the present study, we shall be concerned with the continuous ground-level point sources generating a steady-state concentration field from which the measurements *μ*_{i} are taken. However, these do not constitute a vector space. We then choose as fundamental the vector space of all continuous ground-level functions *s*, such that *s*(*x*, *y*) can be given the meaning of a rate of release of the species per unit time and area around the point at horizontal coordinates (*x*, *y*). The vector space is associated with the fundamental product(2.1)in which *Σ* is the surface area on the ground. In equation (2.1), all portions of the surface *Σ* are equally weighted, implying that no specific region is given a preference as a possible location of a release.

Given the fundamental product, the correspondence between a source *s* and the measurements *μ*_{i} under steady-state conditions is described by introducing adjoint functions *a*_{i}(*x*, *y*) such that(2.2)

### (b) Computation of the adjoint functions

The adjoint function *a*_{i} introduced in equation (2.2) is a key element to describe the link between the measurement and the source (Marchuk 1992). Here, *a*_{i} is derived as *a*_{i}(*x*, *y*)=*r*_{i}(*x*, *y*, 0), in which *r*_{i}(*x*, *y*, *z*) is adjoint to the measurement *μ*_{i} for a more familiar product, facilitating the computations, defined as for continuous volume sources *σ*(*x*, *y*, *z*) per unit mass of air and per unit time; *ρ*(*x*, *y*, *z*) denotes air density. This product provides a symmetric relationship for forward and adjoint steady-state transport, so that corresponding adjoint functions may be termed as *retroplumes* (Hourdin & Issartel 2000)(2.3)in which is the wind vector; *Χ*(*x*, *y*, *z*) is the steady-state concentration generated by a continuous source *σ*(*x*, *y*, *z*); *π*_{i} is the sampling function associated with the measurement *μ*_{i}, such that *π*_{i}(*x*, *y*, *z*) is the proportion, in the sample, of the air taken at point (*x*, *y*, *z*). The diffusion operator *ζ* is assumed self-adjoint, based on the assumption of a time-symmetric turbulence that gives (Issartel & Baverel 2003)(2.4)where ** K** is the diffusion tensor and

**is the gradient operator.**

*∇*The variables *μ*_{i}, *Χ*, *π*_{i} and *r*_{i} are related as(2.5)By expressing in equation (2.5) the source *σ* as , in which *δ* is the Dirac delta function, one gets(2.6)implying that the adjoint function *a*_{i} for the fundamental product (equation (2.1)) coincides with the retroplume *r*_{i} at the ground level.

### (c) Geometric weights: the renormalization

In this and §2*d*, we describe the process of transforming a set of observations *μ*_{1}, *μ*_{2}, …, *μ*_{n} into an estimate of the unknown source without assuming a point source. In this process, as shown in our earlier paper, a straightforward use of the fundamental product is inappropriate. The estimate is a linear combination of the *a*_{i}, which is revealed to be arbitrary and artificial in the case of measurements at a point (Issartel 2005). In that case, the adjoint functions become singular at the location of the respective samplers, where the estimate may then take the value either infinity or zero. Such artefacts are taken care of by means of geometric weights *f* such that(2.7)the weights are non-negative and, in order to focus on their distribution, the total is fixed, equal to the number of measurements. They are used to define a modified scalar product with modified adjoint functions(2.8)According to this product, a source function decomposes as *s*=*s*_{∥f}*+s*_{⊥f} with parts, respectively, parallel and perpendicular to the *a*_{fi}. The parallel part is same for all source functions following the given measurement, *μ*_{i}=(*s*, *a*_{fi})_{f}=(*s*_{∥f}, *a*_{fi}) (equation (2.2)) as the perpendicular part does not contribute: (*s*_{⊥f}, *a*_{fi})_{f}=0. Among these functions, *s*_{∥f} has minimum norm (*s*_{∥f}, *s*_{∥f})_{f}. Its expression as a linear combination of the *a*_{fi} is derived as(2.9)in which ** μ** and

*a*_{f}(

*x*,

*y*) are, respectively, the column vectors of the measurements and modified adjoint functions; superscript ‘T’ denotes the transposition.

To eliminate artificial information, the weights are chosen such that (Issartel 2005)(2.10)The weights *φ* subject to equation (2.10) are called *renormalizing* weights. The function *φ* is also referred to as a visibility function, and it will be useful in the physical interpretation of the extent of the regions seen by the monitoring network. Its computation is described in §4.

### (d) Retrieval of a point source

Suppose that a concentration vector ** μ** has been observed and, for some reason, it is known that it has been generated from a point release. Is it possible to identify the location of this release from the observations? A positive answer is given in this section for ideal measurements, without errors. The location of the release coincides with the location

*x*_{0}=(

*x*

_{0},

*y*

_{0}), for which the estimate

*s*

_{∥φ}(

*x*,

*y*) computed from the observations (equation (2.9)) becomes maximum.

To see that, a point source is considered at location *x*_{0} with intensity *q*. The measurements generated by it are deduced from equation (2.2)(2.11)The signal ** μ** is detectable provided that the source is located in an observable region , and equation (2.9) yields out of it a source estimate(2.12)in which (

**)=(**

*x**x*,

*y*). This procedure is intended for the estimation of distributed areal emissions and, as a matter of fact, the estimate

*s*

_{∥φ}in equation (2.12) is a function distributed throughout the domain. It gives no indication that the measurements have been caused by a point source. However, if we have some reason to consider that the measurements have indeed been caused by a point source, its position and intensity are revealed by

*s*

_{∥φ}as follows.

As is a positive definite symmetric matrix (Boseck 1967), it may be regarded as defining an *n*-dimensional scalar product, so that the application of the Cauchy–Schwartz inequality to equation (2.12) yields the following:(2.13)In this inequality, both the expressions under the square root symbol are subject to the renormalizing condition (2.10), and we thus conclude that (Issartel & Burkhart 2006)(2.14)The maximum value of *s*_{0}(** x**) or

*s*

_{∥φ}(

**) is obtained for the location of the release. This allows to deduce the location of the source from the observations. Then, its intensity is derived from the relationship(2.15)obtained from equation (2.11) with the renormalizing condition (, equation (2.10)).**

*x*### (e) Accuracy of point source retrieval

In reality, the measurements include a noise perturbing the computation of the source estimate (equation (2.9)) with its maximum shifted from *x*_{0}. Contrary to the ideal data, **μ**_{r} is probably not proportional to any vector *a*_{φ}(** x**). The estimated location

*x*_{e}of the point source maximizing also minimizes the distance of the noisy data to a vector

*a*_{φ}(

**). In this, the part of the noise perpendicular to the cone generated by the**

*x*

*a*_{φ}(

**) in the**

*x**n*-dimensional real vector space

^{n}is corrected.

A flat maximum of the estimate is more susceptible to the noise than a peaked one. This is related to the visibility weights *φ*. The following result is obtained using the expression (2.12) of *s*_{0} after recognizing in the expression the definition of *H*_{φ} (see equation (2.9))(2.16)Since the total value of the integral is unity and the integrand is positive, this integrand takes either relatively large values over a smaller domain or small values over a larger domain. Note also that |*s*_{0}|≤1 and *s*_{0}(*x*_{0})=1 (see equation (2.14)). This allows us to conclude that the maximum of *s*_{0} will be either sharp, corresponding to a relatively large value of *φ*(*x*_{0}), or flat, corresponding to a relatively small value of *φ*(*x*_{0}).

Thus, when the function *φ* is weak, the location of the source may not be characterized accurately.

## 3. Diffusion experiment

Inverse modelling requires measurements of atmospheric concentration fields with sufficient accuracy. However, such concentration measurements are expensive, and are typically available only on a sparse spatial and temporal network. In the present study, we have used the observations taken in the tracer diffusion experiment conducted for surface releases of tracer SF_{6} in February 1991 at IIT Delhi (28°52′ N, 77°18′ E) in low-wind conditions to evaluate the inversion technique for retrieval of a point source. Details of the experiment including atmospheric stability are given in Singh *et al*. (1991) and Sharan *et al*. (1996). The tracer was released at a height of 1 m above the ground, and the samplers were also placed approximately at the same height. For the computational purposes here, the release is considered from a ground source and the measured concentrations are at the ground *z*=0.

In all, 14 test runs were conducted. Only seven (runs 1, 6, 7, 8, 11, 12 and 13) corresponding to the unstable steady conditions required by our dispersion model (§4) have been chosen for the analysis. Run 2 was ignored due to a relatively large variability in wind direction and the remaining runs because of neutral and stable conditions. In runs 1, 6, 7 and 11, the release point was located at the centre of the circular arcs, whereas in the remaining runs (8, 12 and 13), the release point was shifted 100 m towards northwest. In runs 1, 8, 12 and 13, the release rate was 5000 μg s^{−1}, while in the remaining runs (6, 7 and 11), it was 3000 μg s^{−1}. In run 1, the tracer was released continuously, and the air was collected for a period of 30 min. In the remaining runs, release periods were of 60 min duration with sampling during the later 30 min. The sampling network involved 20 receptors on 50, 100 and 150 m, and in some cases on 200 m circular arcs with 45° angular spacing between them (figure 1). When the source is shifted from the centre onto the arc at 100 m, the corresponding sampler is moved to the centre. In some of the runs, a few measurements are discarded (Sharan *et al*. 1996), thus reducing the effective numbers of samplers in the monitoring network.

Wind and temperature measurements were obtained at four levels (2, 4, 15 and 30 m) from a 30 m micrometeorological tower. The values of wind speed, wind direction, atmospheric stability and mixing height required in the dispersion model are taken from Sharan *et al*. (1996).

## 4. Description of the computations

An analytic low-wind dispersion model (Sharan *et al*. 1996) developed for point source emissions, based on steady dispersion equation (2.3), is used in backward mode for the computation of the functions adjoint to the measurements. In principle, this model is applicable for unstable, neutral and stable conditions, but adequate data are not available under neutral and stable conditions. Hence, renormalized source estimation has been carried out for the seven runs corresponding to unstable steady conditions. For each run, two inversions are done by considering (i) the complete set of concentrations observed and (ii) the analogous set of synthetic measurements generated from the dispersion model. Synthetic data minimize the errors associated with the dispersion model and the measurements. This helps in validating the inversion process as well as in interpreting the results with the real data. To illustrate the performance of the dispersion model, a comparison of synthetic and real data is given in table 1 for representative run 1. It shows differences within a factor two for the most significant values, except those obtained from samplers located at the edges of the plume (sampler 1). The effect of this error will be addressed qualitatively.

Inversion mainly consists of the following steps: (i) determination of the adjoint function *a*_{i}(*x*, *y*) from the dispersion model, (ii) computation of the renormalizing function *φ*, (iii) computation of the renormalized estimate according to equation (2.9), and (iv) identification of the source, its location corresponding to the maximum of the estimate and intensity being computed from equation (2.15).

For step (i), since forward and adjoint steady-state transports are symmetric (equation (2.3)) and each measurement represents the concentration sampled at a point, the dispersion model is used in the computation of the adjoint functions by simply changing the direction of the wind by 180°. Intensity *q* is taken as unity, released from the position of the sampler. Note that adjoint functions for a given run are superposable on each other with reference to the sampler at origin. Accordingly, in each given run, all adjoint functions are deduced from a single computation.

For step (ii), we are not aware of any analytical construction of *φ*. The weights *φ*, subject to the renormalizing condition (equation (2.10)), are computed with an algorithm (Issartel 2005; Issartel *et al*. 2007)(4.1)The regions not fulfilling the condition with threshold *ϵ* in equation (4.1) are, in fact, eliminated from the inversion process. This condition is aimed at avoiding divisions by small numbers in the numerical process. The computation of the vector , of elements , involves a division by *f*_{k}(** x**). Thus, the algorithm cannot take care of the points not or poorly seen. These points are not known in advance; however, we suspect that a region with relatively low values of all the adjoint functions

*a*

_{i}is poorly illuminated. The threshold

*ϵ*is chosen semi-empirically to be very small (a factor 10

^{−9}) compared with the maximum value of . The choice is supported afterwards by verifying that eliminated regions do not influence the regions corresponding to the high values of the converged weights. The numerical implementation of the algorithm requires a discretization of the domain.

In spite of the availability of *a*_{i} in analytic form, a square domain of size 995×995 m is discretized into 199×199 grid points for the inversion process. Each mesh is a square of 5×5 m. The centre of the arcs is at the grid point (100, 100). Depending on the run, source is either at (100, 100) or (86, 114). For discretization, we note that *a*_{i} is singular at the position of the sampler. In order to resolve this singularity, we compute average values in the mesh containing the sampler and the neighbouring meshes. For this purpose, each of these meshes is further subdivided into 99×99 grid points.

The threshold in equation (4.1) was taken as *ϵ*=3.6×10^{−12} kg^{−2} in all computations. The computations for run 1 have also been carried out using *ϵ*=3.6×10^{−10} kg^{−2}.

## 5. Results and discussion

The computations have been carried out for (i) synthetic data and (ii) real data taken from the IIT diffusion experiment. With the synthetic data, the inversion technique proposed here is able to reproduce the location and intensity of the source exactly. The technique is also able to retrieve approximately the emitted point source in the case of real data subject to noise.

Estimates are commonly represented in the form of isopleths drawn in the traditional *xy*-plane, referred to as *usual geometry*. The renormalization function *φ* is a pivotal element of the present study. Some features of the estimates, observed in the usual geometry, become more understandable if they are represented in a transformed framework, called *renormalized geometry*, with areas drawn in proportion to *φ*. The transformation, made in polar coordinates, is purely radial (*l*, *θ*)→(*l*′, *θ*), such that d*l*′^{2}=*φ*(*l*, *θ*)d*l*^{2}. Practically, the images of the estimates are first prepared for the usual geometry in vector form, isopleths being described as sequences of points. The samplers and isopleths are then transformed point by point into the renormalized geometry. In all runs, the pole is chosen at the centre of the monitoring network. Hence, the angles seen from the centre are preserved, which helps in reading the correspondence between the usual and renormalized representations. The total renormalized area is finite as (equation (2.7)), and its total extent is captured in the figures.

### (a) Renormalizing weights and renormalized representation

The renormalizing function *φ* computed for each run is represented in figures 2*a*(i)–*c*(i) and 3*a*(i)–*d*(i) for all runs. The area of the region seen by the samplers is almost similar in all the runs and oriented in the direction of the mean wind. We observe that *φ* is peaked at the position of the samplers. It decays upwind of the monitoring network and becomes negligible after approximately 500 m. A source located downwind of the monitoring network cannot even be detected, and we observe that the visibility *φ* vanishes immediately in that region. There is a difference, from a theoretical point of view, between the region weakly seen far away upwind of the monitoring network (*φ*>0 with *φ*≈0) and the region not seen at all downwind of it (*φ*=0). A source far away in the upwind direction will be detected, but the signal will be weak and poorly exploitable because the various samplers will see the almost same homogenized concentrations: the location of the source will not be identifiable.

The peaked values of *φ* around the samplers are seen in the form of eight or nine branches of the renormalized domain. These branches correspond to the eight directions in which the samplers are regularly arranged around the centre of the monitoring network. The ninth branch is related to a sampler isolated in the southeast direction from the centre.

### (b) Synthetic data

#### (i) Identification of a point source

For all runs, the maximum of the estimate computed from the synthetic data is found exactly at the grid point corresponding to the prescribed location of the point source in both central and shifted cases. In runs 1, 6, 7 and 11, the location of the source is retrieved at the grid point (100, 100), whereas in the shifted runs 8, 12 and 13, it is found at the point (86, 114). Also, for all the runs, the computed intensity of the source is found to be the same, up to a round-off error, as that prescribed, 3000 μg s^{−1} in runs 6, 7 and 11 (in fact, 2999.8, 2999.7 and 2999.7 μg s^{−1} were retrieved, respectively) and 5000 μg s^{−1} in runs 1, 8, 12 and 13 (4999.5, 4999.7, 4999.7 and 4999.7 μg s^{−1}). This very good result is not surprising, it just reflects the mathematical consistency of the theory as well as the synthetic data used here.

#### (ii) Extension of the estimate upwind

In figure 3*a*(ii), we note that the estimate is composed of two parts. The first part, where the estimate has its maximum, is around the prescribed source surrounded by the monitoring network. Detached from it is the second part of the estimate, with considerable extension upwind of the monitoring network. The physical meaning of this feature is not obvious.

The classical Bayesian assimilation endeavours to avoid such features (Bouttier & Courtier 2001) by minimizing the statistical expectation of the quadratic difference between the unknown source *s*_{r} and its estimate *s*_{e}. The constraint is imposed homogeneously throughout the domain. We are of the view that this will, in fact, degrade the quality of the inversion in the region close to the samplers.

In the renormalized geometry (figure 3*a*(iv)), the first part of the estimate is magnified, but the second one shrinks along the northeastern boundary of the domain. Hence, this extension falls in a weakly illuminated region with a marginal influence on the value of measurements. It is thus considered an unimportant feature of a globally optimized estimate. Similar behaviour is seen in all the runs having the source at the centre (figure 3*a–d*).

When samplers are present upwind of the source (runs 1, 6, 7 and 11), they observe there relatively low concentrations of SF_{6} providing valuable information about the origin of the release. As a result, the part of the estimate from which the location of the source is identified is clearly demarked from the part upwind of the monitoring network (figure 3*a*(ii)–*d*(ii)). On the other hand, in the shifted runs, no sampler is available upwind of the source, and this is the reason why the estimate extends continuously from its maximum to the upwind direction (figure 2*a*(ii)–*c*(ii)).

### (c) Real data

The observed concentrations from IIT tracer diffusion experiment are subject to the instrumental errors associated with the samplers and representativity errors associated with the dispersion model and the input meteorological observations, altogether usually termed the measurement errors. As a result, the identification of source location and intensity becomes uncertain. Representativity errors are certainly prevailing, but their probability distribution cannot be determined from the data. The magnitude of the errors, comparable to the measurements, is illustrated in table 1.

#### (i) Identification of a point source

In run 1, the source estimate computed from the real data has its maximum at the grid point (99, 102) (table 2). Hence, the source is retrieved roughly 10 m away from its real location, corresponding to the grid point (100, 100), and the shift is in the upwind direction. We observe that the estimate *s*_{∥φ} takes similar values, 0.338 and 0.337 μg m^{−2} s^{−1}, respectively, at these two points. Similar features are observed in all runs—source at the centre or shifted. The estimated position of the source is shifted either upwind (runs 1, 7, 11 and 12) or downwind (runs 6, 8 and 13) of the real location, at a minimum distance of 5 m in run 6, a maximum distance of 30 m in run 7 and an average distance of 17 m. The difference between the maximum value of the estimate and its value for the real position is less than 0.5 per cent in runs 1, 6, 8 and 13 and less than 10 per cent in the other runs (table 2). The distance between the estimated and the prescribed locations of the release is similar, in average, for the runs with central source (14 m) and the runs with shifted source (20 m). The estimates have their maximum elongated along the wind in such a way that its position in that direction is markedly susceptible to the noise contained in the data.

The intensity of the source is estimated within a factor 2 for all runs from equation (2.15), , using the noisy observations and the estimated source location. If the estimated location of the source falls in a region better (less) illuminated than the real source, this tends to cause an underestimation (overestimation) of the intensity. The comparison of *φ* for both calculated and real positions of the source (table 2) shows that this effect is dominant in all runs, except run 6. Analogous errors are observed in the size of an object and the corresponding distance viewing with the eyes. A region weakly (strongly) illuminated may thus be interpreted as far from (close to) the monitoring network.

#### (ii) Separation of noise and signal

The source estimate computed from the data observed in run 1 (figure 3*a*) is similar to that computed from the synthetic data. It is still composed of two parts around the centre of the monitoring network and upwind of it (figure 3*a*(iii)). We observe that the estimate obtained for the real data (figure 3*a*(iii)) is relatively larger than that obtained for the synthetic data (figure 3*a*(ii)).

The resemblance of the estimates is emphasized in the renormalized geometry (figure 3*a*(iv)(v)–*d*(iv)(v)). Both of them are mostly focused at the centre and in the southern branch of the renormalized domain. As an effect of the noise contained in the real data, the corresponding estimate slightly spreads in the other branches (figure 3*a*(v)–*d*(v)). Hence, the inversion technique seems to have the potential for distinguishing the signal from the noise.

Similar features are observed in all runs, be the release at the centre or shifted. In run 6, the intensity is overestimated, in spite of the fact that the estimated location has a larger visibility *φ* than the real one (table 2). This suggests that the concentrations observed in this run are relatively noisy, which is surprising because the run is also associated with the least error of source location. In the renormalized framework (figure 3*b*(v)), the importance of the noise is reaffirmed by the spread of the estimate in all the branches, and this may be the reason that the location of the source is not affected appreciably.

Note that in run 8, the source location is estimated with a relatively large error of 29 m. With a slight correction of 10° in the wind direction, it is observed that the error in the location of the source reduces to 11 m.

The identification of the source in all the runs using real data not only reaffirms the use of the inversion based on renormalization, but also the quality of the dispersion model used for reproducing the dispersion of a tracer in low-wind conditions. In spite of unavoidable significant measurement errors between observed (real) and predicted (synthetic) concentrations, the dispersion model provides the relevant information extracted and exploited by the renormalized inversion.

## 6. Issues and limitations

So far, we have presented the results with both synthetic and real data for a single release, and the inversion technique is working well. Now, we would like to discuss underlying theoretical issues and the limitations associated with the theory, data and atmospheric dispersion model.

### (a) Renormalized framework

A unique feature in this study is the illustration of the results in the renormalized geometrical framework highlighting (i) the relative importance of the various regions in the inversion and (ii) the influence of the noise on the identification of a point source. In this geometry, regions weakly seen by the monitoring network, far away or downwind of it, simply disappear and they do not affect the interpretation of the data. This allows some flexibility in choosing the threshold *ϵ* used in the computation of *φ* (equation (4.1)). Increasing this threshold implies that more of the regions weakly seen are withdrawn from the inversion. Figure 4 shows that, when *ϵ* is increased by a factor 100, inversion results in run 1 are appreciably altered only in regions weakly seen, be the data real or synthetic (compared to run 1 in figure 3). In particular, the computational window should not be much larger than the region well illuminated.

In the Bayesian framework, the difficulty with the remote regions is accounted for by considering that the estimation is subject to errors. These errors are described using high-dimensional probability distributions that, in practice, are neither observed nor fully computed (Elbern *et al*. 2007).

Using the renormalized representations, we have also noted that the source estimate computed from the real data is spreading widely throughout the domain. This spread corresponds to the noise and can be analysed in the renormalized framework. It shows that our inversion technique is able to distinguish the structured signal associated with a point release and an unstructured signal associated with a noise.

### (b) Analogy with sky in eye's view

The above results suggest that the renormalized geometry is, in fact, familiar to us. In viewing the stars from the eyes, we have no sensation of their distance, just as if they were lying on a remote sphere. We are so used to it that we do not even try to make a Euclidean representation: one could just represent the stars as sharp cones filling to infinity the solid angle under which they are seen. These cones are similar to the upwind extensions of the estimates in the weakly illuminated regions of the fundamental geometry. In the renormalized geometry, these regions are condensed at an edge analogous to the sky. Based on this concept, one should arrange the samplers in such a way that the region of interest is not shrunk.

### (c) Intensity of the point source

The intensity of the source is calculated here using equation (2.15), and we now discuss other approaches that may be addressed for its computation. Recently, Issartel *et al*. (2007) have retrieved the intensity of distributed areal emissions using synthetic satellite data by integrating the estimated source over the Indian region. It appears that the success of this technique was related to (i) the homogeneous illumination of the whole domain provided by the satellite and (ii) the smoothness of the release to be estimated. These conditions are not fulfilled in the present study.

Here, the addition of the fluxes estimated from the synthetic concentration measurements on whole-ground domain, , is clearly overestimating the intensity of the release except for run 1 (table 3). The method does not work well, because this addition of fluxes in regions—some well illuminated and others weakly illuminated—does not appear to be a well-defined physical quantity. To overcome this, the integration of the fluxes, , is done on a smaller domain *Σ*′ homogeneously well illuminated, in all the runs, within 50 m of the prescribed release. The intensity is now underestimated by a factor 4 or more (table 3). This is due to the reason that a point source is a singularity in the domain, and this does not agree with the statistical hypothesis underlying the estimation procedure with weights *φ* (Issartel 2005). This approach should not be used to infer the intensity of a point source.

### (d) Comparison with a least-squares estimation

A more classical estimation of source location ** x** and intensity

*q*was prepared in each run from the concentrations

*μ*

_{i}really observed by minimizing the cost function . This is a least-squares estimation based on the hypothesis that the various measurement errors are uncorrelated with equal variance. The result (table 2) is slightly inferior to that obtained from the renormalized inversion. The estimated location of the release is found at 20.5 m, on average, of the real location, instead of 17 m with the proposed technique. A similar trend is observed for the estimated intensity. When the data are noisy, it is essential to describe a probability distribution of the real emissions around the estimated values (Diggle & Ribeiro 2007). In this, the information known

*a priori*about the emissions must be taken into account. The above cost function does not allow this and, in the Bayesian framework, it is complemented with a background term(6.1)in which, according to the discrete notations generally used, the source function

*s*is regarded a column vector of discretized emissions; the matrix

**corresponds to the adjoint functions, in such a way that**

*A*

**A***s*represents the measurements ideally associated with the emissions (equation (2.2));

**is the covariance matrix of the measurement errors; and**

*Q***is the background error covariance matrix describing the uncertainties about the distribution of the emissions.**

*B*These techniques raise the following difficulties. (i) The identification of point and distributed emissions is not consistent. To identify a point emission in the Bayesian framework, the cost function should be minimized among point sources only, because the distributed source minimizing is not expected to have any special helpful property. (ii) In spite of enduring efforts, the practical determination of the background error covariance matrix ** B** remains an open problem. Furthermore,

**is defined with reference to a first guess of the emissions, but there is no basis in the present application for such a guess prior to the measurements. (iii) The determination of measurement errors covariance matrix**

*B***is generally considered relatively simple. This is true as far as the instrumental errors with the samplers, uncorrelated in principle, are concerned. However, the representativity errors associated with the dispersion model are certainly prevailing with complex correlations, but they have never been investigated conclusively.**

*Q*In renormalized assimilation, the quality of an estimation is addressed with different concepts, and so the above difficulties are avoided.

### (e) Feasibility of source identification

In the proposed framework, contrary to the Bayesian framework, the information known *a priori*, described by the fundamental geometry, is not of a statistical nature, and thus the estimate is derived directly from the measurements. The correspondence can be seen as a coordinate change, such that the coordinates of the point source are read directly from the direction of the measurement vector ** μ** (equation (2.11)). A noise in the measurements and its probability distribution is just forwarded to the estimated source location with the partial correction described in §2

*e*. The important limitation is related to the visibility described by the renormalizing function

*φ*. If the source is located in a region weakly visible, the maximum of the estimate

*s*

_{∥φ}(equation (2.9)) at source location is flat, indicating that the correspondence between

**and**

*x*

*a*_{φ}(

**) is relatively inaccurate. The release location is uncertain within an area proportional (i) quadratically to measurement errors and (ii) inversely to local weight**

*x**φ*. In the case of the IIT tracer diffusion experiment, estimation errors are mostly along wind and this is indicated by the elongation of the maximum of the estimate in that direction. Additional samplers would have been useful to observe the weak concentrations upwind of the source in the shifted location. The visibility function

*φ*depends on the meteorological conditions and the design of the monitoring network. In practice, we can improve it only by modifying the network, with more samplers in the region of interest. The improvement is not evenly distributed but, globally, it is in proportion to the number of detectors (, equation (2.7)).

### (f) Vertical and temporal dimensions

In the present study, the assimilation process was carried out in the horizontal *xy*-plane. Time has not been considered as a variable, owing to the continuous nature of the release and steady-state concentration measurements. The vertical dimension was also eliminated by focusing the inversion on the ground only, thus using our knowledge that the real source was not elevated. These simplifications are taken within the framework of the IIT tracer diffusion experiment, and provide not only a deep insight into the interpretation of the results, but also that the computational cost of the inversion decreases significantly. However, they are not essential to the theoretical framework presented in this paper. The computational time is proportional to the number of grid points in the domain of assimilation, and this number will be multiplied by the number of points required in the description of the vertical or temporal axes. The inversion is expected to be relatively less accurate in such a large domain. We wish to point out that, when the temporal dimension is included in the inversion, the present theory is applied to deal with an instantaneous point source.

### (g) Experimental data

The IIT tracer diffusion experiment has certain limitations, for example the sampling network was too sparse, angular distance of 45° was too large, 30 m micrometeorological tower was not on site, the response of its wind measuring instruments might be poor in light wind conditions and turbulence measurements (Sharan *et al*. 1996) were not available. Though the micrometeorological tower was located approximately 300 m away from the release point, measurements from it have been taken as representative of on-site. Here, we have considered the runs corresponding to low-wind convective conditions for the identification of the source. However, occurrence of low and variable wind conditions leads to irregular and indefinite diffusion of pollutant. The situation becomes more complex in stable conditions where the mixing due to turbulence is suppressed. Owing to non-availability of data, we have not carried out the source identification in these conditions. To capture the concentration pattern better and thus for a more reliable source identification, we should have samplers with lesser spacing between them and on-site meteorological measurements from both slow and fast response sensors in unstable as well as stable conditions.

## 7. Conclusions

A technique has been proposed and evaluated for the identification of point emissions. It was illustrated by using (i) measurements of seven runs, corresponding to convective conditions, from the IIT tracer diffusion experiment, each run with 20 samplers within 200 m of a continuous release of SF_{6}, (ii) the analytic model for atmospheric dispersion and (iii) synthetic measurements generated with the dispersion model. The identification of a point source is a parametric problem, simple compared to the more general problem of identifying a distributed areal source. The first one is often related to safety issues for species accidentally released into the air and the second to more fundamental issues to assess the fluxes of the trace species, in particular the anthropogenic emissions of green house gases. The methods proposed so far in the literature for solving both inverse problems are distinct from a theoretical point of view, even if the assimilative techniques relevant for the distributed areal emissions have been applied regularly with experimental data produced from a point source. The theory of renormalization extends for the point sources, based on the property that these are associated with the maximum of the renormalized estimate computed from the observations. In other words, the impulse response of the assimilation is characterized by its maximum at the point of impulse. This property is not supported by the Bayesian framework used classically (Cohn 1997; Kalnay 2003). Following this approach, when synthetic data generated from the dispersion model are used in place of the real noisy observations, the position and intensity of the source are retrieved exactly in each run. When the real data are used, the position of the source is retrieved with an average error of 17 m, mostly along wind; its intensity is estimated within a factor 2 for all the runs. This slight discrepancy is explained by the noise in the observations, and this suggests that the performance of the atmospheric model used is also good.

The theoretical link established between point and distributed sources was also able to clarify new concepts for the exploitation of monitoring networks. The visibility function provides an unprecedented simple criterion for appreciating the performance of a network and optimizing its design. A unique feature in this study is the illustration of the results in the renormalized geometrical framework, highlighting (i) the relative importance of the various regions in the inversion and (ii) the influence of the noise on the identification of a point source. In this geometry, the regions weakly seen simply disappear, implying that they are not under the control of the monitoring network. The interpretation of the renormalized geometry with its edge analogous to the sky in eye's view is a strong indication that the theory used in this paper is not just a convenient mathematical tool. It gives a scientific meaning to a common sensation not discussed ordinarily: the sky is apparently flat because the information obtained by our eyes from the deep Universe is negligible. Similar limits are met by the atmospheric monitoring network investigated here.

## Acknowledgments

Authors thank Ms Sandie Favier for her encouragement during the course of the study and the Cultural and Scientific Service, French Embassy, New Delhi, India, for providing the partial travel support to Dr Jean-Pierre Issartel to visit IIT Delhi to begin this work. Authors also thank Dr Isabelle Herlin for inviting Dr Maithili Sharan to visit INRIA, Paris-Rocquencourt research centre, France, under the CLIME project that helped to finalize the work.

## Footnotes

- Received October 8, 2008.
- Accepted March 11, 2009.

- © 2009 The Royal Society