## Abstract

Whole ecosystem exchange of water, carbon and energy is predominately determined by complex leaf-level processes occurring at individual plants. Interaction between individuals results in a distribution of environmental conditions that drive a variety of nonlinear response functions such as transpiration and photosynthesis. The nonlinearity of biophysical processes requires higher-order statistical descriptions of micro-environment distributions in order to accurately determine the landscape-scale mean functional response. We present a mathematical framework for describing vegetation structure based on the density, dispersion, size distribution and allometry of individuals within a landscape. Using three-dimensional stochastic vegetation modelling, we develop analytic expressions of the second-order statistics of vegetation canopies, namely the mean and variance of leaf area density and leaf area index with height. These expressions also allow for the approximation of the distribution of beam penetration and sunfleck statistics through the canopy as a function of height. Finally, we demonstrate how landscape-scale fluxes are strongly affected by the variability in canopy micro-environments, and how stochastic vegetation modelling improves flux estimates relative to traditional homogeneous canopy models.

## 1. Introduction

Estimates of the landscape-scale exchange of water, carbon and energy are a necessity for accurate understanding of the Earth's geological and biological processes. The biophysical functionality of vegetated environments is therefore mathematically represented via ecosystems models. Within different ecosystem models, the spatial resolution of plant canopy architecture spans many degrees of complexity. However, it is increasingly understood that accurate description of both the vertical [1–3] and horizontal [4–7] structural dimensions is required to correctly model biophysical functionality. Accurate models of vegetation canopies that can be directly parameterized with information about the abundance and allometric properties of individuals are required to enable a rapid estimation of biophysical exchange in global climate models and at specific, intensely studied, field sites [6,8].

The description of land–atmosphere interactions is inherently influenced by the effects of scale choice. Spatial patterns in rainfall, soil properties and topography can all drive variability in ecohydrological processes, and the schematization of a surface for model representation often prescribes uniform conditions within cells [9]. In the case of heterogeneous canopies, at scales below 10 ha, variability in vegetation has been shown to significantly impact the mean field hydrological processes. Conversely, at scales larger than 100 ha, the spatial structure of climatology has been shown to control variability [10]. The distributions of soil properties and topographic relief also influence vegetation patterns and fluxes [11], and the scale at which geomorphologic features are manifest will drive variability at these spatial extents. Thus, at the field scale, where much ecohydrological field research and modelling efforts are directed, heterogeneity in vegetation structure is expected to be the dominate source of variability.

Vegetation canopies are typically modelled in two distinct manners: one highly efficient and the other highly precise. The first common approach assumes canopy structure to be horizontally homogeneous throughout a given area, and is typically referred to as ‘big-leaf’ modelling. The second approach involves explicitly specifying the location and characteristics of each individual within the area to be studied [7]. The assumption of a homogeneous field implies that the physiological response of the landscape to the average condition is equal to the average of the responses to a distribution of conditions. The classic proof by Jensen [12] demonstrated the inequality of the response to the mean and the mean of the response for nonlinear functions: , when *f*(** X**) is a convex function of

**(bold letters are used to represent random variables throughout the text). Ecophysiological research has shown that leaf photosynthesis and transpiration are strongly nonlinear functions of micro-environment radiation, temperature and humidity [13]. In the example case of transpiration, the phase change of liquid water to vapour in stomata is governed by the availability of a finite amount of incoming radiation; thus as leaf area at a location becomes more dense, increases in transpiration gains become smaller owing to energy limitation. In environments characterized by sparse vegetation, foliage is typically clumped into areas of varyingly dense leaf area separated by patches of bare ground [14], and the assumption of a homogeneous canopy environment with a single average leaf area will result in unreliable estimates of landscape-scale fluxes [6,7]. Natural landscapes span a wide range in both the average amount of leaf area,**

*X**μ*

_{LAI}(m

^{2}m

^{−2}), and the standard deviation of the average leaf area,

*σ*

_{LAI}(m

^{2}m

^{−2}), as shown in figure 1. More complex modelling approaches are required to accurately model surface-to-atmosphere interactions as variability in leaf area increases relative to total leaf area.

In an effort to accurately resolve landscape-scale biophysical fluxes, research has led to individual-based ecosystem models. In these models, the explicit location, size, canopy geometry and leaf physiology of each individual is simulated in a three-dimensional space. The MAESTRO model [18] and/or the coupled atmosphere–canopy Forest Light Environment Simulator (FLiES) [19] are examples of this class of model. Once the three-dimensional environment is fully specified, energy and mass transfer within the canopy are directly resolved. In the case of FLiES, a large number of individual photons (10^{5}–10^{8}) are injected into the simulated canopy, after which their path and energy status is tracked as they are reflected, transmitted and absorbed by canopy elements. The large amount of input data required, as well as the computational demands of full three-dimensional modelling, limit the applicability of these models at regional and continental scales. Another approach used to represent subgrid-scale heterogeneity is to divide areas into ‘patches’ or ‘mosaics’ that are then modelled as separate entities and aggregated [20]. Finally, probability distributions of micro-climates have been used to model subgrid-scale variability [21,22]; yet, parameterizing these distributions in ecosystem models remains a difficulty.

Stochastic canopy and radiation transfer models do not require the explicit simulation of all individuals, yet are able to estimate the average conditions present in a heterogeneous system. These models, pioneered by Estonians Nilson and co-workers [23–25] and Ross and co-workers [26,27], assume that individuals are randomly distributed on a landscape. Using the basic assumption of spatial randomness, foliage area and light penetration profiles are estimated based on statistical geometry. The geometrical optical radiation transfer [28], analytical clumped two-stream [8] and canopy semi-analytical *P*_{gap} and radiative transfer [29,30] models are state-of-the-art radiation transfer models able to predict the light environment vertically through the canopy. Radiation transfer schemes within these models are focused on light attenuation through the canopy and predict the average probability (*p*_{gap}) that light will penetrate to a given depth. These models, with the exception of those of Nilson, typically assume that individuals are randomly distributed following a spatial Poisson process on a landscape. The spatial Poisson assumption does not take into account the underdispersion of ordered systems such as crop stands and forestry plantations, nor the clumping observed in sparsely vegetated systems. The clumping of leaves into crowns is typically considered via an effective leaf area index or clumping factor, *Ω*(*z*,*θ*), which is used to relate true leaf area to observed average light penetration [31].

The radiation transfer equations of Shabanov *et al.* [32,33] use both the probability of encountering vegetation elements and the pair-correlation distance between elements to resolve the mean field vertical radiation regime within heterogeneous canopies. This approach is based on the stochastic radiative theory of Vainikko [34,35], which was developed for addressing radiation transfer through discontinuous clouds. The case of identically sized and shaped cylindrical vegetation elements ordered into canopies separated by gaps with a Poisson distribution in space was numerically investigated with the stochastic mixture radiative transfer model and found to significantly alter radiative fluxes relative to the classic turbid medium approach [33]. The work of Huang *et al.* [36] incorporated pair-correlation functions for both over- and underdispersed vegetation patterns. Huang *et al.* [36] clearly demonstrated that mean representations of canopy vegetation are not sufficient to accurately represent the mean landscape canopy reflectance.

The majority of the earlier-mentioned models are focused on light penetration, and variability is typically considered with respect to gap probabilities, not in the distribution of leaf area itself. No models yet present analytical solutions describing the mean and variability in leaf area as a function of height based on abundance, dispersion, size distributions and allometric parameters. Representation of the statistical distribution of foliage is necessary because ecosystem fluxes are functions of both the canopy micro-environment (e.g. the profile of incident photosynthetically active radiation) and the amount of foliage experiencing these conditions. In this study, we present a framework for modelling the three-dimensional structure of vegetation that mathematically represents complex, heterogeneous canopies with minimal parameters and assumptions. We provide the analytical solution of the mean and variance of leaf area density, *L*_{AD} (m^{2} m^{−3}), and leaf area index, *L*_{AI} (m^{2} m^{−2}), as a function of landscape density, clustering level, size distribution and allometric relationships. With the canopy leaf area characterized, expressions for the landscape fractional cover, gap frequency probability density function as well as the empirical clumping factor *Ω* are analytically derived. Finally, we use the distribution of *L*_{AI} to predict the distribution of evapotranspiration and demonstrate how spatial variability in foliage alters the average surface–atmosphere flux of water.

## 2. Representation of individuals on a stochastic field

The spatial distribution of individuals is described by the average number of individuals, *λ*_{S} (m^{−2}), and their dispersion, *ν*_{S} (), on a subplot *S* of the two-dimensional plane. Each individual on the landscape is assumed to possess a height randomly chosen from a known distribution characterized by a mean, *μ*_{H} (m), and variance, *σ*^{2}_{H} (m^{2}). The canopy above each individual is assumed to conform to allometric relationships characterized by species-specific constants for canopy width (*c*_{W}), canopy thickness (*c*_{T}) and canopy foliage (*c*_{L1}, and *c*_{L2}). We directly derive the analytical expressions for the mean and variance of leaf area density, *L*_{AD} (m^{2} m^{−3}), and leaf area index, *L*_{AI} (m^{2} m^{−2}), as a function of elevation *z*. The appendix describes the random variables, biophysical constants, derived values and other parameters referenced in this text.

### (a) Individual density and dispersion

The number and organization of individuals of a specific species or functional type in a subplot of size *S* is treated as a random variable, ** N**. Often, the spatial organization of individuals is assumed to conform to a two-dimensional Poisson process, and the distribution of

**therefore follows a Poisson probability mass function [37]. The Poisson distribution is a single-parameter distribution where the average number of individuals,**

*N**λ*

_{S}, and variance are equal. However, in more ordered systems, such as a crop fields, forestry plantation or situations where individuals strongly inhibit the growth of their neighbours, the variance in the number of individuals per area is much lower than a Poisson process. These underdispersed systems may be characterized by a binomial distribution [24,38]. Conversely, when vegetation is organized into clumped patters, as is found in sparsely vegetated savannas [14], the variance in the number of individuals in an area

*S*may be significantly higher than in a Poisson process. These overdispersed systems may be described by a negative binomial distribution [38].

To accurately represent the variety of landscapes observed in nature, a two-parameter discrete distribution bound at zero with adjustable variance is required. Efron [39] proposed a two-parameter double Poisson distribution, ∼DP(*λ*,*ν*), as a way to model both overdispersed (e.g. an ordered lattice) and underdispersed data (e.g. clumped points) [39]. The probability that the random number of individuals, ** N**, will be equal to a number

*n*is given by the probability mass function 2.1where

*Sλ*

_{S}and

*ν*

_{S}are the parameters determining the location and shape of

*f*

_{DP(λ,ν)}(

*n*) [39]. The normalizing constant

*c*

_{DP}is defined such that . An approximation of

*c*

_{DP}is given [39], but

*c*

_{DP}is also easily determined numerically. The mean and variance of

*f*

_{DP(SλS,νS)}(

*n*) given over a region

*S*are 2.2

The parameter *ν*_{S} determines the amount of dispersion present in the region *S* and thus also the broadness of the distribution of *f*_{DP(λ,ν)}(*n*), with *ν*_{S}<1 relating to underdispersed data, *ν*_{S}=1 describing complete spatial randomness, and *ν*_{S}>1 relating to overdispersed data. When *ν*_{S} is equal to one, equation (2.1) simplifies to the Poisson distribution. *ν*_{S} is directly analogous to the Cox index of clumping used to describe the level of dispersion in measured subplots in forests [40]. Figure 2 shows three simulations of landscapes all with the same values of *λ*_{S} and *S*, but with different values of *ν*_{S}. Simulations were generated by dividing the area into a gridded lattice with square areas of size *S*. A random number of individuals drawn from the double Poisson distribution was assigned to each lattice box, and then each individual was randomly placed within the area. As shown in the bottom of figure 2, the double Poisson distribution is able to accurately represent the normalized histograms of simulated arrangements in space over a wide range of dispersion conditions.

### (b) The Poisson random case

We begin with the case of complete spatial randomness, i.e. *ν*=1, and derive expressions for the statistics of interactions of individuals. In the case of foliage density and area, interaction (i.e. overlapping) of two or more individuals is an additive process. Assuming radial symmetry of all individuals, let *g*(*x*;*u*,** H**) represent the influence of an individual located at

*u*=(

*u*

_{1},

*u*

_{2}) with the random set of characteristics

**at position**

*H**x*=(

*x*

_{1},

*x*

_{2}) in the landscape. At this point, it is not necessary to state the form of

*g*(

*x*;

*u*,

**), nor do we need to define the random variable**

*H***. The total influence**

*H***(**

*Y**x*) of all elements in the landscape at position

*x*is given by the sum of all contributions, 2.3where

**(**

*N**u*) is a double Poisson stochastic process over the two-dimensional plane with characteristics given by (2.2). As noted by Rodriguez-Iturbe

*et al.*[41], the cumulates of

**are obtained by Taylor series expansion where, subject to convergence, the**

*Y**m*th cumulant

*κ*

_{m}(

**) is the coefficient of**

*Y**p*

^{m}/

*m*! [41]; therefore, 2.4which is a form of Campbell's theorem [42]. The above results may also be derived directly with the fact that for the Poisson random case [41].

The random variables ** H** and

**are considered to be completely independent and identically distributed in this framework. Using the properties of d**

*N***(**

*N**u*), we reformulate the results in polar coordinates, with

*r*as the Euclidean distance from

*x*to

*u*. The expected value of all contributions at a random location on the stochastic landscape is 2.5In a similar manner, the variance is given by 2.6

Equations (2.5) and (2.6) are used to estimate the first- and second-order statistics of the random process ** Y**. Assuming that all individual elements are radially symmetric, of finite cylindrical form and homogeneous within themselves, we can complete the integration in (2.5) and (2.6) out to the element edge. This gives the element area,

*a*(

**), with characteristics**

*H***. Finally, we condition on the probability that a random height**

*H***is equal to a value of**

*H**h*, . The expected value of

**is formally defined by rewriting equation (2.5) as 2.7In a similar manner, the variance of**

*Y***is formally stated by rewriting equation (2.6) as 2.8where**

*Y**f*(

*h*) is a known probability distribution. Equations (2.7) and (2.8) are the core of the vegetation structure model and enable us to estimate the first and second central moments at the landscape scale of processes arising from the additive influence of a field of individuals.

### (c) Over- and underdispersed systems

Modelling individual dispersion with a lattice-based approach provides an analytically tractable representation of complex natural phenomena. Both clustered and thinned processes may be described with respect to a measure of the variance of their intensity, *ν*, and their appropriate length scale, *S*. Values of *ν* and *S* may be obtained through direct surveys at field sites. Additionally, traditional processes such as Neyman–Scott clustering models and Matérn inhibition processes may be approximated by fitting *ν* and *S* to the theoretical correlation structure or Ripley's *K* function for a particular model [43].

We define and as the mean and variance of the process ** Y**(

*x*) with a dispersion parameter

*ν*

_{S}≠1. For all cases, the expected number of individuals and their average influence at point

*x*does not change from the Poisson random case above, therefore . Beginning again at (2.3), and making use of the law of total variance, we use the double Poisson properties in (2.2) and derive the second cumulant of

**(**

*Y**x*) as 2.9In the Poisson case, (2.9) simplifies to (2.4) through the definition of variance, i.e. . If the definition of variance is used in the first term on the right-hand side of (2.9), then this term can be separated, and the first of the resulting three terms is equal to . The second and third terms are multiplied by and factored to form . Equation (2.9) is then rewritten with respect to the moments of the Poisson random case as 2.10where the

*λ*

_{S}

*S*term in the denominator arises because

*ν*

_{S}is defined at a particular scale, and thus integrated over the area

*S*only. As equation (2.10) is integrated over

*S*and not , the influence of adjacent lattice cells is not considered. As

*S*approaches the areal extent of individual elements, i.e.

*S*≤

*a*(

**), equation (2.10) becomes a poor predictor of the .**

*H*We compare simulations of stochastic systems with equation (2.10) in figure 3. A 200×200 m grid with cells of size *S* was created, and within each lattice cell, a random number of marks drawn from a double Poisson probably mass function were placed uniformly. Each mark was assigned a value of ** H** drawn from a gamma distribution with a mean of 10 and a variance of 25. A circular region around each mark with an area proportional to the mark value,

*a*(

**)=0.6**

*H***, was allocated an intensity proportional to the mark value,**

*H**g*(

*x*;

*u*,

**)=0.1**

*H***. Except in the case of low values of both**

*H**S*and

*ν*

_{S}, equation (2.10) is able to capture changes in landscape variability accurately.

## 3. The stochastic three-dimensional canopy environment

The foliage canopies of all individuals combine to form the landscape vertical profile of leaf area, expressed as either leaf area density, *L*_{AD}, or leaf area index, *L*_{AI}. The spatial dimensions of individual canopies are characterized as functions of each individual's height with simple allometric relationships. The stochastic double Poisson process in two-dimensional space is defined as a marked stochastic process where the value of each mark refers to the height of each individual. For representation of a three-dimensional canopy environment, we use equations (2.7) and (2.8) in the estimation of the mean and variance of leaf area in a vegetation canopy as a function of height *z*. Dispersion can then be addressed using equation (2.10).

### (a) Allometric relationships

We have defined the density and dispersion of individuals on the landscape; next, we characterize the influence, *g*(⋅), that each individual exerts on the overall field. Individual canopies in the landscape take the form of vertical cylinders, though other forms, such as ellipsis or cones, may also be constructed. These cylinders are filled uniformly with leaf area at a density specific to each height. The horizontal width, ** W** (m), and vertical thickness,

**(m), of cylindrical canopies are defined as 3.1where**

*T**c*

_{W}(m m

^{−1}) is a constant relating the height of an individual to its canopy width. Similarly,

*c*

_{T}(m m

^{−1}) is a constant relating the thickness of the canopy to the total height of the individual. The area under the canopy is , and the volume of the cylindrical canopy is . The total foliage area present within an individual canopy,

**(m**

*F*^{2}), is given by a simple power relationship to height as 3.2where

*c*

_{L1}[m

^{(2−cL2)}] and

*c*

_{L2}[ ] are constants [27].

The canopy of a single individual is filled with foliage at a leaf area density of *g*_{LAD}(** H**) (m

^{2}m

^{−3}), which is equal to the total foliage area over the total volume,

**/**

*F***. The function**

*V**g*(

**) is then written with respect to elevation above the surface,**

*H**z*, where we note the fact that the function is zero when outside the canopy as 3.3where

**(1−**

*H**c*

_{T}) is the elevation of the bottom of a canopy of an individual of height

**. Figure 4**

*H**a*(black line) shows a single realization of equation (3.3) for an individual with

**=2**

*H**m*. We condition on the fact that

**=**

*H**h*, and use equations (3.1) and (3.2) to resolve the canopy volume and total leaf area with respect to

*h*as 3.4Equation (3.4) states that until an individual's height

*h*reaches

*z*, leaf area density at elevation

*z*is zero. While the individual has a height between

*z*and

*z*/(1−

*c*

_{T}), the leaf area density is described by the mean leaf area density (

**/**

*F***). When an individual is so tall such that the bottom of the canopy does not reach down to elevation**

*V**z*, there is again zero leaf area density. Note also that as

*c*

_{T}approaches one, leaf area is present for any height above

*z*and therefore the upper bound of the second inequality is infinity. Figure 4

*b*(black line) shows equation (3.4) for individuals with different

*h*values at an elevation of

*z*=2 m.

Leaf area index above a specific point is the product of the leaf area density and the vertical length through the canopy. Leaf area index at an elevation *z* within a single canopy, *g*_{LAI}(** H**) (m

^{2}m

^{−2}), is given by 3.5where the leaf area index is constant below the canopy, linearly decreases through the canopy, and is zero above an individual canopy. Figure 4

*a*(grey line) shows a single realization of equation (3.5) for an individual with

**=2 m. We rewrite the**

*H**g*

_{LAI}(

**) function conditioned on**

*H***=**

*H**h*and have 3.6As with the case of leaf area density, the leaf area index at an elevation

*z*is zero until an individual's height

*h*exceeds

*z*. As the height of an individual increases, the leaf area index at

*z*increases slowly because the full canopy is not above

*z*. Once the entire canopy is above the elevation

*z*, leaf area index increases with height more rapidly as the canopy volume expands as height increases. Figure 4

*b*(grey line) shows equation (3.6) for individuals with different

*h*values at an elevation of

*z*=2 m.

### (b) Marked heights

The size of each mark in the stochastic double Poisson process corresponds to the height ** H** of each individual. Many distributions, including the beta, normal, lognormal, gamma and Weibull, have been used to characterize the distribution of heights observed in forest stands [44]. Given the allometric relationships chosen above and the form of equations (2.7) and (2.8), the gamma distribution allows for closed-form analytical solutions to be derived for the mean and variance of

*L*

_{AI}and

*L*

_{AD}. The gamma distribution is a two-parameter continuous distribution bound at zero, of which the exponential distribution is a specific form.

The distribution of ** H** is specified as a gamma distribution, ∼

*Γ*(

*k*,

*m*), with a probability density function

*f*

_{Γ(k,m)}(

*h*) and cumulative density function

*F*

_{Γ(k,m)}(

*h*). The probability that a random height

**is equal to**

*H**h*is defined as 3.7The probability that a random height is less than

*h*is defined as 3.8where

*Γ*(

*x*) is the complete gamma function evaluated at

*x*, and

*γ*(

*α*,

*x*) is the lower incomplete gamma function with parameter

*α*evaluated at

*x*. The average height of

*μ*

_{H}is equal to

*km*and variance is equal to

*km*

^{2}(note, and ).

### (c) Landscape-level variability

We combine equations (3.4) or (3.6) with (2.7) and (2.8) to estimate the mean and variance of leaf area density, and leaf area index for a stochastic ecosystem as a function of elevation *z*. The mean leaf area density, *μ*_{LAD}(*z*) (m^{2} m^{−3}), and variance of leaf area density, (m^{4} m^{−6}), are described by
3.9and
3.10In a similar manner, the mean leaf area index, *μ*_{LAI}(*z*) (m^{2} m^{−2}), and variance of leaf area index, (m^{4} m^{−4}), are given by
3.11and
3.12The constants, [ ], are defined with the gamma cumulative distribution function, *F*_{Γ(k,m)}(*x*), as given in table 1. The first terms of equations (3.11) and (3.12) give the values of *μ*_{LAI} and at ground level (*z*=0).

Figure 5 shows two examples of equations (3.9)–(3.12) for hypothetical distributions of heights with the same landscape density, dispersion and allometric constants. For two distributions of individual heights, one following an exponential distribution and another simulating a more even-sized distribution, we observe that the maximum mean and standard deviation of *L*_{AD} are similar; however, the foliage occupies a much smaller vertical band for the even-sized distribution. This translates to a smaller total leaf area index at ground level for the even-sized distribution. Variability in individual heights affects both the vertical foliage placement and total leaf area. Figure 6 depicts how the mean and variability in leaf area index relate to both landscape density (*λ*_{S}) and aggregation level (*ν*_{S}). As expected, increasing the number of individuals present results in a direct and linear increase in landscape average leaf area index and a decrease in variability. Changes in the dispersion *ν*_{S} do not affect mean leaf area index, but significantly affect the leaf area variability. For a given landscape density, increasing aggregation levels leads to larger variability in leaf area index, as expressed by the coefficient of variation, .

### (d) Fractional cover

The fraction of the landscape covered by canopies, *f*_{cc} [ ], is an important ecosystem feature, and is easily derived from the model. Because each individual is associated with a single canopy, the distribution of the number of canopies above any given location is assumed to be of the same form as that of the individuals themselves, i.e. a double Poisson distribution. The number of canopies at any given location is therefore also ∼DP(*λ*_{cc},*ν*_{fcc}), where *λ*_{cc} and *ν*_{cc} are rescaled parameters describing the distribution of canopy cover.

The expressions for the average number of canopies per unit area *λ*_{cc} and the dispersion in canopy cover *ν*_{cc} are found with the solutions to the moments of the leaf area density. By setting *c*_{L1} to (*πc*^{2}_{W})/4, *c*_{L2} to 3, and *c*_{T} to 1, we create a canopy with a value of one that does not vary with the height of an individual. Using equation (3.9), the average number of canopies per unit area is
3.13where because *z*_{1}=0 and . Similarly, equations (2.10) and (3.10) are used to find *ν*_{cc}=1+*λ*_{cc}(*ν*_{S}−1)/(*λ*_{S}*S*).

The fractional canopy cover is defined as the probability that at least one canopy is present at a given location. This probability is simply estimated as one minus the probability of zero canopies. Using the probability mass function in equation (2.1), we have
3.14where *c*_{DP(λcc,νcc)} is the constant of integration for the double Poisson distribution described by *λ*_{cc} and *ν*_{cc}.

### (e) Gap probabilities and radiation transfer

Characterization of the radiation regime in the vegetation canopy is necessary to drive biophysical models of photosynthesis and transpiration. Given the characterizations of the mean and variance of *L*_{AI} and *L*_{AD}, we wish to estimate the statistics of radiation transfer. We derive analytically an approximation for the fraction of light rays that pass through the canopy to elevation *z*. At any given location, if we consider only canopies and not stem material, then the fraction of direct beam radiation passing through a canopy with a leaf area index ** L** is given by
3.15where

*G*(

*θ*) is the projection of unit foliage area on the plane perpendicular to the view direction

*θ*[1,24,31]. Equation (3.15) was derived based on homogeneous canopies and its application to heterogeneous environments is a simplified approximation. In the treatment here, we consider the landscape as broken up into separate homogeneous patches with different foliage densities. The leaf area index present at a given patch is considered a random variable,

**, where**

*L***has mean**

*L**μ*

_{LAI}and variance . The probably that a realization of

**is equal to a specific value**

*L**l*is denoted by

*f*(

*l*); i.e. . Given that leaf area is a random variable in space (

**), the radiation regime is also a random process in space, denoted as**

*L***. We can then estimate the moments of light penetration as 3.16where and .**

*P*The distribution of leaf area index, *f*(*l*), is assumed to follow a gamma distribution, as defined in equation (3.7). We find *μ*_{P} and as
3.17and
3.18The negative exponential transform of a gamma random variable in equation (3.15) leads to a random variable with a beta-like distribution bound between zero and one. The mean and variance of light penetration, as defined by equations (3.17) and (3.18), can then be used to model the entire distribution of light penetration for the vegetated system.

The ability of equations (3.17) and (3.18) to capture the statistics of the random radiative regime ** P** is assessed with the three-dimensional radiative transfer model FLiES [19]. The FLiES model is a Monte Carlo ray-tracing simulator with an explicitly specified three-dimensional canopy. The model demonstrated low model-to-reference deviations in the RAMI4PILPS radiation model inter-comparison project [45] and has been coupled to energy and carbon exchange models to predict landscape-scale fluxes [7]. In FLiES, a three-dimensional vegetation configuration is specified for a 30×30 m plot, and then a set number of individual photons are bombarded on the canopy. The energy of each photon is followed as it passes through the canopy, encounters vegetation elements and randomly scatters. After all photons have been released, absorbed photosynthetically active radiation (PAR) is calculated for each 0.1×0.1×1 m voxel by summing the change in energy of all scattering events that have occurred within the grid box.

Two separate vegetation configurations were simulated in FLiES, with 50 separate random realizations of each configuration, and 10^{6} photons in each simulation. PAR at the top of the canopy was specified as 1000 μmol m^{−2} s^{−1} with no diffuse light. Incident PAR at each level was calculated as the top of canopy PAR minus the absorbed PAR above. The mean and standard deviation of incident PAR was calculated for each elevation above the surface in 1 m layers. Solar zenith angle was set at 0, leaf albedo set at 0.15 and soil albedo at 0. *G*(*θ*) was set at , where *α* is one albedo, and 0.5 denotes a uniform leaf angle distribution [46]. Vegetation was modelled as cylindrical canopy elements with uniform foliage density (the inner stem area of FLiES was removed by setting the constant ‘`bp2`’ to 0.0 on line 277 of the source file ‘`iprams.f`’ and recompiling the model). FLiES does not consider overlapping canopies, and when two canopies intersect, the leaf area density of this region is set at one of the two values, not their sum. This treatment of intersecting canopies diverges from the stochastic framework developed earlier, and to minimize intersecting canopies in FLiES, the parameter *c*_{T} was set to 0.1.

Vegetation canopies with an even-sized distribution of heights (*μ*_{H}=10, *σ*_{H}=5) and an exponential distribution of heights (*μ*_{H}=10, *σ*_{H}=5) were analysed. Figure 7 shows the incident PAR profile based on equations (3.17) and (3.18) and the FLiES model results. For both cases, the form of the mean incident PAR vertical profile obtained from the FLiES model is captured by the analytical equations derived earlier. The offset for both the ray-tracing simulations from the analytical prediction is probably caused by the different treatment of intersecting canopies between methods. This lack of an additive canopy intersection leads to a reduction in total simulated foliage and a decrease in the variability of foliage densities. Spatial variability in PAR is also captured very well by the analytical formulation when *σ*_{H}=5, and to a lesser degree when *σ*_{H} is set to 10. The increase in variability in *σ*_{H} leads to more large vegetation elements and thereby more canopy intersections, and because intersecting canopies do not have an additive leaf area density in FLiES, the total variability is severely decreased. The application of equation (3.15) to a non-homogeneous field by integrating across the probability space of different leaf areas provides a fast, analytically tractable estimate of second-order radiation regime properties. This application is a simple case with the Sun directly overhead, no diffuse radiation and a black understorey, and therefore the application of equations (3.17) and (3.18) in their current form is limited. However, given that both the mean and variability of PAR can be solved analytically in specific cases, further work may extend this approach further.

Many models make use of an empirical clumping factor, *Ω*, or an or an ‘effective’ leaf area, *L*_{e} [31]. The clumping factor scales leaf area index such that the average leaf area index inserted in equation (3.15) will give the average beam penetration (i.e. ). Similarly, the ‘effective’ leaf area *L*_{e} is equal to *μ*_{LAI}*Ω*(*θ*) [31]. The value of *Ω*(*θ*) can be derived as
3.19However, the use of a clumping factor or ‘effective’ leaf area presents problems when estimating biophysical processes in heterogeneous environments. Areas with varying degrees of leaf area will respond nonlinearly owing to energy limitations, such that model results using averaged leaf area or radiation transfer will overestimate ecosystem functionality.

Multi-angle observations of canopy reflectance data collected from space-based platforms have been used to estimate the clumping index for the land cover types of the Global Landcover 2000 dataset [15,17,47]. Values of *Ω*(*θ*) from [17] were used in conjunction with the average leaf area index obtained from NASA's moderate resolution imaging spectroradiometer (MODIS) platform to estimate the variance of leaf area index for each land cover type in figure 1. MODIS MYD15A2 data were used from the year 2006 [16] and a *G*(*θ*) value of 0.5 was assumed. Equation (3.19) was solved numerically to obtain estimates of the variance of leaf area index for land cover types as shown in figure 1.

## 4. Variability in leaf area and biophysical processes

The leaf surface at the stomatal opening is the site at which transpiration and carbon assimilation occur. These biophysical processes are driven by a finite amount of available energy. Because of energy limitation, an area with twice as much leaf area as another will not necessary have double the transpiration or assimilate twice the carbon. In order to accurately characterize the landscape-scale exchange of water and carbon, information about the distribution of leaf area and its micro-environment is required. By characterizing both the mean and variance of leaf area, we are able to describe the distribution of foliage area and the biophysical response at any amount of leaf area.

We present an example of how variability in the spatial organization of vegetation structure affects the total landscape flux by estimating the total landscape transpiration with the Penman–Monteith (PM) equation. The PM equation is a statement of the energy balance at the leaf scale used to estimate ecosystem transpiration rates given meteorological forcing of net radiation, ** R**, air temperature,

*T*

_{a}, horizontal wind speed,

*U*

_{x}, relative humidity,

*h*

_{r}, and stomatal conductance,

*g*

_{s}(see [48] for a complete description of the PM equation and its terms). The evapotranspiration,

**(mm day**

*E*^{−1}), from a surface is calculated as 4.1where

**is a random variable describing the total leaf area index at a given location, and**

*L***is a random variable describing the net radiation present at a given location, determined as a function of**

*R***. Figure 8 (thick black line) depicts the estimated evapotranspiration for given values of leaf area index following [48]. Radiation at the top of the canopy,**

*L**R*

_{in}, was attenuated as a function of the random leaf area index at a given location, as in equation (3.15), as

**=**

*R**R*

_{in}e

^{−G(θ)L}. Leaf area at each

**was split into sunlit and shaded portions, and the total evapotranspiration for each leaf index was calculated as the sum [46].**

*L*Evapotranspiration is a nonlinear function of leaf area, and therefore, variability in ** L** will influence the total average landscape transpiration. For a given

*λ*

_{S}and

*ν*

_{S}and a set of biophysical constants,

*μ*

_{H},

*σ*

^{2}

_{H},

*c*

_{W},

*c*

_{T},

*c*

_{L1}and

*c*

_{L2}, we can calculate the mean,

*μ*

_{LAI}, and standard deviation,

*σ*

_{LAI}, of the leaf area index and estimate the distribution of

**. The distribution of**

*L***is again assumed as a gamma random variable with probability density function**

*L**f*

_{Γ}(

*l*). We calculate the expected landscape transpiration rate by conditioning on the probability that a particular value of leaf area is present in the canopy by the definition of expectation, 4.2Equation (4.2) is numerically integrated to estimate the average landscape evapotranspiration.

As the landscape moves from highly regular arrangements to highly clumped arrangements, large changes in the expected transpiration rate occur. Figure 8 shows the value of for landscapes with different coefficients of variability in leaf area index. Although the treatment of radiation attenuation through integration of equation (3.15) is a simplification, the presented method allows for exploration of the consequences of different vegetation patterns to be quickly investigated. In more clustered landscapes characterized by large variability in leaf area index, the landscape evapotranspiration is significantly less than the flux obtained when the average leaf area index is used to estimate evapotranspiration. Clustering leads to patches of variably dense foliage where the patch transpiration may be higher in total but lower per unit leaf area due to a limited amount of energy to drive the evaporation process.

## 5. Final discussion

Understanding variability in the structure of vegetated environments is necessary to accurately estimate the exchange of water, carbon and energy between the Earth's surface and the atmosphere. The degree to which individuals are dispersed on a landscape considerably influences structural variability, and the stochastic vegetation model presented earlier is able to incorporate diverse configurations of individuals. With the stochastic field of individuals defined as a marked double Poisson process, the earlier-mentioned methodology captures the vertical and horizontal distribution of canopy structure. Allometric relationships relate the marked two-dimensional process to a three-dimensional space where canopies occupy finite volumes and contain specified foliage amounts. The choice of cylindrical canopies and simple allometry in the *g*(⋅) functions, as well as assuming the distributions of individual heights are gamma random variables, provides straightforward closed-form analytical solutions. Through stochastic spatial geometry, direct analytical expressions of the moments of the vertical distribution of foliage are available.

Descriptions of the mean and variance of leaf area density and leaf area index provide a detailed representation of vegetation structure with minimal parameterization. Analytical expression allows for fast computation of vegetation profiles that can be parameterized easily to resolve canopy structure in global ecosystem models. The amount of canopy cover, as well as the full distribution of light penetration, can be quickly modelled as functions of the vertically resolved leaf area distribution.

Furthermore, if the independence of different species or functional groups is assumed, then this methodology may be used to estimate the vertical profile of leaf area for a biologically diverse system through a superposition of separate leaf area profiles for each modelled species. This would allow for representation of ecotones in Earth-system models, where the observed ranges of species partially overlap. However, we have assumed that leaf area distributions follow a gamma distribution, which is a bound distribution that does not permit simple combinations of multiple distributions. Other distributions, such as the normal, may be chosen such that the leaf area profiles of different species can be combined. Furthermore, by setting *c*_{T} to one, vegetation structures low to the ground, such as grass clumps, are able to be modelled. Finally, because of the generality of the model, it is possible to model the distribution of subsurface elements such as rooting networks.

Because this three-dimensional stochastic vegetation model is able to characterize both the mean and variance in leaf area distribution, differences in ecosystem processes as functions of leaf area can also be address. The micro-climate where leaf-level processes occur is highly dependent on the amount of leaves in the immediate area. Energy available to drive photosynthesis and transpiration is a finite quantity and, as such, increases in leaf area lead to diminishing increases in transpiration. Landscape-scale fluxes from ecosystems characterized by larger variability in their leaf area can only be estimated by consideration of the entire distribution of micro-environments. The difference between the evapotranspiration at mean leaf area index, *E*(*μ*_{LAI}), and the expected value of ** E** given a distribution of leaf area, , is shown in figure 9 as the percentage of flux overestimation, (%). The overestimation of evapotranspiration becomes large as the variability in canopy structure increases. Land cover types characterized by open vegetation cover with lower leaf area overestimate evapotranspiration to a larger degree for the modelled conditions. Other micro-environment factors such as temperature and humidity may also vary with foliage density and could possibly be incorporated into more complicated models. Finally, this methodology is formulated such that changes in the density and size distribution of individual species can be quickly propagated into the entire vegetation structure. This straightforward analytical solution to canopy structure as a function of individual abundance is therefore suited to studies of vegetation succession, biodiversity and optimality wherein changes in external forcing and disturbances drive ecosystem dynamics.

## Appendix

- Received January 3, 2013.
- Accepted April 8, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.