## Abstract

Stream runoff is perhaps the most poorly represented process in ecohydrological stochastic soil moisture models. Here we present a rainfall-runoff model with a new stochastic description of runoff linked to soil moisture dynamics. We describe the rainfall-runoff system as the joint probability density function (PDF) of rainfall, soil moisture and runoff forced by random, instantaneous jumps of rainfall. We develop a master equation for the soil moisture PDF that accounts explicitly for a general state-dependent rainfall-runoff transformation. This framework is then used to derive the joint rainfall-runoff and soil moisture-runoff PDFs. Runoff is initiated by a soil moisture threshold and a linear progressive partitioning of rainfall based on the soil moisture status. We explore the dependence of the PDFs on the rainfall occurrence PDF (homogeneous or state-dependent Poisson process) and the rainfall magnitude PDF (exponential or mixed-exponential distribution). We calibrate the model to 63 years of rainfall and runoff data from the Upper Little Tennessee watershed (USA) and show how the new model can reproduce the measured runoff PDF.

## 1. Introduction

Runoff is perhaps the most poorly represented process in ecohydrological stochastic soil moisture models [1]. This is a problem because runoff from extreme rainfall-runoff events creates average annual losses in excess of $2.6 billion in the United States alone [2]. The frequency of such extreme events increased in the twentieth century and this trend is predicted to continue in the twenty-first century [3,4].

The quantitative description of runoff remains a challenge because of nonlinearities (e.g. thresholds) in the watershed rainfall-runoff response. Further, landscape heterogeneities and the spatial and temporal randomness of multiple hydrologic states make prediction extraordinarily difficult [5,6]. Many attempts have been made to address these challenges, which generally involve detailed modelling of small-scale processes and explicit mapping of watershed heterogeneities (e.g. [7–10]). However, these approaches often require a large number of model parameters, leading to the problem of model over-parameterization. Most models are therefore highly calibrated to compensate for a lack of understanding of the actual watershed processes [11,5]. As multiple sets of different parameter values may yield the same model calibration result, these models have a high degree of predictive uncertainty that can result in fundamental misrepresentations of the runoff processes [5,12].

Soil moisture and its spatial and temporal variability have been recognized as primary factors controlling runoff variability (e.g. [6,13,14]). Watershed soil moisture dynamics represent a complex spatio-temporal stochastic process that responds to stochastic rainfall forcing, evapotranspiration and other deterministic losses, as well as soil type and land cover, all of which may be heterogeneous in space [15,16]. Previous descriptions of runoff variability have included some combination of stochastic rainfall, the spatial statistics of soil storage capacity or soil moisture dynamics [17–20]. However, these models have been limited to one type of runoff transformation, such as infiltration excess overland flow described by the Philip equation (e.g. [17,18]) or saturation excess overland flow (e.g. [19,20]). Further, stochastic soil moisture models have been applied primarily at the point scale without an explicit description of the runoff statistics [19,21,22]. For different spatial scales and runoff transformations, a grand challenge is to characterize the effect of multiple soil moisture states on runoff production and, thus, to find the joint probability density functions (PDFs) of soil moisture and runoff as well as rainfall and runoff. The joint PDF of runoff and soil moisture and the marginal runoff PDF, dependent on the magnitude of soil moisture fluxes, present a potentially efficient means of understanding how runoff variability is affected by seasonal, intra-annual and inter-annual changes in soil moisture fluxes such as evapotranspiration or rainfall.

Here we develop a framework to find PDFs of runoff (including joint PDFs of runoff and soil moisture) that characterize the effect of multiple soil moisture states on runoff variability. The framework is general to any rainfall-runoff transformation function and builds upon the stochastic soil moisture model of [19,21–23]. We develop a general function to represent the rainfall-runoff transformation, which was previously fixed as the rainfall excess over an upper soil moisture threshold (e.g. [19,23,24]). The model is developed for a spatially uniform soil reservoir that can be approximated by a homogeneous runoff response. A further generalization to the spatial distribution of runoff generation will be presented elsewhere [25].

The paper starts by defining the soil moisture balance for a general rainfall-runoff transformation. We then present a new procedure to find the PDFs of runoff and soil moisture (both antecedent and posterior to a rainfall event). For these PDFs, we then discuss the system crossing properties, which, in turn, are the basis of the forward equation for the soil moisture PDF. In §3, we introduce a specific rainfall-runoff transformation, defined as a soil moisture-dependent fraction of each rainfall event, which describes the saturation threshold complimented by the progressive partitioning of rainfall into runoff and infiltration. We compare and contrast model results for rainfall arrivals described by a homogeneous or non-homogeneous Poisson process and for different rainfall depth distributions, i.e. either exponential or mixed-exponential. Finally, we evaluate the framework against 63 years of streamflow data from the Upper Little Tennessee River watershed, USA.

## 2. Soil moisture and runoff model

We begin with an area where the soil is considered as a spatially uniform reservoir, with a vertically averaged soil porosity (i.e. volume of voids/volume of soil), *n*, rooting depth, *Z*_{r}, and dimensionless state variable of relative soil moisture, *s*. The soil reservoir, with a maximum storage capacity of *nZ*_{r}, is intermittently filled by random pulses of rainfall and emptied by losses due to evapotranspiration and runoff. Later in §4, we consider stormflow runoff to be generated from a spatially uniform soil reservoir that occupies a fraction of the watershed area.

The unit area soil water storage may be described by an effective soil moisture given by
*s*_{w} is the soil moisture at the plant wilting point, *s*_{1} is the upper bound of the effective reservoir beyond which rainfall inputs are rapidly lost to runoff and *s* varies between *s*_{w} and *s*_{1}. The soil moisture terms *s*, *s*_{w} and *s*_{1} represent vertically averaged values over the root zone. The maximum capacity of the effective soil water reservoir is
*x*⋅*w*_{0}.

The temporal dynamics of the effective soil moisture, *x*(*t*), follow the differential balance equation [19,21,22,26]
*F*[*x*(*t*)] (*F*[*x*(*t*)]≥0) represents continuous evapotranspiration and slow leakage losses, *R*(*t*) is the rainfall input and *H*[*x*(*t*)] represents runoff. Equation (2.3) is assumed to operate at the daily time scale, such that *R*(*t*) and *H*[*x*(*t*)] can be assumed to be instantaneous. Consequently, rainfall events are modelled as a stochastic arrival process, with time increments and rainfall depths described by appropriate PDFs to be introduced later. In this paper, differently from previous stochastic soil moisture models, runoff losses are modelled with a general function *H*[*x*(*t*)] that represents any rapid, episodic loss of water at the soil surface or subsurface. The runoff loss, *H*, and continuous loss, *F*, normalized by the maximum storage depth, *w*_{0}, are indicated by the variables *h* and *f*, respectively.

### (a) Distributions of runoff and soil moisture

The rainfall-runoff system can be described by the joint PDF of runoff, rainfall and the soil moisture before (antecedent, *x*^{−}) and after (posterior, *x*^{+}) a rainfall event, i.e.
*z*=*R*/*w*_{0} is the normalized rainfall depth. The joint PDF of equation (2.4) is found by defining the conditional PDF, *p*_{hx+|z,x−}(*h*,*x*^{+}|*z*,*x*^{−}), and the joint PDF, *p*_{zx−}(*z*,*x*^{−};*t*).

The normalized runoff is assumed to be a deterministic function of *z* and *x*^{−}, i.e.
*y*=*z*−*h*(*z*,*x*^{−}), and *x*^{−} jumps by this amount to the posterior soil moisture *x*^{+}, i.e.

Because of the deterministic relationship that *h* and *x*^{+} have with *z* and *x*^{−}, the joint PDF *p*_{hx+|zx−}(*h*,*x*^{+}|*z*,*x*^{−}) is a point mass of probability indicated by Dirac delta functions, *δ*(⋅), and can be written as
*δ*(*h*(*z*,*x*^{−})−*h*) is the conditional PDF *p*_{h|z,x−}(*h*|*z*,*x*^{−}) derived from equation (2.5), while *δ*(*x*^{−}+*z*−*h*(*z*,*x*^{−})−*x*^{+}) is the conditional PDF *p*_{x+|z,x−}(*x*^{+}|*z*,*x*^{−}) derived from equation (2.6). In contrast to a continuous distribution, a point mass indicates that only one value is possible for a given *z* and *x*^{−}. Thus, equation (2.7) states that with probability 1, *h* and *x*^{+} take the values of the r.h.s. of equations (2.5) and (2.6), respectively.

The joint PDF *p*_{zx−}(*z*,*x*^{−};*t*) represents the variability of *z* and *x*^{−} for multiple realizations of rainfall events. Here, we reasonably assume that the two random variables are statistically independent and, therefore, the joint PDF is the product
*p*_{x−}(*x*^{−};*t*) is the PDF of antecedent soil moisture. Equations (2.7) and (2.8) allow us to rewrite equation (2.4) as

From the rainfall-runoff joint PDF of equation (2.9), the PDFs for *h* and *x*^{+} are
*δ*(⋅) is performed using the property presented in appendix A. Both equations (2.10) and (2.11) are an extension of an alternative form of the change of variables theorem (e.g. [27–29]).

### (b) Crossing properties and master equation

Having described the soil moisture and runoff process during a single jump event, we now focus on the entire dynamics of the soil moisture time series trajectory described by equation (2.3). The soil moisture, *x*, increases from *x*^{−} along a discontinuous trajectory, i.e. a jump of storm infiltration, and then decreases along a continuous trajectory because of evapotranspiration/leakage. The probability of soil moisture, *x*, depends on the frequency a trajectory of soil moisture crosses the level *x* from either evapotranspiration/leakage abstraction (downcrossing) or infiltration inputs (upcrossing, see figure 2). The balance between these upcrossing and downcrossing frequencies may form the basis for constructing the master equation (i.e. the Chapman–Kolmogorov forward equation) describing the evolution of the soil moisture PDF, *p*_{x}(*x*;*t*).

To derive the master equation, we start with the upcrossing frequency, an expression of which follows from linking the distribution of antecedent values *p*_{x−}(*x*;*t*) to the soil moisture distribution *p*_{x}(*x*;*t*). The two may be linked for a non-homogeneous Poisson arrival process of rainfall, with an arrival frequency of λ(*x*;*t*). For such a process, we consider the trajectories of the system (see figure 2) over a time interval (*t*,*t*+d*t*), noting that the variable *x*^{−} represents only the values of *x* from which a jump trajectory starts. For the interval (*t*,*t*+d*t*), the probability of being in an infinitesimal interval (*x*,*x*+d*x*) conditional on a jump occurrence, i.e. *p*_{x−}(*x*;*t*) d*x*, equals the joint probability of being in (*x*,*x*+d*x*) and jumping, i.e. λ(*x*;*t*)*p*_{x}(*x*;*t*) d*x* d*t*, divided by the probability of jumping independent of the level *x*, i.e. 〈λ(*t*)〉 d*t*; this expression is [30]
*x*;*t*)=λ_{o}, and the PDF of antecedent soil moisture *p*_{x−}(*x*;*t*) equals the PDF of soil moisture *p*_{x}(*x*;*t*).

Based on equation (2.12), the frequency of jumps from an antecedent value less than *x* may be written as
*x* and so do not cross *x* is given by
*x*, i.e. 〈λ(*t*)〉*P*_{x+}(*x*;*t*), from the frequency of jumps starting from below *x*, i.e. 〈λ(*t*)〉*P*_{x−}(*x*;*t*), retrieves the upcrossing frequency of *x*, i.e.
*t*)〉 multiplied by the probability of upcrossing *x*. Note that for negative jumps, the opposite is true, and that the downcrossing frequency for a generic level *x* is *J*^{↓}(*x*;*t*)=〈λ(*t*)〉(*P*_{x+}(*x*;*t*)−*P*_{x−}(*x*;*t*)). The upcrossing frequency of equation (2.16) may finally be written in terms of the soil moisture PDF by substituting for the posterior PDF of equation (2.11) and using equation (2.12) to write *p*_{x−}(*x*;*t*) in terms of *p*_{x}(*x*;*t*), i.e.
*p*_{hx+z|x−}(*h*,*x*^{+},*z*|*x*^{−}) is

The soil moisture trajectories (see figure 2) downcross a level *x* due to the deterministic loss function *f*(*x*). The frequency of downcrossing is equal to the fraction of time spent by the process between *x* and *x*+d*x*, i.e. *p*_{x}(*x*;*t*) d*x*, divided by the time spent during a single downcrossing, i.e. d*x*/*f*(*x*), which gives [23,24,31]

Taken together, the crossing frequencies of equations (2.16) and (2.19) constitute the probability current *J*(*x*;*t*), which represents the flux of probability passing through a boundary per unit time. In our case, the probability current *J*(*x*;*t*) is the net frequency of crossing a level (or boundary) *x*, and thus equals the rate of change of the soil moisture CDF, i.e.

By substituting equations (2.17) and (2.19) for the crossing frequencies into equation (2.20) and differentiating with respect to *x*, the general master equation for the probability density of the process is obtained as
*f*(*x*), losses due to the jumps of infiltration, and gains due to the jumps of infiltration. Note the correspondence of equation (2.21) with equation (2.3), i.e. the first term represents the change due to evapotranspiration and slow leakage and the second and third terms together represent the change due to infiltration, which includes the general runoff transformation (see equation (2.18)).

The master equation (2.21) is general to both bounded and unbounded systems. In the case of soil moisture, the system is bounded by the reflecting barrier of soil saturation, i.e. *x*=1. Immediately before the bound, the upcrossing and downcrossing frequencies must be equal, i.e.
*x*=1 refers to the end of the open interval [0,1). In terms of the crossing frequencies of equations (2.17) and (2.19), equation (2.22) is
*t*)〉 (see equation (2.13)), whereas the second term is 〈λ(*t*)〉 multiplied by the probability of not reaching posterior soil saturation *P*_{x+}(*x*<1;*t*), and so
*P*_{x+}(*x*<1;*t*)) is the probability of soil saturation. Therefore, before the saturation bound, the downcrossing frequency of *f*(1)*p*_{x}(1;*t*) is equivalent to the average frequency of jumping to soil saturation, i.e. 〈λ(*t*)〉(1−*P*_{x+}(*x*<1;*t*)).

Equation (2.21) is a specific version of the general master equation which is usually written in terms of transition probabilities per unit time (e.g. [32,33]). To introduce transition probabilities here, consider that the second term on the right-hand side of equation (2.21) is equivalent to the general term for a jump away from the state *x*, i.e.
*W*(*u*|*x*;*t*) is the transition probability density per unit time for a transition to any state *u* from state *x*, while the third term is equivalent to the general term for a jump to the state *x*, i.e.

where *W*(*x*|*u*;*t*) is the transition probability density per unit time for a transition to the state *x* from any state *u*. The previous expressions imply that the instantaneous transition probability per unit time of jumping away from state *x* to any state *u* is given by
*x* from any state *u* is
*u*;*t*) multiplied by the conditional PDF *p*_{x|u}(*x*|*u*), given by

## 3. Runoff PDFs for a rainfall-runoff transformation with saturation threshold and progressive partitioning

To analyse the general framework of the previous section, we consider a novel form for the rainfall-runoff transformation of equation (2.5), *h*(*z*,*x*^{−}), that consists of a pure threshold partitioning at soil saturation (e.g. [34,23]) complemented by a pre-threshold progressive partitioning where runoff increases linearly with rainfall. The pre-threshold runoff is attenuated by a progressive partitioning fraction, *β* (where 0≤*β*≤1), that may gradually reduce the frequency of saturation excess threshold runoff, i.e.
*Θ*(0)=1 (figure 1*b*). At the saturation threshold indicated by the Heaviside step function *Θ*(⋅), runoff losses become equivalent to the excess of rainfall, *z*, minus the spare capacity (1−*x*^{−}), but before the threshold, runoff losses occur by the progressive partitioning term *zβx*^{−}. The argument of the step function describes the rainfall amount needed for soil saturation, i.e. *z*=(1−*x*^{−})/(1−*βx*^{−}) (see figure 1*b*). When *β*=0 this rainfall amount is simply the antecedent soil storage spare capacity (1−*x*^{−}), but as *β* increases, the rainfall amount required for soil saturation increases to a value greater than (1−*x*^{−}) (see figure 1). Thus, increasing *β* decreases the occurrence of saturation-threshold excess runoff.

The progressive partitioned runoff term, *zβx*^{−}, represents runoff production mechanisms that occur when the soil is not fully saturated, such as subsurface preferential flow (i.e, macropore flow) or infiltration-excess overland flow [14,35–40]. These runoff mechanisms may depend on the antecedent soil moisture condition, *x*^{−}, or the rainfall magnitude, *z* [14,35,37]. As the antecedent soil moisture increases, a series of mechanisms may allow for a greater self-organization of the soil macropores that expands the preferential flow network and thus increases the quantity of preferential flow runoff in the watershed [39,38]. Considering both mechanisms, the progressive partitioning runoff increases with both the antecedent soil moisture status and the rainfall magnitude [14,1], and the parameter *β* controls the maximum fraction of rainfall *z* partitioned to preferential subsurface flow or infiltration-excess overland flow.

### (a) Pure runoff threshold: *β*=0

When *β*=0, the runoff transformation of equation (3.1) reverts to the saturation excess mechanism of [34] where runoff is partitioned by the soil saturation threshold of *x*=1 (e.g. [23]). Immediately prior to runoff, the soil moisture is at a distance (spare capacity) of 1−*x*^{−} below the saturation threshold, and runoff losses occur only when rainfall, *z*, exceeds this spare capacity, i.e.
*Θ*(⋅) sets runoff equal to zero when rainfall does not exceed the spare capacity (figure 1*a*). The behavior of soil moisture and runoff may be observed in the series of trajectories in figure 2 (i.e. traces of the soil moisture state *x* and runoff *h*). Note the two distinct cases (figure 2): when *z*>1−*x*^{−}, i.e. the posterior soil moisture is at the saturation threshold of *x*^{+}=1, the rainfall is partitioned between infiltration and runoff, otherwise the rainfall is partitioned as only infiltration that increases the antecedent soil moisture to the posterior soil moisture value.

The continuous joint PDF *p*_{hx+zx−}(*h,x*^{+},*z,x*^{−};*t*) is found by substituting equation (3.2) for the function *h*(*z*,*x*^{−}) in equation (2.9), i.e.
*Θ*(⋅)=0 and *Θ*(⋅)=1. Each event corresponds to a different probability density term from the right-hand side of equation (3.3), and because the events are disjoint,
*Θ*(⋅)=0 and the second term is for *Θ*(⋅)=1. Since the PDF may be integrated to a CDF, any Heaviside step functions are right continuous, i.e. *Θ*(0)=1.

Following equations (2.10) and (2.11), integrating *p*_{hx+zx−}(*h*,*x*^{+},*z*,*x*^{−};*t*) over *x*^{+}, *x*^{−}, and *z* retrieves the marginal distributions of runoff, i.e.
*h*, *x*^{−} and *z* retrieves the marginal distribution of posterior soil moisture, i.e.
*P*_{h}(*h*=0;*t*) or *P*_{x+}(*x*^{+}=1;*t*)), while the other terms represent distributions for either runoff, *h*, or posterior soil moisture, *x*^{+}. When the posterior soil moisture is below saturation, runoff must be zero, and so the discrete probability of zero runoff is equivalent to the total probability of the continuous part of the posterior soil moisture distribution on [0,1). Similarly, when runoff occurs, the soil is at saturation, and so the discrete probability of posterior soil saturation is equivalent to the total probability of the continuous part of the runoff distribution on

The integrands of the respective equations for each PDF are different because each represents a different joint distribution. The PDFs of the joint distributions of runoff and rainfall, *p*_{hz}(*h*,*z*;*t*), and posterior soil moisture and rainfall, *p*_{x+z}(*x*^{+},*z*;*t*), are simply equations (3.5) and (3.7) not integrated by *z*, i.e.
*p*_{hx−}(*h*,*x*^{−};*t*), and posterior soil moisture and antecedent soil moisture, *p*_{x+x−}(*x*^{+},*x*^{−};*t*), are equations (3.6) and (3.8) not integrated by *x*^{−}, i.e.

For the rainfall-runoff function of equation (3.2), the general master equation for the probability density of the process defined by equation (2.21) is
*p*_{x+}(*x*) of equation (3.7) with the antecedent PDF *p*_{x−}(*x*;*t*) substituted with the expression of equation (2.12) that contains the soil moisture PDF *p*_{x}(*x*;*t*). Note equation (3.13) is equally valid when the last two terms are instead represented by the PDF *p*_{x+}(*x*;*t*) of equation (3.8). The four terms on the right-hand side, respectively, represent the contributions of probability density from the deterministic drift function *f*(*x*), losses due to the jumps of infiltration, gains due to the jumps of infiltration and contributions from the discrete probability of posterior soil moisture at the saturation threshold.

For continuity immediately before the bound of *x*=1 on the interval [0,1), the crossing frequency continuity condition of equation (2.24) is equivalent to the first three terms of the right-hand side of equation (3.13) at *x*=1, but without the fourth term that exists on the bound and outside the interval [0,1), i.e.
*P*_{x+}(*x*<1;*t*) is equivalent to the probability of no runoff *P*_{h}(*h*=0;*t*). Since (1−*P*_{h}(*h*=0;*t*)) is the probability of a runoff event, the downcrossing frequency from soil saturation of *f*(1)*p*_{x}(1;*t*) is equivalent to the average frequency of runoff events of 〈λ(*t*)〉(1−*P*_{h}(*h*=0;*t*)).

#### (i) Exponential distribution of jumps with state-dependent frequency

Of the various distributions considered for the rainfall amounts, the exponential distribution is special because of its memoryless property. For the soil moisture system, the normalized form is given by
*γ*=*w*_{0}/*α* is the inverse of the average rainfall depth, *α*, normalized by the storage depth, *w*_{0}. Substituting this exponential jump distribution into equation (2.21), assuming λ=λ(*x*), retrieves

The master equation of equation (3.16) coincides with the master equation formulated by [22] when λ(*x*) is constant, i.e. λ(*x*)=λ_{o}. For steady-state conditions, the general solution for the process bounded at *x*=1 is
*Θ*(⋅) is the Heaviside step function and *C* is a normalization constant (see appendix B).

An example of an explicit solution can be found for the special case of linear drift, i.e. *f*(*x*)=*kx*, and linear state-dependent arrivals, i.e.
*Γ*(⋅) is the gamma function, *Γ*(⋅,⋅) is the incomplete gamma function and *k* is a maximum loss rate (for evapotranspiration) normalized by the storage depth *w*_{0}, and λ_{o} and *σ* are frequencies of rainfall arrival. The parameter *σ* does not alter the mathematical form of the solution but only affects the scale parameter. For *σ*=0, the solution reverts to that of [21]. Also, note that the magnitude of *k* reflects the intensity of evapotranspiration for the time period of interest and, thus, an ecohydrological control on the system. Higher evapotranspiration will shift *p*_{x}(*x*) to lower values and consequently decrease the frequency and quantity of runoff.

Based on equation (3.17), the steady-state runoff PDF from either equations (3.5) or (3.6) is
*C*_{1} is
*C*_{2}=1−*C*_{1}. By the memoryless property of the exponential distribution used for rainfall, the continuous distribution of runoff (first term of equation (3.21)) is also exponential. The steady-state distribution of posterior soil moisture from either equations (3.7) or (3.8) is

The effect of state-dependent arrivals on the resulting distributions is explored in figure 3 by changing the value of the parameter *σ* in equation (3.18). For a hypothetical case where the frequency of rainfall arrivals increases with increasing soil moisture, i.e. *σ*>0, the values of antecedent soil moisture are more likely to be greater than the values of soil moisture, as indicated by the modes of the respective distributions *p*_{x}(*x*) and *p*_{x−}(*x*^{−}) (figure 3*a*). With the opposite condition of the frequency of rainfall arrivals decreasing with increasing soil moisture, i.e. *σ*<0, the values of antecedent soil moisture are now more likely to be less than the values of soil moisture, as indicated by the modes of the respective distributions *p*_{x}(*x*) and *p*_{x−}(*x*^{−}) (figure 3*b*). Posterior soil moisture saturation and hence runoff are more common when rainfall arrivals increase with increasing soil moisture (figure 3*a*) versus when rainfall arrivals are more likely with decreasing soil moisture (figure 3*b*).

#### (ii) Mixed-exponential jump distribution

The shape of the jump distribution, especially the tails, significantly affects the PDFs of posterior soil moisture and runoff losses. A mixed-exponential distribution of rainfall jumps may allow for a better overall fit to the data through multiple parameter values. These multiple parameters may increase the strength of the tails of the distribution (e.g. [42,43]) and may capture a simultaneous shift of probability density to smaller and larger values which could occur in the future due to an intensification of the water cycle [3,4]. For a mixture of two exponential distributions, the normalized form of the rainfall distribution is given by
*ω*_{1} and *ω*_{2}=1−*ω*_{1}, *γ*_{1}=*α*_{1}/*w*_{0}, *γ*_{2}=*α*_{2}/*w*_{0}, and the average rainfall is *α*=*α*_{1}*ω*_{1}+*α*_{2}*ω*_{2}.

For a linear deterministic drift of *f*(*x*)=*kx* and a linear frequency of arrivals per equation (3.18), the steady-state solution to the master equation is [44]
*C*_{3} is the normalization constant (see appendix B). The solution is a truncated gamma distribution modulated by a confluent hypergeometric function of the first kind, i.e. _{1}*F*_{1}(⋅) [45]. In this case, it is not possible to find an analytical solution for the distributions of runoff *p*_{q}(*q*) and posterior soil moisture *p*_{x+}(*x*^{+}); however, equations (3.5) and (3.7) (or (3.6) and (3.8)) may be integrated numerically to find these distributions.

The heavy tail of the mixed exponential distribution represents a greater likelihood of extreme rainfall events (inset of figure 4*d*). For a relatively dry climate (λ_{o}=0.1 d^{−1} and *α*=1 cm), in comparison to the exponential distribution, the mixed exponential distribution increases the probability of posterior soil saturation (figure 4*a*) and runoff losses (figure 4*b*). For a wetter climate (λ_{o}=0.1 d^{−1} and *α*=2.5 cm), in comparison to the exponential distribution, the heavier tailed mixed exponential distributions lowers the probability of posterior soil moisture (figure 4*c*) and runoff losses (figure 4*d*).

### (b) Runoff threshold with a progressive partitioning: *β*≠0

For the general case of the rainfall-runoff transformation of equation (3.1) when *β*≠0, the behaviour of soil moisture and runoff is shown in the series of trajectories in figure 5. In contrast to the pure threshold case, runoff events occur according to a Poisson process with the same frequency as rainfall events; therefore, there is only an atom of probability for posterior soil saturation and not for zero runoff. For each rainfall event, there are two cases (figure 5): (1) a linear progressive partitioning of rainfall between infiltration, *y*, and runoff, *h*, when infiltration is less than the spare capacity, 1−*x*^{−}, and (2) at soil saturation, a threshold partitioning of rainfall into runoff where *h* is equal to rainfall, *z*, minus the spare capacity amount, 1−*x*^{−}.

The continuous joint PDF *p*_{hx+zx−}(*h*,*x*^{+},*z*,*x*^{−};*t*) is found by substituting equation (3.1) for *h*(*z*,*x*^{−}) in equation (2.9), i.e.
*Θ*(⋅)=0 and *Θ*(⋅)=1. Each disjoint event corresponds to a different probability density term from equation (3.26), i.e.
*Θ*(⋅)=0 and the second term is for *Θ*(⋅)=1.

Following equation (2.10), integrating the continuous joint PDF *p*_{hx+zx−}(*h*,*x*^{+},*z*,*x*^{−};*t*) of equation (3.27) over *x*^{+}, *x*^{−}, and *z* using the properties of appendix A retrieves the runoff PDFs, i.e.
*z*=(1−*x*^{−})/(1−*βx*^{−}) and *x*^{−}=(1−*z*)/(1−*βz*), and substituting these values into equation (3.1) results in the respective equations *h*=*z*−1+(1−*z*)/(1−*βz*) and *h*=*x*^{−}−1+(1−*x*^{−})/(1−*βx*^{−}). These equations may be solved in terms of *z* or *x*^{−} to find the upper and lower bounds of equations (3.28) and (3.29), respectively, and the maximum of either equation, i.e.

Following equation (2.11), integrating the joint PDF *p*_{hx+zx−}(*h*,*x*^{+},*z*,*x*^{−};*t*) over *h*, *x*^{−} and *z* retrieves the expressions for the PDFs of posterior soil moisture, i.e.
*p*_{hz}(*h*,*z*;*t*) and posterior soil moisture and rainfall, *p*_{x+z}(*x*^{+},*z*;*t*), are simply equations (3.28) and (3.30) not integrated by *z*, while the joint PDFs of runoff and antecedent soil moisture, *p*_{hx−}(*h*,*x*^{−};*t*), and posterior soil moisture and antecedent soil moisture, *p*_{x+x−}(*x*^{+},*x*^{−};*t*), are equations (3.29) and (3.31) not integrated by *x*^{−}.

When rainfall events arrive as a homogeneous Poisson process at frequency λ(*x*;*t*)=λ_{o} (i.e. *σ*=0) the general master equation (2.21) is
*p*_{x+}(*x*) of equation (3.31) multiplied by the constant rate of the homogeneous Poisson process, λ_{o}, where by equation (2.12) for a homogeneous Poisson process *p*_{x}(*x*;*t*)=*p*_{x−}(*x*;*t*). For a mixed exponential distribution of rainfall, *p*_{z}(*z*) (see equation (3.24)), the steady-state master equation (3.32) is
*γ*_{1}(*x*^{−})=*γ*_{1}/(1−*βx*^{−}) and *γ*_{2}(*x*^{−})=*γ*_{2}/(1−*βx*^{−}), and the integral term consists of the soil moisture PDF, *p*_{x}(*x*), convolved with either the infiltration PDF *p*_{y}(*y*)=*γ*_{1}(*x*^{−}) *e*^{−γ1(x−)y} or the infiltration PDF *p*_{y}(*y*)=*γ*_{2}(*x*^{−}) *e*^{−γ2(x−)y}. For linear drift, i.e. *f*(*x*)=*kx*, and the assumption that *γ*_{1}(*x*^{−}) and *γ*_{2}(*x*^{−}) are independent of the convolution, i.e. the convolution dummy variable is *x** (see appendix B), an approximate solution can be found as (see equation (B11) of appendix B)
*C*_{4} normalizes the PDF, and the solution has a dependence on the antecedent soil moisture value through *γ*(*x*^{−}). Closing the solution of equation (3.34) requires approximating *γ*(*x*^{−}) in terms of *x*. For *γ*(*x*^{−}) we assume that *x*^{−} is proportional to *x*, i.e.
*υ* is a constant fraction. Because *γ*(*x*^{−}) is from the master equation term for λ_{o}*P*_{x+}(*x*) (see equations (2.15), (2.20) and (2.21)), the relationship between *x*^{−} and *x* is actually symbolic of a relationship between *x*^{−} and the posterior value, *x*^{+}, and the linear relation of equation (3.35) is based on the previous figures where *p*_{x−}(*x*^{−}) roughly appears to be a linear rescaling of *p*_{x+}(*x*^{+}). After substituting equation (3.35) into equation (3.34), we find the value of *υ* such that at the bound of *x*=1, the downcrossing frequency equals the frequency of soil saturation (see equation (2.24)), i.e.

For a mixed exponential distribution of rainfall, the solution consists of equations (3.34) and (3.35), where for each value of *β*, we calculate a unique value of *υ* that satisfies the boundary condition of equation (3.36). For the soil moisture PDF, *p*_{x}(*x*), we favourably compare the approximate analytical solution of equations (3.34) and (3.35) (figure 6, dotted-dashed line) against the exact soil moisture PDF constructed from the numerical results of a Monte Carlo simulation (figure 6, histogram bars). The approximate solution captures the shape of the exact PDF of the system (figure 6, histogram), but with a relatively small bias that decreases as the value of *β* decreases to *β*=0 where the solution becomes the exact solution of equation (3.25). The favourable comparison to the histogram values shows that *γ*(*υ* *x*) is a reasonable approximation of *γ*(*x*^{−}).

For the solution of the soil moisture PDF of equations (3.34) and (3.35), the resulting PDFs of posterior soil moisture *p*_{x+}(*x*^{+}) (black solid line) found from equation (3.30) or (3.31), and runoff *p*_{q}(*q*) (dashed line) found from equation (3.28) or (3.29) are shown for values of *β*=0.3 and *β*=0.7 (figure 6). The system reverts to the pure threshold case for a progressive partitioning strength of *β*=0. As the progressive partitioning strength increases, i.e. *β* increases, the infiltration input decreases and the mode of the soil moisture and posterior soil moisture distributions shifts to lower soil moisture values. Concurrently, there is a decrease in the discrete probability of soil saturation, i.e. *x*^{+}=1 (bar), and the mode of the runoff distribution shifts to a higher value of runoff (figure 6).

## 4. Stormflow response model

Streamflow can be separated into a slow baseflow component (fed by deep percolation and groundwater) and a fast stormflow component (fed by direct runoff from the surface and shallow soil layers). Under the assumptions of our model, *h* refers to stormflow with a relatively fast response time. It is reasonable to assume that the majority of stormflow runoff is produced on a fraction, *ζ*, of the total watershed area, or the so-called source area (e.g. [34,46,47]). Therefore, stormflow, *q*, which is observed at the watershed scale, is the modelled runoff scaled by the source area fraction,
*ζ*, does not produce runoff that contributes to stormflow. For the stormflow model, we assume that the rainfall-runoff transformation is given by equation (3.1) for *β*≠0, and thus the stormflow PDFs are equations (3.28) and (3.29) of §3b, transformed by the change of variables technique (e.g. [48]), which results in substituting *q*/*ζ* for *h* and scaling the PDFs by the factor *ζ*^{−1}. We now apply the results obtained in §3 to model observed stormflow in the Upper Little Tennessee watershed. For the data comparison, the parameters *ζ* and *β* are calibrated and the remaining parameters are derived from data measurements related to soil properties.

### (a) Site description and stormflow separation

Daily runoff data were obtained for the Upper Little Tennessee watershed (United States Geological Survey (USGS) gauge 03500000 near Prentiss, NC with a contributing area of 363 *km*^{2}). Daily rainfall was extracted from ‘Climate Station 01’ at the Coweeta Hydrologic Laboratory. We used HYSEP [49,50] to separate the streamflow into baseflow and stormflow. Regional soils data (table 1) were used to define the soil moisture loss rate, *k*, and the maximum water storage, *w*_{0}, which is dependent on the soil porosity, *n*, minimum soil moisture, *s*_{w}, maximum soil moisture, *s*_{1}, and rooting depth *Z*_{r}.

### (b) Model validation

We compare our probabilistic model of *q* to 63 years of rainfall and stormflow data between 1 January 1950 and 31 December 2012 (figure 7). The data represent storm totals of rainfall and stormflow runoff. Based on the hydrograph separation, we assume that a single storm may consist of consecutive daily data values of rainfall and stormflow up to 3 days. The 23 011 days of data yielded 4567 rainfall-runoff events with average unit area rainfall and runoff of 24.8 and 3.3 mm, respectively (figure 7). The transformed §3b runoff PDFs, i.e. *p*_{q}(*q*)=*ζ*^{−1}*p*_{h}(*q*/*ζ*), require a rainfall PDF, and we assume a mixed exponential PDF, which best fit the data (figure 8*a*). For the mixed exponential PDF of rainfall, the soil moisture PDF is given by equations (3.34) and (3.35), and the parameters are listed in table 1. Note that the mixed exponential distribution of rainfall is a normalized version for which the parameters are *γ*_{1}=*α*_{1}/*w*_{0} and *γ*_{2}=*α*_{2}/*w*_{0} (see equation (3.24) and table 1). Model calibration consists of selecting different values of *β*, calculating the *ζ* that sets the theoretical mean runoff equal to the data mean (see table 1), and then selecting the optimal pair of *β* and *ζ* that provide the overall best fit between the theoretical and data runoff quantiles of figure 8*b*. In this case, the model calibration resulted in the lowest root mean square error (RMSE) of 0.0068 and the highest Nash–Sutcliffe efficiency (NSE) of 0.976 between the data and theoretical model quantiles of normalized runoff, *q*, where the correlation coefficient *R*^{2} is 0.99 (see figure 8*b*).

The joint rainfall-runoff distribution for the stormflow model, *p*_{qz}(*q*,*z*), is the transformed integrand of equation (3.28), i.e. =*ζ*^{−1}*p*_{hz}(*q*/*ζ*,*z*), and this joint PDF compares favourably to the data based on the goodness of fit between the model and data quantiles (figure 8*a*,*b*). For the quantiles of the model marginal PDFs, similar data quantiles fall near the 1 : 1 line, which indicates an exact match between the model PDFs and the data. In addition, the quantiles are within the 95% confidence bands given by the Kolmogorov–Smirnov statistic. Both the theoretical and data distributions of runoff (figure 8*c*) show a high density of smaller runoff amounts that are captured by the progressive partitioning term *zβx*^{−}, while the tails of the distribution represent larger runoff amounts from rainfall exceeding the saturation threshold of the soil (figure 8*c*,*d*). The log plot of the stormflow distribution shows a kink near *q*=0.04 (figure 8*d*), where the system transitions from progressive partitioning to threshold excess runoff. Beyond this kink, the goodness of fit in the tails shows that threshold excesses well describe runoff production in the watershed for large runoff events except for extreme values (figure 8*b*,*d*). About six extreme rainfall-runoff events, i.e. 0.13% of the data, significantly deviate from the model (figure 8*b*,*d*). Since the mixed exponential distribution under represents large rainfall quantities (see figure 8*a*), alternative rainfall distributions may allow the model to better represent large runoff quantities.

The high density of progressive partitioned values (figure 8*c*) correlates with the large number of rainfall-runoff data clustered in the region below the theoretical boundary that defines the maximum value of progressive partitioned runoff (figure 7, red large-dashed line). Beyond this boundary, the rainfall-runoff transformation of equation (3.1) for average soil moisture (figure 7, green small-dashed line) transitions to runoff production by rainfall in excess of the soil saturation threshold. While this rainfall-runoff response has been observed at the watershed scale (e.g. [53]), here we have given it the quantitative form of equation (3.1). This transformation of equation (3.1), which is normalized to a watershed area basis by *ζ*, may be viewed as a new type of runoff curve, i.e. it serves the same function as the SCS curve. However, here the standard deviation (SD) of the soil moisture PDF of equation (3.34) provides the range of the transformation between wet and dry states of soil moisture (figure 7).

The joint distributions of stormflow runoff and rainfall, *p*_{qz}(*q*,*z*), and stormflow runoff and soil moisture, *p*_{qx−}(*q*,*x*^{−}), are the respective integrands of the runoff PDFs of either equation (3.28) and (3.29), respectively, again transformed by *ζ*. In both cases, the dependence structure of these joint PDFs is set by the rainfall-runoff transformation of equation (3.1). We further compare the data to the dependence structure in the joint PDF *p*_{qz}(*q*,*z*)=*p*_{q|z}(*q*|*z*)*p*_{z}(*z*) by examining the dependency on *z* for the integrand term representing the conditional distribution *p*_{q|z}(*q*|*z*) that is given by

where *p*_{x−}(⋅) is the soil moisture PDF of equation (3.34). This distribution has a term for progressive partitioned stormflow before the soil saturation bound indicated by *ζ*(*z*−1+((1−*z*)/(1−*βz*))*Θ*(1−*z*)) and a term for threshold excess loss after the saturation bound. As *z* increases, both the data and theoretical CDFs of *P*_{q|z}(*q*|*z*) shift to higher values of runoff (figure 9*a*), so equation (3.1) estimates the general trend of the dependence structure of the rainfall-runoff data. For values *U*_{i}=*P*_{q|z}(*q*_{i}|*z*_{i}) calculated from data points (*q*_{i},*z*_{i}), a CDF of *P*_{U}(*U*) following a uniform distribution indicates goodness of fit between the data and theoretical CDFs of *P*_{q|z}(*q*|*z*). For rainfall greater than the average, equation (3.1) well predicts the data dependence structure since the CDF *P*_{U}(*U*) follows the 1:1 line (figure 9*c*). Conversely, for rainfall less than the average (figure 9*b*), the model dependence structure well predicts the median runoff of the data since the CDF *P*_{U}(*U*) passes near the point (0.5, 0.5). While in figure 9*b* the dependence structure of the model both overpredicts small runoff values (where *P*_{U}(*U*) is above the 1 : 1 line) and underpredicts large runoff values (where *P*_{U}(*U*) is below the 1 : 1 line), the magnitudes of the deviations are relatively small, as may be observed in figure 9*a*. For example, for *z*=0.5〈*z*〉 (figure 9*a*), the difference on the *q* axis between the theoretical and data CDF, Δ*q*, is typically around Δ*q*=0.004, which amounts to an actual runoff difference of 0.6 mm based on the *w*_{0} of table 1. For rainfall events less than the average, the correspondingly small runoff amounts typically only deviate by a relatively small magnitude as observed on the Q–Q plot (figure 8*b*). Understanding the discrepancy between the dependence structures of the model and data CDFs (figure 9*a*,*b*,*c*) is a topic for future work.

### (c) Antecedent soil moisture control on storm exceedance probability

Having examined the dependence structure of equation (3.1), we are now able to explore the relation between stormflow runoff and soil moisture, which is contained in the joint PDF *p*_{qx−}(*q*,*x*^{−})=*p*_{q|x−}(*q*|*x*^{−})*p*_{x−}(*x*^{−}). This PDF is the integrand of equation (3.29) transformed by *ζ*, and the dependence on *x*^{−} is described by the conditional PDF *p*_{q|x−}(*q*|*x*^{−}), i.e.
*p*_{z}(⋅) is the mixed exponential distribution of equation (3.24). This conditional distribution consists of a progressive partitioning term before the saturation bound and a threshold excess term after the bound. In areas where the soil moisture is monitored, the conditional PDF of equation (4.3) may help to assess the effect of soil moisture on the risk of storm runoff. The increased runoff risk from an increase in the soil moisture status may be observed through the exceedance probability, 1−*P*_{q|x−}(*q*|*x*^{−}), where *P*_{q|x−}(*q*|*x*^{−}) is the CDF of equation (4.3), and the soil moisture state is a significant control for the exceedance probability (figure 9*d*). For example for *q*=0.1, the probability that a storm will exceed this runoff amount is about 20% when the antecedent soil moisture is at *x*^{−}=1, but is near 0% when the antecedent soil moisture is at *x*^{−}=0.1 (figure 9*d*). Thus, the exceedance probability quantifies the increased runoff risk from an increase in the soil moisture status, a relationship which has been observed but not concisely characterized by previous studies (e.g. [6,54]).

## 5. Concluding remarks

We have presented the first probabilistic framework for determining the runoff PDFs that are linked to a simple model of the soil moisture balance forced by stochastic rainfall. Thus, the derived analytical forms of the joint PDFs for both runoff and rainfall, *p*_{hz}(*h*,*z*;*t*), and runoff and antecedent soil moisture, *p*_{hx−}(*h*,*x*^{−};*t*), explicitly account for the effects of multiple soil moisture states. Based on a new progressive partitioning rainfall-runoff transformation that accounts for runoff generation in unsaturated soils, we were able to successfully apply the framework to create a probabilistic stormflow runoff model with PDFs that reproduces the measured rainfall-runoff response for 363 *km*^{2} of the Upper Little Tennessee River watershed. The stormflow model relies on only two calibration parameters, the fraction of watershed area producing stormflow runoff, *ζ*, and the progressive partitioning fraction, *β*, which controls the frequency of threshold initiated stormflow events. While low parameter models are known for their predictability, here we have also demonstrated their suitability for creating stochastic descriptions consisting of runoff PDFs, including joint PDFs of rainfall and runoff as well as soil moisture and runoff. For this watershed, the model allows an explicit link between the antecedent soil moisture state and runoff variability, which we summarize with the stormflow runoff event return period.

The soil moisture dependence of the joint runoff PDFs paves the way for a statistical characterization of the effect of ecohydrological processes on runoff dynamics from cumulative threshold excesses from hillslope to watershed scales. Extending the framework to multiple soil layers may allow for a representation of threshold initiated hillslope preferential flow, which may contribute to stormflow for more extreme rainfall events. In addition, a better characterization of extreme rainfall events may be achieved by different approximations of the rainfall PDF, e.g. a mixed PDF with a heavier tailed distribution for extreme rainfall events (e.g. Pareto PDF) that is combined with a exponential PDF for smaller rainfall events. We anticipate that the runoff PDFs, including the joint PDF of runoff and soil moisture, may aid in the analysis of how runoff variability may be affected by seasonal, intra-annual, inter-annual, and climatic changes in the ecohydrology of the soil moisture system.

## Ethics

This research poses no ethical considerations.

## Data accessibility

Processed data and code are available by email from the corresponding author.

## Author' contributions

M.S.B., A.P. and J.J.M. conceived the study. M.S.B. created the model, analysed the data and drafted the manuscript. E.D. assisted M.S.B. in validating the model. A.J.P. assisted M.S.B. in interpreting data. All authors revised the manuscript and gave final approval for publication.

## Competing interests

We have no competing interests.

## Funding

This work was supported by the Agriculture and Food Research Initiative of the USDA National Institute of Food and Agriculture (2011-67003-30222); National Science Foundation (grants CBET-1033467, EAR-1331846, FESD-1338694 and EAR-1316258); and by the US Department of Energy (DOE) Office of Biological and Environmental Research (BER) Terrestrial Carbon Processes Program (DE-SC0006967).

## Acknowledgements

We thank Gaby Katul and Larry Band for early discussions. Thanks to Duke University and UNC Chapel Hill for the Keohane Professorship support of J.J.M. during the preparation of this manuscript. Data sets were provided by the Climate and Hydrology Database Projects, a partnership between the Long-Term Ecological Research program and the US Forest Service Pacific Northwest Research Station, Corvallis, Oregon. Significant funding for these data was provided by the National Science Foundation Long-Term Ecological Research program and the USDA Forest Service.

## Appendix A. Dirac delta function property

When integrating equations (2.10) and (2.11), the Dirac delta functions of equation (2.9) often cannot be evaluated explicitly using the sifting property but may be evaluated (in this case for the variable of integration *x*^{−}) with the property
*g*(*x*^{−}) is the function nested within the delta function, i.e. *g*(*x*^{−})=*h*(*z*,*x*^{−})−*h* or *g*(*x*^{−})=*x*^{−}+*z*−*h*(*z*,*x*^{−})−*x*^{+}, the summation is for all the roots *x*_{n} where *g*(*x*_{n})=0, and *g*^{′}(⋅) is the derivative of the function with respect to *x*^{−} (e.g. [29]).

## Appendix B. Solution of the forward equation

For a general case where the system state *x* is forced by a *m* number of exponentially distributed jump variables *z*_{i} (with PDFs _{i}(*x*_{i}) and rainfall-runoff transformation of *h*(*z*,*x*^{−}) of equation (3.1) of §3, the master equation of equation (2.21) of §2b is
*ω*_{i} *γ*_{i}, which is the inverse of the normalized average rainfall depth *w*_{0}/*α*_{i}. In the case of a pure threshold of §3a,

When each jump variable *z*_{i} is considered separately, we may construct a master equation for each state variable *x*_{i}, which in steady state is
*x*<1 the atom of probability represented by the delta function disappears, i.e.
*p*_{yi}(*y*_{i}) on *x*^{−} will produce a relatively small change in *x*_{i} provides
*C* normalizes the PDF, and this solution is exact when *p*_{xi}(*x*_{i}) represents a system only forced by the specific exponentially distributed random variable *z*_{i} arriving at frequency λ_{i}(*x*_{i}). For only a single exponentially distributed random variable with a constant parameter *γ* with the weight *ω*_{1}=1, equation (B7) is equal to the exact solution of equation (3.17) of §3a. The specific solution of equation (B7) for the linear rainfall arrival process of equation (3.18) of §3a and linear drift, i.e. *f*(*x*_{i})=*kx*_{i}, is
*γ* with the weight *ω*_{1}=1, equation (B8) is equal to equation (3.19) of §3a. For a system forced by multiple exponentially distributed random variables, *z*_{i}, the system solution is a summation of random variables, i.e. *x* is a convolution of different distributions with the form of equation (B8).

For the convolution of two distributions with the respective weights *ω*_{1} and 1−*ω*_{1}, the convolution integral is
*x*_{1}=*sx* and rearrangement of the terms equation (B9) may be written as
_{1}*F*_{1}(⋅) is the confluent hypergeometric function of the first kind, and *Γ*(⋅) is the gamma function. When equation (13.1.27) of [45] is applied, equation (B11) is equal to equation (3.25) of §3a where *γ*_{1}(*x*^{−})=*γ*_{1} and *γ*_{2}(*x*^{−})=*γ*_{2}, as well as to equation (3.34) of §3b where *γ*_{1}(*x*^{−})=*γ*_{1}/(1−*βx*^{−}) and *γ*_{2}(*x*^{−})=*γ*_{2}/(1−*βx*^{−}). In the case where *γ*_{1}(*x*^{−})=*γ*_{1} and *γ*_{2}(*x*^{−})=*γ*_{2}, the solution is exact; however, for all other cases, the solution requires closure through an approximation such as the one presented by equation (3.35) of §3b.

- Received June 10, 2015.
- Accepted September 29, 2015.

- © 2015 The Author(s)