## Abstract

Odlyzko has computed a dataset listing more than 10^{9} successive Riemann zeros, starting from a zero number to beyond 10^{23}. This dataset relates to random matrix theory as, according to the Montgomery–Odlyzko law, the statistical properties of the large Riemann zeros agree with the statistical properties of the eigenvalues of large random Hermitian matrices. Moreover, Keating and Snaith, and then Bogomolny and co-workers, have used *N*×*N* random unitary matrices to analyse deviations from this law. We contribute to this line of study in two ways. First, we point out that a natural process to apply to the dataset is to minimize it by deleting each member independently with some specified probability, and we proceed to compute empirical two-point correlation functions and nearest neighbour spacings in this setting. Second, we show how to characterize the order 1/*N*^{2} correction term to the spacing distribution for random unitary matrices in terms of a second-order differential equation with coefficients that are Painlevé transcendents, and where the thinning parameter appears only in the boundary condition. This equation can be solved numerically using a power series method. In comparison to the Riemann zero data accurate agreement is exhibited.

## 1. Introduction

The celebrated Riemann hypothesis asserts that all the non-trivial zeros of the Riemann zeta function *ζ*(*s*) are on the so-called critical line, Re *ζ*(*s*) on the critical line as Riemann zeros, and restrict attention to those in the upper half plane—those in the lower half plane are obtained by complex conjugation. There is an intriguing relationship between the Riemann zeros and random matrix theory. It originates from the viewpoint that the Riemann zeros, for distances far up the critical line, are best considered in a statistical sense rather than as a deterministic sequence. The first step in a statistical analysis is to scale the sequence so that locally the density of the zeros far up the critical line is unity. This is straightforward from knowledge of the fact that at position *E*≫1, the density (*E*, with Δ*E*≪*E*. This quantity, referred to as the *two-point correlation function*, is probed by computing the average value of a test function *f*(*E*−*E*_{0}), with *E*_{0} corresponding to the fixed zero. Subject to the technical assumption that the class of test function used has Fourier transform *k*|<1, the limiting two-point correlation function was proved to be equal to
*X* be an *N*×*N* random matrix with standard complex Gaussian entries, and form the Hermitian matrix

This coincidence, first observed by Dyson (e.g. [5], p. 159), is in keeping with and extends the so-called *Hilbert–Pólya conjecture* [6] that the Riemann zeros correspond to the eigenvalues of some unbounded self-adjoint operator. Considerations in quantum chaos (e.g. [7,8]) give evidence that generically the large eigenvalues of such operators have local statistics coinciding with the eigenvalues of large random Hermitian matrices. The Hermitian matrices must be constrained to have real entries should there be a time reversal symmetry. Thus this line of reasoning suggests that asymptotically the Riemann zeros have the same local statistical properties as the large eigenvalues of a chaotic quantum system without time reversal symmetry. This assertion is essentially a statement of what is now called the Montgomery–Odlyzko law or the GUE conjecture (see also [5,9–12]).

At an analytic level, further evidence for the validity of the law was given by Rudnick & Sarnak [13], who extended Montgomery’s result to the general *k*-point correlation function by obtaining the explicit functional form
*k*-point correlation (1.3) is precise for the eigenvalues of large Hermitian random matrices in the bulk scaling limit [4], prop. 7.1.1. At a numerical level, Odlyzko has made high-precision computations for the 10^{20}th Riemann zero and over 70 million of its neighbours [23], and (later) for the 10^{22}nd zero and 1 billion of its neighbours [24]. (The zeros on the critical line are conventionally numbered in the order of their distance from the real axis, with the first zero at Im *s*≈14.134725 [25], A058303.) These data exhibit consistency with random matrix theory for statistical quantities such as the pair correlation function outside the range known rigorously from the work of Montgomery, the distribution of the spacing between neighbouring zeros, the variance of the fluctuation of the number density in an interval and so on; for a popular account of this line of research, see [26].

Notwithstanding the extraordinary distance along the critical line achieved in Odlyzko’s calculations, it turns out that finite-size corrections to the limiting behaviour occur on the scale of the logarithm of the distance as suggested by (1.1), and thus are of significance in the interpretation of the data from the viewpoint of the random matrix predictions. In addressing the question of the functional form of the finite-size corrections from the viewpoint of random matrix theory, Keating & Snaith [27] (see also the review [28] and the later work [29]) were led to a totally unexpected conclusion: the correct model for this purpose is not complex random Hermitian matrices, but rather *N*×*N* random unitary matrices chosen with Haar measure, and *N* related to *E* so as to be consistent with (1.1). Random unitary matrices with Haar measure result, for example, by applying the Gram–Schmidt orthonormalization procedure to a matrix of standard complex Gaussians. The statistical quantities considered in [27] relate to the value distribution of the logarithm of the Riemann zeta function on the critical line, which for random matrices corresponds to the value distribution of the characteristic polynomial. Coram & Diaconis [30] subsequently used the random unitary matrix model to predict the functional form of the covariance between the number of eigenvalues in overlapping intervals of equal size. Following up on [24], where Odlyko studies the difference between the spacing distribution for the Riemann zeros and the corresponding limiting random matrix distribution and writes: ‘Clearly there is structure in this difference graph, and the challenge is to understand where it comes from’, Bogomolny *et al.* [31] considered finite-size corrections to the spacing distribution for random unitary matrices. While previous studies in random matrix theory gave analytic results relating to the value distribution of the characteristic polynomial [32], there is no existing literature on the analytic calculations of the leading finite-size correction to the spacing distribution for random unitary matrices. Here we are referring to the pointwise value of the expected probability density function; there is an existing literature relating to the decay, as a function of *N*, of the Kolmogorov distance between the empirical cumulative spacing distribution and its expected large *N* form [33–35]. In [31], the corrections were computed using a numerical extrapolation procedure based on a determinant expression valid for finite *N*. A primary objective of the present paper is to provide an analytic characterization of the leading finite-size correction to the spacing distribution.

Variants of the nearest neighbour spacing distribution also provide relevant statistical quantities. For example, one could consider spacing distributions between zeros *k* apart (e.g. [4], §8.1), or the distribution of the closest of the left and right neighbours [36]. Another variant is to consider spacing distributions resulting from first thinning the dataset by the process of deleting each member independently with probability (1−*ξ*), 0<*ξ*<1, as first considered in random matrix theory by Bohigas & Pato [37]. In both cases, the finite-size correction to the large *N* form of the corresponding quantity for the eigenvalues of random unitary matrices can be computed analytically. This is of concrete consequence as we are fortunate enough to have had A. Odlyzko provide us with a dataset extending that reported in [24]. Specifically, this dataset begins with zero number equal to
*s*-plane with *E* equal to
^{9} of each of the subsequent Riemann zeros. Thus, we were able to compare the analytic forms with the finite corrections exhibited by the Riemann zeros, uninhibited by sampling error.

In §2, we begin by considering the two-point correlation. Bogomolny & Keating [18] have derived an explicit formula for this quantity in the case of the Riemann zeros, in the regime that *E* is large but finite; Rodgers [38] has recently made this computation rigorous for the class of test functions with Fourier transform supported on |*k*|<1. Subsequently, Bogomolny *et al.* [31] showed how the result of [18] could be expanded for large *E* to obtain the correction term to the random matrix result (1.1). Moreover, upon (partial) resummation, the correction term was found to be identical in its functional form to the correction term of the scaled large *N* expansion of the two-point correlation for the eigenvalues of random unitary matrices chosen with Haar measure. After reviewing this result, we compare the empirical form of the two-point correlation as computed from Odlykzko’s dataset, and for the dataset thinned with parameter *ξ*=0.6, with the theoretical prediction. Note that this thinning value of *ξ*=0.6 was chosen more or less randomly, with the constraints that it would produce a large enough deformation from *ξ*=1 to be visible with the eye, while still keeping enough eigenvalues to allow us to resolve the structure from Odlyzko’s data. Very good agreement is found for distances up to approximately one-and-a-half times the average spacing between zeros, after which systematic deviation is observed.

Higher-order correlations for random unitary matrices are fully determined by the same function (see (2.1)) as determined by the two-point correlation. This certainly is not literally true for the higher-order correlations of the Riemann zeros, which are not determinantal for finite *E* [17,22]. Hypothesizing as in [31] that, nonetheless, it is asymptotically true for the leading (this is equivalent to the Montgomery–Odlyzko law) and the *first subleading term*, it follows that the coincidence between the correction term to the leading large *E* form of the two-point correlation for the Riemann zeros and the correction term to the leading large *N* form of the two-point correlation for the eigenvalues of random unitary matrices persists to all higher-order correlations.

While the *k*-correlation is a function of *k*−1 difference variables and cannot be measured empirically from the Riemann zero data with accurate statistics for *k*>2, this hypothesis becomes predictive as the coincidence must carry over to any distribution function which can be written in terms of the correlation functions, for example, the nearest neighbour spacing distribution [31]. In §3, we make two main contributions to the study of this theme. One is to test the hypothesis upon thinning of the dataset. The other is to provide an analytic determination of the correction term in the random matrix case using certain Painlevé transcendents.

The correction term to the nearest neighbour spacing is oscillatory and becomes more so as the thinning parameter *ξ* is decreased from 1. In §§3 and 4, we discuss this effect in the context of the large distance form of the Painlevé transcendents.

## 2. Two-point correlation

Correlation functions are fundamental to the theoretical description of general point processes. For definiteness, and in keeping with the setting of the Riemann zeros, we will specify that the point process is defined on a line. Starting with *ρ*_{(1)}(*x*) as the density at point *x*, the *k*-point correlations *ρ*_{(k)}(*x*_{1},…,*x*_{k}) can be defined inductively by the requirement that the ratio
*x*_{k}, given that there are particles (or zeros, or eigenvalues, etc.) at locations *x*_{1},…,*x*_{k−1}. Suppose the particle density *ρ*_{(1)}(*x*) is identically constant so that the point process is translationally invariant, and furthermore, take the constant to be unity. We then have that the two-point correlation *ρ*_{(2)}(*x*_{1},*x*_{2}) is just the density at *x*_{2} given that there is a particle at *x*_{1}. As such, in this circumstance, *ρ*_{(1)}(*x*_{1},*x*_{2}) can be empirically determined from a single dataset. To see this, note that due to translational invariance, *ρ*_{(2)}(*x*_{1},*x*_{1}+*s*) depends only on *s*, allowing *x*_{1} to be averaged over to generate the necessary data for a statistical determination.

For the eigenvalues {e^{iθj}}_{j=1,…,N} of *N*×*N* unitary matrices chosen with Haar measure, the *k*-point correlation is given by the *k*×*k* determinant (e.g. [4], §5.5.2)
*x*_{j}=*Nθ*_{j}/2*π* (*j*=1,…,*N*) and *K*_{N}(*x*,*y*)—referred to as the *correlation kernel*—is given by

The situation with the Riemann zeros is more complicated. For a start, as the Riemann zeros are a deterministic sequence, a statistical characterization only results after an averaging over a suitable interval of zeros about the point *x* of interest along the critical line, and with the use of a test function. And, as mentioned in the Introduction, it is only for a restricted class of test functions—those whose Fourier transforms have support on |*k*|<1—that rigorous analysis of the correlations has been possible. Even then the theorems obtained are statements about the leading asymptotic form only.

The discussion in the Introduction mentioned that a non-rigorous, but predictive analysis based on an analogy with a chaotic quantum system, together with the use of a conjecture of Hardy and Littlewood for the pair correlation function of the primes, allows for further progress [18]. One consequence is that (1.2) can be derived as the two-point correlation function without the assumption of a restricted class of test functions. Again, this is a statement about the limiting asymptotic form. For present purposes, the most relevant feature of the results of [18] is that they allow for the determination of the leading correction term to the limiting asymptotic form. The necessary working is given in [31], where it was shown that at position *E*≫1, along the critical line, with the local density normalized to unity, and with *E* expansion
*Λ* and *Q* can be expressed in terms of convergent expressions involving primes, which when evaluated give the numerical values *Λ*=1.57314,… and *Q*/*Λ*=1.4720,… . Moreover, with
*N* and the distance along the critical line *E* according to [31]:
*s* is further rescaled as
*E*.

In the theory of point processes (e.g. [39]), a thinning operation, whereby each member is deleted independently with probability (1−*ξ*), is used to create a family of point processes from a single parent process. The effect on the corresponding *k*-point correlation *ρ*_{(k)}(*x*_{1},…,*x*_{k};*ξ*) is very simple—the *ξ* dependence scales to give
*y*_{j}=*ξx*_{j} (*j*=1,2,…) one sees that
*y*_{j}} correspond to a rescaling so that the density of the original point process remains unchanged, assuming translational invariance.

### (a) Numerical results

To determine the two-point correlation *ρ*_{(2)}(*y*_{1}/*ξ*,*y*_{2}/*ξ*;1) empirically from Odlyzko’s dataset, we must first rescale the data by (1.1) so that the rescaled local mean density is unity; we then delete each zero with probability 1−*ξ*. Next, we sample this sequence by empirically computing the zero density of the subsystem formed by each zero in turn as the left boundary, and a fixed number (say 50) of its neighbours. Figure 1 displays the results of this empirical determination of (2.8) in both the variables {*x*_{j}} (figure 1*a*) and {*y*_{j}} (figure 1*b*). To the eye, the resulting graphical forms are identical to the leading-order random matrix prediction (1.2).

Following [31], our main point of interest is in the functional form that results after subtracting the leading random matrix prediction from the empirical two-point function for Odlyzko’s dataset,
*ξ*=1, been compared with the two-point correlation of the same sequence of Riemann zeros as used here [40], figs 12–17 or [41], figs 3–7, and quantitative agreement is found for all distances displayed.

## 3. Spacing distributions

### (a) Fredholm determinant formulae

The *k*-point correlations are functions of *k*−1 difference variables, and so become increasingly more complicated to compute from data and exhibit increasing statistical error; for the case *k*=3 and the low-lying Riemann zeros see [21]. On the other hand, certain functionals of the data can be quite simple to compute with negligible statistical error. A specific example is the *nth nearest neighbour spacing distribution* *p*(*n*;*s*), *n*=0,1,2,…. In terms of the correlations, and assuming a translationally invariant system with constant density *ρ*, we have in the case *n*=0,
*n* (e.g. [4], §8.1 and §9.1).

The calculation of the functional form of the scaled limit of *p*(0;*s*) for *N*×*N* unitary matrices *U*(*N*) chosen with Haar measure, or equivalently bulk scaled GUE matrices, is a celebrated problem in random matrix theory. First note that a Fredholm determinant can be expanded as a series involving multiple integrals with integrands given as determinants of the integral kernel. With this expansion one can show from the form of the scaled *k*-point correlation functions (1.3) that
*s*) with kernel
*ρ* is unity. The required working can be found, for example, in [4], §9.1. This Fredholm determinant formula is equivalent to the expression
_{1}(*s*)>λ_{2}(*s*)>⋯>0 are the eigenvalues of _{l}(*s*)} to the eigenvalues of the second-order differential operator having the prolate spheroidal functions as eigenfunctions (e.g. [4], §9.6.1). Only recently has it been shown that there are numerical schemes based directly on (3.2) that exhibit exponentially fast convergence to the limiting value [43,44].

Nearly two decades after the work of Gaudin, the Kyoto school of Jimbo *et al.* [45] expressed (3.2) as the so-called *τ-function* for a particular Painlevé V system. Explicitly, it was shown that
*σ* satisfies the differential equation (DE)
*t* boundary conditions
*ξ* introduced in (3.3) only enters in the characterization through the boundary condition. For background theory relating to the Painlevé equations as they occur in random matrix theory, we refer the reader to [4], ch. 8.

Although (3.2) only requires the case *ξ*=1, the Fredholm expansion formula used to derive (3.2) generalizes to read
*ξ*=1, this formula, via the inclusion/exclusion principle, shows *E*(0;*s*), say, that there are no zeros in an interval of length *s* of the original dataset. Thus, we have
*ξ*<1, (2.7) together with the inclusion/exclusion principle shows (3.6) is equal to the probability that there are no zeros in an interval of length *s* in the dataset corresponding to a thinning of the original dataset by deleting each zero with probability (1−*ξ*)—this is the procedure alluded to in the Introduction. With *p*(0;*s*;*ξ*) denoting the corresponding distribution of the nearest neighbour spacing, (3.2) generalizes to
*ρ*=*ξ* in this decimated system. We remark that the notation *p*((0,*s*);*ξ*) has been used in [4], §8.1 to denote the generating function for the conditioned spacing distributions {*p*(*n*;*s*)}_{n=0,1,…}, corresponding to the conditioned gap probabilities in the next paragraph. These quantities are related by

As just mentioned, another interpretation shows itself via the formula (e.g. [4], eqn (8.1) together with (9.15))
*E*(*m*;*s*) denotes the probability of an interval *s* in the original dataset containing exactly *m* zeros. This is referred to as a *(conditioned) gap probability*. Note that (3.10) reduces to (3.7) in the case *m*=0.

Both expressions (3.2) and (3.3) have analogues for the eigenvalues of *U*(*N*) itself, rather than their scaled limit. Let *s*) with kernel *K*_{N}(*x*,*y*) as specified by (2.1). The analogue of (3.8) is then the structurally identical formula
*p*^{N}(0;*s*;*ξ*) denotes the nearest neighbour spacing distribution for matrices from *U*(*N*) chosen with Haar measure, with *s* the difference in neighbouring angles. The *τ*-function formula (3.3) generalizes to [46], eqn (1.33)
*U* satisfies the *u*(*s*)=*U*(*s*;*ξ*) and parameters
*u*(*s*)∼*ξN*/*π* as

### (b) Finite *N* correction for the nearest neighbour spacing

Our interest is in the leading correction term to the large *N* form of (3.11). Since, from (2.1),
*N*^{2}. Specifically
*L*_{s} is the integral operator on (0,*s*) with kernel *R*(*x*,*y*) denoting the resolvent kernel—that is the kernel supported on (0,*s*) of the integral operator

The expansion (3.14) tells us that in relation to the representation (3.12), we should change variables and write
*σ*^{(0)} and *σ*^{(1)} also depend on *ξ* (which, as mentioned below (3.5), enters through the boundary conditions), but for notational convenience this has been suppressed. As noted above, *σ*^{(0)} satisfies the particular Painlevé V equation in sigma form (3.4) with boundary condition (3.5). By changing variables *σ*^{(1)}(*X*) with coefficients involving *σ*^{(0)}(*X*), can be obtained.

### Proposition 3.1

*With σ*^{(0)}(*X*), *σ*^{(1)}(*X*) *related to U*(*s*;*ξ*) *in* (*3.12*) *by* (*3.15*) *we have that σ*^{(0)}(*X*) *satisfies the particular Painlevé V equation in sigma form* (*3.4*) *with boundary condition* (*3.5*), *while σ*^{(1)}(*X*) *satisfies the second-order, linear DE*
*where, with σ*(*s*)=*σ*^{(0)}(*s*),
*The corresponding s*→0^{+} *boundary condition is*

### Proof.

We see there are three main terms in (3.13). Changing variables *N*^{4}. Similarly, expanding the other two main terms and then equating terms of order *N*^{2} gives (3.4), while equating terms independent of *N* gives (3.17).

In relation to the boundary conditions, for *U*(*s*;*ξ*) in (3.12) we know from [4], eqn (8.78), further extended to the next order, that for large *s*,
*c*=*ξN*/*π*. Substituting in (3.15) and performing appropriate additional expansions we read off (3.18). ▪

While we know of no other analytic formulae for the correction term to the spacing distribution for random matrix ensembles in the bulk, at an edge—specifically the soft edge—this task seems to have been first taken up by [48]. Aspects of this same problem at the hard edge are discussed in the recent works [49–51].

Let *f*^{N}(*x*)[*N*^{−p}] denote the coefficient of *N*^{−p} in the large *N* expansion of *f*^{N}(*x*). Then we have from (3.16) that
*s* expansion.

### Corollary 3.2

*We have*

### Proof.

Differential equation (3.4) admits a unique power series expansion about the origin with the first two terms given by (3.5). Expanding to higher order and solving for the unspecified coefficients gives (3.20). Substituting the expansion in (3.3) gives (3.22) for the small *s* expansion of the Fredholm determinant; this series is also reported in [4], eqn (8.114). If we substitute (3.20) instead in DE (3.17) and make use of the boundary condition (3.18), we find a unique power series expansion solution is generated, which upon solving for the unspecified coefficients gives (3.21). Substituting (3.20) and (3.21) in the order 1/*N*^{2} correction term (3.19) gives (3.23). Recalling (3.11) and that we have a uniform density *ρ*_{(1)}(*s*)=*ξ*, we see (3.24) and (3.25) follow immediately from (3.22) and (3.23). ▪

A consistency check can be placed on the above expansions. For this we note that it is also possible to use the characterization (3.12) of *N*-dependent small *s* expansion [4], eqn (8.79)
*N*, we obtain agreement with (3.22) for the leading form, and agreement with (3.23) for the next leading-order 1/*N*^{2} term.

A noteworthy feature of (3.24) and (3.25) is the weak dependence on the dilution parameter *ξ*, which only changes from *ξ* to *ξ*^{2} at order *s*^{7}. One understanding of this result relates to the interpretation of (3.8), upon division by *ξ*, as the generating function for *p*(1;*s*) is obtained from (3.24) and (3.25) by applying −∂/∂*ξ* and setting *ξ*=1. It follows that *p*(1;*s*) has leading small *s* behaviour proportional to *s*^{7}, which is a known result [4], eqn (8.115).

We turn our attention now to the behaviours for large *s*. The asymptotics of spacing distributions for random matrix ensembles in this regime have been reviewed in the recent work [52]. The case *ξ*=1 must be distinguished from 0<*ξ*<1.

### Corollary 3.3

*Suppose 0<ξ<1. As* *, we have* [53]*, eqn (1.16), with σ*_{0}*(s/2)=σ*^{(0)}*(s) in our notation*
*and thus*
*Also*
*and this together with (3.28) implies*

*Suppose instead ξ=1. In this case, for* *we have*
*which is equivalent to the expansion*
*We also have*
*which when combined with* (*3.32*) *is equivalent to*

*Thus, for large s*
*and*

### Proof.

As noted, expansion (3.27) can be read off from [53], and this substituted in (3.3) gives (3.28). The value of *A*(*ξ*) is given in [54], eqn (1.14), although its derivation requires different methods [55,56]. Substituting (3.27) in (3.17) and solving for large *s* gives (3.29), and this together with knowledge of (3.28) gives (3.30).

From the definition of *k*, we see that the behaviour (3.27) breaks down when *ξ*=1. In that case we have, instead of (3.27), the large *s* expansion (3.31). This follows from (3.3) and knowledge of the leading two terms of the large *s* form of *s* gives (3.33). Then, substituting this in (3.19) and using (3.32), we see that the large *s* form of the 1/*N*^{2} correction term (3.19) for *ξ*=1 is given by (3.34).

To obtain the behaviour of the spacing distribution, we use (3.28), (3.30), (3.32) and (3.34) in conjunction with (3.11) and so deduce (3.35) and (3.36). ▪

Numerical methods to be discussed in §4 allow the large *s* forms of *σ*^{(0)}(*s*) and *σ*^{(1)}(*s*) to be compared with their computed values from DEs (figures 3–5). A prominent feature seen upon differentiating the transcendents is that, for 0<*ξ*<1, higher-order terms in the large *s* expansion are oscillatory, as known from the analytic work of McCoy & Tang [53] in the case of *σ*^{(0)}(*s*). Since there is no such effect for *ξ*=1, we display only the transcendent and its asymptotic form, and not the derivative. The asymptotic form in this case is functionally distinct from that for 0<*ξ*<1—note that the quantity *k* in the latter actually diverges at *ξ*=1.

A variation on the process of deleting each zero with the (constant) probability 1−*ξ* is to delete with probability 1−*ξ*(*x*), with *x*=0 corresponding to some pre-determined origin in the data. Since upon the scaling (1.1) the original data set is translationally invariant, an ensemble can be generated by varying the origin. The corresponding *k*-point correlation, *ρ*_{(k)}(*x*_{1},…,*x*_{k};*ξ*(*x*)) say, has the factorization property
*s*) with kernel (3.37) by *s* from the origin has no zeros is equal to *ξ*(*x*) having the functional form *ξ*(*x*/*s*)=*v*(*y*), *y*=*x*/*s* with *v*(*y*) analytic on [0,1], then (for large *s*) the expansion (3.28) generalizes to [59], theorem 2.1
*C*[*ξ*]. Unfortunately, the requirement that *ξ* scale with *s* prohibits probing this process in the Riemann zero data. This is similarly true of the choice *ξ*=1−e^{−2κs} recently studied for large *s* in [60].

### (c) Alternative characterization of finite *N* correction for the nearest neighbour spacing

We see from (3.11) that to obtain the order 1/*N*^{2} correction to the limiting spacing distribution from knowledge of the 1/*N*^{2} correction to the gap probability, the second derivative with respect to *s* of the latter must be taken. In fact, as first shown in [61] and further refined in [47], the second derivative can be incorporated within the Painlevé theory. Specifically, we have from [47] (again recalling (3.9)) that
*u*^{(0)} satisfies the particular, modified *σ*PV equation
*s* boundary condition
*N*
*V* satisfies the *u*(*s*)=*V* (*s*;*ξ*) and parameters

Analogous to (3.15), in keeping with the correction to the large *N* form being of order 1/*N*^{2}, we expand
*u*^{(0)}(*X*) satisfies the nonlinear equation (3.38) subject to the boundary condition (3.39). Analogous to proposition 3.1, it is possible to specify *u*^{(1)}(*X*) as a second-order, linear DE with coefficients involving *u*^{(0)}(*X*).

### Proposition 3.4

*We have that u*^{(1)}(*X*) *satisfies the second-order, linear DE*
*where, with u*(*s*)=*u*^{(0)}(*s*),
*The equation must be solved subject to the s*→0^{+} *boundary condition*

### Proof.

We apply the same technique as in proposition 3.1 to the *N*^{2} give (3.38), while the terms of order 1 give (3.42). For the boundary condition on *u*^{(0)}(*s*), we have (3.39), which we extend to degree 6 using (3.38). Then we combine this series with (3.42) to obtain the boundary condition on *u*^{(1)}(*s*). ▪

One check on the consistency of proposition 3.4 is to verify that substitution of (3.43) and the first few terms of the power series solution of (3.38) into (3.41), reproduce the expansion (3.25). We find that this is indeed the case.

The required consistency between (3.41) and (3.16), combined with (3.11), together with knowledge of the large *s* asymptotic forms in corollary 3.3 give information on the large *s* behaviour of *u*^{(0)}(*s*) and *u*^{(1)}(*s*). In relation to *u*^{(0)}(*s*), for 0<*ξ*<1, we have [53], eqn (1.26), with

(up to a correction *b*_{0}(*n*)→−*b*_{0}(*n*) in [53], eqn (1.13)) with *k* as in (3.27). From [53], we know that the *O*(1/*s*) term in (3.44) is oscillatory. This is clearly displayed in figure 6, where we compare a numerical solution of DE (3.38), with [53], eqn (1.26) as given in (3.44) with the oscillatory term O(1/*s*) included. These oscillatory terms have concrete significance for the deduction of the functional form of *u*^{(1)}(*s*,*ξ*) in the large *s* limit, as if we are to substitute (3.44) for this purpose in DE (3.42), we find a result inconsistent with (3.36). Consistency with the latter and (3.29) requires
*ξ*. From figure 6, we also see that there are oscillatory subleading terms in (3.45); the graphs of the derivatives in figure 7 indicate that these oscillations begin with the O(*s*) term.

For *ξ*=1, we have consistency with (3.35) when

## 4. Numerical power series solution of the differential equations and evaluation of the spacing distributions

The utility of propositions 3.1 and 3.4 is that we can use a power series method together with computer packages to numerically compute to high accuracy the next-to-leading order term in the nearest neighbour spacing distribution (3.8); this can be done directly by computing the terms in (3.41) or via the second derivative of (3.19). Such power series methods were introduced as a technique to compute Painlevé transcendents associated with spacing distributions in random matrix theory in [62]. This was in relation to the soft edge. Subsequently, the same numerical method was used for the computation of bulk spacings [4], §8.3.4. The first point is to note that in generating a power series solution of the *σ*PV equation (3.4) about the origin, one finds a radius of convergence of approximately 8.5 (for both *ξ*=1 and *ξ*=0.6). Inside the radius of convergence, this power series can be used to accurately evaluate the transcendent and its derivative, and from these data a new power series can be computed and the procedure iterated to cover the interval from *s*=0 to beyond *s*=20. With *σ*^{(0)}(*s*) so determined, iterative power series solutions of the DE in proposition 3.1 can be obtained. It is through this procedure, and its analogue starting with DE (3.38) and proceeding to the DE of proposition 3.4, that the graphs of *σ*^{(0)}(*s*), *σ*^{(1)}(*s*), *u*^{(0)}(*s*) and *u*^{(1)}(*s*) displayed in figures 3–8 have been generated.

We remark that use of a power series method to generate solutions over a large interval seems necessary in the cases 0<*ξ*<1. In particular, if in these cases one tries to use a computer algebra package to solve the DE (3.4) subject to initial conditions for *σ*^{(0)}(*x*_{0}) and *σ*^{(0)}′(*x*_{0}) with *x*_{0} small, as computed from the boundary condition (3.5) (with the latter further extended in accuracy as given in (3.20)), it is found that as *x* is increased from *x*_{0}, the DE solver gives incorrect values, and furthermore soon diverges to a spurious pole.

Substitution of the piecewise functions for *σ*^{(0)}(*s*) and *σ*^{(1)}(*s*) into (3.16) gives us an approximation of *s*, we obtain figures 9 and 10, respectively. Note that to obtain the comparison with the Riemann zero data in these figures, we have used the scalings (2.5) and (2.6) from [31]. Also in figures 9 and 10 we plot an extrapolation of a sequence of finite calculations (3.11), similar to that in [31]: by calculating (3.11), with kernel (2.1), for 20 values of *N* between 100 and 138 we extrapolate the limiting value of the nearest neighbour spacing and the next-to-leading order correction at each point. To the eye the extrapolated values are identical to those computed from the DE. Note that when *ξ*=0.6 the correction term shows more pronounced oscillations for finite *s* values than its *ξ*=1 counterpart, a feature that seems to be driven by oscillations in the functional forms for the corresponding Painlevé transcendents.

Most strikingly, the predicted functional forms from random matrix theory show accurate agreement with the Riemann zero data both for *ξ*=1 [31], and upon thinning of the Riemann zero dataset with *ξ*=0.6, for all displayed values of *s*. This is in contrast to what was found in figure 2 for the two-point function, where the accuracy deteriorates after approximately one and a half units of the mean spacing. A significant difference between the two quantities that may explain this observation relates to the respective large *s* forms. That is, the spacing distribution decays as an exponential or faster (see (3.41) with (3.44)–(3.47)), whereas the correction term for the two-point correlation function is an oscillatory order one quantity for all *s* (see (2.2)). It follows that, at a graphical level, discrepancies in the case of the correction term to the spacing distribution functional formal form cannot be quantitatively probed in distinction to the situation with the correction term for the two-point correlation function.

In conclusion, our study corroborates the study of Bogomolny *et al.* [31], and so lends further weight to the conjecture that correction terms to the Montgomery–Odlyzko law themselves have a random matrix interpretation. Specifically, as put forward in [31] as an extension of the earlier work [27], the results of our study are consistent with the hypothesis that the leading correction term to the Montgomery–Odlyzko law for correlation functions and associated distribution functions of the Riemann zeros coincides with the *O*(1/*N*^{2}) correction term for the corresponding quantities of the eigenvalues of random unitary matrices. This holds with the relationship between *E* and *N* given by (2.5), and with the change of scale (2.6), both of which are arithmetic in their origin.

## Data accessibility

The Riemann zero data that we used was obtained on request from Andrew Odlyzko (http://www.dtc.umn.edu/~odlyzko/); note that this is the same dataset as that used in [31].

## Authors' contributions

P.J.F. initiated this project, directed the analytical analysis and prepared the bulk of the text of the paper; A.M. performed the majority of the analytical and numerical computations and contributed parts of the text. Both the authors gave final approval for publication.

## Competing interests

The authors have no competing interests.

## Funding

This research project is part of the programme of study supported by the ARC Centre of Excellence for Mathematical and Statistical Frontiers.

## Acknowledgements

We are very appreciative of A. Odlyko in providing us with the Riemann zero dataset used in our work. We also thank F. Bornemann for spotting some points for correction in an earlier draft, and the referees for some useful remarks.

- Received June 27, 2015.
- Accepted September 7, 2015.

- © 2015 The Author(s)

Published by the Royal Society. All rights reserved.