## Abstract

In this paper, bounds on the mean power-weighted nearest neighbour distance are derived. Previous work concentrates mainly on the infinite sample limit, whereas our bounds hold for any sample size. The results are expected to be of importance, for example in statistical physics, non-parametric statistics and computational geometry, where they are related to the structure of matter as well as properties of statistical estimators and random graphs.

## 1. Introduction

Let be a set of random variables sampled from a probability distribution *P* on ^{n} with a density *q* on a bounded set (with respect to the Lebesgue measure *λ*). The *k*-th nearest neighbour distance of the point *X*_{i} in the *l*^{p}-norm is denoted by *d*_{i,k}. In this paper we examine the expected power-weighted distance with *α*>0. This quantity plays a significant role in many fields including non-parametric statistics (see Evans & Jones 2002; Kohler *et al*. 2006), physics (Torquato 1995) and geometric probability (Penrose & Yukich 2003). Nearest neighbour distances have turned out to be important in such tasks as convergence analysis of statistical estimators and theoretical analysis of particle systems in statistical physics; furthermore, nearest neighbour graphs are a fundamental class of graphs in computational geometry.

A large part of the previous work on the topic concentrates on the asymptotic form of ; for example, denoting by *V*_{n,p} the volume of the unit ball, defining *Γ*(.) as the Gamma function and taking *p*=2, it is shown by Evans *et al*. (2002) that(1.1)as *M*→∞ assuming that *q* is smooth and positive on a convex and compact set . Moreover, similar asymptotic results can also be derived for more general random structures than the nearest neighbour graph (see for example Penrose & Yukich 2003).

In this paper, instead of the aforementioned asymptotic analysis, we derive non-asymptotic lower and upper bounds for . Denoting by *B*_{p}(0, *r*) the open ball of radius *r* and centre at the origin, we assume that to ensure that all nearest neighbour distances are smaller than . In the special case , our main result can be summarized as follows (with the notation ‖.‖_{p} for the function *L*^{p}-norm w.r.t. *λ*).

*The inequalities*(1.2)*hold almost surely when* 0<*α*<*n*. *The latter inequality is valid also for α=n*.

A remarkable fact about theorem 1.1 is its universality as the upper bounds hold for (almost) any combination of points in being based on purely deterministic arguments.

Previous upper bounds on the average power-weighted *k*-th nearest neighbour distances include Kulkarni & Posner (1995), Torquato (1995) and Kohler *et al*. (2006). Compared with our bounds, the proof technique used by Kulkarni & Posner (1995) gives much higher constants when *α* is close to *n*; actually for *α*=*n*, the bound contains an additional logarithmic factor. The probabilistic upper bounds in Kohler *et al*. (2006) on the other hand do not yield the explicit form of the constants, while Torquato (1995) concentrates on hard-sphere systems.

In the special case *k*=1, an essentially similar upper bound as ours has been derived by Tewari & Gokhale (2004). However, the bound is not rigorously derived as analysis of the boundary effect is excluded. While the authors discuss a physical model where taking the limit *M*→∞ is realistic, for a moderate sample size the points close to the boundary tend to have a significant effect and should be taken into account.

Our lower bound differs from (1.2) by being based on a measure-theoretic argument. To our knowledge, there is no previous work on non-asymptotic lower bounds.

*The inequality**holds for α*>0 *and* 1≤*i*≤*M*.

It is worth noting that by lemma 5.1 in Evans *et al*. (2002),

## 2. A geometric upper bound

The formal definition of the nearest neighbour of the point *X*_{i} in the *l*^{p}-norm isThe *k*-th nearest neighbour is defined recursively asi.e. the closest point after removal of the preceding neighbours. The corresponding distances are defined asNote that the definition of *N*[*i*,*k*] is not necessarily unique if there are two points at the same distance from *X*_{i}. However, the probability of such an event is zero and consequently it can be neglected as theorem 1.1 is stated to hold almost surely. Moreover, the conclusion of theorem 1.1 holds for all configurations of points in with an arbitrary method of tie-breaking. For example, *N*[*i*,*k*] can be chosen as the smallest index among the alternatives.

To fix some notation, let us define

Next, we prove a geometric upper bound for the average *k*-th nearest neighbour distance. The proof is based on showing that any point *x*∈ belongs to at most *k* balls, *B*_{p}(*X*_{i}, *d*_{i,k}/2).

*For any* 0<*α*≤*n and r*>0*,*(2.1)*almost surely*.

Choose any *x*∈^{n}. Let us make the counter assumption that there exists *k*+1 points, denoted by (the indices being distinct), such that for *j*=1, …, *k*+1. Let be the pair that maximizes the distance . Under these conditions the triangle inequality yieldsThe strict inequality holds because *B*_{p}(*x*, *r*) is an open ball. On the other hand,leading to a contradiction. Thus, we have for the sum of indicator functionsOn the other hand,implies thatBy Jensen's inequality,which implies equation (2.1). ▪

As stated in the following corollary, the last inequality in theorem 1.1 follows straightforwardly by choosing because implies that all nearest neighbour distances are smaller than and .

*For* 0<*α*≤*n we have*

## 3. The boundary effect

In this section we show that the bound in theorem 2.1 can be improved by dividing the nearest neighbour distances into two different sets corresponding to small and large values. We will show that the volume of the set is asymptotically negligible, which consequently implies the first inequality in (1.2).

*Assume that* *when r*≤*c*_{2} *for some constants c*_{1},*c*_{2}>0. *Then for any* 0<*α*<*n,*(3.1)

Let us define the set of indicescorresponding to points with the *k*-th nearest neighbour distance larger than *r*. If the number of elements |*I*_{r}| is bigger than one, we may define the subsample and the corresponding nearest neighbour distances . Because excluding points can only increase the distances between a point and its nearest neighbours, we obtainA straightforward application of theorem 2.1 yieldsThe first inequality in (3.1) follows now by Chebyshev's inequality,One should also take in the account the case |*I*_{r}|=1 that, however, does not pose any problems as .

To see the second result, chooseand use the approximation valid for small *x*. ▪

The condition requires some regularity of the boundary of . It is similar to condition C.2 in Evans *et al*. (2002). Such a bound holds for most sets encountered in practice; for example, if we have

It is clear that the influence of points close to the boundary grows once the dimensionality of the space becomes bigger. To demonstrate the improvement obtained compared with the straight application of theorem 2.1 with , both bounds are plotted in figure 1 for *n*=3, *k*=1, *p*=2, *α*=1 and using the estimate in (3.1).

## 4. A probabilistic lower bound

### (a) The small ball probability

In this section the lower bound in theorem 1.2 is derived. The proof is based on the properties of the random variableIt is interesting that has a distribution that is independent of the probability measure *P* as shown in the following well-known lemma. The proof is given here for completeness, but it can also be found, for example, in Evans & Jones (2002).

*For any i,* *α*>0*,*

Choose 0<*z*<1 and set . Then and if and only if there are at most *k*−1 points in the set *B*_{p}(*X*_{i}, *t*). Thus, a combinatorial argument yields almost surely(4.1)Using the formula

theorem 8.16 in Rudin (1987) and, for example, an induction argument together with the properties of the beta function we obtain ▪

### (b) A derivation of the lower bound

Let us define the Hardy–Littlewood maximal functionAs a consequence of basic properties of maximal functions, we may bound the *L*^{1} norm of *L*(*x*) by the following lemma.

*Choose s*>1 *and let* . *For any i*>0*,*(4.2)

By Holder's inequality,(4.3)By a classical result for maximal functions (see for example theorem 8.18 in Rudin (1987)), is finite if the density *q* belongs to the space *L*^{s}. In fact, the proof of theorem 8.18 in the aforementioned reference gives us the well-known boundwhich can be substituted into equation (4.3) to obtain equation (4.2). ▪

Given lemmas 4.1 and 4.2, the main result of the section follows rather easily.

*For α, i*>0 *and s*>1*,*

By lemma 4.1,The proof is completed by observing the fact that by Jensen's inequalityand applying lemma 4.2. ▪

Theorem 1.2 follows now as a corollary.

*The inequality**holds for α*>0 *and* 1≤*i*≤*M*.

## 5. Discussion

### (a) Possible improvements

A comparison between theorem 1.1 and equation (1.1) reveals that our upper bound is approximately 2^{α} times higher than the asymptotic moments in (1.1). An interesting question is whether one can actually improve theorem 3.1 by taking into account that the samples are independent and identically distributed (i.i.d.). On the other hand, the geometric bounds are strong in the sense that they hold for any combination of points.

Theorem 4.3 requires thatfor some *s*≥2. Possibly such a condition could be avoided to obtain a similar result with weaker conditions. Another direction of considerable interest is the extension to the non-i.i.d. case, which occurs especially in physics.

### (b) Applications

In general, nearest neighbour distances arise as a basic quantity in many fields. One specific application of particular interest is analysing finite spherical packings, where the centres of non-overlapping hard spheres are chosen in a random way. Such packings arise, for example, via random sequential adsorption (see Talbot *et al*. 2000) or in equilibrium statistical physics (see for example Torquato 1995). The main challenge in hard-sphere systems is the possibility of long-range dependencies that hamper the theoretical analysis. Thus theorem 1.1 is an interesting tool for analysis of packing fractions and other important quantities, as it is based on a non-probabilistic argument and holds for any configuration of points. It is also of interest to ask whether an analogue of theorem 1.2 could be proven for hard-sphere systems and how tight such bounds would be. The potential of bounds on nearest neighbour distances has already been noted by Torquato (1995), who found rather deep connections between physical and geometric quantities.

Many estimators in non-parametric statistics are based on the use of nearest neighbour distances. Thus, non-parametric statistics is another interesting application area for the theory of nearest neighbour distributions. For some recent work we refer to Kohler *et al*. (2006), where a probabilistic nearest neighbour bound plays an important role in the convergence analysis of non-parametric regressions estimators for unbounded data. Apart from regression, similar techniques are useful in the analysis of non-parametric classifiers as demonstrated by Kulkarni & Posner (1995). The advantage of our bounds is that they take a concrete form without unknown constants while being rather tight and general.

## Footnotes

- Received September 26, 2007.
- Accepted March 18, 2008.

- © 2008 The Royal Society