## Abstract

The Moran process models the spread of genetic mutations through populations. A mutant with relative fitness *r* is introduced and the system evolves, either reaching fixation (an all-mutant population) or extinction (no mutants). In a widely cited paper, Lieberman *et al.* (2005 Evolutionary dynamics on graphs. *Nature* **433**, 312–316) generalize the model to populations on the vertices of graphs. They describe a class of graphs (‘superstars’), with a parameter *k* and state that the fixation probability tends to 1−*r*^{−k} as the graphs get larger: we show that this is untrue as stated. Specifically, for *k*=5, we show that the fixation probability (in the limit, as graphs get larger) cannot exceed 1−1/*j*(*r*), where *j*(*r*)=*Θ*(*r*^{4}), contrary to the claimed result. Our proof is fully rigorous, though we use a computer algebra package to invert a 31×31 symbolic matrix. We do believe the qualitative claim of Lieberman *et al.*—that superstar fixation probability tends to 1 as *k* increases—and that it can probably be proved similarly to their sketch. We were able to run larger simulations than the ones they presented. Simulations on graphs of around 40 000 vertices do not support their claim but these graphs might be too small to exhibit the limiting behaviour.

## 1. Introduction

The Moran process [1] is a simple, discrete-time model of the spread of genetic mutations through a finite population. Individuals that do not possess the mutation have ‘fitness’ 1 and mutants have fitness *r*>0. At each time step, an individual is selected, with probability proportional to its fitness, to reproduce. A second individual is chosen uniformly at random, without regard to fitness, and is replaced with a copy of the reproducer. Because the reproducer is chosen with probability proportional to its fitness, the case *r*>1 corresponds to an advantageous mutation. With probability 1, the population will reach one of two states, after which no further change is possible: the population will consist entirely of mutants or of non-mutants. These scenarios are referred to as *fixation* and *extinction*, respectively.

Lieberman *et al*. [2] extend the model by structuring the population on the vertices of a fixed directed graph. Each vertex corresponds to exactly one individual. In each time step of this generalized Moran process, the reproducer is chosen as before: an individual is selected, with probability proportional to its fitness. Then a second individual is selected uniformly at random from the set of out-neighbours of the reproducer. Once again, the second individual is replaced with a copy of the reproducer. The original Moran process corresponds to the special case of the extended process in which the graph is a complete graph (one with edges between all pairs of individuals).

In this paper, we study the model of Lieberman *et al*. It is referred to as an *invasion process* because an individual duplicates and then replaces another. This is in contrast to the *voter model*, which is another generalization of the Moran process in which individuals first die, and are then replaced. There is much work on voter-model variants of the Moran process [3]. In general, voter models and invasion process behave differently [4].

Given a graph *G*, we can ask what is the probability that a mutant with fitness *r* reaches fixation in the invasion process, and we denote this probability by *f*(*G*;*r*). It is easy to see that the number of mutants in the original Moran process behaves as a random walk on the integers with bias *r* to the right and with absorbing barriers at 0 and *N*, where *N* is the population size. Hence, as , the fixation probability tends to 1−1/*r*. The generalized Moran process can have a higher fixation probability. For example, on the complete bipartite graph *K*_{1,N−1}, the fixation probability tends to 1−1/*r*^{2} as *N* tends to infinity (see, for example, Broom and Rychtář’s [5] calculation of the exact fixation probability, as a function of *r* and *N*).

### (a) Families of graphs with high fixation probability

Lieberman *et al.* [2] introduce three classes of graphs, which they call funnels, metafunnels and superstars. Superstars will be defined formally in §2. An example is given in figure 1. Funnels, metafunnels and superstars are essentially layered graphs, with the addition of ‘positive feedback loops’, and they have a parameter *k* that corresponds to the number of layers. Lieberman *et al.* claim that, for fixed *r*>1, for sufficiently large graphs in these classes, the fixation probability tends to 1−*r*^{−k}. This is stated as [2], theorem 3 for superstars and a proof sketch is given. Hauert states [6, eqn.] (5) that the same limiting fixation probability (and presumably the same argument) also applies to funnels. Lieberman *et al.* [2] conclude that funnels, metafunnels and superstars ‘have the amazing property that, for large *N* [the number of vertices in the graph], the fixation probability of any advantageous mutant converges to one. […] Hence, these population structures guarantee fixation of advantageous mutants, however small their selective advantage’ ([2], p. 314).

The claimed limiting fixation probability of 1−*r*^{−k} is cited frequently in the literature (see, for example, [7, eqn. (2)], [8, eqn. (4)], the survey paper [9, eqn. (6)] and the references therein). We prove that this limiting fixation probability is incorrect for *k*=5, demonstrating that the proof sketch cannot be made rigorous, at least for the exact claim that they make.

On the other hand, superstars do seem to be well designed to amplify selection. Informally, the chains in these graphs (such as the chain *c*_{1,1},*c*_{1,2},*c*_{1,3} in figure 1) seem to be a good mechanism for amplifying the fitness of a mutant, and the trade-off between the high out-degree of the centre vertex and the lower in-degree seems to be a useful feature.

We have investigated the fixation probability of superstars via computer simulation. Before discussing our proof, and the result of these simulations, we give a brief survey of the relevant literature. Lieberman *et al.* [2] simulated the fixation probability of superstars for the special cases when *r*=1.1 and *k*=3 and *k*=4 on graphs of around 10 000 vertices. Unfortunately, these particular values are too small to give evidence of their general claim.

Funnels and metafunnels are not very amenable to simulation because the number of vertices is exponential in the relevant parameters. We are not aware of any published justification for the claim for metafunnels but there has been some simulation work relevant to funnels. Barbosa *et al.* [7] have found the fixation probability to be close to 1−*r*^{−3} for funnels of up to around 1600 vertices, for the special cases, *k*=3 and *r*=1.1 and *r*=2. Motivated by the claimed fixation probability for funnels, their objective was to see whether similar phenomena occur for similar randomly generated layered graphs, which they argue are more like ‘naturally occurring population structures’ than are funnels, metafunnels and superstars. They found that the fixation probabilities for *r*=1.1 and *r*=2 on these randomly generated graphs with *k*=5 or *k*=10 generally exceed the value of 1−1/*r* that would be seen in an unstructured population but are substantially lower than 1−*r*^{−k}. These experiments do not apply directly to funnels (and it may be that the graphs that they considered were too small to demonstrate the limit behaviour) but, in any case, their experiments do not give evidence in favour of the fixation probability claimed by Lieberman *et al.* [2].

For small graphs, it is possible to calculate exact fixation probabilities by solving a linear system. If the graph has *n* vertices, then the Moran process has 2^{n} states, so there are 2^{n} equations in the linear system. Computationally, solving such a system is not feasible, apart from for tiny graphs. A significant improvement was introduced by Houchmandzadeh & Vallade [10], who present a new method for calculating fixation probabilities by solving differential equations. The relevant equation [10, eqn. 23] has a variable *z*_{i} for each vertex *i*, so there are *n* variables in all. A further improvement is given: if the vertices in the graph can be partitioned into equivalence classes such that all of the vertices in a given equivalence class have exactly the same set of in-neighbours and the same set of out-neighbours, then these vertices can share a variable (in the terminology of Houchmandzadeh & Vallade [10], they can be viewed as a single ‘island’). Thus, the fixation probability can be calculated by solving a differential equation in which the number of variables equals the number of equivalence classes. The paper [10] also offers a method for approximately solving the relevant differential equations. This seems to work well, in practice, though the approximation is difficult to analyse and there are currently no known results guaranteeing how close the approximate value will be to the actual fixation probability.

### (b) Outline of the paper

In §2, we prove that the fixation probability for sufficiently large parameter-5 superstars cannot exceed 1−(*r*+1)/(2*r*^{5}+*r*+1), which is clearly bounded below 1−*r*^{−5} for all sufficiently large *r* (in particular, for ). This proof is fully rigorous, though we use a computer algebra package to invert a 31×31 symbolic matrix. Thus, we show that theorem 3 of Lieberman *et al.* [2] is incorrect as stated (though something very similar may well be true).

Section 3 presents simulation results on graphs of around 40 000 vertices. These simulations do not support the claim that the fixation probability is 1−*r*^{−k}, or that this probability increases as *k* increases. However, it may be that 40 000 vertices is not enough to exhibit the true limiting behaviour.

## 2. An upper bound for *k*=5

The superstars of Lieberman *et al.* are defined as follows. A superstar has a *centre* vertex *v*, and ℓ disjoint subgraphs called *leaves*. Each leaf consists of a *reservoir* of *m* vertices, together with a chain of length *k*−2. There are edges from the centre to the reservoir vertices, from the reservoir vertices to the start of the chain, and from the end of the chain back to the centre. The formal definition follows, where [*n*] denotes the set {1,…,*n*}.

### Definition 2.1 (Lieberman *et al.* [2])

Define
and
The graph is a *parameter- k superstar* with ℓ leaves and reservoir size

*m*. We use

*n*to denote |

*V*|=1+ℓ(

*m*+

*k*−2).

Figure 1 shows the parameter-5 superstar . The parameter *k* is sometimes referred to as the ‘amplification factor’.

Lieberman *et al.* state the following proposition (which turns out to be incorrect—see theorem 2.4 below).

### Proposition 2.2 (stated as Lieberman *et al.* [2, theorem 3])

The statement of proposition 2.2 is not sufficiently precise because the two variables ℓ and *m* are simultaneously taken to infinity without regard to the relative rates at which they tend to infinity. Nevertheless, we can make sense of the proposition by regarding *m* as a function of ℓ. We require that *m*(ℓ)=*ω*(1), that is that the function *m*(ℓ) is an increasing function of ℓ that grows without bound. For example, Nowak [11] considers *m*=ℓ. Because we are only interested in *r*>1, we can also simplify the expression, using the fact that the denominator tends to 1.

### Proposition 2.3 (The *r*>1 case of Lieberman *et al*. [2, theorem 3])

*r*>1

*Suppose r*>1 *and m*(ℓ)=*ω*(1). *Then*

Lieberman *et al.* give a brief sketch of a proposed proof of proposition 2.3. However, we now show that this sketch cannot be made rigorous for the proposition as stated. We do this by choosing a fixed value of *k* (specifically, *k*=5) and showing that proposition 2.3 is false for this value of *k*. Specifically, we show the following:

### Theorem 2.4

*Let m(ℓ) be any function which is ω(1). Let j(r)= (2r*^{5}*+r+1)/(r+1). For any r>1, if* *exists, then
*

Note that theorem 2.4 applies for any function *m*(ℓ)=*ω*(1). In particular, it shows that, for all *r*>1, if exists then
whereas proposition 2.3 would give the contrary conclusion
where 1−1/*r*^{5}>1−1/*j*(*r*) for all sufficiently large *r* (specifically, for ) because *j*(*r*)=*Θ*(*r*^{4}).

### Proof of theorem 2.4.

Let *m*(ℓ) be any function which is *ω*(1). Consider the generalized Moran process on . Let *R* be the event that the initial mutant is placed on a reservoir vertex, and let *F* be the event that, at some time during the execution of the process, the centre vertex *v* is occupied by a mutant and is chosen for reproduction. Let *p*(ℓ,*r*) be the probability that *R* does not occur and let *q*(ℓ,*r*) be the probability that *F* occurs, conditioned on the fact that event *R* occurs. Clearly,
Let . We will show that this limit exists for every *r*>0, and that *h*(*r*)=1−1/*j*(*r*). From the calculation above, it is clear that, for every *r*>1, if exists then

In fact, the value *q*(ℓ,*r*) is a rational function in the variables ℓ, *m*(ℓ) and *r*. This rational function can be calculated by solving a linear system. We solved this linear system using Mathematica—the corresponding Mathematica program is in appendix A. The program consists of three main parts. The first block of code defines useful constants, the bulk of the file defines the system of linear equations and the last four blocks solve the system for all variables and extract the solution of interest.

In the Mathematica program, *V* denotes the vertex *v*, *X* denotes the reservoir vertex *x*_{i,j} in which the initial mutant is placed, and *O*, *P* and *Q* represent the vertices in the corresponding chain (*c*_{i,1}, *c*_{i,2} and *c*_{i,3}, respectively). Let *Ψ*={*V*,*X*,*O*,*P*,*Q*}. If we start the generalized Moran process from the state in which vertex *X* is occupied by a mutant, and no other vertices are occupied by mutants, then no vertices outside *Ψ* can be occupied by mutants until event *F* occurs. In the program, *L* is a variable representing the quantity ℓ, and *M* is a variable representing the quantity *m*(ℓ). Let *Ω* be the state space of the generalized Moran process, which contains one state for each subset of *Ψ*. The state corresponding to subset *S*∈*Ω* is the state in which the vertices in *S* are occupied by mutants and no other vertices are occupied by mutants. We use the program variable *FS* to denote the probability that event *F* occurs, starting from state *S*.

For each state *S*, *EQS* is a linear equation relating *FS* to the other variables in {*FS*′|*S*′∈*Ω*}. The linear equations can be derived by considering the transitions of the system. To aid the reader, we give an example. Consider the state *XO* in which vertices *X* and *O* are occupied by mutants. From this state, three transitions are possible. (We write *W* for the total fitness of vertices in the state under consideration.)

— With probability

*r*/*W*, vertex*O*is chosen for reproduction. Vertex*P*becomes a mutant so the new state is*XOP*.— With probability (1/

*W*)×(1/*LM*), vertex*V*is chosen for reproduction. From among its*LM*neighbours, it chooses vertex*X*to update (removing the mutant from vertex*X*), so the new state is*O*.— With probability (

*M*−1)/*W*, one of the vertices in {*x*_{i,j}|*j*∈[*m*(ℓ)]}\*X*is chosen for reproduction, removing the mutant from vertex*O*, so the new state is*X*.

Thus, we have the equality

This equality (which we called *EQXO*) is included in the linear system constructed in the Mathematica program (except that we normalized by multiplying the numerator and denominator by *W*). The constant *DXO* is defined to stand for the denominator of this expression to enhance readability. The constants *XonO*, *XoffO* and so on refer to the probabilities that, respectively, the vertex *O* is made a mutant or a non-mutant (‘switched on or off’) by *X* (again, normalized by multiplying by *W*).

We similarly derive an equation *EQS* for every non-empty state *S*∈*Ω*. Clearly, if *S* is the state in which no vertices are mutants then *FS*=0, so we can account for this directly in the other equations. The system therefore consists of 31 equations in 31 variables with one variable *FS* for each non-empty state *S*∈*Ω*. The desired quantity *q*(ℓ,*r*) is equal to *FX*, which can therefore be calculated by (symbolically) solving the linear system.

The solution for *FX* is a rational function in *L*, *M* and *r*. The numerator of this rational function can be written as . We say that the term *c*_{i,j}(*r*)*L*^{i}*M*^{j} is *dominated* by the term *c*_{i′,j′}(*r*)*L*^{i′}*M*^{j′} if *c*_{i′,j′}≠0, , and *i*+*j*<*i*′+*j*′. The sum of the undominated terms in the numerator is
Similarly, the sum of the undominated terms in the denominator is

Thus, for any fixed *r*,
Because *j*(*r*)=(2*r*^{5}+*r*+1)/(*r*+1), we have . □

## 3. Simulations on superstars

We simulated the generalized Moran process on superstars with ℓ=*m*=200 and for *k*∈{3,4,5,6,7,12} and *r*∈{1.1,2,3,5,10,50}. Thus, the size of the graphs ranges from approximately 40 000 to 42 000 vertices. For each choice of parameters, we ran 2500 simulations for and 10 000 for . The results are presented in table 1 and figure 2.

For clarity, we have plotted extinction probability (i.e. 1−*f*(*G*;*r*)) rather than fixation probability, and we have plotted on a log–log scale. The straight line shows the value of *r*^{−k}, that is the extinction probability predicted by proposition 2.3, and the points are the fixation probabilities derived by simulation, along with their 99.5% confidence intervals.^{1} The only parameter values that we simulated for which *r*^{−k} falls within the 99.5% confidence interval of our simulations are *k*=3 and *r*∈{1.1,2} and *k*=4, *r*=1.1; these are the points marked with asterisks in figure 2. In all other cases, the extinction probabilities are significantly higher than the claimed value of *r*^{−k}, with the disparity growing as *k* increases.^{2}

Reading down the columns of table 1, it can be seen that for , the fixation probabilities do not increase towards 1 but tail off for larger values of *k*. In particular, the lower end of the 99.5% confidence interval for *k*=5 is greater than the upper end of the corresponding interval for *k*=12 for *r*∈{3,5,10}. This observation does suggest that the claimed fixation probability in proposition 2.3 may be qualitatively wrong in the sense that the fixation probability might not tend to 1 as *k* increases. However, we are inclined to believe that the proposition is qualitatively correct, and that the tailing off in the data is explained by the fact that, for large values of *k*, the values of ℓ and *m* which we were able to simulate may have been too small for the limiting behaviour to be apparent.

One can also consider the degenerate case *k*=2, which has chains of length zero (i.e. direct edges) from the reservoir vertices to the centre: that is, the superstar is just the complete bipartite graph *K*_{1,ℓm}, also known as a ‘star’. Large stars have fixation probability tending towards 1−*r*^{−2} [5] which is 0.9996 for *r*=50. This is above the upper end of the 99.5% confidence interval of all our *r*=50 superstar simulations, but again we suspect that ℓ=*m*=200 is too small for our simulations to exhibit limiting behaviour in that case.

Note that each graph in figure 2 corresponds to a row of the table. For fixed *k*, the fixation probability does indeed tend to 1 as *r* increases and this is easily seen to hold for any strongly connected graph.

Lieberman *et al.* simulated only the case *r*=1.1 with *k*=3 and *k*=4, on graphs of around 10 000 vertices (they do not state what values of ℓ and *m* they used). Their results in these cases are consistent with ours: they measure fixation probabilities of approximately 0.25 and 0.30 for *k*=3 and *k*=4, respectively. For *r* close to 1 and small *k*, the fixation probability is reasonably close to 1−*r*^{−k}.

The reader is referred to the electronic supplementary material for the simulation code, a description of it and a proof of its correctness. As Barbosa *et al.* [8] point out, it is difficult to simulate on large graphs because of resource constraints. We use various time-saving tricks that they discuss such as skipping simulation steps where nothing changes [13]. We also describe several optimizations that we use that are specific to superstars.

## Acknowledgements

L.A.G. and D.G. are partially supported by EPSRC grant EP/I011528/1 *Computational Counting*. P.G.S. is partially supported by the EU FET IP project MULTIPLEX contract no. 317532.

## Appendix A. Mathematica code

Here is the text of the Mathematica program that we ran to solve the linear system. We explain the code in the proof of theorem 2.4.

## Footnotes

↵1 Brown

*et al*. [12] and others have shown that the standard (Wald) binomial confidence interval of has severely chaotic behaviour, especially when*p*is close to 0 or 1, as here, even for values of*n*in the thousands. This unpredictably produces confidence intervals with much lower coverage probabilities than the nominal confidence level—often by 10 per cent or more. Following the discussion in [12], we use what they call the Agresti–Coull interval, which applies a small adjustment to*p*and*n*before computing the interval. This avoids the erratic behaviour of the Wald interval and gives coverage probabilities that are closer to the nominal confidence level and generally exceed it for*p*close to 0 or 1.↵2 Quantitatively, these results would only be weakened slightly by using the standard Wald interval: 1−

*r*^{−k}would be within the confidence interval for the additional points*k*=3,*r*=3 and*k*=4,*r*=2.

- Received March 25, 2013.
- Accepted April 30, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.