Optimizing snake locomotion in the plane

Silas Alben

Abstract

We develop a numerical scheme to determine which planar snake motions are optimal for locomotory efficiency, across a wide range of frictional parameter space. For a large coefficient of transverse friction, we show that retrograde travelling waves are optimal. We give an asymptotic analysis showing that the optimal wave amplitude decays as the power of the coefficient of transverse friction. This result agrees well with the numerical optima. At the other extreme, zero coefficient of transverse friction, we propose a triangular direct wave that is optimal. Between these two extremes, a variety of complex, locally optimal motions are found. Some of these can be classified as standing waves (or ratcheting motions).

1. Introduction

Snake locomotion has a long history of study by biologists, engineers and applied mathematicians [16]. A lack of appendages makes snake motions simpler in some respects than those of other locomoting animals [7]. However, snakes can move through a wide range of terrestrial [1,8,9], aquatic [10] and aerial [11] environments, with different kinematics corresponding to the different dynamical laws that apply in each setting. Here, we focus on terrestrial locomotion, with motions restricted to two dimensions, and a Coulomb frictional force acting on the snake. Even with these simplifying assumptions, there are many possible snake kinematics to consider. One way to organize our understanding of a diversity of locomotory kinematics is to propose a measure of locomotory performance (here, efficiency), and determine which motions are optimal, and how their performance varies with physical parameters. Well-known examples are optimization studies of organisms moving in low- [1217] and high-Reynolds-number fluid flows [1823].

In this work, we use a recently proposed model for snake locomotion in the plane, which has shown good agreement with biological snakes [3,6]. The snake is a slender body whose curvature changes as a prescribed function of arc length distance along the backbone and time. Its net rotation and translation are then determined by Newton's laws, with Coulomb friction between the snake and the ground. This model is perhaps one of the simplest that can represent arbitrary planar snake motions. Previous studies have used the model to find optimal snake motions when the motions are restricted to sinusoidal travelling waves [6], or the bodies are composed of two or three links [24]. These motions are represented by a small number of parameters (2–10); at the upper end of this range, the optimization problem is difficult to solve using current methods. Guo & Mahadevan [2] developed a model that included the internal elasticity and viscosity of the snake, and studied the effects of these and other physical parameters on locomotion for a prescribed sinusoidal shape and sinusoidal and square-wave internal bending moments. In this work, we neglect the internal mechanics of the snake, and consider only the work performed by the snake on its environment, the ground. However, we consider a more general class of snake motions. Hu & Shelley assumed a sinusoidal travelling-wave snake shape, and computed the snake speed and locomotory efficiency across the two-parameter space of travelling-wave amplitude and wavelength for two values of frictional coefficients [6]. They also considered the ability of the snake to redistribute its weight during locomotion. Jing & Alben [24] considered the locomotion of two- and three-link snakes, and found analytical and numerical results for the scaling of snake speed and efficiency. Among the results were the optimal temporal function for actuating the internal angles between the links, expressed as Fourier series with one and two frequencies.

Here, we address the optimization problem for planar snakes more generally, by using a much larger number of parameters (45 in most cases) to represent the snake's curvature as a function of arc length and time. Although our snake shape space is limited, it is large enough to represent a wide range of shapes. To solve the optimization problem in a 45-dimensional shape space, and simultaneously across a large region of the two-dimensional friction parameter space, we develop an efficient numerical method based on a quasi-Newton method. The simulations show clearly that two types of travelling-wave motions, retrograde and direct waves, are optimal when the coefficient of friction transverse to the snake is large and small, respectively. An asymptotic analysis shows how the snake's travelling-wave amplitude should scale with the transverse coefficient of friction when it is large. The analysis gives a power law for the amplitude, which agrees well with the numerics. In this limit, the power required to move a snake optimally is simply that required to tow a straight snake forwards. Another analytical solution gives an optimally efficient direct-wave motion when the transverse coefficient of friction is zero. Between these extremes, the numerics show a more complicated set of optima that include standing-wave (or ratcheting) motions, among a wide variety of other locally optimal motions. Taken together, these results show which planar snake motions are optimal within a large space of possible motions and frictional coefficients.

2. Model

We use the same frictional snake model as [3,6,24], and so we only summarize it here. The snake's position is given by X(s,t)=(x(s,t),y(s,t)), a planar curve that is parametrized by arc length s and varies with time t. A schematic diagram is shown in figure 1.

Figure 1.

Schematic diagram of snake position, parametrized by arc length s (non-dimensionalized by snake length), at an instant in time. The tangent angle and unit vectors tangent and normal to the curve at a point are labelled. Vectors representing forwards, backwards and transverse velocities are shown with the corresponding friction coefficients μf, μb and μt.

The unit vectors tangent and normal to the curve are and respectively. The tangent angle and curvature are denoted by θ(s,t) and κ(s,t), and satisfy , and κ=∂sθ. We consider the problem of prescribing the curvature of the snake in order to obtain efficient locomotion. When κ(s,t) is prescribed, the tangent angle and position are obtained by integration, 2.1 2.2 and 2.3The trailing-edge position (x0,y0) and tangent angle θ0 are determined by the force and torque balance for the snake, i.e. Newton's second law, 2.4 2.5 and 2.6Here, X=(−y,x), ρ is the snake's mass per unit length and L is the snake length. The snake is locally inextensible, ρ is uniform in space and constant in time and L is constant in time. Here, f is the force per unit length on the snake owing to Coulomb friction with the ground, 2.7Here, H is the Heaviside function and the hats denote normalized vectors. When ∥∂tX∥=0, we define to be 0. According to (2.7), the snake experiences friction with different coefficients for motions in different directions. The frictional coefficients are μf, μb and μt for motions in the forwards (), backwards () and transverse (i.e. normal, ) directions, respectively. In general, the snake velocity at a given point has both tangential and normal components, and the frictional force density has components acting in each direction. A similar decomposition of force into directional components occurs for viscous fluid forces on slender bodies [25].

We assume that the snake curvature κ(s,t) is a prescribed function of s and t that is periodic in t with period T. Many of the motions commonly observed in real snakes are essentially periodic in time [3]. We non-dimensionalize equations (2.4)–(2.6) by dividing lengths by the snake length L, time by T and mass by ρL. Dividing both sides by μfg, we obtain 2.8 2.9 and 2.10In (2.8)–(2.10) and, from now on, all variables are dimensionless. For most of the snake motions observed in nature, L/μfgT2≪1 [3], which means that the snake's inertia is negligible. By setting this parameter to zero, we simplify the problem considerably while maintaining a good representation of real snakes. Equations (2.8)–(2.10) become 2.11 2.12 and 2.13In equations (2.11)–(2.13), the dimensionless force f is 2.14Equations (2.11)–(2.13) thus involve only two parameters, which are ratios of the friction coefficients. From now on, for the sake of simplicity, we refer to μt/μf as μt and μb/μf as μb. Without loss of generality, we assume μb≥1. This amounts to defining the backward direction as that with the higher of the tangential frictional coefficients, when they are unequal; μt may assume any non-negative value. The same model was used in [3,6,24], and was found to agree well with the motions of biological snakes in [3].

Given the curvature κ(s,t), we solve the three nonlinear equations (2.11)–(2.13) at each time t for the three unknowns x0(t), y0(t) and θ0(t). Then, we obtain the snake's position as a function of time using (2.1)–(2.3). The distance travelled by the snake's centre of mass over one period is 2.15The work performed by the snake against friction over one period is 2.16We define the cost of locomotion as 2.17and our objective is to find κ(s,t) that minimizes η at (μb,μt) values which range widely over . In our computations, instead of η, we minimize a different function that is more convenient for physical and numerical reasons, 2.18To obtain F from η, we have substituted −d/W for W/d, to avoid infinities in the objective function for a common class of motions with d→0 and W of order 1. The exponential term in (2.18) penalizes rotations over a period, to ensure that the snake moves in a straight path rather than a circular path over many periods. As will be described in §4 (Results), for and all μb, we find optima that have an analytical travelling-wave form, and in the limit of large μt, rotations tend to zero for these solutions. These optima are found even with no rotation penalty, so the rotation penalty plays a negligible role in this region of parameter space. For smaller μt, however, in the absence of the rotation penalty, many of the optima have significant rotations. In appendix A, we give a more detailed explanation for the choice of F.

3. Numerical minimization method

Our goal is to determine the snake shape, κ(s,t), that minimizes F, for a given (μb,μt). Because κ(s,t) has t-period 1, we represent it via a double series expansion, 3.1Here, Tk is a Chebyshev polynomial in s, 3.2The expansion (3.1) converges as for a large class of κ that includes differentiable functions, and with a convergence rate that increases rapidly with the number of bounded derivatives of κ [26]. It is reasonable to hope that minimizing F over a class of functions (3.1) with small m1 and n1 gives a good approximation to the minimizers with infinite m1 and n1. Because the terms in (3.1) with coefficients β0k are automatically zero, the functions given by (3.1) are a (2m1−1)n1-dimensional space. In most of this work, we use m1=n1=5, so we are minimizing F in a 45-dimensional space. We give a few results at large μt with m1=n1=10 for comparison, and find similar minimizers in this region of parameter space. The minimization algorithm converges in a smaller portion of (μb,μt) space as m1 and n1 increase, and convergence is considerably slowed by the increased dimensionality of the search space. We find that m1=n1=5 is small enough to allow convergence to minima in a large portion of (μb,μt) space, but large enough to approximate the gross features of a wide range of κ(s,t). Another reason to expect that good approximations to minimizers can be obtained with small n1 is that the dynamical equations (2.11)–(2.13) depend only on spatial integrals of body position and velocity.

Using the dynamical equations (2.11)–(2.13), we show in the electronic supplementary material section A that d, W and F are the same for any reparametrization of time. Such a reparametrization can be used to reduce the high-frequency components of a motion while keeping the efficiency the same. Therefore, it is reasonable to expect that a good approximation to any minimizer of F can be obtained with low temporal frequencies, that is, with an expansion in the form of (3.1) with small m1.

We use the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm [27] to minimize F numerically, by computing a sequence of iterates that converge to a local minimizer of F. The algorithm requires routines for computing the values and gradients of F as well as a line search method for finding the next iterate at each step, using the search direction provided within the BFGS algorithm. The BFGS algorithm is a quasi-Newton algorithm, and therefore forms an approximation to the Hessian matrix of second derivatives, without the expense and difficulty of computing the Hessian matrix explicitly.

We have developed an efficient method for evaluating F and its gradient with respect to the (2m1−1)n1 shape parameters with just a single simulation of the snake trajectory over one period. Computing the value of F requires computing the trajectory of the snake over one period. We discretize the period interval uniformly with m time points: {0,1/m,2/m,…,1−1/m}. At each time point, we write (2.11)–(2.13) as a nonlinear system of three equations in three unknowns: {∂tx0,∂ty0,∂tθ0}. We also define an artificial frame for each time point in which the snake's trailing-edge coordinates are at the time in question; each such frame is fixed in time. We then solve (2.11)–(2.13) for the trailing-edge velocities in each of the artificial frames, . Because each artificial frame is obtained from the laboratory frame by a fixed translation and rotation, the tangent angle at the snake's trailing edge changes at the same rate in the two frames: . We thus obtain the trailing-edge tangent angle in the laboratory frame by integrating 3.3We then obtain the trailing-edge position in the laboratory frame by integrating the trailing-edge translational velocities, rotated to the laboratory frame, 3.4We then integrate κ(s,t) using {x0(t),y0(t),θ0(t)} to obtain the body trajectory a posteriori. Using {∂tx0,∂ty0,∂tθ0} as the unknowns instead of {x0,y0,θ0} has two advantages: we avoid the cancellation error associated with computing discrete time derivatives of {x0,y0,θ0}, and we solve m decoupled systems of three equations in three unknowns instead of a larger coupled system of 3m equations in 3m unknowns ({x0,y0,θ0} at all time points), which is more expensive to solve.

We solve the system (2.11)–(2.13) using Newton's method with a finite-difference Jacobian matrix. We solve it at each time sequentially from t=0, using a random initial guess for {∂tx0,∂ty0,∂tθ0} at t=0 and an extrapolation from previous time points as an initial guess at other t. An important element of our method is the quadrature used to compute the integrals in (2.11)–(2.13) using (2.14). Here, we introduce notation for the tangential and normal components of the snake velocity, 3.5In (2.14), we can write 3.6These terms have unbounded derivatives with respect to us and un when us,un→0. The local error when integrating (2.14) using the trapezoidal rule (for example) with uniform mesh size h is . To obtain convergence, in this case, it is necessary to have , so the trapezoidal rule (and other classical quadrature rules) needs to be locally adaptive when (us,un) tends to zero.

We use a different approach that allows for a uniform mesh even when (us,un)→0. We perform the quadrature analytically on subintervals using locally linear approximations for us and un. Using {∂tx0,∂ty0,∂tθ0} and κ(s,t), we compute ∂tX, X, and then us and un on a uniform mesh. On each subinterval, we approximate us and un as linear functions of the form As+B using their values at the interval endpoints. We approximate X, and as constants using their values at the midpoints of the intervals. Then, the integrals in (2.11)–(2.13), using (2.14), are of the form 3.7on each subinterval. We evaluate such integrals analytically using 3.8and 3.9When used to evaluate (3.7), the logarithms in (3.8)–(3.9) have cancelling singularities when the linear approximations to us and un are proportional (including the case in which one is zero), i.e. when ADBC→0. We use different (and simpler) analytical formulae on these subintervals because the integrands are constants or sign functions, which are easy to integrate analytically. This quadrature method is O(h2) for mesh size h owing to the use of linear and midpoint approximations of the functions in the integrands.

With this method for evaluating (2.11)–(2.13), we compute {∂tx0,∂ty0,∂tθ0} by Newton's method as described above, and obtain the snake trajectory and d, W and F for the prescribed curvature function. The BFGS method also requires the gradient of F with respect to the (2m1−1)n1 curvature coefficients {αjk,βjk}. In appendix B, we describe how the gradient of F is computed with a cost comparable with that of computing F alone.

4. Results

In figure 2, we show the result of one run of the optimization routine with μb=1 and μt=30, starting with a randomly chosen initial vector of curvature coefficients. In figure 2a, we plot a map of curvature values κ(s,t) over one period, and find that the curvature approximates a sinusoidal travelling wave with wavelength approximately equal to the snake body length. In this case, the waves appear to travel with nearly constant speed, shown by the straightness of the curvature bands. The bands deviate from straightness near the ends, where the largest curvatures occur. In figure 2b, we plot the objective function and of the two-norm of its gradient over the iteration count of the optimization routine. We find that the gradient norm decreases by four orders of magnitude within the first 100 iterations, and the objective function reaches a plateau at the same time. In figure 2c, we give snapshots of the position of the snake X(s,t) at equal instants in time, moving from left to right, and see that the snake approximately follows a sinusoidal path.

Figure 2.

Result of one snake optimization run for μb=1 and μt=30. (a) Map of curvature versus arc length and time. (b) Values of the objective function (dashed line) and of the two-norm of its gradient (solid line) with respect to the iteration count in the optimization routine. (c) Snake trajectory over one period. (Online version in colour.)

Figure 3 shows three different converged results for the same parameters μb=1 and μt=30, but with different randomly chosen initial iterates. Again, we have travelling waves of curvature shown by bands, but unlike in figure 2a, the bands are not straight. In addition, the bands travel from right to left instead of left to right. The bands are not straight because the travelling-wave speed changes over time. However, at any given time, the distances in s from the maximum to the minimum of curvature are nearly the same in figure 3ac and figure 2a, and so are the shapes. In all four cases, the snake passes through the same sequence of shapes, but at different speeds, and different (integer) numbers of times in one period. Thus, the four motions are essentially the same after a reparametrization of time. Correspondingly, the values of F for all four are in the range −5.4418±0.0024. The distances travelled, d, are within 0.2% of integer multiples of 0.597, with the multiples 4,3,2 and 1 for the motions in figures 2a and 3ac, respectively, which correspond to the number of times they move through the sequence of shapes represented by figure 3c. The values of F are the same whether the waves move from left to right or right to left in s, because μb (or μb/μf)=1, so there is a symmetry between forward and backward motions.

Figure 3.

Results of three additional snake optimization routines for μb=1 and μt=30. (ac) Maps of curvature versus arc length and time after convergence, starting from three different randomly chosen initial iterates. (Online version in colour.)

Over an ensemble of 20 random initializations, this sinusoidal motion yields the lowest value of F among the local optima found. We have repeated the search with twice as many spatial and temporal modes (m1=10, n1=10) and 48 random initializations. About three-quarters of the searches converge, and the majority of the converged states are very close to those shown in figures 2 and 3. The values of F for these states lie in the range −5.577±0.026. The mean is systematically lower than the mean for m1=5, n1=5 by about 2.5%. The decrease in the objective function is slight, and the optimal shape is fairly smooth, which indicates that the m1=5, n1=5 minimizer is representative of the minimizer for much larger m1 and n1, at least for μb=1 and μt=30. With m1=10 and n1=10, the curvature bands of figures 2a and 3 become straighter near the ends, showing the effect of the truncation in the Chebyshev series. However, it is more difficult to obtain convergence with four times as many modes because of the higher dimensionality of the space. As a primary goal here is to obtain the optima across as broad a range of (μb,μt) space as possible, we proceed with m1=5 and n1=5, except when noted.

Performing searches with μb=1 and a wide range of μt, we find that similar sinusoidal travelling-wave optima occur for . In figure 4a, we plot the values of F corresponding to the optima over a range of μt on a logarithmic scale, and find that the optima are more efficient at higher μt. In figure 4b, the curvatures of the optimal shapes are plotted for the same μt, at the times when the curvatures have a local maximum at . The corresponding snake body positions are shown in figure 4c, and are vertically displaced to make different bodies easier to distinguish. In all cases, the bodies approximate sinusoidal shapes with slightly more than one wavelength of curvature on the body. The curvature amplitudes decrease monotonically with increasing μt.

Figure 4.

Travelling-wave optima for 10 different μt on a logarithmic scale. (a) Values of F. (b) Curvature versus arc length at the instant when a curvature maximum crosses the midpoint, for the 10 μt values in (a). (c) Snake shapes corresponding to (b).

Performing searches with a wide range of μt and μb now, we find that similar sinusoidal travelling-wave optima are found for and all μb used (1≤μb≤3000). For these motions, the curvature travelling waves move backwards in s, and the tangential motion is purely forwards, so they have the same F at all μb. However, for μb>1, a separate class of (local, not global) optima are also found. These have curvature travelling waves that move forwards in s, and for which the tangential motion is purely backwards. As friction is higher in the backward direction, these local optima have higher F (are less efficient) than the oppositely moving optima, and the values of F for these optima decrease with increasing μb, which makes physical sense.

In figure 5a, we plot the values of F for these ‘reverse’ sinusoidal motions at various μb and μt, with leftward-pointing triangles. The triangles connected by a given solid line are data for various μt at a particular value of μb, labelled to the right of the line. When μb=1, the reverse sinusoidal motions have the same F values as the forward sinusoidal motions shown in figure 4. These have the lowest F values across all μb for , and are thus the most efficient motions we have found in this region of parameter space.

Figure 5.

Four categories of locally optimal motions: reverse sinusoidal motion, sinusoidal motion with curls, reverse sinusoidal motion with curls and other motions, for 10 different μt on a logarithmic scale. (a) Values of F. Corresponding values of μb are labelled only for the reverse sinusoidal motions. (b) Snapshots of a reverse sinusoidal motion with curls. Here, μb=1 and μt=10. (c,d) Snapshots of motions in the ‘other’ category, with μb=10 and μt=60 for (c) and μb=6 and μt=20 for (d).

Three other sets of symbols are used to denote other classes of local optima found by our method. All are less efficient than the forward sinusoidal motions, but we present examples of them to show some of the types of local optima that exist for this problem. For these three sets, shown by circles, downward-pointing triangles and plus signs, we do not join the optima with a given μb by lines, to reduce visual clutter. The circles, labelled ‘curl’, give F values for shapes that are like sinusoidal travelling waves, but they have a curl – the snake body winds around itself once in a closed loop. Snapshots of the snake motion for such a case are shown in figure 5b. Here, the snake moves horizontally, but the snapshots are displaced vertically to make them distinguishable. The arrow shows the direction of increasing time. The tail of the snake is marked by a circle. The snake body moves rightwards, whereas the curvature wave and curl both move leftwards. Because the snake overlaps itself, this motion is not physically valid in two-dimensional space. However, a large number of such optima are found numerically, with various small-integer numbers of curls (and with both directions of rotation). The efficiency of such motions decreases as the number of curls increases. The downward-pointing triangles, labelled ‘curl, reverse’, are for travelling waves with curls that pass backwards along the snake. These have a yet smaller efficiency for the same reason that the reverse sinusoidal motions are less efficient than the forward sinusoidal motions, in the absence of curls. The plusses, labelled ‘other’, are for other local optima, which have a variety of motions. Figure 5c shows snapshots of a curling (but not self-intersecting) motion in which the snake moves horizontally leftwards. The horizontal displacement is increased by of a snake length between snapshots to make them distinguishable. Figure 5d shows a different curling motion in which the snake moves horizontally rightwards, and the snapshots are displaced vertically to make them distinguishable. These alternative minima increase greatly in number as μt decreases below 6, and some of them outperform the forward sinusoidal motions at these transitional μt values. We will discuss this regime further after giving an analytical travelling-wave solution to the problem in the large-μt regime.

5. Large-μt analysis

Hu et al. [3] and Hu & Shelley [6] found that when the curvature of the snake backbone is prescribed as a sinusoidal travelling wave, high speed and efficiency are obtained when the coefficient of transverse friction is large compared with the coefficient of forward friction. Jing & Alben [24] found that the same holds for two- and three-link snakes. In biological snakes, the ratio of transverse to forward friction is thought to be greater than 1 [3,6], although how much greater is not clear. In [3,6], a ratio of only 1.7 was measured for anaesthetized snakes. These works noted that active snakes use their scales to increase frictional anisotropy, so the ratio in locomoting snakes is potentially much larger, and indeed, a ratio of 10 was found to give better agreement between the Coulomb friction model and a biological snake [6]. Wheeled snake robots have been used very successfully for locomotion [28,29], and for most wheels, the frictional coefficient ratio (with forward friction defined by the rolling resistance coefficient of the wheel) is much greater than 10 [30].

We now analytically determine the optimal snake motion in the limit of large μt, which helps us understand the numerical results in this regime. We give the position of the snake in terms of its components, 5.1If μt is large, then the snake moves more easily in the tangential direction than in the transverse direction for a given shape dynamics. Furthermore, the most efficient shape dynamics will give motion mainly in the tangential direction, to avoid large work performed against friction. We assume that the mean direction of motion is aligned with the x-axis, and that the deflections from the x-axis are small, that is, |y|,|∂ty|,|∂sy| and higher derivatives are O(μαt) for some negative α. This has been observed in the numerical solutions, and makes intuitive sense. Small deflections allow the snake to move along a nearly straight path, which is more efficient than a more curved path, which requires more tangential motion (and work performed against tangential friction) for a given forward motion, i.e. d.

We expand each of the terms in the force and torque balance equations in powers of |y| and retain only the terms that are dominant at large μt. We expand the cost of locomotion similarly, and find the dynamics that minimize it, in terms of y(s,t).

We first expand x(s,t). We decompose x(s,t) into its s-average and a zero-s-average remainder, 5.2where means that the constant of integration is chosen so that the integrated function has zero s-average. Hence, 5.3We denote the s-averaged horizontal velocity (the horizontal velocity of the centre of mass) by 5.4We expand the integrand in (5.3) in powers of |y| and its derivatives, 5.5The quartic remainder term in (5.5), O(|∂sy|4), actually includes time derivatives of ∂sy also, but we assume that |y| and all its derivatives are of the same order of smallness with respect to μt. We define 5.6and with the small-deflection assumption, (5.3) becomes 5.7and 5.8We will see that, for a quite general class of shape dynamics, U(t) is O(1), i.e. of the same order as one snake length per actuation period, even in the limit of large μt. Therefore, U is the dominant term in (5.7) and (5.8). We use this assumption on U to expand the velocity two-norm, 5.9In writing O(|y|4), we have again assumed that |y| has the same order as its derivatives. Dividing (5.8) by (5.9), we obtain the expansion for the normalized velocity, 5.10We next expand the unit tangent and normal vectors, 5.11and 5.12Using (5.10)–(5.12), 5.13and 5.14

Using (5.13) and (5.14), we can write the force and torque balance equations (2.11)–(2.13) and the cost of locomotion η in terms of U, h and derivatives of y. We revert to η instead of F here because it is analytically simpler, and the rotation penalty in F does not play a significant role at large μt, so η and F have the same optima.

We first expand η and use it to argue that the optimal shape dynamics is approximately a travelling wave. This allows us to neglect certain terms, which simplifies the formulae for the rest of the analysis. If the power expended at time t is P(t), then the cost of locomotion is 5.15 5.16In (5.16), we have assumed that the tangential motion of the snake is entirely forwards, with no portion moving backwards, so the coefficient in front of the tangential term is unity (i.e. μf, rather than the Heaviside term that switches between μf and μb). This assumption is not required, but it simplifies the equations from this point onwards, and will be seen to be correct shortly. The numerator of (5.16) can be separated into a part involving tangential velocity and a part involving normal velocity with a factor of μt. Using (5.9), (5.13) and (5.14), we can list the terms with the lowest powers of y in each part, 5.17and 5.18The optimal shape dynamics is a y(s,t) that minimizes η subject to the constraints of force and torque balance. The leading-order term in (5.17) is 1, which is independent of the shape dynamics. At the next (quadratic) order in y, terms appear in both (5.17) and (5.18); only that in (5.18) is given explicitly. It is multiplied by μt, so it appears to be dominant in the large μt limit. Thus, to a first approximation, minimizing η is achieved by setting 5.19so a first guess for the solution is a travelling wave, i.e. a function of s+Ut with U constant. However, such a function does not satisfy the x-component of the force balance equation (2.11), which is 5.20Inserting (5.13) and (5.14) and retaining only the lowest powers in y from each expression gives 5.21which cannot be satisfied by a function of s+Ut. Physically, the first term (unity) in (5.21) represents the leading-order drag force on the body due to forward friction. For a function of s+Ut, in which the deflection wave speed equals the forward speed, the snake body moves purely tangentially (up to this order of expansion in y) along a fixed path on the ground, with no transverse forces to balance the drag from tangential frictional forces. In addition, when a function of s+Ut is inserted into (5.21), all factors of U cancel out and U is left undetermined. This is related to a more fundamental reason why posing the shape dynamics as a function of s+Ut is invalid: the problem of solving for the snake dynamics given in §2 is then not well posed. In the well-posed problem, we provide the snake curvature and curvature velocity versus time as inputs, and solve the force and torque balance equations to obtain the average translational and rotational velocities as outputs. Thus, U(t) is one of the three outputs (or unknowns) that are required to solve the three equations.

Instead, we pose the shape dynamics as 5.22which is a travelling wave with a prescribed wave speed Uw, different from U in general. Because y is periodic in t with period 1, g is periodic with a period of Uw. Inserting (5.22) into (5.21), we obtain an equation for U in terms of Uw and g, 5.23In (5.23), as , (1−Uw/U)→0. In the solution given by (5.22) and (5.23), the shape wave moves backwards along the snake at speed Uw, which propels the snake forwards at a speed U, with U slightly less than Uw. Therefore, the snake slips transversely to itself, in the backward direction, which provides a forward thrust to balance the backward drag owing to forward tangential friction. We refer to (1−Uw/U) in (5.23), a measure of the amount of backward slipping of the snake, as the ‘slip’.

We now solve (5.23) for (1−Uw/U) and insert the result into (5.15) to find g and Uw that minimize η. Using the lowest order expansions (5.17) and (5.18) with (5.23), we obtain 5.24 5.25 5.26where 5.27If we minimize (5.26) over possible g for a given μt, using only the terms in the expansion that are given explicitly, then we find that η tends to a minimum of 1 as the amplitude of g diverges. However, in this limit, our small-y power series expansions are no longer valid. We therefore add the next-order terms in our expansions of η and the Fx equation and search again for an η-minimizer.

We now expand to higher order the components of η, (5.17) and (5.18), 5.28and 5.29and equation (2.11), 5.30We again assume the travelling-wave form of y (5.22), and use this to evaluate h given by (5.6) in terms of g, 5.31Inserting the travelling-wave forms of y (5.22) and h (5.31) into (5.30), we obtain 5.32To obtain (5.32), we have used the fact that the slip (1−Uw/U)→0 as , which will be verified subsequently. One can also proceed without this assumption, at the expense of a lengthier version of (5.32). We insert the travelling-wave forms of y and h into (5.28) and (5.29) to obtain η, again with the assumption that (1−Uw/U)→0 as to simplify the result, 5.33Solving for the slip (1−Uw/U) from (5.32) and inserting into (5.33), we obtain an improved version of (5.26), 5.34We see that (5.34) is similar to (5.26), but with an additional term in the denominator of the integrand that penalizes large amplitudes for g. If we approximate 〈g′(s+Uwt)2〉 as constant in time, we obtain 5.35which is minimized for 5.36Thus, the r.m.s. amplitude of the optimal g should scale as μ−1/4t. Equation (5.36) represents a balance between competing effects. At smaller amplitudes, the forward component of normal friction is reduced in comparison with the backward component of tangential friction, so the snake slips more, which reduces its forward motion and does more work in the normal direction. At larger amplitudes, as already noted, the snake's path is more curved, which requires more work carried out against tangential friction for a given forward distance travelled. We note that (5.36) is only a criterion for the amplitude of the motion. To obtain information about the shape of the optimal g, we now consider the balances of the y-component of the force and the torque.

So far, we have searched for the optimal shape dynamics in terms of y(s,t), but, in fact, it is the curvature κ(s,t) that is prescribed. We obtain y(s,t) from the curvature by integrating twice in s, 5.37 5.38 5.39 5.40 5.41where Y and R are defined for notational convenience. Prescribing the curvature is equivalent to prescribing k(s,t). We set 5.42so we have the same form for y as before, with an additional translation Y (t) and rotation R(t). We see that Y and R are determined by the y-force and torque balance equations, 5.43and 5.44We expand these to leading order in y and obtain 5.45and 5.46We insert (5.41) with (5.42) into the three equations (5.30), (5.45) and (5.46) to solve for the three unknowns U, Y and R in terms of g and Uw. We obtain 5.47 5.48 and 5.49We solve (5.48) and (5.49) for Y and R in terms of U, 5.50 5.51 5.52 and 5.53We then solve (5.47) for U in terms of g, 5.54We now form η, (5.16) with terms given by (5.28) and (5.29), using the updated form of y ((5.41) with (5.42)) and (5.54). We obtain a modified version of (5.34), 5.55We can find η-minimizing g in a few steps. First, let {Lk} be the family of orthonormal polynomials with unit weight on [0,1] (essentially the Legendre polynomials), 5.56At a fixed time t, we expand g′ in the basis of the Lk, 5.57Then, we have 5.58 5.59 and 5.60Inserting into (5.55), η becomes 5.61The denominator of the integrand in (5.61) is minimized when 5.62The function g(s+Uwt) that minimizes η is that for which (5.62) holds for all t. Then, the integral in (5.61) is maximized, so η is minimized. Recall that g is a periodic function with period Uw. The relations (5.62) hold for any such g in the limit that Uw→0, as long as g is normalized appropriately. Define 5.63Then, 5.64 5.65 and 5.66It is fairly straightforward to show (5.64)–(5.66) hold, and we show them in the electronic supplementary material, section B for completeness. If (5.64)–(5.66) hold, then in the limit that Uw→0, (5.62) holds with A=21/4μ−1/4t, by (5.58)–(5.60). A physical interpretation of the small-wavelength limit is that the net vertical force and torque on the snake due to the travelling wave alone (g) become zero in this limit, so no additional heaving motion (Y) or rotation (R) are required, and thus the additional work associated with these motions is avoided. Similar ‘recoil conditions’ were proposed as kinematic constraints in the context of fish swimming [18]. We also note two other scaling laws. For a given gμ−1/4t, by (5.54), the slip and by (5.50)–(5.53), Y,Rμ−3/4t.

Now, in order to obtain an η-minimizing function g*, let g be any periodic function with period Uw. Define 5.67where A is given in (5.63). Then, in the limit Uw→0, 5.68by the estimate in (5.55). Thus, the optimal travelling-wave motions become more efficient as , which agrees with the decrease in F for the numerical solutions in figure 4a.

We consider the relationship of the numerically found optima to the theoretical results further in figure 6. Here, we take the plots of figure 4 and transform them according to the theory. Figure 6a transforms F from figure 4a into η by the formula η=−e2F−1 because rotations over one period are negligible. In figure 6a, we plot η−1 versus μt for the numerical solutions (squares) from figure 4a, along with the leading-order terms in the theoretical result (5.68). The numerics follow the same scaling as the theory, but with a consistent upward shift (as expected from the finite-mode truncation). In particular, the numerically determined optima do not have very small wavelengths, owing to the finite number of modes used. This is explained further in the electronic supplementary material, section C. The uniformity of the shift may be due to the shape consistently assumed by the numerical optima for various μt. For μt=300, we have also computed η via the dynamical simulation for a much shorter-wavelength shape given by with wavelength and A1 chosen so that satisfies (5.36). The result is given by the open circle, which outperforms the optimum identified in the truncated-mode space, as expected. The circle is 0.013 above the solid line. This deviation is due, in part, to the neglect of the next term in the asymptotic expansion, proportional to μ−1t or here. Numerically, it is difficult to simulate motions with Uw much smaller than . Figure 6b shows the curvature, rescaled by μ1/4t to give a collapse according to (5.36). The data collapse well compared with the unscaled data in figure 4b. Figure 6c shows the body shapes from figure 4c with the vertical coordinate rescaled, and the shapes plotted with all centres of mass located at the origin. We again find a good collapse, consistent with the curvature collapse.

Figure 6.

The μt-scalings of the numerical travelling-wave optima from figure 4. (a) The cost of locomotion η minus unity, on a log scale, for the numerical optima (squares) of figure 4a together with η−1 for the analytical optimum (solid line). (b) The curvature from figure 4b rescaled by μ1/4t. (c) The deflection from figure 4c rescaled by μ1/4t.

6. Moderate to small μt: from retrograde waves to standing waves to direct waves

Near μt=6, there is a qualitative change in the numerically computed optima. At values of , the global minimizers of F are all backward-moving travelling waves. In addition, there are local minimizers in which the waves move forwards along the snake (and the snake moves backwards), and overlapping solutions with integer numbers of curls appear in the travelling waves. The values of F at the local minima, shown in figure 5a, are generally well separated. In addition, there is a small number of local minima with ratcheting types of motions. For μt≤6, in figure 5a, there is an increasing number of local minimizers, denoted as ‘other’, quite close to the global minimizer. These motions are perturbed travelling waves. Thus, the number of local minima increases greatly as μt decreases below 6, and the spacings between their F values decrease greatly.

Figure 7 shows the values of F for all computed local optima at four values of μt near the transition: 2, 3, 6 and 10. For each μt, the F values are plotted against μb. In addition, symbols are used to distinguish whether the local maximum in curvature travels forwards or backwards on average, and hence gives the direction of the curvature travelling wave (or perturbed travelling wave). At μt=10, we find that the smallest F is nearly constant with respect to μb. The same is true for μt=6, except at μb=2000, where there are several closely spaced values of F, which are slightly below the minimal F at smaller μb. These values mark the first appearance of the perturbed travelling waves as μt decreases. For μt=3, the distribution of values is quite different, with a much stronger decrease in F as μb increases. There are many clusters of closely spaced but distinct values of F, showing a large increase in the number of local equilibria. At each μb, the smallest values of F are spread over a band of about 0.5 in width. At μt=2, the same qualitative trend holds, except that the decrease in F with μb is much greater. There is more scatter of values overall and of those representing forward-moving curvature maxima. There is randomness in the sample of F values, particularly evident for μt≤3, because they are values for a finite sample of local minimizers starting from random initial κ coefficients.

Figure 7.

Values of F for all computed local optima with various random initial iterates, across μb at four μt (10 (a), 6 (b), 3 (c) and 2 (d)) showing the emergence of non-travelling-wave optima.

Intuitively, the minima for μt=3 represent perturbations of travelling-wave motions towards ‘ratcheting’-type motions that involve small amounts of backward motion in a portion of the snake to ‘push’ the rest of the snake forwards. The appearance of backward motion explains the variation of F with μb. As μt decreases from 2000 to 6, F for the travelling waves increases, and for μt=3, other motions outperform the travelling-wave motions at sufficiently large μb. For μt=2, the motions tend more towards ratcheting-type motions, and we will investigate this further subsequently.

Another way of viewing the transition is to consider the large-μt solution as μt decreases. A given travelling-wave motion will slip more as μt decreases, unless the amplitude increases. The component of the transverse force in the forward direction reaches a maximum when the slope of the snake becomes vertical. Thus, there is a limit to how much the slip can be decreased by increasing the amplitude of the travelling wave. When μt decreases below 6, the slope of the snake becomes nearly vertical at its maximum.

In figure 8a, we plot a function of the shape dynamics that quantifies the transition away from travelling-wave motions near μt=6. The quantity is 6.1For κ(s,t), which is a travelling wave with a wavelength sufficiently small that a maximum or minimum in curvature occurs in 0.2≤s≤0.8 at all times, and for which the maximum and minimum are equal in magnitude, ψ equals unity. For , the numerical κ(s,t) are approximately travelling waves with local extrema in 0.2≤s≤0.8; we restrict s away from the ends to avoid slight intensifications in curvatures near the ends. In figure 8a, we plot the maxima of ψ over all equilibria located at a given (μt,μb) pair.

Figure 8.

(a) At each (μt,μb), the maximum value of the function ψ over all optima is plotted as a shaded dot. (b) The values of F corresponding to the optima in (a) are plotted. (Online version in colour.)

We find ψ>0.8 for . We see that ψ increases slightly from about 0.8 to 0.9 as μt increases from 6 to 200. For μt=6, ψ decreases slightly but notably, from 0.8 to 0.7, as μb exceeds 200, showing that, at transitional μt, the perturbations away from the travelling-wave solutions appear first at the largest μb. At μt=3, the values of ψ have decreased to around 0.6 for most μb; ψ is as large as 0.9 at smaller μb and as small as 0.4 at larger μb. For μt<3, the values of ψ are generally much smaller. There is considerable randomness in the values of ψ plotted for μt<3 because they are the maxima over samples of 10–20 local equilibria starting from random initial κ coefficients.

In figure 8b, we plot the values of F for the optima whose ψ-values are shown in figure 8a. We find that F varies smoothly over the region of transitional μt, 2≤μt≤10, attaining its highest values (i.e. lowest efficiency) there. The retrograde travelling-wave motions and other motions are equally efficient at a transitional μt, and the other motions become optimal as μt decreases.

A global picture of the optimal states is presented in the ‘phase diagram’ of figure 9. At large μt, the retrograde travelling waves (plotted with circles) are optimal. At small μt, all of the optima have the form of direct travelling waves (plotted with triangles). Unlike for the large-μt waves, these direct waves all have large amplitudes, and their specific form varies with μb. Near μt=1, there is a region where standing waves with non-zero mean deflection (or ratcheting motions) are seen (plotted with crosses). Examples will be given shortly. These optima coexist with retrograde-wave optima at larger μt. We note that there are other optima that do not fit easily into one of these categories, throughout the phase plane. These other optima are generally found less often by a computational search, and are generally suboptimal to the travelling waves, but there are exceptions. For the sake of brevity, we do not discuss such motions here. We also note that with twice the spatial and temporal modes (m1=10 and n1=10), we again find a transition from retrograde travelling waves to standing waves near μt=6; so the transition does not appear to be strongly affected by the mode truncation.

Figure 9.

Phase diagram of optimal motions in (μt,μb)-space.

To obtain a clear picture of the snake kinematics across the transitional region shown in figure 9, we give figures showing snapshots of snake kinematics, together with spatio-temporal curvature plots, at μb=1 and 3 in this section and at μb=2,5 and 20 in the electronic supplementary material, section D. Each of these figures represents one horizontal ‘slice’ through the phase diagram in figure 9.

In figure 10, we show the optimal snake kinematics for μb=1 and eight μt spanning the transition. Each set of snapshots is arranged in a single column, moving downwards as time moves forwards over one period, from top to bottom (shown by arrows). Similar to figure 5b,d, the snapshots are shown in a frame in which the net motion is purely horizontal, from left to right. However, the snapshots are displaced vertically to make them visible (i.e. to prevent overlapping). Below the snapshots, the corresponding curvature plot is shown, in the same format as in figures 2 and 3, with s increasing from 0 to 1 on the horizontal axis, from left to right, and t increasing from 0 to 1, from bottom to top. We have omitted the axes, and the specific curvature values for each plot, instead using a single colour bar (shown at the bottom) with curvature values that range from the maximum to the minimum for each plot. By omitting this information, we are able to show several of the optimal motions side by side. By comparing a large number of motions in a single figure, it is easier to see the general trends.

Figure 10.

Snapshots of optimal snake motions at various μt (one per column, labelled at the top), for μb=1. (Online version in colour.)

We begin by discussing the rightmost column in figure 10, with μt=6. A retrograde travelling wave is clearly seen, so this optimum is a continuation of the large-μt solutions. At this μt, just above the transition, the wave amplitude is large, and the wave clearly moves backwards in the laboratory frame. Thus, the snake slips backwards considerably, even though it moves forwards, because μt is not very large. Moving one column to the left, the optimum at μt=3 shows a clear difference. At certain instants, the curvature becomes locally intensified near the snake midpoint. This occurs when the local curvature maximum reaches the snake midpoint. The motion is still that of a retrograde travelling wave, but now the wave shape changes over the period. Hence, the displacement can no longer be written in the form g(s+Uwt) for a single periodic function g, as in the previous section. Comparing the curvature plots for μt=6 and 3, we see that the bands have nearly constant width across time for μt=6. At μt=3, in contrast, the curvature bands become modulated. The bands show bulges near the curvature intensifications. Moving leftwards again, to μt=2 and then to 1.5, the curvature intensifications increase further, showing a general trend. In all cases, however, the curvature peaks move backwards along the snake, even as the peak values change, so these motions are retrograde waves. At μt=1.2, a novel motion is seen: the endpoints of the snake curl up (but do not self-intersect), forming ‘anchor points’. Between these points, the remainder of the snake moves forwards. In the curvature plot, bands can still be seen corresponding to retrograde waves. The three remaining columns, moving leftwards, all show direct waves. That is, the wave moves in the same direction as the snake. No motion is shown for μt=1. The point μb=1,μt=1, in which friction is equal in all directions, is in some sense a singular point for our computations. Near this point, it is difficult to obtain convergence to optimal motions, and optima are difficult to classify clearly as retrograde or direct waves (or other simple motions). Nonetheless, we have computed one optimum that can be classified as a direct wave, but it has complicated and special features that are not easy to classify. In the three left columns, the direct waves can be recognized by the bands in the curvature plots. The bands have the direction of direct waves. However, the bands are not nearly as simple in form as those for the retrograde waves at large μt.

Figure 11 shows optima for μb now increased to 3. The same transition is seen, but now in the two middle columns, with μt=1 and 1.5, standing waves are seen. The curvature plots are nearly left–right symmetric. In the corresponding motions, the snake bends and unbends itself. As μb>1, there is a preference for motion in the forward direction even though the curvature is nearly symmetric with respect to . These optima did not occur for μb=1 (figure 10) because symmetric motions with equal forward and backward friction would yield zero net distance travelled. It is somewhat surprising that such motions can be completely ineffective for μb=1 and yet represent optima for μb as small as 2, as shown in figure 9.

Figure 11.

Snapshots of optimal snake motions at various μt (one per column, labelled at the top), for μb=3. (Online version in colour.)

We have presented only a limited selection of optima for moderate μt, and largely with pictures, because it is difficult to describe and classify many of the motions we have observed. Future work may consider this rich region of parameter space further.

We now move on to the limit of zero μt. In this limit, we find a direct-wave motion with zero cost of locomotion. This explains the preference for direct waves at smaller μt in this section.

7. Zero μt: direct-wave locomotion at zero cost

When μt=0, it is possible to define a simple shape dynamics with zero cost of locomotion. The shape moves with a non-zero speed while doing zero work. It is a travelling wave with the shape of a triangular wave that has zero mean vertical deflection, 7.1The vertical speed corresponding to (7.1) is a square wave, 7.2The wave of vertical deflection moves forwards with a constant speed of 1, and the horizontal motion is also forwards, with a constant speed U, 7.3We determine U by setting the net horizontal force on the snake to zero, 7.4with 7.5Solving (7.4) for U, we obtain 7.6With this choice of U, is identically zero on the snake, so the motion is purely in the normal direction. Therefore, there are no forces and torques on the snake, and no work performed against friction. However, the snake moves forwards at a non-zero speed U. This solution has infinite curvature at the peaks and troughs of the triangular wave, and we may ask if zero cost of locomotion (η) can be obtained in the limit of a sequence of smooth curvature functions that tend to this singular curvature function. To answer this question, we have set y to the Fourier series approximation to (7.1) with n wavenumbers, for n ranging from 2 to 1000. At each time t, we have then calculated a net horizontal speed U(t), net vertical speed V (t) and a net rotational speed dR(t)/dt such that Fx(t)=0, Fy(t)=0 and τ(t)=0. We find very clear asymptotic behaviour for U(t), V (t), dR(t)/dt and η, 7.7Therefore, zero cost of locomotion does, in fact, arise in the limit of a sequence of smooth shape dynamics. This type of travelling-wave shape dynamics, which yields locomotion in the direction of wave propagation, has been called a ‘direct’ wave in the context of crawling by snails [31] and worms [32]. A triangular wave motion was previously considered (in less quantitative detail) by Bekker [33].

The triangular travelling-wave motion is particularly interesting because it can represent an optimal motion at both zero and infinite μt, and its motion can be computed analytically for all μt. We show how this motion embodies many aspects of the general problem in the electronic supplementary material, section E.

8. Conclusion

We have studied the optimization of planar snake motions for efficiency, using a fairly simple model for the motion of snakes using friction. Our numerical optimization is performed in a space of limited dimension (usually 45, increased to 180 at selected (μb,μt) values), but which is large enough to allow for a wide range of motions. The optimal motions have a clear pattern when the coefficient of transverse friction is large. Our numerical results and asymptotic analysis indicate that a travelling-wave motion of small amplitude is optimal. When this coefficient is small, another travelling-wave motion, of large amplitude, is optimal. When the coefficient of transverse friction is neither large nor small, the optimal motions are far from simple, and we have only begun to address this regime.

A natural extension of this work is to include three-dimensional motion of snakes. In other words, one would give the snake the ability to lift itself off the plane [2,3,6], as occurs in many familiar motions such as side-winding [9]. One can consider more general frictional force laws, perhaps involving a non-diagonal frictional force tensor. Incorporating the internal mechanics of the snake as in [2] would penalize bending with high curvature. Our optima at large and small μt can be approximated well with shapes of small and moderate curvature, respectively. Some of the more exotic optima at moderate μt that involved curling at the ends would be significantly modified by a curvature penalty, however. Another interesting direction is to consider the motions of snakes in the presence of confining walls or barriers [5,34].

Acknowledgements

We acknowledge helpful discussions on snake physiology and mechanics with David Hu and Hamidreza Marvi, and helpful discussions with Fangxu Jing during our previous study of two- and three-link snakes, and the support of NSF-DMS Mathematical Biology grant no. 1022619 and a Sloan Research Fellowship.

Appendix A. Objective function

To understand our choice of F, we first describe the class of motions X(s,t) that result from periodic-in-time κ(s,t). Let X(s,t) and ∂tX(s,t) be given for the moment. Let us rotate X(s,t) by an angle α and translate X(s,t) by a vector v, uniformly over s; and so we have a rigid-body motion. Then, and are also uniformly rotated by α. Let us also rotate ∂tX(s,t) uniformly by α. Then by (2.14), f is uniformly rotated by α. Hence, if X(s,t) and ∂tX(s,t) are such that the force and torque balances (2.11)–(2.13) hold, they will continue to hold if we apply a rigid-body rotation and translation to X(s,t) and rotate ∂tX(s,t) by the same angle. Now, because κ(s,t+1)=κ(s,t), X(s,t+1) is obtained from X(s,t) by a rigid-body rotation by an angle α1 and translation by a vector v1. In fact, the trajectory of the body during each period is the same as during the previous period, but with a rigid translation by v1 and rotation by α1. Over many periods, such a body follows a circular path when α1≠0, and owing to the rotation, the distance travelled over n periods may be much smaller than n times the distance travelled over one period. To obtain a motion that yields a large distance over many periods, we restrict to body motions for which X(s,t+1) is obtained from X(s,t) by a translation only, with rotation angle α1=0. We implement this constraint approximately in our optimization problem, using a penalization term. We multiply d in (2.17) by a factor that penalizes rotations over a single period, A1Equation (A1) involves the change in angle at the snake's trailing edge over a period, which is the same as α1. The exponential factor has a maximum of e2 when α1=0, and a minimum of e−2 when α1=π. Using the factor of 2 in the exponent of (A1), all of the optima found by our computations involve very little rotation – less than 0.5° over all of the (μb,μt) space. When the factor is decreased from 2, some of the optima begin to involve non-trivial amounts of rotation. In short, the factor of 2 represents a choice of how much to weight lack of rotation relative to distance and work in the objective function.

A second modification to η is made to avoid large values of η. There are snake motions (i.e. with the κ(s,t)=κ(1−s,t) symmetry and μb=1) that have d=0 and W≠0, giving . The minimization search commonly encounters motions that are nearly as ineffective throughout shape space, and η has a large gradient in the vicinity of such motions. It is challenging to minimize such a function numerically. We therefore substitute −d/W for W/d. The minima are the same for both because each is an increasing function of the other, but −d/W varies much more smoothly in most of the shape parameter space. There are cases where −d/W tends to , but these occur only in a small region of parameter space that is rarely approached by our computations.

Combining these ideas, the function we minimize in place of η is A2

Appendix B. Computing ∇F

The BFGS method requires the gradient of F with respect to the (2m1−1)n1 curvature coefficients {αjk,βjk}. For notation, let a be a (2m1−1)n1×1 vector whose entries are the curvature coefficients. We compute a finite-difference gradient using B1where ej is the unit basis vector in the jth coordinate direction. Equation (B1) requires (2m1−1)n1+1 evaluations of F, or (2m1−1)n1+1 computations of the snake motion over a period. We use a very accurate approximation that requires only a single computation of the snake motion over a period.

Computing F(a+10−8aej) requires solving b=0 from (2.11)–(2.13) for {∂tx0,∂ty0,∂tθ0} at each time tk=k/m∈[0,1/m,…1−1/m] using the vector of curvature coefficients a+10−8aej. Here, b is a function of (∂tx0,∂ty0,∂tθ0) and the curvature coefficients, so we write it as b((∂tx0,∂ty0,∂tθ0),a+10−8aej) and solve b((∂tx0,∂ty0,∂tθ0),a+10−8aej)=0 at each time tk. The value of (∂tx0,∂ty0,∂tθ0) that satisfies these equations at each tk may be written ck(a+10−8aej). As ck(a) solves a nearby set of equations, it provides a very good initial guess for a Newton iteration to find ck(a+10−8aej). We define ϵj≡10−8aej for our description. We start with the usual Taylor series expansion from which Newton's method is obtained, B2Here, is the 3×3 Jacobian matrix of partial derivatives of b with respect to its leading 3×1 vector argument, B3Having previously computed ck(a) via Newton's method, we have also calculated and stored at each tk. We therefore modify in (B2) while retaining the same order of accuracy in ϵj, B4We solve (B4) to obtain cjk, our approximation to ck(a+ϵj), B5Equation (B5) requires only a single computation of the snake motion over a period, which gives ck(a) and at each tk. It also requires (2m1−1)n1 evaluations of the function b. These are much less expensive than even a single computation of the snake motion over a period. The O(∥ϵj2) error in cjk is of the same order of magnitude as machine precision. The value of the cost function corresponding to cjk has an error that is also comparable with machine precision, and is therefore subdominant to the ≈10−8 error in (B1). This level of error leads to essentially no deterioration in the convergence of BFGS up to machine precision relative to the version with the exact gradient [35]. Therefore, we obtain the gradient of F for approximately the cost of computing F. Each computation of a search direction in BFGS requires only one gradient computation. These ingredients give an efficient optimization method, which is critical to finding optima in the high-dimensional space of curvature coefficients.

• Received April 15, 2013.
• Accepted August 2, 2013.

View Abstract