## Abstract

We study a trajectory-planning problem whose solution path evolves by means of a Lie group action and passes near a designated set of target positions at particular times. This is a higher-order variational problem in optimal control, motivated by potential applications in computational anatomy and quantum control. Reduction by symmetry in such problems naturally summons methods from Lie group theory and Riemannian geometry. A geometrically illuminating form of the Euler–Lagrange equations is obtained from a higher-order Hamilton–Pontryagin variational formulation. In this context, the previously known node equations are recovered with a new interpretation as Legendre–Ostrogradsky momenta possessing certain conservation properties. Three example applications are discussed as well as a numerical integration scheme that follows naturally from the Hamilton–Pontryagin principle and preserves the geometric properties of the continuous-time solution.

## 1. Introduction

This paper is concerned with the analysis of a class of higher-order trajectory planning problems that are important in a wide range of contexts, ranging from computational anatomy to quantum control, both of which are discussed in this paper. The problem in its abstract formulation consists of finding an optimal curve *g*(*t*) in a Lie group *G* that generates a curve *q*(*t*)=*g*(*t*)*Q*_{0} in an object manifold *Q*, such that *q*(*t*) passes near a set of fixed *target points* at given *node times*. Precise definitions of optimality and proximity will be provided below.

In the higher-order variational formulation appropriate for such problems, the Euler–Lagrange equations split into a set of Euler–Poincaré equations that hold on the open time intervals between the nodes, and a set of *node equations* that describe how to pass from one open interval to the next. The primary purpose of this paper is to develop a new geometric understanding of the node equations in terms of conjugate momenta. Continuing from this, we develop a numerical algorithm for the trajectory planning problem that respects the geometric properties exhibited by the continuous-time solution.

### (a) Background and problem formulation

The problem treated in this paper fits into a classical type of problem in control theory called *trajectory planning* or *interpolation by variational curves*. The task in this type of problem is to find an optimal curve that interpolates through a given set of points (or configurations) lying in some manifold, specific to the application one has in mind. Such trajectory planning problems are relevant in numerous applications, for example, in aeronautics, robotics, computer-aided design, air traffic control and more recently, computational anatomy. Some trajectory planning applications require the optimal trajectories to possess a certain degree of smoothness. This requirement summons variational principles that depend on higher-order derivatives of the interpolation path, such as *acceleration* (rate of change of velocity) or *jerk* (rate of change of acceleration), etc. Properties of such higher-order variational principles have been widely studied, one of the earlier references being [1], where an intrinsic formulation in terms of higher-order tangent bundles was given. Prieto-Martínez & Román Roy [2] contains a literature overview concerned both with mechanics and field theories, and some further recent developments are given in earlier studies [3–5].

In some applications, for example in computational anatomy [3] and quantum control [6], the optimal trajectory is generated via a group action. That is, a curve *g*(*t*) in a Lie group *G* acts on a point *Q*_{0} in an *object manifold* and generates a curve *q*(*t*)=*g*(*t*)*Q*_{0}, which passes through a sequence of target points at prescribed times. In some instances, it is desirable to relax the target constraints in such a way that the optimal curve does not exactly pass through the target points but still passes *near* them at the prescribed times. This may be achieved by including a *soft constraint* in the cost functional, that is, a term penalizing the discrepancy between the trajectory *q*(*t*) and the targets. This leads to cost functionals of the following type,
1.1Here, *ξ*^{(j)} are the *j*th time derivatives of a curve *ξ*(*t*) in the Lie algebra (the tangent space at the identity element of the Lie group) that integrates to the curve *g*(*t*), which in turn produces the trajectory in the object manifold according to *q*(*t*)=*g*(*t*)*Q*_{0}. The first part of the cost is the integral over a Lagrangian ℓ and is associated with the curve on the group. The second part sums up the squares of the distances *d* between the curve *q*(*t*) and the target points *q*_{i}, at the prescribed times *t*_{i}. The *tolerance parameter* *σ* may be adjusted to suitably weight the two parts. In computational anatomy, for example, a diffeomorphism group acts by transforming a medical image (or substructures thereof, such as points of interest, fibres or surfaces). See the study of Miller *et al.* [7] for an overview. In this context, the prescribed target images (or substructures) may not be diffeomorphically related to the initial one. That is, the initial and target configurations may lie on different group orbits. In general, one expects therefore that exact matching may not be possible and works instead with a soft constraint. Two of the authors have recently studied a problem of similar nature in quantum control, see [6]. There, one considers the group of unitary matrices acting on quantum state space, with the goal of finding the optimal experimental manipulation of the system such that the evolution of an initial state passes near a sequence of given target states. The cost functional is directly related to the required amount of change in the experimental apparatus over time. The introduction of a soft constraint is appropriate in this problem even though in this case the group action is transitive, because by increasing the tolerance parameter *σ* optimal trajectories may be achieved at a lower cost.

In a more general sense such trajectory planning problems can be thought of as *inverse problems*, where the data points *q*_{i}∈*Q* have been determined experimentally at the times *t*_{i} and one seeks the corresponding curve *g*(*t*) in configuration space *G*. In this context, a natural choice of the tolerance parameter *σ* would be a measure of the uncertainty inherent in the experiment, such as standard deviation. The Lagrangian ℓ represents a modelling choice and is specific to the application one has in mind.

This paper is concerned with trajectory planning problems of the above type. It was found in [3] that the Euler–Lagrange equations split into higher-order Euler–Poincaré equations, which hold on the open intervals between node times, and a set of *node equations* that describe the passage from one open interval to the next. These node equations describe the continuity properties of a set of quantities involving derivatives of various orders of the Lagrangian. As we will see in examples, a natural choice of Lagrangian ℓ leads to *Riemannian cubics* and their higher-order generalizations. This class of curves was introduced in Noakes *et al.* [8] and has since been studied in a series of papers including [9–14]. Riemannian cubics appear in a variety of applications, for example, in the quantum control problem mentioned above, but also in computer graphics, robotics and spacecraft control [15–18].

### (b) Plan of the paper

In §2, we shall rederive the Euler–Lagrange equations for the higher-order variational problem by using Lagrange multipliers in a generalization of the symmetry reduced Hamilton–Pontryagin principle of geometric mechanics. In this approach, the derivation of the Euler–Lagrange equations simplifies considerably and a new geometric interpretation of the node equations emerges. Namely, they describe the evolution of Legendre–Ostrogradsky momenta across the nodes, in which the highest order momentum experiences a discontinuous jump related to the distance between the curve in the object manifold and the target points. The discontinuity can be understood in terms of a momentarily broken symmetry at the node times. However, if the object manifold is isotropic with respect to a subgroup action, then a residual symmetry remains. By Noether’s theorem, this residual symmetry leads to a conservation law across node times.

In §3, we discuss a number of applications, including rigid body splines, macromolecular configurations and quantum splines. Section 4 is concerned with the numerical solution of the inexact trajectory planning problem. More precisely, §4 describes a geometric discretization of the higher-order Hamilton–Pontryagin principle for inexact trajectory planning, similar to the approach given in [19] for first-order systems. Our main motivation for the development of a geometric integrator is the exact momentum behaviour of the discrete solution. This leads in turn to a dimensionality reduction of the search space in the numerical optimization.

## 2. Geometry of the inexact trajectory planning problem

We start with the statement of the problem considered here. One aims at steering from an initial point *Q*_{0} in some *object manifold* *Q* along an optimal trajectory *q*(*t*) that evolves via the action of a Lie group *G*. That is, *q*(*t*)=*g*(*t*)*Q*_{0}, where the right-hand side denotes the action of *g*(*t*) on *Q*_{0} and the curve *q*(*t*) lies in the *G*-orbit of *Q*_{0}.

The optimality condition is given in terms of a function defined on the *k*-fold Cartesian product of the Lie algebra , which measures the cost of the transformation *g*(*t*), and a distance function *d*:*Q*×*Q*→. As we shall see, the integer *k* determines the degree of smoothness of solution curves. The optimal curve *q*(*t*) is required to pass near prescribed target points *Q*_{ti} at prescribed *node times* *t*_{i} for *i*=1,…,*l*. This is formalized by including a squared distance term *d*^{2}(*g*(*t*_{i})*Q*_{0},*Q*_{ti}) in the cost functional, for each *i*. Thus, the cost functional , where is a suitable space of -valued curves (see below), is defined by
2.1The notation *ξ*^{(j)} is shorthand for d^{j}*ξ*/d*t*^{j}, the quantity *σ* is a *tolerance parameter*, and the curve *g*(*t*) originates at the identity *g*(0)=*e* and satisfies . We write *R*_{g} for multiplication by *g* from the right and *TR*_{g} for its differential, thus . Variations are considered in the space consisting of curves for which (2.1) exists and whose restrictions to open intervals (*t*_{i−1},*t*_{i}) for *i*=1,…,*l* are *C*^{2k−2} and whose *j*th derivatives *ξ*^{(j)} are continuous on [0,*t*_{l}] for *j*=0,…,*k*−2. We also require the existence of one-sided limits of certain derivatives at the node times *t*_{i}. Moreover, we assume that initial values of *ξ*^{(j)}(0) for *j*=0,…,*k*−2, are given.

This type of trajectory planning problem is familiar, for example, from image registration in computational anatomy, where one typically thinks of *Q*_{0} as a template shape being deformed by a curve of diffeomorphisms *g*(*t*), in turn generated by the time-dependent vector field *ξ*(*t*) [20]. At times *t*_{i}, the resulting curve in shape space passes near the given target shapes *Q*_{ti}, the parameter *σ* determining the proximity of the passage. In this case, the Lie group *G* of diffeomorphisms is infinite dimensional. However, in this paper, we will restrict ourselves to the case of finite-dimensional Lie groups and object manifolds. A natural finite-dimensional instance for illustrating these ideas arises in quantum control [6], where quantum state vectors evolve under the action of the unitary group. The generator curve *ξ*(*t*) in this case corresponds to the Hamiltonian operator, which is controlled in experiments.

### (a) Euler–Lagrange equations via Lagrange multipliers

The Euler–Lagrange equations characterize solutions to Hamilton’s principle, *δS*=0, and were derived in [3]. The equations split into a set of higher-order Euler–Poincaré equations on the open time intervals between the node times and a number of *node equations* describing how the solution evolves across the nodes. In previous formulations of this problem, one must find the variation *δg*(*t*_{i}) that is produced by a (time-dependent) variation *δξ*(*t*). This can be facilitated by taking advantage of Lagrange multipliers in an equivalent variational formulation that we describe now. As this paper demonstrates, the new approach also provides a geometric interpretation of the node equations, and furthermore suggests a geometric numerical procedure for the solution of the problem, see §4.

The method of Lagrange multipliers involves enlarging the space on which the dynamics happen. We define the cost functional *S* on some space of curves ,
2.2Again some technical assumptions about the space of curves are needed. Namely, (2.2) exists, the curves are *C*^{1} when restricted to the open intervals (*t*_{i−1},*t*_{i}) for *i*=1,…,*l*, and *g*,*ξ*^{0},…,*ξ*^{k−2} are continuous on [0,*t*_{l}]. Moreover, one-sided limits of the curves *μ*^{r}, *r*=0,…,*k*−1, exist at the node times *t*_{i}, for *i*=0,…,*l*. We also assume *g*(0)=*e* and given initial values , for *j*=0,…,*k*−2.

Before we take variations of *S*, it is useful to introduce the *cotangent lift momentum map* associated with the action of *G* on *Q* (see [21, ch. 11], for more information). This map is defined to satisfy
2.3where we have used the notation *ξ*_{Q}(*q*):=d/d*ε*|_{ε=0} e^{εξ}*q* and 〈 . ,. 〉 for the respective duality pairings. We also introduce the shorthand to denote the exterior derivative of the distance function *d* with respect to the first entry, and d_{1}*d*(*t*_{i}):=d_{1}*d*(*g*(*t*_{i})*Q*_{0},*Q*_{ti}). Moreover, we will shorten *d*(*g*(*t*_{i})*Q*_{0},*Q*_{ti}) to *d*(*t*_{i}). Integrating by parts and using (2.3), we obtain
2.4where we set *η*:=*TR*_{g−1}*δg* and defined as well as . Arriving at (2.4) can be understood by splitting the integral in (2.2) over each open interval, (*t*_{i},*t*_{i+1}), *i*=0,…,*l*−1. The time derivatives present in these integrals lead to boundary terms for each *t*_{i}, *i*=0,…,*l*. Those boundary terms involving the *η*(*t*_{i}) can be combined with the only other non-integral terms that arise in the calculation, the variations of the penalty terms with respect to the *g*(*t*_{i}),
in which we used (2.3). Note also that the last line of (2.4) could have been omitted because by assumption *η*(0)=0 and *δξ*^{j}=0 for *j*=0,…,*k*−2.

We can now read off the Euler–Lagrange equations. On the one hand, for *t* in any of the open intervals (*t*_{i},*t*_{i+1}), *i*=0,…,*l*−1, we have
2.5
2.6
2.7
2.8
and
2.9On the other hand, the *node equations* are given by
2.10
2.11
2.12
and
2.13

### Remark 2.1

There are four versions of the action functional. The one above can be called the *left-action, right-reduction* version as *g*(*t*) acts on *Q*_{0} from the left, while *ξ*^{0} is the right-reduced velocity (see §2*b*, for more details on reduced variables). There are the following three other cases.

(1) The

*right-action, right-reduction*case with action functional The variation of the penalty term in this case changes according to This means that (2.10) and (2.12) are replaced by(2) In the

*right-action, left-reduction*case, the action functional is where we wrote*L*_{G}for multiplication by*G*from the left and*TL*_{G}for its differential. However, this is equivalent to the left-action, right-reduction case by identifying 2.14and setting ℓ=*l*°*κ*, where is multiplication by −1.(3) By the same token the

*left-action, left-reduction*case with action functional can be mapped to the right-action, right-reduction case.

In the analysis that follows, we will largely restrict ourselves to the left-action, right-reduction case. Anything we say can be transferred to the remaining three cases by applying the modifications listed above.

### (b) Euler–Poincaré equations

From (2.5) to (2.9), it follows that on open intervals (*t*_{i},*t*_{i+1}), *i*=0,…,*l*−1,
2.15This is a *kth-order Euler–Poincaré equation* for a system that exhibits right-invariance. This type of equation was derived in [3] from a variational perspective and in [5] from the Hamiltonian one. We now explain its appearance in the inexact trajectory planning problem from the viewpoint of invariant Lagrangians, starting with some necessary definitions that can be found in [4,22].

The *kth-order tangent bundle* is defined as a set of equivalence classes of curves as follows: Two curves *g*_{i}(*t*)∈*G*,*i*=1,2, are *equivalent*, if and only if their time derivatives at *t*=0 up to order *k* coincide in any local chart. That is, , for 0≤*j*≤*k*. The equivalence class of a curve *g*(*t*) is denoted by , or formally as . The set of all equivalence classes of curves emanating from *g*_{0}∈*G* is written as . Then is a fibre bundle over *G* with projection map . Note that a curve *g*(*t*) defines, at each time *t* in its domain, an element by setting *h*(*s*):=*g*(*t*+*s*).

It is convenient to represent *T*^{(k)}*G* using the *trivialization map* that makes use of the right group multiplication (analogous constructions exist using the left multiplication map). Let *g*(*t*) be a representative of and define . The trivialization map is given by
The *reduction map* is obtained by omitting the first entry.

It is well known (for example, [21, ch. 13]) that the first-order (*k*=1) Euler–Poincaré equation appears when the Euler–Lagrange equation for a Lagrangian *TG*→ with symmetry is written in terms of the reduced velocity vector . The higher-order Euler–Poincaré equation appears in a similar fashion when computing the Euler–Lagrange equations for an invariant *k*th-order Lagrangian *L*:*T*^{(k)}*G*→. More precisely, *L* is called *right-invariant* if . The definition of *left-invariance* follows analogously. Let *L* be a right-invariant Lagrangian and consider Hamilton’s principle, , for
2.16where variations are taken with respect to fixed end points up to order *k*−1, that is, *δg*^{(j)}(*a*)=*δg*^{(j)}(*b*)=0, for *j*=0,…,*k*−1, in any local chart. The Euler–Lagrange equations can be written in terms of the right-reduced velocity vector, which leads to the *kth-order Euler–Poincaré equations* [3]
2.17with *reduced Lagrangian* ,
It is now straightforward to see from the viewpoint of invariant Lagrangians why the Euler–Poincaré equation (2.15) must characterize optimal curves in the trajectory planning problem on open time intervals. Indeed, fix 0≤*i*≤*l* and suppose *ξ*(*t*) is a local extremum of (2.1) with integral curve *g*(*t*). If *ξ*|_{(ti,ti+1)} is not a solution to the *k*th-order Euler–Poincaré equation one can find a variation *g*_{ε}(*t*) keeping fixed *g*(*t*_{i}) and *g*(*t*_{i+1}), such that with Lagrangian *L*:=ℓ°*β*_{k}. By consequence, is a variation of *ξ* such that *δS*≠0 for *S* as in (2.1), a contradiction.

### (c) Geometry of multipliers

Furthermore, we observe from (2.5) to (2.9) that on open intervals (*t*_{i},*t*_{i+1}), *i*=0,…,*l*−1,
2.18In order to discuss the geometric meaning of these identities, we recall from above that the trajectory planning problem (2.2) on open intervals reduces to a problem of the type with of the form (2.16). In particular, equations (2.5)–(2.9) and therefore (2.18) are obtained by taking suitably constrained variations of
2.19This variational principle is a higher-order generalization of the *reduced Hamilton–Pontryagin principle* of first-order mechanics. In first-order mechanics, this principle provides a unified treatment of the Lagrangian and Hamiltonian descriptions of invariant mechanical systems on Lie groups (see [23] for a detailed discussion. There is a close connection between the Hamilton–Pontryagin principle and Dirac structures, an aspect we do not enter into in this paper). In particular, the *Legendre transform* connecting the two descriptions is revealed by the variational calculus. This remains true for higher-order mechanics. Indeed, (2.18) can be recognized to be the reduced Legendre transform that appears in [3,4]. While we found (2.18) from a variational approach, these references take as a starting point [1], where a coordinate-free description of the higher-order Legendre transform on manifolds was given. We briefly review this approach here.

The Legendre transform of higher-order mechanics, given in [1], is a map Leg:*T*^{(2k−1)}*G*→*T**(*T*^{(k−1)}*G*). If Leg is a diffeomorphism (that is, *L* is *hyperregular*), it connects the Lagrangian and Hamiltonian descriptions just as in the first-order case. With respect to the right-trivializations
2.20it is given as [4]
The same equations were seen in (2.18) to emerge from the Hamilton–Pontryagin principle (2.19). This means that, as for first order, the higher-order Hamilton–Pontryagin principle contains both Lagrangian and Hamiltonian descriptions of higher-order mechanics and provides a unified framework for both views. To obtain the Lagrangian description, one may eliminate *μ*^{0},…,*μ*^{k−1} from (2.5)–(2.9) using (2.18). The resulting equations are the trivialized flow equations of the *Lagrangian vector field*, an element of [1]. On the other hand, if (2.9) can be solved for *ξ*^{k−1} (this is the case, for example, when *L* is hyperregular), then (2.5)–(2.8) are the trivialized flow equations of the *Hamiltonian vector field* , which solves [1]
2.21where *ω* is the canonical symplectic form on *T**(*T*^{(k−1)}*G*) and *H*:*T**(*T*^{(k−1)}*G*)→ is given as
2.22with respect to the trivialization (2.20). By consequence of (2.21), the flow map *F*_{t}:*T**(*T*^{(k−1)}*G*)→*T**(*T*^{(k−1)}*G*) of the Hamiltonian vector field preserves the symplectic form *ω*.

For later reference, we point out how this can be seen alternatively from the Hamilton–Pontryagin principle. This is a generalization to higher order of a standard argument (see [19, §3], for the first-order case). If we omit end point constraints on the variations of in (2.19), the integration by parts contributes boundary terms to (cf. (2.4)),
2.23where *θ* in the second equality is the canonical one-form on *T**(*T*^{(k−1)}*G*) and we defined *δx*(*t*) to be the curve in *TT**(*T*^{(k−1)}*G*) whose trivialization corresponds to the variations (*TR*_{g−1}*δg*,*δξ*^{0},…,*δξ*^{(k−2)},*δμ*^{0},…,*δμ*^{(k−1)}). If we restrict the variations to solution curves of (2.5)–(2.9), we may just as well express as a function of initial conditions . The integral part of (2.23) then vanishes and we obtain
Therefore, . Taking into account that d^{2}=0, we obtain the desired identity .

### (d) Momentum conservation and Noether’s theorem

Another important point to notice from (2.5)–(2.9) is that is a conserved quantity. Here Ad* is the map dual to given by (*g*,*ξ*)↦Ad_{g}*ξ*:=*TL*_{g}*TR*_{g−1}*ξ*. Indeed by (2.5)
2.24In the context of first-order Euler–Poincaré equations, a similar momentum conservation is owing to the invariance of the Lagrangian with respect to group multiplication operations. This is an instance of Noether’s theorem, which roughly speaking guarantees that the momentum map associated with the action of a symmetry group is preserved (see [21, ch. 11]). We now show that the situation is similar for (2.24). The right action *R* of *G* on itself,
can be lifted to an action on *T*^{(k−1)}*G*,
This action can be lifted to its so-called *cotangent-lifted action* [21, §12.1]
given in trivialized form as
It is apparent that the Hamiltonian (2.22) is symmetric with respect to this group action. By Noether’s theorem, the associated momentum map is conserved.

What is this momentum map? By appealing to standard formulae [21, §12.1] we find that the momentum map is with respect to the trivialization (2.20). The conservation law observed in (2.24) thus arises from the right-invariance of the Lagrangian (respectively, the Hamiltonian) via Noether’s theorem.

The conservation law can also be obtained from a variational perspective. This is well known in first-order mechanics, and it is also the case in higher-order mechanics. We take a solution of (2.5)–(2.9) on the time interval [*a*,*b*] and vary it according to *δg*=*TL*_{g}*ν* for . For as in (2.19), we have (cf. (2.23))
2.25The same argument holds after replacing the upper boundary *b* by any *b*′∈[*a*,*b*]. As *ν* was arbitrary, we conclude that is conserved along a solution of (2.5)–(2.9).

### (e) Node equations

The remarks above concerned equations (2.5)–(2.9) on the open time intervals between nodes. We now come to the node equations (2.10)–(2.13). These specify the evolution across node times of the Lagrange multipliers *μ*^{r}, which we interpreted above as the reduced *Legendre momenta* of the system. More specifically, the momenta *μ*^{r}, *r*=1,…,*k*−1 are continuous on [0,*t*_{l}], while the 0th momentum *μ*^{0} experiences jump discontinuities at the nodes. If the Lagrangian ℓ is hyperregular, we can conclude that *g*∈*C*^{2k−2}([0,*t*_{l}]), that is, *g* is (2*k*−2) times continuously differentiable on [0,*t*_{l}]. Furthermore, the node equations specify terminal values for the curves *μ*^{r}, *r*=1,…,*k*−1.

For *a*,*b*∈ define 1_{a≤b} to be equal to 1 if *a*≤*b* and 0 otherwise. We can now prove the following theorem.

### Theorem 2.2

*For t in any of the open time intervals (t*_{s}*,t*_{s+1}*) as well as for t∈{0,t*_{l}*},
*2.26

### Proof.

At final time *t*=*t*_{l} (2.26) clearly holds because of (2.12). As is conserved on open intervals, it follows that for *t*∈(*t*_{s},*t*_{s+1}),
We can now obtain (2.26) by induction over the open time intervals, noting that at each node *t*=*t*_{s} a term −(*d*(*t*_{s})/*σ*^{2})*J*^{Q}(d_{1}*d*(*t*_{s})) gets added on. □

In order to formulate the following corollary, we use the notation for any point *q*∈*Q* to denote the Lie algebra of the *isotropy subgroup* of that point, *G*_{q}:={*g*∈*G*|*gq*=*q*}. In particular, *ρ*_{Q}(*q*)=0 for any .

### Corollary 2.3

*For a solution of (2.5)*–*(2.13) we have, for t in any of the open time intervals* (*t*_{s},*t*_{s+1}) *as well as for t*∈{0,*t*_{l}},
2.27

In §4, we will develop a geometric algorithm that inherits an exact version of this corollary. This implies that the numerical search for the optimal initial value of *μ*^{0} can be restricted to the subspace of that annihilates .

### Proof.

For *t* and *ρ* as in the statement of the corollary it follows from theorem 2.2 that
where we used (2.3) for the second equality and noted that for the third. □

### (f) Residual conservation law after partial symmetry breaking

A physically intuitive perspective on corollary 2.3 is to understand it as a residual conservation law after partial symmetry breaking. We can see the sum in (2.2) as the integral over a time-dependent *potential function* *V* :[0,*t*_{l}]×*G*→ given by
2.28where *δ* denotes the Dirac delta function. This potential produces instantaneous singular forces at node times *t*_{i} which impart the jump discontinuities on the otherwise conserved momentum ,
2.29This is because the presence of this potential breaks the *G*-invariance of the variational problem, however a residual symmetry remains. Clearly, multiplication of *g* from the right by an element *h*∈*G*_{Q0} leaves *V* invariant. An adaptation of the argument surrounding equation (2.25), restricting *ν* to the subspace , then leads to
for any *t*∈[0,*t*_{l}]. Moreover, (2.12) gives . Therefore, for any *t*∈[0,*t*_{l}] and , which is equivalent to corollary 2.3.

## 3. Applications

In this section, we discuss a number of examples that summon the inexact trajectory planning problem. We start with a discussion of Riemannian cubics on general Riemannian manifolds and specifically on Lie groups with invariant metrics, because this type of curve appears for a certain natural choice of Lagrangian. We then treat the rigid body, finite-dimensional quantum systems and molecular strands.

### (a) Riemannian cubics

Let (*M*,*γ*) be a Riemannian manifold with metric *γ*, whose norm we denote by ∥.∥_{γ}. The notion of straight lines in Euclidean space generalizes to geodesics in *M*. These are curves *x*(*t*)∈*M* that satisfy . Here, we introduce the notation , where ∇ is the Levi–Civita connection for *γ*. In local coordinates, the geodesic equation is given by
where are the Christoffel symbols of the Levi–Civita connection. We define the vector bundle isomorphism over the identity, ♭:*TM*→*T***M*, which maps *v*∈*T*_{x}*M* to *v*^{♭}=*γ*(*x*)(*v*,⋅). The inverse map is denoted by ♯.

The physical interpretation of straight lines in Euclidean space follows from Newton’s second law . Here, *F* is an external force acting on a particle of unit mass that follows the trajectory *x*(*t*). By analogy, one can often understand as a generalized force acting on a physical system whose time evolution is represented by a curve *x*(*t*) in configuration space *M*. Consider the Lagrangian *L*:*T*^{(2)}*M*→,
3.1whose corresponding action functional measures the square of the *L*^{2}-norm of the external force. Hamilton’s principle *δS*=0 for variations with fixed boundary velocities and leads to the Euler–Lagrange equation [8]
where *R* is the curvature tensor, *R*(*X*,*Y*)*Z*:=∇_{X}∇_{Y}*Z*−∇_{Y}∇_{X}*Z*−∇_{[X,Y ]}*Z* for any vector fields . Solutions to this equation are called *Riemannian cubics* and were introduced in [8].

When the manifold is a Lie group *M*=*G*, we call a Riemannian metric *right (or left) invariant* if *γ*(*g*)(*v*_{g},*w*_{g})=*γ*(*gh*)(*TR*_{h}*v*_{g},*TR*_{h}*w*_{g}), or *γ*(*g*)(*v*_{g},*w*_{g})=*γ*(*hg*)(*TL*_{h}*v*_{g},*TL*_{h}*w*_{g}), respectively, for all *g*, *h*∈*G* and *v*_{g}, *w*_{g}∈*T*_{g}*G*. If the metric is right (or left) invariant, then the Lagrangian (3.1) can be reduced to a function [3]
3.2where we introduced the operation . This can be seen from the fact (e.g. [3]) that for a curve *g*(*t*), whose right (or left) reduced velocity is given by ,
In the following, we will discuss a number of physical systems whose configuration spaces are Lie groups and whose equation of motion in the absence of external forces is given by for some right (or left) invariant Riemannian metric. The Lagrangian (3.2) is then one natural choice for ℓ in the inexact trajectory planning problem because optimal curves minimize (in the *L*^{2} sense) the amount of external forcing necessary to achieve them. It is clear from equations of motion (2.5)–(2.11) that the solution *g*(*t*) is a Riemannian cubic on open intervals and twice continuously differentiable on the whole time interval [0,*t*_{l}]. Such curves are called *Riemannian cubic splines*.

### Remark 3.1

Without going into the mathematical details, we point to a probabilistic interpretation of Riemannian cubics. See also [18], where a closely related idea is discussed in the context of stochastic modelling of biological growth. Let *G* be a Lie group with right-invariant metric *γ* and let *e*_{i}, *i*=1,…,*d* be an orthonormal basis of the *d*-dimensional Lie algebra . Consider a curve *g*(*t*)∈*G*, whose right-reduced velocity satisfies the following Ito stochastic differential equation
3.3where *W*_{i}, *i*=1,…,*d*, are independent Brownian motions (see [24, §2.2], for a definition) and *σ*_{W}∈. Suppose the (noisy) data are given in a vector space *V* equipped with an inner product, whose norm we denote by ∥.∥_{V}. The noise distribution is assumed to be Gaussian, that is, the probability density function has the form , where is the true state of the system and *σ*_{n}∈. Suppose experiments at times *t*_{i}, *i*=1,…,*l* measuring the trajectory *g*(*t*)*Q*_{0} have given results *Q*_{ti}. Then the minimization of
with ℓ as in (3.2), can formally be understood as the maximization of the (logarithm of the) probability of the path *g*(*t*), given the measurements. Alternative models of stochastic forcing will typically lead to minimization problems of the same type, but with different choices of ℓ (see remark 3.2).

### (b) Rigid body splines

Let the Lie group *G* be the set of rigid rotations *SO*(3) and let *Q* be the unit sphere *S*^{2}∈^{3}. We use vector notation for the Lie algebra of the Lie group of rotations *SO*(3) as well as for its dual . One identifies with ^{3} via the familiar isomorphism
3.4called the *hat map*. This is a Lie algebra isomorphism when the vector cross product × is used as the Lie bracket operation on ^{3}. The identification of with ^{3} induces an isomorphism of the dual spaces , with the dot product as duality pairing. Let *γ* be a left-invariant Riemannian metric on *SO*(3). This defines an inner product on which can be expressed as
for a symmetric, positive definite matrix . The geodesic equation in Euler–Poincaré form is
3.5Consequently, the Lagrangian (3.2) takes the form
3.6Consider the inexact trajectory planning problem in the left-action, left-reduction form. That is, suppose an initial point **x**_{0} and targets **x**_{t1},…,**x**_{tl}∈*S*^{2} are given as well as a tolerance parameter *σ*. We seek the minimizer of
3.7The physical interpretation is as follows. The group of rigid rotations, *SO*(3), is the configuration manifold of a rigid body constrained to rotate around a fixed point. In the absence of external torques, the motion is governed by the geodesic equation (3.5), where is the moment of inertia tensor (for example, [25, §2.4]). The resulting curve *g*(*t*) describes the orientation of the rigid body relative to a space fixed reference frame.

Suppose the motion of a rigid body (with or without external torque) is partially observed in an experiment. At discrete times *t*_{i}, the direction of a particular body fixed axis is measured in the space fixed frame, generating a sequence of outcomes **x**_{ti}∈*S*^{2}. Therefore, if **x**_{0} is the initial direction of the axis and *g*(*t*) describes the rigid body motion, then *g*(*t*_{i})**x**_{0}−**x**_{ti}=0, up to measurement error. One would like to model the trajectory *g*(*t*), taking into account this information. The action functional (3.7) encodes one such model, yielding the curve *g*(*t*) of minimal external torque (in the *L*^{2} sense) that is consistent with the experiment. A natural choice for the parameter *σ*^{2} is to set it equal to the variance of the measurement. Example simulations are shown in figures 1 and 2. These were generated by the numerical algorithm discussed in §4.

If the observer is instead moving with a body fixed frame and measuring a space fixed direction, then *g*(*t*_{i})^{−1}**x**_{0}−**x**_{ti}=0, up to measurement error. In this case, the inexact trajectory planning problem presents itself in the right-action, left-reduction form. Evidently, the formalism presented in this paper applies to any (sufficiently smooth) choice of Lagrangian. For example,
leads to optimal curves *g*(*t*) with minimal work done by external torques.

### Remark 3.2

We mentioned in remark 3.1 that the minimization of (3.7) is related to a certain inverse problem given the stochastic evolution (3.3). Alternative stochastic models lead to forms of ℓ different from (3.6). For example,
where d**W**=(d*W*^{1},d*W*^{2},d*W*^{3})^{T} is a vector of independent Brownian motions, leads to
with ∥.∥ being the Euclidean norm.

### (c) Quantum splines

In this section, we give an overview of a quantum mechanical application presented in [6], where more details can be found. We consider an *n*+1-level quantum system with Hilbert space . Quantum state space is the *n*-dimensional complex projective space , which is defined as −{0} modulo the equivalence relation |*ψ*〉∼*λ*|*ψ*〉 for any complex number *λ*≠0. The standard geodesic distance is given by
Here, we used Dirac notation to write |*ψ*〉 for a vector in and 〈*ψ*| for its Hermitian conjugate. The evolution of a quantum state is given by the *Schrödinger equation*
where the Hamiltonian operator *H* is a Hermitian matrix. Equivalently, one may express this as a differential equation on the unitary group *U*(*n*+1) by
3.8One may assume without loss of generality that the anti-Hermitian matrix *iH* is of zero trace, therefore *U*(*t*) is in the special unitary group *SU*(*n*+1). This is owing to the fact that the trace contributes a complex phase factor to the state evolution, which can be neglected in projective terms. Note that −*iH* is skew-Hermitian and trace free, and therefore lies in the Lie algebra .

Suppose a time-dependent Hamiltonian *H*(*t*) is controlled in an experiment whose goal it is to steer a system from initial state |*ψ*_{0}〉 through states |*ψ*_{tj}〉 at times *t*_{j}, for *j*=1,…,*l*. One would like to achieve this trajectory with least change to the experimental apparatus. We formalize this requirement using an inner product on ,
3.9The cost we associate with a time-dependent Hamiltonian *H*(*t*) is . We note that *γ*_{e} extends to a bi-invariant Riemannian metric *γ* on *SU*(*n*+1), which means, in particular, that ad^{†}=−ad. The Lagrangian is therefore equal to the reduced Lagrangian (3.2) for the metric *γ* on *SU*(*n*+1). The total cost functional takes the left-action, right-reduction form
The tolerance parameter *σ* can be used to trade off quality of matching against a reduced cost associated with change in the Hamiltonian over time. For more details, see [6].

### (d) Macromolecular configurations

Equilibrium configurations of macromolecular structures and of DNA in particular can be modelled using the classical theory of elastic rods. See [26,27] for examples of this approach. In this section, we formulate an inexact trajectory planning problem in this context.

We start by showing how the configuration of an elastic rod can be described by a position curve **r**(*s*) in ^{3} and a curve *R*(*s*) in the group of rigid rotations, *SO*(3). Here, *s*∈[0,1] parametrizes the cross sections of the rod along its length, whereby for a given value of *s* the vector **r**(*s*) points to the centre of mass of the respective cross section, as seen in the laboratory frame. Let **e**_{i}(*s*), *i*=1, 2, 3 be an orthonormal frame such that **e**_{1}(*s*) and **e**_{2}(*s*) point along the principal axes of the moment of inertia tensor of the cross section. We will refer to this frame as the *body fixed* frame. *R*(*s*) is the rotation that transforms the initial frame at *s*=0 (which we assume to coincide with the laboratory frame) to the body fixed frame at *s*. Therefore, the configuration of a macromolecule can be described by a curve *g*(*s*)=(*R*(*s*),**r**(*s*)) in the special Euclidean group, *SE*(3), originating at the identity. The group multiplication rule of *SE*(3) is
The Lie algebra consists of elements (*Ω*,**v**) with and **v**∈^{3}. Applying the inverse of the hat map (3.4) to *Ω*, we can represent as ^{6}. The ad operation becomes
If we identify the dual with ^{6} using the standard dot product as duality pairing, then
3.10The *body fixed velocity* vector pertaining to a configuration (*R*(*s*),**r**(*s*)) is defined as
where a superscript ⋅ means derivation with respect to *s*. In vector notation, we can set so that ** ξ**=(

**,**

*Ω***v**)∈

^{6}.

A macromolecule that is experimentally constrained to assume a configuration with given final rotation and displacement *g*(1)=(*R*(1),**r**(1)) will relax into an equilibrium state that minimizes potential energy with respect to all possible configurations respecting the constraint. For the case of DNA, the authors of [27] propose to model this effect using the Lagrangian *L*:*TSE*(3)→ whose left-reduced form is given by
The 6×6 matrix *K* is symmetric and positive definite and encodes the various stiffness properties. The double helix structure of DNA means that the equilibrium configuration for unconstrained end points retains a number *n* of rotations along its length. This is expressed by the vector
3.11where we assume without loss of generality that the equilibrium configuration for unconstrained end points is oriented along the spatial *z*-axis and has unit length.

In the case of constrained final rotation and displacement *g*(1)=(*R*(1),**r**(1)), the equilibrium configuration minimizes the action functional . Hamilton’s principle *δS*=0 leads to Euler–Poincaré equations
3.12with ad* operation given by (3.10) and the operation ad^{†} defined by . This equation of motion can be used to design second-order Lagrangians to model non-equilibrium states of the DNA. For example, let us set
Suppose an experiment measures the position of the centre of mass **r**(*s*_{i}) at a number of parameter values *s*_{i} (*i*=1,…,*l*) as well as a body fixed direction, say **e**_{3}(*s*_{i}). The space of measurement outcomes is *Q*=*S*^{2}×^{3}⊂^{6} with *SE*(3) action given by
If the measurements yield the sequence (**x**_{si},**y**_{si}) (*i*=1,…,*l*), this suggests that up to measurement error the configuration *g*(*s*)∈*SE*(3) satisfies
The task of modelling the configuration *g*(*s*) can then be cast in the form of an inexact trajectory planning problem in the left-action, left-reduction form with cost functional

### Remark 3.3

Owing to the anisotropy in velocity space, expressed by the vector **z**, the Euler–Poincaré equation (3.12) is *not* the reduced geodesic equation for the curve *g*(*t*) with respect to the metric defined by *K*. Consequently, the solution to the inexact trajectory matching problem is not a Riemannian cubic spline.

## 4. Geometric discretization

The purpose of this section is to illustrate how the higher-order Hamilton–Pontryagin principle offers a direct route towards geometric numerical integrators. All one needs to do, in essence, is to provide a geometric discretization of the constraint and define a discrete Hamilton–Pontryagin principle accordingly. For first-order variational problems, this idea was introduced in [19], building on the general theory of variational integrators. We refer to [28–30] for more information and to [31] for an extensive review. Moreover, Marrero *et al.* [32] discusses discrete Lagrangian and Hamiltonian mechanics in the general framework of Lie groupoids. We follow the footsteps of Bou-Rabee & Marsden [19] to treat second-order problems. Third and higher orders can be dealt with in a similar fashion.

Our main motivation for the development of a geometric integrator lies with the exact momentum behaviour exhibited by the discrete solution curves. Not only does the momentum conservation on open intervals (2.24) translate exactly into the discrete picture, but the behaviour at the nodes is given by discrete versions of the continuous time node equations (2.10)–(2.13). As a consequence, one can obtain discrete analogues of theorem 2.2 and corollary 2.3. This means that the numerical search for the optimal initial value of the momentum *μ*^{0} can be restricted to a linear subspace of of the same dimension as the data manifold *Q*. As we shall see below, the variational nature of the integrator also means that the discrete flow map preserves the canonical symplectic form on *T**(*TG*).

### (a) A geometric integrator

In discrete mechanics, the time axis [*t*_{0},*t*_{l}] is replaced by a set of discrete time points *t*_{k}=*t*_{0}+*kh*, *k*=0,…,*N*, where *h* is the step size and *t*_{l}=*t*_{0}+*Nh*. We use integers *N*_{i}, *i*=1,…,*l*, as node indices *t*_{i}=*t*_{0}+*N*_{i}*h*. For convenience, we also define *N*_{0}:=0. We will need a map that approximates the Lie exponential and is an analytical diffeomorphism in a neighbourhood of 0 with *τ*(0)=*e* as well as *τ*(*ξ*)*τ*(−*ξ*)=*e* for all . An example is the *Cayley transform*, which is a second-order approximation of the Lie exponential in quadratic matrix Lie groups. This includes the applications discussed in §3. Indeed, we used the Cayley transform when generating the figures in this article. It is defined as
This and further examples are discussed in [19].

Since *τ* is an approximate of the Lie exponential, a simple way of discretizing the constraint is to require that , where *h* is the size of a time step. Similarly, one may translate to . With these considerations in mind, we define a discretized version of the action functional (2.2) on discrete path space . Let
then we define as
4.1In analogy to the continuous time problem, we assume that *g*_{0}=*e* and are given.

The discrete Euler–Lagrange equations follow from Hamilton’s principle *δS*_{d}=0. That is, they characterize paths , for which *δS*_{d}:=*δγ*(*S*)=0 for all variations with *δg*_{0}=0 and . In the process of computing *δS*_{d}, we need to calculate . For that purpose, it is convenient to introduce the left-trivialized differential of *τ* at ,
whose inverse we denote by . By taking a derivative of *τ*(*ξ*)*τ*(−*ξ*)=*e*, one can show that [19]
4.2Denoting , we find
where in the last equality we used (4.2). Introducing the quantities
4.3and
4.4we obtain, after rearranging terms,
where we used abbreviations *d*_{Ni}=*d*(*g*_{Ni}*Q*_{0},*Q*_{ti}) and d_{1}*d*_{Ni}:=d_{1}*d*(*g*_{Ni}*Q*_{0},*Q*_{ti}). The Euler–Lagrange equations are therefore composed of the following equations. The constraints
4.5the discrete equations for the Legendre–Ostrogradsky momenta
4.6and
4.7the discrete version of the Euler–Poincaré equation for interior indices *k*≠*N*_{i} (*i*=1,…,*l*)
4.8and for node indices *k*=*N*_{i} (*i*=1,…,*l*−1)
4.9Finally,
4.10and
4.11A solution to (4.5)–(4.9) is said to have *initial conditions* , using the definitions (4.3). If the Lagrangian ℓ is hyperregular, then (4.7) can be solved for . This means that can be eliminated from equations (4.5) to (4.9), which can subsequently be integrated for given initial conditions. If in addition equations (4.10) and (4.11) are satisfied, then *γ* is a critical point of the action functional *S*_{d}.

### Remark 4.1

In remark 2.1, we mentioned that besides the above left-action, right-reduction case, three other cases were available to be considered. In a similar manner to what we observed in that remark the right-action, right-reduction case introduces changes to equations (4.9) and (4.10). These become respectively. The remaining two cases (left-action, left-reduction; and right-action, left-reduction) are equivalent to the first two in just the same way as explained in remark 2.1.

### (b) Geometric properties

As in §2*c*, the interior equations are conveniently analysed by omitting the mismatch penalty term (4.1) from the action functional, so that
The arguments that surround equation (2.23) and show symplecticity of the continuous time flow can then be applied in a straightforward manner to the discrete case. Indeed, interior equations (4.5)–(4.8) define a flow map , which integrates a solution *γ* for given initial conditions. That is, or more succinctly (*F*_{d})^{k}(*γ*_{0})=*γ*_{k}. We restrict to solutions of (4.5)–(4.8) and express it as a function of initial conditions *γ*_{0}∈*T**(*TG*). This means that if is the solution obtained by integrating *γ*_{0} then . It follows that
where *θ* is the canonical one-form on *T**(*TG*). Hence, . Taking an exterior derivative shows that the canonical symplectic form *ω*=*dθ* is preserved by *F*_{d}. Hence, the discrete flow map *F*_{d} is symplectic.

Similarly, the observations given in §2*d* can be translated to the discrete picture. Indeed, we pointed out in the paragraph of equation (2.25) how to obtain the conservation of the momentum from a variational perspective. These arguments can be applied to the discrete variational principle to show that for interior indices *k*≠*N*_{i}. Alternatively, a manipulation of equation (4.8) using (4.2) shows that
and therefore .

### Remark 4.2

The equations of motion on open intervals can equivalently be discretized in the purely Lagrangian picture by following [33]. One chooses a suitable discrete Lagrangian *L*_{d}:*G*×*G*×*G*→ and applies variational calculus to the discrete action sum
Let us define by and set
Boundary conditions being equal, the resulting optimal curve (*g*_{0},…,*g*_{N}) is the same as the one we obtained from the discrete Hamilton–Pontryagin principle. The Hamilton–Pontryagin principle has an advantage in situations where more sophisticated discretizations of the constraint are chosen. For example, in the preliminary study [34] Runge–Kutta–Munthe–Kaas methods were used to introduce a class of such integrators. Those integrators can still be understood in the purely Lagrangian framework; however, the definition of the corresponding function *L*_{d}(*g*_{k},*g*_{k+1},*g*_{k+2}) is implicit in that evaluating it requires solving a variational problem. The Hamilton–Pontryagin approach circumvents this difficulty by building the discretization of the constraint into the variational principle from the outset.

The node equation (4.9) reflects in a geometrically consistent way the jump discontinuities of given in (2.29). Indeed, (4.9) says that
when *k*=*N*_{i}. Moreover, the final time conditions (4.10) and (4.11) are exact analogues of (2.12) and (2.13). See figure 3 for an example. Putting everything together leads to discrete versions of theorem 2.2 and corollary 2.3. The proofs are analogous to the continuous-time case.

### Theorem 4.3

*For k=0,…,N
*

### Corollary 4.4

For *k*=0,…,*N*

### (c) Practical remarks

The integrator derived above provides a way of finding a numerical solution to the inexact trajectory planning problem.

The discrete equations of motion (4.5)–(4.9) can be employed to express the action functional as a function of initial conditions . The minimizer in the space of initial conditions can then be determined by a gradient descent method. As *g*_{0}=*e* and are given, the minimization is in effect only over the variables . By corollary 4.4, the optimal lies in the subspace of that annihilates , to which the search can therefore be restricted. On the other hand, one still needs to consider all of for the optimization of .

The gradient of can be estimated via finite-difference methods. However, this requires the repeated forward integration of (4.5)–(4.9). The number of such integrations increases with the number of dimensions of the Lie group *G* and for higher dimensional systems this quickly becomes unfeasible. Such difficulties can be circumvented using the standard method of *adjoint equations*, which can be easily implemented for the geometric discretization presented here (a detailed derivation is provided in the electronic supplementary material). Then the *exact* gradient is obtained at the cost of integrating twice (once forward and once backward) a system of equations of the same complexity as the forward equations.

The significance of the exact preservation of final time constraints (2.12) and (2.13) in the form of (4.10) and (4.11) is to provide verification that a (local) minimum has been found. When the tolerance is tight (*σ* is small) and in the absence of a good initial guess it may occur that the algorithm tends to a local minimum rather than the global one. Suitable initial guesses can be computed using a homotopy strategy. This means a step-by-step reduction of *σ*, where the optimum at one value of *σ* is taken as the initial guess at the next smaller value.

We used this algorithm to generate the figures in this paper. For simulations of quantum splines (see §3*c*) using the same methods, we refer to [6].

## 5. Discussion

This paper has discussed a type of inexact trajectory planning problem whose optimal curves are required to pass near a sequence of fixed target positions at designated times. In §2, a new derivation of the Euler–Lagrange equations for this type of problem was obtained by applying reduction by symmetry to a higher-order Hamilton–Pontryagin principle. This approach provided a new geometric interpretation of the previously known node equations in terms of Legendre–Ostrogradsky momenta. The highest order momentum was seen to undergo discontinuous jumps at the node times as a consequence of a partially broken Lie group symmetry. This was the content of theorem 2.2 and corollary 2.3. In §3, several applications of the theory were discussed, which summoned the inexact trajectory planning problem both from a control theoretic viewpoint (quantum splines) as well as in the context of a type of inverse problem (rigid body splines, macromolecular configurations). Finally, §4 was concerned with the numerical approach to solving the problem at hand. The reduced Hamilton–Pontryagin principle was taken as the starting point and a geometric discretization of the Euler–Lagrange equations was obtained, which led to exact momentum behaviour and discrete versions of both theorem 2.2 and corollary 2.3. This meant, in particular, that the search for the optimal initial value of the highest order momentum could be restricted to a subspace of the dual of the Lie algebra of the Lie group *G*, whose action describes the motion.

## 6. Outlook

This work invites further development in several directions. First, one can show that in general the discrete flow map derived in §4 is accurate only to first order in step size *h*. The development of geometric methods with a higher degree of accuracy would be desirable. The class of integrators presented in the preliminary study [34] may prove useful in this regard. An alternative possibility is the Lagrangian approach of Colombo *et al.* [33] together with suitably exact approximations of the Lagrangian function. Second, the final step in the numerical optimization of the cost functional was based on a shooting method with gradient descent in the space of initial conditions. A comparison with more sophisticated methods of nonlinear programming would be a useful guide for further development. Third, the example of the molecular strand (§3*d*) was solved as a problem of statics. Adding the consideration of dynamics of the strand brings one into the realm of the so-called *G*-Strands [35], for which the inexact trajectory planning problem may prove interesting. Finally, for applications in computational anatomy one must deal with infinite dimensional diffeomorphism groups. Extending the framework of this paper to infinite dimensions is therefore another challenge that lies ahead.

## Funding statements

D.D.H. thanks the Royal Society for a Wolfson Research Merit Award and the European Research Council for Advanced grant no. 267382.

## Acknowledgements

We extend our gratitude to M. Bruveris, F. Gay-Balmaz, H. O. Jacobs, M. Leok, L. Noakes, T. S. Ratiu, J. Vankerschaver and F.-X. Vialard for several enlightening discussions of this material. We thank both referees for their careful reading and constructive suggestions.

- Received April 19, 2013.
- Accepted August 20, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.