## Abstract

A new perspective on the emergence of the Korteweg–de Vries (KdV) equation is presented. The conventional view is that the KdV equation arises as a model when the dispersion relation of the linearization of some system of partial differential equations has the appropriate form, and the nonlinearity is quadratic. The assumptions of this paper imply that the usual spectral and nonlinearity assumptions for the derivation of the KdV equation are met. In addition to a new mechanism, the theory shows that the emergence of the KdV equation always takes a universal form, where the coefficients in the KdV equation are completely determined from the properties of the background state—even an apparently trivial background state. Moreover, the mechanism for the emergence of the KdV equation is simplified, reducing it to a single condition. Well-known examples, such as the KdV equation in shallow-water hydrodynamics and in the emergence of dark solitary waves, are predicted by the new theory for emergence of KdV.

## 1. Introduction

The Korteweg–de Vries (KdV) equation is one of the most well known of all nonlinear partial differential equations (PDEs). It was discovered in the nineteenth century, first appearing in a footnote in the works of Boussinesq in 1877 [1]. But it was largely unnoticed until the seminal 1895 paper of Korteweg & de Vries [2]. It is now recognized to be a fundamental model in a wide range of applications. Historical essays on KdV are given by Miles [3] and de Jager [4].

The KdV equation was first derived in the context of water waves in shallow water. A review of the various derivations is given by de Jager [4]. It was subsequently realized that it appears in a wide range of contexts: water waves [3], internal waves [5] and many other applications [6]. The conventional view is that the KdV equation appears as an approximate model equation when the dispersion relation for the frequency versus wavenumber, for some system of PDEs, is approximated by *ω*=*c*_{1}*k*+*c*_{2}*k*^{3} (for some constants *c*_{1} and *c*_{2}, as *k*→0) and there is a ‘balance between dispersion and nonlinearity’ to leading order. In this well-established approach, the coefficient of the linear dispersive system (*c*_{2} above) is readily found, whereas the coefficient of the quadratic nonlinear term is system dependent.

The purpose of this paper is to show that there is a quite different derivation for a certain general class of equations, which yields both the dispersive and nonlinear coefficients in terms of simple formulae. The approach here is to obtain the KdV equation by modulating the basic state. This approach leads to a single condition for the emergence of KdV. Moreover, a significant outcome is that the coefficients in the KdV equation can be obtained directly from the properties of the basic state, and so it can be written down without extensive calculation. The main result of the paper is that the emergent KdV equation takes the universal form
1.1where *k* parametrizes the basic state, *q*(*X*,*T*) is modulation of *k*, *X*=*εx* and *T*=*ε*^{3}*t* are slow space and time variables, respectively, *ε* is an amplitude parameter, and is a parameter associated with the basic state. The functions and are given as they are deduced from the basic state. The condition
1.2is the principal necessary condition for the emergence of the KdV equation, and this condition can be checked using only information from the basic background state.

In order to keep the class of PDEs as general as possible, it is assumed that the PDE is the Euler–Lagrange equation for some given Lagrangian
1.3where *Z*(*x*,*t*) is a vector-valued function.

One feature that may be surprising, but will be clear *a posteriori*, is the role of symmetry in the emergence of the KdV equation. Suppose that the Lagrangian is invariant with respect to a one-parameter group of symmetries. By Noether's theorem there is an associated conservation law,
1.4where *A*(*Z*) is the density and *B*(*Z*) the flux, respectively, of the conservation law.

Now suppose there is a steady—relative equilibrium (RE)—solution of the Euler–Lagrange equation associated with (1.3); that is, a one-parameter family of solutions associated with the symmetry group. Denote this family of solutions by where *θ*=*kx*+*θ*_{0} with *θ*_{0} an arbitrary phase shift, and *k* parametrizes the family of RE. The precise form of the family of RE is given in §3. Since a parameter must be varied in order to generate a KdV equation, there is always explicitly or implicitly a background state. The functions and are the components of the conservation law (1.4) evaluated on the family of RE.

Given the basic periodic state, the strategy is to modulate and perturb the basic state; that is, a solution of the full PDE (introduced in §2) of the following form is proposed:
1.5where *ϕ*(*X*,*T*,*ε*) and *q*(*X*,*T*,*ε*) are functions of a slow space variable *X*=*ε*^{α}*x* and slow time variable *T*=*ε*^{β}*t*. The exponents *a*,*b*,*c* and *α*,*β* have to be chosen appropriately to generate a modulation PDE for *ϕ* and *q*. In the present case, where the governing equations are deduced from a Lagrangian functional it is natural to propose the Whitham scaling (cf. [7,8]); that is, *α*=*β*=1 and *a*=0, *b*=1 and *c*=2. However, this scaling leads to a quasi-linear modulation equation (the Whitham modulation equations) without dispersion. For non-conservative systems, the modulation (1.5) has been used by Doelman *et al*. [9] with *α*=1, *β*=2, *a*=0, *b*=1 and *c*=2. With this scaling, a class of reaction–diffusion equations is reduced to a Burger's equation for *q*.

Here, the interest is in a scaling that leads to the KdV equation: choose *α*=1, *β*=3 and *a*=1, *b*=2 and *c*=3 giving the following ansatz for the modulation:
1.6with
1.7The main result of this paper is that, to leading order, *q*=*ϕ*_{X} and *q* satisfies the *q*-KdV equation (1.1). A necessary condition, which emerges from the perturbation expansion, is .

This geometric form for KdV (1.1) is a generalization of the normal form for the emergence of dark solitary waves (DSWs) (cf. [10]). Here, it is shown that the form (1.1) is far more general and applies to the modulation of a general class of basic states. In [10], a dynamical systems argument augmented by temporal transformations was used, whereas here a more general direct PDE derivation is given.

The strategy is to substitute (1.6) into the Euler–Lagrange equation for (1.3) and expand everything in powers of *ε*. The equations are satisfied exactly up to fifth order in *ε* if and only if , *q*=*ϕ*_{X} and *q* satisfies (1.1).

Two examples are given in the paper. In §8, a simplified model for shallow-water waves is considered. The standard formula for the KdV equation for water waves (e.g. [11], eqn (6.9c), p. 693) is
1.8where the ± signs are for the cases of left-running and right-running waves, *h*_{0} a reference depth and *g* the gravitational constant. It is shown in §8 that (1.8) follows immediately from the formula (1.1). The conservation law in this case is *conservation of mass* and the family of RE are the uniform flows.

The second example, considered in §9 is a re-visit of the emergence of DSWs from the de-focusing NLS equation. A derivation of the result in [10] is given using the new theory in this paper.

## 2. Lagrangian form of nonlinear partial differential equations

A general form of Lagrangian as in (1.3) can be taken as the starting point, and the theory can be developed in terms of a Taylor series expansion of the Lagrangian. However, there is a natural canonical form for (1.3)
2.1For definiteness and simplicity, *Z*(*x*,*t*) takes values in (comments on higher phase space dimension are in §10). In the Lagrangian density, **M** and **J** are 4×4 skew-symmetric matrices, *S*(*Z*) a given smooth function and 〈⋅,⋅〉 the standard inner product on . Although not essential, it is convenient to assume that **J** is invertible:
2.2Given an arbitrary Lagrangian density the form (2.1) can be obtained by Legendre transform in both time and space, when the Lagrangian is non-degenerate, and examples are given in §§8 and 9.

Here, it is assumed that the Lagrangian is already in the form (2.1). This form is an example of a Lagrangian for a *multi-symplectic Hamiltonian PDE* [12–14]. The Euler–Lagrange equation for (2.1) is
2.3The steady part of this equation is a standard Hamiltonian system on with Hamiltonian function *S*(*Z*),
2.4

### (a) Symmetry and conservation laws

Suppose that the Lagrangian is invariant under the action of a one-parameter group. By Noether's theorem applied to the Lagrangian, there is a connection between symmetry and conservation laws. Hydon [13] has developed the Noether theory for systems in the form (2.1). A general theory for arbitrary (abelian and non-abelian) Lie groups can be developed, but for simplicity, the essential ideas can be seen by considering the simplest cases. Three cases will be considered: an orthogonal group action, an affine group action and the case where the basic state is periodic (where the conservation law is conservation of wave action).

#### (i) Orthogonal group action

A one-parameter orthogonal action is of the form *G*_{s}*Z*, where with infinitesimal generator
2.5Invariance of the Lagrangian can be characterized in terms of its component parts
2.6Noether's theorem then gives the conservation law
2.7The infinitesimal versions of (2.6) assure that and are symmetric matrices. The conservation law (2.7) can be verified by direct calculation
2.8An explicit illustration of an orthogonal group action is given in §9.

An RE is of the form 2.9An equation for is obtained by substituting (2.9) into (2.4).

#### (ii) Affine group action

A one-parameter affine action is of the form *Z*↦*Z*+*sΥ*, where is a fixed constant vector, which is also the infinitesimal generator. Invariance of the Lagrangian reduces to *S*(*Z*+*sΥ*)=*S*(*Z*). Noether's theorem then gives the conservation law
2.10The conservation law (2.10) can be verified as in (2.8). An explicit illustration of an affine group action is given in §8.

An RE is of the form 2.11An equation for is obtained by substituting (2.11) into (2.4).

#### (iii) Periodic state

In the case where the basic state is periodic, it can also be represented in the form with *θ*=*kx*+*θ*_{0} and a 2*π*-periodic function of *θ*. The conservation law in this case is *conservation of wave action* (cf. [15]) with components
where 〈〈⋅,⋅〉〉 is an inner product, including averaging over *θ*
2.12The governing equation for is obtained by substituting it into (2.4).

## 3. Basic state: a steady relative equilibrium

All three of the above cases can be represented in the form and so this form will be used throughout, and all three will be referred to as ‘RE’. In the case of orthogonal or affine group action, the conservation laws do not require averaging. However, the inner product (2.12) will be used throughout, although in the case when the RE is associated with the orthogonal or affine group action it reduces to the standard inner product: 〈〈⋅,⋅〉〉=〈⋅,⋅〉.

Substitution of a family of RE, , into (2.4) shows that it satisfies
3.1Suppose henceforth that the equation (3.1) has a solution and it is a smooth function of *θ* and *k*.

### (a) Linearization about the family of relative equilibrium

The derivatives of the basic state with respect to *θ* and *k* will be needed. Introduce the linear operator
3.2Then differentiation of (3.1) with respect to *θ* and *k* gives
3.3where
3.4The vectors {*ξ*_{1},*ξ*_{2}} form the first two elements of a Jordan chain. Higher derivatives with respect to *θ* and *k* are obtained by successive differentiation. The only one needed in the later derivation is an equation for . Differentiating (3.1) gives
3.5

The operator **L** is formally symmetric with respect to the inner product on . Hence *ξ*_{1} is in the kernel and co-kernel. It is assumed that the kernel and co-kernel are no larger
3.6that is, zero is an eigenvalue of **L** of geometric multiplicity one.

The components of the conservation law evaluated along a branch of RE are 3.7The first derivatives of the components of the conservation law will be needed 3.8The second derivative of the flux will also be needed, 3.9

## 4. Modulation of the basic state

Modulate the basic state using the expansion (1.6). Expand *W*(*θ*,*X*,*T*,*ε*) in a Taylor series in *ε*,
4.1Now substitute (1.6) with (4.1) into the PDE (2.3), and expand all terms in power series in *ε* up to fifth order. For example, the expansion of the modulation term is
with
where the superscript zero indicates evaluation with *ε*=0. To simplify notation, the zero superscript is dropped henceforth, and it is understood that all terms are now evaluated at *ε*=0. Substitution of all the expansions into (2.3) leads to lengthy expressions and just a summary of the equations at each order is given here.

The terms proportional to *ε*^{0} in the expansion just recover the governing equation (3.1) for the basic state. The terms proportional to *ε*^{1} give
which is exactly satisfied by the basic state for all *ϕ*. The terms proportional to *ε*^{2} give
The terms in brackets on the left-hand side vanish identically (obtained by differentiating with respect to *θ* and using (3.3)). Since (or would be an equilibrium (cf. equation (3.1))), it follows that
4.2

Starting at third order, the equations get more interesting since the *W*_{j} terms begin appearing. The terms proportional to *ε*^{3} are
where the terms in brackets are lengthy expressions which vanish identically. Hence the equation for *W*_{1} reduces to
4.3This equation (4.3) is solvable if and only if the right-hand side is in the range of **L**. Solvability requires
Comparison with the second equation of (3.8) shows that this condition is proving the following: *the equation for* *W*_{1} *in* (4.3) *is solvable if and only if* . The solvability of (4.3) is related to the third component of a Jordan chain (see §5). Using *ξ*_{3} from §5, the general solution for *W*_{1} is
for some (yet to be determined) function *α*(*X*,*T*).

## 5. Intermezzo: and Jordan chain theory

The linear operator **L** defined in (3.2) with kernel (3.6) has zero as an eigenvalue of geometric multiplicity one. The equations (3.3) show that the algebraic multiplicity is at least two. In this section, conditions are established for the algebraic multiplicity to be exactly four, with eigenfunctions {*ξ*_{1},*ξ*_{2},*ξ*_{3},*ξ*_{4}}, where *ξ*_{1} is the geometric eigenvector and *ξ*_{2},*ξ*_{3} and *ξ*_{4} are generalized eigenvectors.

### Proposition

*Suppose* **L** *in (3.2) has zero as an eigenvalue of geometric multiplicity one. The algebraic multiplicity is four if and only if
*5.1

The condition is required for the Jordan chain to have length greater than two, and is the condition for the chainlength to end at four. Since the phase space dimension is four the second condition is automatically satisfied, but it is included here in order to highlight what is necessary for the more general case.

The Jordan chain has length three if there exists a function *ξ*_{3} satisfying
5.2This equation is solvable if and only if
confirming the first condition in (5.1). Hence there exists a function *ξ*_{3} satisfying (5.2) and the Jordan chain has length at least three.

The Jordan chain has length four if there exists a function *ξ*_{4} satisfying
5.3This equation is solvable if and only if 〈*ξ*_{1},**J***ξ*_{3}〉=0. Rearranging and using the Jordan chain,
using symmetry of **L** and skew-symmetry of **J**. Hence there exists a function *ξ*_{4} satisfying (5.3) and the Jordan chain has length at least four.

It remains to give a condition for the Jordan chain to end at four. Look at the next element in the chain. The Jordan chain has length five if there exists a function *ξ*_{5} satisfying
5.4However, (5.4) is solvable if and only if . Hence the Jordan chain terminates if , confirming the second condition in (5.1).

## 6. Terms of fourth and fifth order in *ε*

The terms proportional to *ε*^{4} give
Substituting for *W*_{1} and simplifying reduces it to
6.1Introduce a function *ζ* satisfying
6.2The right-hand side is in the range of **L** since **M** is skew-symmetric so the vector-valued function *ζ*(*θ*) exists. Substitute into (6.1)
6.3The second and fourth terms in the Jordan chain can be used to solve for the inhomogeneous terms, giving the following general solution for *W*_{2}:
6.4for some (yet to be determined) function *β*(*X*,*T*).

### (a) Fifth-order terms

The fifth-order terms are the most interesting. The reason it is necessary to go to fifth order is because zero is an eigenvalue of **L** of algebraic multiplicity four. The four terms of the Jordan chain all contribute, and successive terms are one power higher in *ε* (e.g. *ξ*_{4} shows up for the first time at *ε*^{4} in (6.4)). Hence one degree higher in order for the entire Jordan chain to contribute.

The terms proportional to *ε*^{5} give
where the terms in Remainder either vanish identically or can be expressed in terms of **L**(⋅). Substituting for *W*_{1} and *W*_{2} and incorporating the terms proportional to **L** into *W*_{3} and calling it ,
This equation is solvable if and only if the right-hand side is in the range of **L**. Applying solvability gives
6.5with
6.6Since , this coefficient vanishes only if the Jordan chain in §5 has length greater than four. It remains to simplify the expressions for the coefficients *a*_{0} and *a*_{1}.

## 7. Coefficients in *q*-KdV

Starting with the expression for *a*_{0} in (6.6)
Use the same strategy to determine the coefficient *a*_{1} but with less detail. Starting with the expression for *a*_{1} in (6.6) and using (3.9)
7.1For the third term in (7.1),
7.2For the second term in (7.1)
using (3.5). Substituting this expression and (7.2) into (7.1) and using permutation invariance of over *i*,*j*,*k*, gives . Hence, multiplying (6.5) through by −1 then gives the *q*-KdV equation (1.1). Non-degeneracy of the *q*-KdV requires the natural assumptions
7.3

In summary, the ansatz (1.6) satisfies the governing equations (2.3) up to fifth order in *ε* if and only if (i) satisfies (3.1); (ii) satisfies ; (iii) *q*=*ϕ*_{X}; (iv) ; and (v) *q* satisfies the *q*-KdV equation.

## 8. Example: a model for shallow-water waves

In this section, a simplified model for water waves is considered, and it is shown how the theory of this paper applies to give an immediate derivation of the KdV for shallow-water waves. Development of the theory for the full water wave problem and its implications are given in [16]. Consider the following Boussinesq model (cf. [11], ch. 5)
8.1where, for the gravity wave problem,
8.2*h*(*x*,*t*) is the surface elevation, *u*(*x*,*t*) the vertical average of the horizontal velocity, *g* the gravitational constant and *h*_{0} a reference depth.

Introduce a velocity potential *u*=*ϕ*_{x}. Then the equations can be written in the form
8.3where *R* is the Bernoulli function. They are generated by the Lagrangian functional
with
The time-derivative term is already in canonical form. Take a Legendre transform with respect to the *x*-direction
Hence the fully Legendre transformed system is
with
and

### (a) Relative equilibria

The system is invariant under the symmetry *ϕ*↦*ϕ*+const. In terms of the *Z* coordinates, it is an affine symmetry: *Z*↦*Z*+*sΥ* with *Υ*=(1,0,0,0). This symmetry generates the mass conservation law
The family of steady relative equilibria associated with this symmetry is
and *k* represents the background velocity of the uniform flow. This family of RE satisfies
8.4Writing out (8.4) gives *w*=0, *p*=*h*_{0}*k* and , where *R* (the Bernoulli constant) is the specified value of total head.

The components of the conservation law evaluated on the family of RE are
The necessary condition for the emergence of KdV is
which is the usual condition for criticality. By defining the Froude number , the ‘Froude number unity’ condition is recovered. *This condition is the only necessary condition required for emergence of KdV.*

### (b) Emergence of Korteweg–de Vries

Differentiating the conservation laws further
Hence the KdV equation should be of the form
To determine linearize (8.4) and solve for the Jordan chain {*ξ*_{1},*ξ*_{2},*ξ*_{3},*ξ*_{4}} in (3.3), (5.2) and (5.3). The result is
Hence
and so the universal form for KdV is
or
8.5which is precisely [11], eqn (6.9c), p. 693 with *k*↦−*k* (note that both signs of *k* are admissible). This is the ‘velocity form’ of the KdV equation for water waves. By letting *q*=−*kη*/*h*_{0} the standard form for the ‘surface height’ version of the KdV is obtained
8.6which recovers in Dingemans [11], eqn (6.9b), p. 693 noting that .

### (c) Moving versus stationary frame

One subtlety in comparing derivations is that the derivation of KdV in [11] is relative to a moving frame with the uniform flow velocity zero, whereas the above derivation is relative to a laboratory frame of reference. Owing to Galilean invariance the two are equivalent. However, suppose *both* a moving frame and a non-zero background velocity are introduced. For a frame moving at speed *c* (that is, *x*↦*x*−*ct*) the governing equation is modified to **M***Z*_{t}+(**J**−*c***M**)*Z*_{x}=∇*S*(*Z*) and the derivation proceeds as before with the conservation law boosted by *c*. The details and implications for water waves are given in [16].

## 9. Example: defocusing nonlinear Schrödinger to Korteweg–de Vries

Consider the defocusing nonlinear Schrödinger (NLS) equation
9.1It has a natural *S*^{1} symmetry with action *θ*↦e^{iθ}*Ψ*. Hence e^{iθ}*Ψ*(*x*,*t*) is a solution for any *θ*∈*S*^{1} when *Ψ*(*x*,*t*) is a solution. The conservation law associated with this symmetry is
9.2

To apply the theory here, first show that the NLS equation (9.1) is the Euler–Lagrange equation associated with a Lagrangian in the standard form (2.1). Let *Ψ*=*u*_{1}+*iu*_{2}; then the system is
and
It is generated by the Lagrangian
with **u**=(*u*_{1},*u*_{2}) and
The *t*-part is in standard form. Take a Legendre transform in the *x*-direction: giving . Then is in the standard form (2.1) with
with

The family of RE associated with the *S*^{1} symmetry are solutions of the form *Ψ*=*Ψ*_{0}e^{i(kx+θ0)}, where *θ*_{0} is an arbitrary phase shift. Substitution into (9.1) shows that *k*^{2}+|*Ψ*_{0}|^{2}=1. The components of the conservation law (9.2) evaluated on the family of RE are

### (a) Emergence of Korteweg–de Vries

The necessary condition for emergence of KdV is
which is satisfied for two values of *k*. Hence the canonical form for KdV is
9.3since

It remains to compute the Jordan chain in order to calculate . In real coordinates, the basic family of RE is
where
In the linearization about the family of RE, the Jordan chain takes the form
Substitution into the linearization shows that
with **a**_{0}=**b**_{0}=0. Hence a basis for the Jordan chain is
The coefficient of dispersion in the KdV equation is
Substitute into (9.3),
9.4Now show that this agrees with the form for KdV derived by direct perturbation expansion. In §2 of [10], Kivshar's perturbation method [17] is used to give a direct derivation of KdV from defocusing NLS. That derivation gives
9.5where in the notation here is . However, eqn (2.8) in [10] gives
and in the notation here the right-hand side is *q*(*X*,*T*). Substituting into (9.5) gives
9.6Noting that 1=3*k*^{2} the term multiplying *qq*_{X} is −9*k*. Now multiplying by shows that (9.6) is exactly (9.4).

## 10. Concluding remarks

Starting with a Lagrangian functional for a conservative PDE, there were essentially eight assumptions which led to the universal form for the emergence of KdV. The first three assumptions were on the Lagrangian: (i) it is in canonical form (2.1) with skew-symmetric matrices **M** and **J**; (ii) **J** is assumed to be invertible (2.2); (iii) symmetry: it is invariant under the action of a one-parameter Lie group. The invertibility of **J** is not essential but is used to ensure that the steady system (2.4) is a standard Hamiltonian system and the Jordan chain argument in §5 follows the standard theory. The symmetry requirement is surprising since it is never mentioned in conventional derivations for the emergence of KdV. However, the example from water waves shows that even in situations where it is not obvious that symmetry is present, it is in fact playing a key role in the emergence of KdV.

The fourth assumption is that the phase space dimension is four. Dimension four is the lowest dimension that the phenomena can occur. KdV can still emerge in higher dimension if there are no other zero or pure imaginary eigenvalues in the linearization. If there are additional non-hyperbolic eigenvalues in the linearization then a more general modulation equation will result: an example is when the KdV generalizes to the ‘longwave–shortwave resonance’ equations. With larger phase space dimension the fifth assumption—that the kernel be restricted by (3.6)—becomes essential to get KdV. A generalization of great interest is the case of infinite dimensions, where *Z*(*x*,*t*) depends on an additional coordinate such as a vertical coordinate in a water-wave problem (cf. [16]).

The sixth assumption is the natural assumption that exists and is smooth enough to differentiate. The seventh assumption is the modulation ansatz (1.6). It is justified by the fact that ‘it works’; that is, the governing equations are satisfied exactly up to fifth order by this form of solution. The eighth assumption is just that the emergent KdV equation is non-degenerate; that is, the derived coefficients are non-zero (7.3).

Only the simplest form of group action, and hence relative equilibria, are considered here. It is expected that the theory generalizes in a natural way to more exotic one-parameter groups, such as subgroups of non-abelian groups, where the full generality of relative equilibria comes into play (e.g. [18], ch. 4). An interesting direction, of importance in applications, is the generalization of the emergence of KdV to the case where the underlying symmetry is associated with a multi-parameter group and a system of conservation laws, rather than a one-parameter group associated with a single conservation law.

- Received December 4, 2012.
- Accepted February 1, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.