PHY6200 W07

Chapter 6: Calculus of Variations


In Taylor, read sections 6.1 to 6.2 for today, and 6.2 to 6.4 for Friday.


Historical Perspective

Newton's laws were published in 1687. For a century, they were tested and used in many applications, and, along with the development of calculus that accompanied Newton's laws, initiated a period of great advancement in mathematics and physics. This led to the development of alternative formulations of the laws of mechanics. In 1788, Joseph Louis Lagrànge published a new formulation of the rules of mechanics, now called Lagrangian mechanics, and in 1833, William Rowan Hamilton published another formulation now called Hamiltonian mechanics. Both formulations approach the topic of mechanics from a rather different perspective than the one we've been using for Newtonian mechanics.

In this chapter we will be introduced to the mathematics needed for that different perspective, called the calculus of variations. The calculus of variations deals with finding functions that maximize or minimize an integral of the function. (To be precise, we will find the function that will make the integral "stationary", zero change under infinitesimal variations, to first order. Therfore the function can produce a maximum, minimum, or saddle point.) This probably sounds like a daunting challenge. The most difficult part is to understand the formulation of the problem. After that, you already have the necessary tools to solve it.

Two Examples

To help make the formulation of the problem clear, let's consider some examples. These examples are simple in nature, but that should help get a grasp of what we're doing.

The Shortest Path between Two Points

Everyone should know that the shortest path between two points lying in a plane is a straight line. But how does one go about proving that statement? How do we set up the problem mathematically?

Since shortest path refers to the shortest in length, we first need to write down how to determine the legth of a path. A path can be represented as a smooth curve between the two points, call them 1 and 2. For an arbitrary path, the length is the infinite sum of infinitesimal line segments tangent to the path at every point. Let a line segment connect the two points on the path (x,y) and (x+dx, y+dy). Its length, to lowest order in dx and dy, is ds ≈ sqrt(dx² + dy²). We want to sum these lengths from point 1 to point 2, but need to massage this expression a little so this can be easily done. Since the path is a smooth curve, it can be represented by a function y(x) that has a first derivative, everywhere. (Okay, there are mathematical subtelities here, like y(x) being double valued, or y'(x) being infinite at some points, but those are the details for a math course. For our purposes, let's limit ourselves to cases where this doesn't occur.) Then we can relate dy and dx by dy = (dy/dx)dx = y'(x)dx, and use this to relate ds to the one independent variable x

ds = sqrt(dx² + y'(x)²dx²) = sqrt(1 + y'(x)²) dx

This quantity can now be integrated from point 1 to point 2 for the total length of the path:

L = ∫12 ds = ∫12 sqrt(1 + y'(x)²) dx.

Now the problem is to find the function y(x) that minimizes the above integral. This is the type of problem we want to solve.

Fermat's Principle

Another example of this type of problem is to find the path that a ray of light will follow between two points (with mirrors or materials of different index of refraction in the path, to make the problem interesting. The path will be the one that makes the travel time for the light a minimum, maximum, or saddle point, or simply the path that makes the travel time stationary. This is called Fermat's principle.

Again, imagine the path as composed of an infinite number of infinitisemal line segments. The time for the light to travel one segment is dt = ds/v = (n/c) ds where v = c/n is the speed of light in a medium and n is the index of refraction in that medium and c is the speed of light in vacuum. The total travel time is the integral over the path

T = ∫12 dt = (1/c)∫12 n ds

If n is a constant, then this also factors out of the integral, and the problem is reduced to the minimization of the path between two points done above. If n is different regions, or varies with position, then the integral can be cast in the form

T = (1/c)∫12 n(x,y) sqrt(1 + y'(x)²) dx.

While it is obvious why a solution that gives a minimum would be a valid path, it isn't obvious why a maximum or saddle point solution would be the right path for a light ray. But this is the general result: a solution that makes the integral stationary (unchanged to first order by an infinitesimal variation) will be a path for the light ray. Whether one gets a minimum, maximum, or saddle point depends on the nature of the problem. I'll give a homework problem that illustrates this point.

The Euler-Lagrange Equation

Mathematically, the problem to be solved is to find a function that makes an integral of that function stationary, that is unchanged to first order by infinitesimal variations of the function. This still sounds difficult. There's an infinity of arbitrary functions, and we must find the one that makes the integral stationary. To make this tractable, we need a way to parameterize arbitrary functions.

Problems of this type are called variational problems. The general statement is that we want to find the path y(x) that makes the integral

S = ∫12 f[y(x), y'(x), x] dx

stationary, where the path has known endpoints y1 = y(x1) and y2 = y(x2), and the function f can be a function of x, y, and the first derivative dy/dx. (The shorthand y' stands for dy/dx.) A picture is to imagine a rubber band with one end a point (x1, y1) and the other at (x2, y2), these are the boundary conditions. Different paths correspond to changing the shape of the rubber band, and finding the shape that makes the integral stationary.

There is a very simple trick to parameterize all the possible paths. Let y(x) stand for the path that makes the integral stationary -- this is the function we want to find. An arbitrary path is obtained by adding another function, η(x), to it yielding

Y(x) = y(x) + αη(x)

The function η(x) is constrained to be 0 at the end points, η(x1) = η(x2) = 0, so that Y(x) satisfies the same boundary conditions as y(x), Y(x1) = y1 = y(x1), and Y(x2) = y2 = y(x2). Except for possible smoothness criteria, η(x) is otherwise arbitrary. The variable α is included to make the integral into a function of α, S = S(α), with the property that S is stationary when α = 0, that is (dS/dα)α=0 = 0. Let's evaluate dS/dα.

S(α) = ∫12 f[y + αη, y' + αη', x] dx
dS/dα = ∫12 ∂f[y + αη, y' + αη', x]/∂α dx = ∫12 [(∂f/∂y)η + (∂f/∂y')η'] dx = 0.

Now, we're almost there. To put this in usable form, we must get both terms proportional to η (or η'). We can use integration by parts to convert the second term

12 (∂f/∂y')η' dx = [(∂f/∂y')η]12 - ∫12 η d(∂f/∂y')/dx dx = - ∫12 η d(∂f/∂y')/dx dx

where the term in square brackets evaluates to zero because η=0 at the endpoints. This gives us the result we want

dS/dα = [(∂f/∂y) + (d/dx)(∂f/∂y')]η dx = 0.

This result must hold for any η (for instance, η equal to the term in square brackets) and this can be guaranteed only if the term in square brackets is zero.

(∂f/∂y) + (d/dx)(∂f/∂y') = 0

This is known as the Euler-Lagrange equation. Knowing the form of the function f(y,y',x) this yields a differential equation that, along with the boundary conditions y1 = y(x1), and y2 = y(x2), can be solved for y(x).

Applications of the Euler-Lagrange Equation

Let's return to the two earlier examples and apply the Euler-Lagrange equation to solve them.

The Shortest Path between Two Points

On the x-y plane, find the shortest path between two points.

We determined that the length of a path from 1 to 2 is given by the integral

L = ∫12 ds = ∫12 sqrt(1 + y'(x)²) dx.

By inspection we see that f = sqrt(1 + y'(x)²). Applying the Euler-Lagrange equation to f we have

(∂f/∂y) + (d/dx)(∂f/∂y') = 0 + (d/dx)[y'/sqrt(1 + y'²)] = 0

Therefore y'/sqrt(1 + y'²) = C = constant or y'(x) = C /sqrt(1 - C²) = constant = m. Finally, integrating y'(x), we find y(x) = mx + b, the equation for a straight line. The endpoints determine m and b.

In principal y(x) = mx + b is the path that makes the length stationary. In this case it is obvious that this is the minimum of the integral, but how do we show that in the general situation?

A Note on Variables


Given two points 1 and 2, with 1 higher above the ground, in what shape should we build a frictionless track so that a mass released from point 1 will reach point 2 in the shortest possible time?

Arrange the coordinate system so that point 1 is at the origin, x is horizontal, and y is vertically downward. This time we will take y as the independent variable. The integral to minimize is

T = ∫12 (ds/v)

where x' stands for dx/dy, ds = sqrt(1 + x'²)dy and conservation of energy yields v = sqrt(2gy) for any descent by the distance y. Substitution leads to

T = (1/sqrt(2g)) ∫12 sqrt((x'² + 1)/y) dy

The function inserted into the Euler-Lagrange equation is

f(x, x', y) = sqrt[(x'² + 1)/y]
With the roles of x and y reversed, the Euler-Lagrange equation reads

Maximum and Minimum versus Stationary

More than Two Variables

Euler Angles

Motion of a Spinning Top

© 2007 Robert Harr