Thursday, June 13

Calculus of Variations, 2/2


In the first post of this series, the Barchistochrone problem was introduced and we looked at how the performance of a particular curve could be evaluated. Recall that the total transit may be expressed in this way:

 [;T = \int^{x_B}_{x_A}\frac{\sqrt{1+f'(x)^2}}{v}\,dx;]

Also, because the bead is traveling through a constant gravitational field, the velocity may be expressed as a function of the change in height by conservation of energy.

[; v = \sqrt{2g(y_A-y)};]

[; v = \sqrt{2g(f(x_A)-f(x))};]

Therefore, a functional that evaluates the transit time for some curve f(x) in the brachistochrone problem is given by:

[; T(f(x),f'(x),x) = \int^{x_B}_{x_A} \frac{\sqrt{1+f'(x)^2}}{\sqrt{2g(f(x_A)-f(x))}}\,dx ;]

which may be generalized with a function L:

[; T(f(x),f'(x),x) = \int^{x_B}_{x_A} L(f(x),f'(x),x)\,dx ;]

where

[; L(f(x),f'(x),x) = \frac{\sqrt{1+f'(x)^2}}{\sqrt{2g(f(x_A)-f(x))}} ;]

Now that we have an expression for the total transit time of the bead for any continuous differentiable function between points A and B, imagine how changing that function might change the transit time of the bead. For the optimal curve, any perturbation in the curve will increase the transit time. If we can come up with a relationship between perturbations of a function and changes in the functional T, we can look at how infinitesimal perturbations of the function change the transit time. The motivation here is to use that relationship in order to find a function for which infinitesimal perturbations produce no change in the transit time - at which point the function can be said to minimize T.

Perturbing Perturbations

This part was difficult for me to comprehend originally. I didn't understand how, out of all of the possible ways in which a continuous function could be perturbed, a relationship could be established between the change in the functional and the infinite set of all possible infinitesimal perturbations. The trick is to formally define the perturbations and then dissect them by applying constraints.

Scaled perturbations of a curve
We can express some perturbed curve as a composition of the original function and some test function n(x) scaled by a constant e.

  [; g(x) = f(x) + en(x) ;]

Note that the perturbed function g(x) is subject to the same constraints as the candidate function f(x). This means that the test function n(x) must be continuous and differentiable between A and B, and the value of the test function must be zero at the boundaries in order for the perturbed curve to satisfy the boundary conditions.

[; g(x_A) = y_A ;]  and  [; g(x_B) = y_B ;]

[;f(x_A)+en(x_A)=y_A+en(x_A);]

[; y_A+en(x_A)=y_A ;]

[; en(x_A) = 0 ;] and [; en(x_B) = 0 ;]

In this way, any valid perturbation of f(x) is covered by g(x). We can then express the functional with respect to g(x).

[; T(g(x),g'(x),x) = \int^{x_B}_{x_A} L(g(x),g'(x),x)\,dx ;]

Noting that: [; g'(x) = f'(x) + en'(x) ;]

Next, we can differentiate the transit time with respect to the scaling factor e.

[; \frac{d}{de}T(g(x),g'(x),x) = \frac{d}{de} \int^{x_B}_{x_A} L(g(x),g'(x),x)\,dx ;]

Because the integral is taken with respect to x, the derivative may be evaluated inside of the integral.

[; \frac{d}{de} \int^{x_B}_{x_A} L(g(x),g'(x),x)\,dx=\int^{x_B}_{x_A}\frac{d}{de} L(g(x),g'(x),x)\,dx ;]

We can express the derivative of L with respect to e in the following way:

[; \frac{dL(g,g',x)}{de} = \frac{dL_g}{de} = \frac{dx}{de}\frac{\partial L_g}{\partial x} + \frac{dg}{de}\frac{\partial L_g}{\partial g} + \frac{dg'}{de}\frac{\partial L_g}{\partial g'} ;]

where, from the definitions of g(x) and g'(x):

[; \frac{dx}{de} = 0 ;],  [; \frac{dg(x)}{de} = n(x) ;], and [; \frac{dg'(x)}{de} = n'(x) ;]

Swapping those values in, we have:

[; \frac{d}{de}T(g(x),g'(x),x) = \int^{x_B}_{x_A}n(x)\frac{\partial L_g}{\partial g}+ n'(x)\frac{\partial L_g}{\partial g'}\,dx ;]

Setting this to zero, we will have an expression that will reveal the function f(x) which will minimize our functional T.

The next step is a little bit tricky. Although we have already differentiated T by e, the result still depends on e in the integral through the terms [; \frac{\partial L_g}{\partial g};] and [; \frac{\partial L_g}{\partial g'};]However, notice what happens when e approaches 0:

[; g(x) |_{e=0} = f(x) ;]  and  [; \frac{\partial L_g}{\partial g} |_{e=0} = \frac{\partial L}{\partial f} ;]

We can evaluate the functional when e is equal to zero even though we already differentiated by it.

[; \frac{d}{de}T(g(x),g'(x),x) |_{e=0}= \int^{x_B}_{x_A}n(x)\frac{\partial L}{\partial f}+ n'(x)\frac{\partial L}{\partial f'}\,dx ;]

Integrate by parts to reveal:

[; 0 = \int^{x_B}_{x_A} [\frac{\partial L}{\partial f}-\frac{d}{dx} \frac{\partial L}{\partial f'}] n(x)\,dx +[n(x)\frac{\partial F}{\partial f'}]^{x_B}_{x_A} ;]

Recognize that the term on the right will vanish at the boundary values (by our definition of n(x)).

[; \int^{x_B}_{x_A} [\frac{\partial L}{\partial f}-\frac{d}{dx} \frac{\partial L}{\partial f'}] n(x)\,dx=0 ;]

Fundamental Lemma of The Calculus of Variations

This final step takes a little bit of discussion. It deals with the Fundamental Lemma of the Calculus of Variations.

Basically, through all of this toil, we have come up with a weak statement about the minimizing function f(x): the integral of the product of these two functions with respect to x is equal to zero. This weak statement does help us narrow f down to some family of functions that satisfy the statement but we want to make a stronger statement about f(x) and narrow the possible functions down even further. One important condition that allows for the application of this lemma in our present case is that
[;n(x_A) = 0;] and [;n(x_B) = 0;]

Also, [;n(x);] is continuous and differentiable on [; [x_A,x_B] ;].


For ease of discussion, label
 [; J(x) = [\frac{\partial L}{\partial f}-\frac{d}{dx} \frac{\partial L}{\partial f'}] ;].

The fundamental theorem states that, given all of these characteristics of n(x), knowing that

[; \int^{a}_{b} J(x)n(x) \,dx=0 ;] 
for every possible function n(x), we can conclude that J(x) = 0.

The keyword here is every. Obviously, when n(x)=0 for all values of x, the integral evaluates to zero. Even for a some simple polynomial n(x) that satisfies the boundary conditions, you could dream up a function J(x) that balances out the integral (consider a sinusoidal J that oscillates equally between positive and negative values). However, because it must satisfy any and all n(x), those "equal distribution" solutions get all whacked out. Essentially n(x) is weighting the values J(x) that are being integrated. There isn't a single non-zero J(x) that can satisfy all possible n(x) weightings. Therefore J(x) must be zero for all values on [;[x_A, x_B];].

The Euler-Lagrange Equation

As a direct consequence of the Lemma, all of this effort yields what is called the Euler Lagrange Equation which (although it is the most important equation in this series of posts) we may adapt to the original Brachistochrone problem.

[; \frac{\partial L}{\partial f}-\frac{d}{dx} \frac{\partial L}{\partial f'}=0 ;]
Euler-Lagrange Equation
Recalling that

[; L(f(x),f'(x),x) = \frac{\sqrt{1+f'(x)^2}}{\sqrt{2g(f(x_A)-f(x))}} ;]

We can plug this L from the Brachistochrone problem into the Euler-Lagrange equation. This turns out to be quite messy - there is a way of modifying the Euler-Lagrange in this case by observing that L does not depend explicitly on x (this method was taken from these notes starting after Eq 19 therein). The method relies on the definition of the total derivative.

[; \frac{dL(f(x),f'(x))}{dx}= \frac{\partial L}{\partial f}f'+\frac{\partial L}{\partial f'}f'' ;]

Multiply the Euler-Lagrange EQ by f' to simplify


[; 0=[\frac{\partial L}{\partial f}-\frac{d}{dx} \frac{\partial L}{\partial f'}]f' ;]

[; =[\frac{\partial L}{\partial f}f'-f'\frac{d}{dx}\frac{\partial L}{\partial f'}] ;]

[; =[\frac{dL}{dx}-\frac{\partial L}{\partial f'}\frac{d^2f}{dx^2}-f'\frac{d}{dx}\frac{\partial L}{\partial f'}] ;]

[; 0=\frac{d}{dx}[L-f'\frac{\partial L}{\partial f'}] ;]

We can reason from this statement that the terms being differentiated with respect to x are constant:

[; [L-f'\frac{\partial L}{\partial f'}]=C ;]

We now have a differential equation which may be used to solve for the minimum time curve in the Brachistochrone Problem. The notes go further to determine that the minimum time curve explicitly as.

[;x(\theta)=\frac{A}{2}(\theta-sin(\theta));]

[;y(\theta)=\frac{A}{2}(1-cos(\theta));]

Where A may be solved for by constraining the parameterized curve to pass through one of the fixed points. Note that the solution in those notes differs by an offset term in the denominator of L(f(x),f'(x)). This is because they start with an assumed zero potential energy.

The most important takeaway from this post is the Euler-Lagrange equation. It has applications in a large variety of optimization problems. The Euler-Lagrange has a more general form covering multiple variables and higher-order derivatives. For problems involving larger-dimension state-spaces, the Euler Lagrange will generalize to a series of differential equations. I will cover this in a future post on the topic of generating equations of motion for multi-arm pendula.

No comments:

Post a Comment