Thursday, May 16

Calculus of Variations, 1/2

I've decided to post on this topic for two reasons. First, I want to solidify my own comprehension; I was introduced to variational calculus as a supplemental portion of a Finite Element Analysis course that I took during the third year of my BME. When my FEA professor lectured on it, I wasn't particularly comfortable with the method; I got the gist of what was being done but I wasn't able to reconstruct it on my own when I sat down to a piece of paper. Second, since that lecture, I have recognized it in a handful of applications in the realm of optimization, mechanics, control theory, and FEA; Those topics are pretty damned important for someone interested in robotics. For those reasons, although plenty of people have produced better summaries and  course materials on the topic, I feel compelled to make a modest attempt for my own sense of completion.

To start, it is sometimes useful to learn about the history of a subject in order to understand the context in which it was developed. As far as mathematical methods go, there is usually some famous problem for which it was developed. In the case of variational calculus, that problem was finding the Brachistochrone Curve. The problem was posed by Johann Bernoulli as part of a correspondence between friends and contemporary mathematicians of his time. He asked his colleagues to determine the optimal curve that a bead would follow through a uniform gravitational field between two points in the shortest amount of time. As one might expect, the name is derived from the latin brachis meaning "shortest" and chron meaning "time".

The Brachistochrone Problem



Leonhard Euler developed the general approach of variational calculus in his attempt to solve the Brachistochrone problem. One might wonder why it isn't called Euler's Calculus of Variations. For one thing, it seems quite a bit was done to develop the topic beyond Euler's original contribution. Furthermore, there are quite enough things named after Euler as it is. In what follows, I will attempt to summarize the method insofar as I comprehend it and make an effort to emphasize and clarify the points that caused me confusion as I was learning it - straight from the dog's mouth, as noone says.

To begin, imagine a curve between the two points; the primary constraints are that the curve will have two fixed boundary values (the coordinates of the end points). It can also be assumed that the solution will be a function of the horizontal distance because it wouldn't be minimal for the bead's path to double back on itself along the x-axis. As far as the behavior of the bead, it is easiest to make use of energy equations to derive the equations of motion. Recognizing that no work is done on the bead by external forces, it can be seen that the mechanical energy is conserved and is equal to the sum of the potential energy and the kinetic energy at any given time.


Conceptually, it would be advantageous for the bead to lose height as early as possible because higher velocities early on will have a larger 'return on investment' as far as the total time of transit is concerned. At the same time, it is advantageous to minimize the length of the bead's path. The minimum path length is achieved by a straight line between the two points while maximizing the initial acceleration will require driving the curve away from a straight line. Without any calculations, it can already be seen that the minimum time solution will be a trade-off between the initial slope and length of the path; it's also possible to sort of visualize what that might look like. However, imagining the minimum time curve for one instance of the Brachistochrone Problem is very different from analytically determining a general solution for any instance.

The first step towards an analytical solution is to determine an expression that will produce the transit time of the bead along any curve. There are a few ways of developing that expression. One method that I consider to be most natural is to represent the path as a vector function



The time derivative of this function will be a vector of the time derivatives of the components. This relationship is summarized by the right triangle in the figure below. Note that the magnitude of ds can be determined by the pythagorean theorem.

Components of ds

Also, ds/dt represents the instantaneous velocity of the bead.


Finally, manipulating the magnitude equation, we can express the total time in terms of x and the path represented as a function f(x).



We can call T a functional in that it maps a vector/function to a scalar value. In the case of the brachistochrone, this functional essentially quantifies the performance of the wire curve; it represents the time of the bead's transit from point A to point B. In a sense, as my FEA professor described it, a functional can be thought of as a "function of a function". This was a confusing description for me to grasp originally because, if someone were to say "the function of a function" to me, I would immediately imagine something like f(g(x)). The difference here is that a functional in this case provides one value for the entire range of values in the given boundary. While f(g(x)) varies with x (and may have a unique value for every x), a functional such as T above provides one value for the totality of g, bounded between points A and B - all values of x in that range are incorporated into the score. In that way a functional is a function of a function; the difference is in what one means by 'function': the value returned by the function or the entire span of values that are returned by the function within some boundary - the output versus the essence.

In the next post, we'll imagine some slight modification is made to the curve. Because this functional essentially 'scores' the curve, we can test how changing the curve impacts the 'score'. Recalling that the task in the brachistochrone problem is to minimize this score, it would be prudent to consider how small changes in the curve impact T. Just like extrema on single variable functions, an optimal path (one that minimizes the functional) will produce no change in the functional for infinitesimal changes in the path.

*click here to view the second post on this topic

No comments:

Post a Comment