16 January 2009...6:08 pm

Mathematical Preparations For Lagrangians

Jump to Comments

Mathematics is the key and door to the sciences

I would like to discuss Classical mechanics quite a bit, specifically about math necessary for Lagrangian mechanics in this post. I am going off of a derivation that John Baez made fairly rigorous, but my target audience is not as mathematically savvy as to just read his beautiful lecture notes (available here) so I will need to discuss the math first, then the actual physics.

Differentials and Infinitesimals

Suppose we are working with some function f(x). Suppose we can Taylor expand it about x_{0}

f(x) = f(x_{0}) + f'(x_{0})(x-x_{0}) + \mathcal{O}([x-x_{0}]^2).

If we introduce some new “number”, lets call it \varepsilon, that is a “number” in the sense that \sqrt{-1}=i is a “number”, that satisfies

\varepsilon^2 = 0

while we insist that \varepsilon\neq 0, we can do the following. We can find, by Taylor expanding, f(x+\varepsilon) to be

f(x+\varepsilon) = f(x) + \varepsilon f'(x) + \mathcal{O}(\varepsilon^2).

We keep only the first order, since anything of order \varepsilon^2 is necessarily 0. Why? Well, that’s how we defined it! So, we are then left with

f(x+\varepsilon) = f(x) + \varepsilon f'(x)

which can be thought of as sort of intuitively if we rearrange this to look look like

f(x+\varepsilon) - f(x) = \varepsilon f'(x).

This should remind everyone of the definition of a derivative, if only we could divide through by \varepsilon. But, we don’t know if the inverse of \varepsilon exists, so we cannot do it.

What does this mean? Well, the intuition of \varepsilon is that it is an “infinitesimal” or “a very tiny amount”. So we have for some “sufficiently small amount”

f(x + \text{some tiny amount}) - f(x) = (\text{some tiny amount}) f'(x)

which is superb. The left hand side can be thought of as a sort of “really small deformation” of f(x). It turns out that this is equal to the derivative of f at x times the “really small deformation of x“. This is quite
beautiful.

But it is quite problematic. Wouldn’t the very definition of \varepsilon be inconsistent? Since

\sqrt{\varepsilon^2} = \varepsilon = 0?

Well, as always, yes and no. Yes, this is a bit of a problem, and many have railed against it (ranging from bloggers to the Church). No, this is no problem, as long as the square root operation is not defined for \varepsilon alone. In fact, we could use matrices instead of numbers to have

\varepsilon = \begin{bmatrix}0 & 1\\ 0 & 0\end{bmatrix}

which is a nilpotent matrix, so when we square it, it vanishes as desired. We also set 1=I to be the identity matrix to make everything work out. But this is still a bit irritating. How do we do anything productive?

We can do something really clever instead. We can say, let \delta x be some variable or perhaps a function. Then we can have

\displaystyle\left.\frac{d}{ds} f(x+s\delta x)\right|_{s=0} = f'(x)\delta x

which is precisely what we had when we wrote

f(x+\varepsilon)-f(x)=\varepsilon f'(x)

making the switch of \varepsilon for \delta x.

But there is a subtlety here — we introduced a dummy parameter s. What is it? How should we interpret it? This is worth a book in and of itself, but for the time being just think of it as a sort of dummy parameter used to find the “deformation” of f with respect to x.

This is nice and cute for one dimensional functions, but what about e.g. f(x,y)? Well, we do the same song and dance, we just have

\displaystyle\left.\frac{d}{ds}f(x+s\delta x, y+s\delta y)\right|_{s=0} = \left.\frac{\partial f}{\partial (x+s\delta x)}\frac{d(x+s\delta x)}{ds}\right|_{s=0} + \left.\frac{\partial f}{\partial (y+s\delta y)}\frac{d(y+s\delta y)}{ds}\right|_{s=0}

This reduces to

\displaystyle\left.\frac{d}{ds}f(x+s\delta x, y+s\delta y)\right|_{s=0} = \frac{\partial f(x,y)}{\partial x}\delta x + \frac{\partial f(x,y)}{\partial y}\delta y.

So we are happy, we found a generalization for two dimensions. We can further generalize this to n dimensions by merely summing over the number of dimensions and taking the partial derivative with respect to the k^{th} dimension and multiplying it by \delta x_{k} where x_k indicates that it is the k^{th} coordinate we are differentiating with respect to. This should look familiar, remember when we have the differential of f, we take

\displaystyle\frac{df(x,y)}{dt} = \frac{\partial f(x,y)}{\partial x}\frac{dx}{dt} + \frac{\partial f(x,y)}{\partial y}\frac{dy}{dt}.

If, being physicists, we just multiply through by dt, we find

\displaystyle df = \frac{\partial f(x,y)}{\partial x}dx + \frac{\partial f(x,y)}{\partial y}dy

where dx = (dx/dt)dt, dy = (dy/dt)dt. This is remarkably similar to what we have, if we replace dx\to\delta x, etc. Why not run with it? See what happens!

I am tempted to introduce Grassmann algebras, and show that the exterior derivative could be set up using noncommuting variables allowing us to have the rule d^2f=0, but I will resist this urge for now. We should just bear in mind that we really only do this once, if we do the exterior derivative (find df) twice, we get 0. There is a bunch of other technical conditions when working with differential forms that we need to bear in mind, such as noncommutativity, etc. For now, do not worry about it, it is not necessary to continue on.

I will end the work here and continue on another time, to discuss Classical Mechanics directly.

2 Comments

  • Nice post. Finding explicit objects that square to zero without being zero themselves is fun indeed. A few years back, I was playing with the bi-octonions (complex numbers with octonion, rather than real, coefficients) and managed to find some examples of Grassmann algebraic behavior.

    Here’s a simple example:

    Let e_1 and e_2 be imaginary units (e.g. e_1^2=e_2^2=-1 and e_1e_2=-e_2e_1) of the octonionic basis and ‘i’ an imaginary unit that commutes with the octonionic units. Then w=e_1+ie_2 is a bi-octonionic element satisfying w^2=0, since w^2=(e_1+ie_2)(e_1+ie_2)=e_1^2+ie_1e_2+ie_2e_1+i^2e_2^2=-1+1=0, by anticommutivity of the octonionic units.

    I first heard about the bi-octonions from Baez’s notes here and Ohwashi formulated a matrix model with them.

  • What I have found more difficult to explain to physics majors is the notion of integrating by a matrix…that is to say, if we could “represent” e.g.

    z = \begin{bmatrix}x & -y\\y & x\end{bmatrix}

    then what exactly is

    \int dz = ?

    There is no intuition of a “dMatrix” type expression for them…

    The octonions are a fun toy to play with though, I’ve found quaternions blow physics majors’ minds.

    But sadly, I must get back to studying :(


Leave a Reply