Arbitrary rescaling of rapidity; work-KE theorem in SR

bcrowell · May 25, 2011

This thread https://www.physicsforums.com/showthread.php?t=500493 has led me to the following question, which probably has an obvious answer that I'm just not seeing.

The Lorentz transformation L depends in a simple way on the rapidity φ, and it scales according to L(kφ)=L(φ)^k. What is to stop us from rescaling φ arbitrarily when we generalize a Newtonian law of physics to a relativistic one?

For example, in SR we have a law of conservation of energy-momentum. The energy-momentum four-vector (timelike component first) is p=m(cosh φ,sinh φ). This transforms like a four-vector, so we're guaranteed that if energy-momentum is conserved in one frame, it's conserved in all other frames as well.

But suppose we instead defined the energy-momentum as p*=m(cosh αφ,sinh αφ), where α is some arbitrary constant. Flipping the sign of α just corresponds to flipping the coordinate system, but we could also imagine [itex]\alpha^2\ne 1[/itex]. I've written p* like a four-vector, but it isn't one. Under a boost by φ, it transforms like [itex]p^*\rightarrow L(\alpha \phi)p^*[/itex]. But that would seem to be OK, since if p* is conserved in one frame, it's also conserved in every other frame. In the Newtonian limit, p is rescaled by a factor of α and E by α²; since these are separately conserved in Newtonian physics, it's OK to rescale them by different factors, and they're still conserved.

E=mc² becomes E=mc²α^-2, which is contrary to experiment, although not contrary to any experimental evidence available to Einstein in 1905. DrGreg has a nice derivation of relativistic energy and momentum https://www.physicsforums.com/showthread.php?p=2416765 in which he explicitly considers why α²=1, but what bothers me a little is that one of the assumptions is the work-kinetic energy theorem. It's not obvious to me why we're justified in assuming a priori that the work-KE theorem has the same form in SR. DrGreg's treatment is based on one by Einstein, "Elementary derivation of the equivalence of mass and energy", Bull. Amer. Math. Soc. 41 (1935), 223-230, http://www.ams.org/journals/bull/1935-41-04/S0002-9904-1935-06046-X/home.html , and Einstein explicitly states as a selling point of his approach that he avoids talking about force.

The evidence that was available to Einstein in 1905 that clearly shows α²=1 was Maxwell's equations. The energy of an electromagnetic wave transforms by a factor of D^α, where D is the Doppler-shift factor for frequency, and Maxwell's equations say it transforms as D. But this seems to me like a very indirect way of approaching the issue. Actually I think that because the EM field is conformally invariant, it shouldn't care about [itex]\phi\rightarrow\alpha\phi[/itex], so α would probably only be constrained when you let light interact with material particles...? (This last thought may be completely wrong, haven't thought it through properly.)

bcrowell · May 25, 2011

I combed through the 1935 Einstein paper and I have to admit that I'm baffled by it at this point. Pages 223-224 are a plausibility argument. On p. 225 he starts over again and says he's going to give a real proof. He says his assumption is that "...the principles of conservation of impulse and energy are to hold for all coordinate systems which are connected with one another by the Lorentz transformations..." But this isn't enough of an assumption, because an E and p that transform with α²≠1 also satisfy it. So that means I need to go through the remainder of the paper to see where there is some other explicit or implicit assumption brought in. On p. 226 I don't understand what definition he is assuming for "the velocity u' of the pair." On pp. 226-227 he talks about elastic collisions of identical particles, but this doesn't make any sense to me. Elastic collisions of identical particles have to be trivial velocity-swaps, because of symmetry in the c.m. frame, so I don't understand how you can conclude anything from them. He gets some equations and says "These equations, which are valid in general for elastic collisions of equal masses, have the form of conservation equations; it may therefore be taken for granted that no other symmetrical or anti-symmetrical functions of the velocity-components exist which in the present case of the elastic collision of two identically constituted material points give anything analogous." But this doesn't make sense to me, because any function of velocity is automatically conserved in the elastic collision of two identical particles. For instance, the Newtonian expressions mv and (1/2)mv^2 are conserved in a system composed of elastically colliding, identical particles, even if the particles are relativistic. Other silly expressions like m/sin v are also conserved.

PhilDSP · May 27, 2011

bcrowell said:

The Lorentz transformation L depends in a simple way on the rapidity φ, and it scales according to L(kφ)=L(φ)^k. What is to stop us from rescaling φ arbitrarily when we generalize a Newtonian law of physics to a relativistic one?

These are all good questions. It could be worth scrutinizing Lorentz's "The Theory of Electrons". He generally was pretty thorough in both verbal rationale and full mathematical development of any derivations. Especially the "Notes" at the end of the book is helpful. In one note he recasts and explains Einstein's re-derivation of the Lorentz transformation.

As for rescaling the rapidity, I'm trying to figure out whether that's equivalent to rescaling the wave operator

[tex]\frac{\partial}{\partial x} - \frac{1}{c^2}\frac{\partial}{\partial t}[/tex] to [tex]\frac{\partial}{\partial x} - \frac{1}{k(c - v)^2}\frac{\partial}{\partial t}[/tex]

or is it?

[tex]\frac{\partial}{\partial x} - \frac{1}{c^2}\frac{\partial}{\partial t}[/tex] to [tex]\frac{\partial}{\partial x} - \frac{1}{c^2}(\frac{\partial}{\partial x} - \frac{1}{kv^2}\frac{\partial}{\partial t})[/tex]

or something else?

In any event, Lorentz seemed to be most interested in generating equations for optical characteristics and chose to scale the Lorentz transformation so that the solutions corresponded with Fizeau's results.

bcrowell · May 27, 2011

Hi, PhilDSP,

Thanks for your reply! This issue has been bugging me like crazy for a week or two now.

Hmm...I think I see where you're going with the wave operator. In [itex]\frac{\partial}{\partial x} - \frac{1}{c^2}\frac{\partial}{\partial t}[/itex], I assume you meant c, not c². Basically this is just the gradient four-vector, [itex]\partial[/itex]. Now, velocities of [itex]\pm c[/itex] correspond to rapidities of [itex]\pm\infty[/itex], so rescaling rapidities leaves c the same. Also, the possibility I'm trying to kill off is the possibility that the p four-vector transforms differently from the x four-vector, which still transforms in the usual way. The thing about the gradient is that it forms a link between those two vector spaces, the space of x's and the space of p's.

Let's refer to a vector that transforms with α²≠1 as an α-vector, as opposed to a Lorentz vector with α²=1. I think the way to rule out the α-vectors is by showing that a boost fails to preserve the relationships that α-vectors have with Lorentz vectors.

One example of such a relationship is that x-space and p-space are basically related to one another through a Fourier transform. I'm sure that's one way to go, but I'm not immediately seeing how to carry it out. I would also like to resolve this in a way that works in an elementary presentation.

Another example along the same lines is the canonical momentum p+qA, where A is the four-potential. If p is an α-vector and A is a Lorentz vector, then the canonical momentum is neither fish nor fowl. But this isn't an argument I can use at an elementary level.

Another such relationship is an inner product. If you have α-vector A and Lorentz vector B, then their inner product A⋅B isn't a scalar. Under a boost, it changes, and it can be zero in one frame but nonzero in another. I think this is why the work-KE theorem is how people usually approach this. If you differentiate the p four-vector with respect to proper time, you get the force four-vector F (the Minkowski force). If p is an α-vector, then so is F. But x is a Lorentz four-vector, so differentiating it with respect to proper time gives a velocity four-vector v that is also a Lorentz vector. That means that F⋅v isn't a scalar, but F⋅v is supposed to be zero, because it represents the rate of change of the particle's rest mass. So this sort of sounds like an ironclad proof that p can't be an α-vector. However, what ties me up in logical knots here is the question of what gets logically established before what. Things like the frame-independence of the work-KE theorem for three-vectors, or F⋅v=0 for the four-force, seem to me like things that are not obvious a priori and should really be established *after* you've figured out how relativistic energy and momentum behave.

-Ben

PhilDSP · May 27, 2011

Yes, you're right. The first order wave operator should be as you say. If we consider that as an advection equation for any parameter being transported at speed c then we can combine 2 such advection equations (each traveling in opposite directions) to get the operator for the full wave equation:

[tex](\frac{\partial}{\partial x} - \frac{1}{c}\frac{\partial}{\partial t})(\frac{\partial}{\partial x} + \frac{1}{c}\frac{\partial}{\partial t}) = \frac{\partial^2}{\partial x^2} - \frac{1}{c^2}\frac{\partial^2}{\partial t^2}[/tex]

What I'm uncertain of is exactly how Lorentz compounded the wave operators for EM propagation at c with the movement of the charge. If we arrive at the same operators Lorentz used then it would seem to be a simple matter to take the Fourier transform and apply that to the questions at hand. I'll need to think more about what you posted.

PhilDSP · May 29, 2011

After pondering the operators more carefully and looking at a few pages of "Many Minds Relativity" by Swedish mathematician Claes Johnson, I think I've figured out how to derive what the operators should look like:

First, determine the differential operator relations by applying the chain rule to the Lorentz transformation (in the x dimension only)
[tex]x' = \gamma(x - vt) \ \ \ \ \ \ \ \ t' = \gamma(t - v\frac{x}{c^2})[/tex]
[tex]\frac{\partial}{\partial x} \ = \ \frac{\partial x'}{\partial x} \frac{\partial}{\partial x'} + \frac{\partial t'}{\partial x} \frac{\partial}{\partial t'} \ = \ \gamma(\frac{\partial}{\partial x'} - \frac{v}{c^2} \frac{\partial}{\partial t'}) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{\partial}{\partial t} \ = \ \frac{\partial x'}{\partial t} \frac{\partial}{\partial x'} + \frac{\partial t'}{\partial t} \frac{\partial}{\partial t'} \ = \ \gamma(\frac{\partial}{\partial t'} - v\frac{\partial}{\partial x'})[/tex]

Now substitute the primed coordinates for the unprimed coordinates in the advection operator (for both directions):
[tex]\frac{\partial}{\partial x} - \frac{1}{c}\frac{\partial}{\partial t} \ \ = \ \ \gamma(\frac{\partial}{\partial x'} - \frac{v}{c^2} \frac{\partial}{\partial t'}) - \gamma(\frac{1}{c})(\frac{\partial}{\partial t'} - v\frac{\partial}{\partial x'}) \ \ = \ \ \gamma((1 + \frac{v}{c})\frac{\partial}{\partial x'} - (1 + \frac{v}{c})\frac{1}{c}\frac{\partial}{\partial t'}) \ \ = \ \ \gamma((1 + \frac{v}{c})(\frac{\partial}{\partial x'} - \frac{1}{c}\frac{\partial}{\partial t'})[/tex]
[tex]\frac{\partial}{\partial x} + \frac{1}{c}\frac{\partial}{\partial t} \ \ = \ \ \gamma(\frac{\partial}{\partial x'} - \frac{v}{c^2} \frac{\partial}{\partial t'}) + \gamma(\frac{1}{c})(\frac{\partial}{\partial t'} - v\frac{\partial}{\partial x'}) \ \ = \ \ \gamma((1 - \frac{v}{c})\frac{\partial}{\partial x'} + (1 - \frac{v}{c})\frac{1}{c}\frac{\partial}{\partial t'}) \ \ = \ \ \gamma((1 - \frac{v}{c})(\frac{\partial}{\partial x'} + \frac{1}{c}\frac{\partial}{\partial t'})[/tex]

Then
[tex](\frac{\partial}{\partial x} - \frac{1}{c}\frac{\partial}{\partial t})(\frac{\partial}{\partial x} + \frac{1}{c}\frac{\partial}{\partial t}) \ \ = \ \ \gamma^2(1 - \frac{v}{c})(1 + \frac{v}{c})(\frac{\partial}{\partial x'} - \frac{1}{c}\frac{\partial}{\partial t'})(\frac{\partial}{\partial x'} + \frac{1}{c}\frac{\partial}{\partial t'}) \ \ = \ \ \frac{\gamma^2}{\gamma^2} ( \frac{\partial^2}{\partial (x')^2} - \frac{1}{c^2}\frac{\partial^2}{\partial (t')^2} )[/tex]

So that
[tex]\frac{\partial^2}{\partial x^2} - \frac{1}{c^2}\frac{\partial^2}{\partial t^2} \ \ = \ \ \frac{\partial^2}{\partial (x')^2} - \frac{1}{c^2}\frac{\partial^2}{\partial (t')^2}[/tex]

The wave operator is invariant, no wonder Lorentz thought he was onto something. The Fourier transform equation will trivially be
[tex]k^2 - \frac{1}{c^2}\omega^2 \ \ = \ \ (k')^2 - \frac{1}{c^2}(\omega')^2[/tex]

PhilDSP · May 29, 2011

Applying the same procedure to the Galilean transformation [itex]\ \ \ x' = x - vt \ \ \ \ \ t' = t \ \ [/itex] we get
[tex]\frac{\partial^2}{\partial x^2} - \frac{1}{c^2}\frac{\partial^2}{\partial t^2} \ \ = \ \ \frac{\partial^2}{\partial (x')^2} - \frac{1}{c^2}(\frac{\partial}{\partial t'} - v\frac{\partial}{\partial x'})^2[/tex]

The Fourier transform equation for this would be
[tex]k^2 - \frac{1}{c^2}\omega^2 \ \ = \ \ (k')^2 + \frac{1}{c^2}(i\omega' - ivk')^2 \ \ = \ \ (k')^2 - \frac{1}{c^2}((\omega')^2 - 2v\omega' k' + v^2(k')^2)[/tex]

Or simply
[tex]c^2k^2 - \omega^2 \ \ = \ \ c^2(k')^2 - (\omega')^2 + 2v\omega' k' - v^2(k')^2[/tex]

Hurkyl · May 29, 2011

bcrowell said:

Under a boost by φ, it transforms like [itex]p^*\rightarrow L(\alpha \phi)p^*[/itex]. But that would seem to be OK,

I don't think that's how it transforms. Have you considered boosts in directions different from the velocity vector?

Hurkyl · May 29, 2011

For [itex]\alpha = 2[/itex], here is the relationship between p and p*:

If p = (t, x, y, z) with [itex]m^2 = t^2 - x^2 - y^2 - z^2[/itex], we have

[tex]p^* = \frac{1}{m} [ t^2 + x^2 + y^2 + z^2, 2tx, 2ty, 2tz ][/tex]

Taking p = (2,0,1,0)^T and applying the boost
[tex]\left( \begin{matrix}5/3 & 4/3 & 0 & 0 \\ 4/3 & 5/3 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right)[/tex]
we get the transformed vector
[tex]p' = (10/3, 8/3, 1, 0)[/tex]
and the corresponding
[tex]p'^* = (173/9, 160/9, 20/3, 0) 3^{-1/2}[/tex]

Now for [itex]p* = (5, 0, 4, 0)^T 3^{-1/2}[/itex] and applying this boost twice, we get
[tex](205/9, 200/9, 4, 0) 3^{-1/2}[/tex]
which is not equal to p'*.

grav-universe · May 29, 2011

I'm not quite sure what you mean by rescaling the rapidity. If one rescales by setting a different tick rate or meter length within a particular frame to measure speeds differently while the frame remains homogeneous by keeping the same synchronization between clocks, then all speeds will rescale in the same way. For instance, if one frame measures c and a relative speed v of a second frame, then after rescaling the second frame, the second frame might then measure c' and v' of the first frame. But since all speeds are rescaled in the same way within the second frame, then v'/c' = v/c, so the rapidity remains the same. Energy according to the rescaled frame then becomes E = m c'^2 using the rescaled speed of light, applying different values but using the same equation.

As for how one transforms kinetic energy, I doubt this is what you are asking about specifically, but we have

E = F d = (m a) d

where a is the proper aceleration applied and the distance traveled by an accelerating object is

d = (c^2 / a) [sqrt(1 + (a t / c)^2) - 1], so

E = m a (c^2 / a) [sqrt(1 + (a t / c)^2) - 1]

where a t / c = (v / c) / sqrt(1 - (v/c)^2), giving

E = (m c^2) [sqrt(1 + (v/c)^2 / (1 - (v/c)^2)) - 1]

= (m c^2) [sqrt((1 - (v/c)^2) + (v/c)^2) / sqrt(1 - (v/c)^2) - 1]

= (m c^2) [1 / sqrt(1 - (v/c)^2) - 1]

I figure you already know this so it is most probably not what you are asking, but I noticed something interesting with Hurkyl's reply, so thought I would post this. The kinetic speed just obtained is according to the distance traveled according to the initial frame of acceleration. It works out as applied that way but should probably be applied from the frame of the ship, although the ship observer should say that the point of departure is accelerating away in the same way over the same distance, at least for the distance measured after the acceleration ceases and the ship becomes inertial again in the final frame, but I would want something more definite than simply applying relativistic distance to the Newtonian formula for work anyway. For instance, from the frame of the accelerating observer, we could say that the accelerating observer's ship is emitting some amount of energy per proper time or per boost in order to accelerate (with insignificant loss to the mass of the accelerating ship due to fuel loss), or we could have the local frames apply some interval of energy with each boost as the ship passes them, so after n boosts, an amount of energy of n dE has been expended by the ship or applied by the local frames in order to accelerate the ship to v(n).

Let's say that the ship is stationary in our frame. We apply a boost with some energy dE expended and the ship now travels at v. In the frame of v, a second identical boost is applied, and so on. After n boosts, we find that the speed of the ship will be traveling with a speed of

v(n) / c = 2 / [(2 / (1 + v/c) - 1)^n + 1] - 1

We also find that for infinitesimal boosts, v/c is infinitesimal (becoming dv/c), so that

(2 / (1 + dv/c) - 1)^n = (1 / e^2)^((v/c) n), whereas

v(n) / c = 2 / [1 / e^(2 (dv/c) n) +1] - 1

and solving for n, we get

n = ln[1 / (2 / (1 + v(n)/c) - 1)] / (2 (dv/c))

But here's where things get interesting. This can be used to attain the relativistic acceleration formulas for speed, distance, and the time for the accelerating observer, which I have already done. But when applied to energy, n in this case should be something on the order of E / dE as far as I can tell, being the energy applied with each boost times the number of boosts in all, so we should be able to solve for E with dE being the infinitesimal local energy applied, gaining a constant acceleration of the ship, whereby dE = m dv^2 / 2 = m a dx = m a dv dt / 2, but then we get

n = E / dE = ln[1 / (2 / (1 + v(n)/c) - 1)] / (2 (dv/c))

E = (m a dv dt / 2) ln[1 / (2 / (1 + v(n)/c) - 1)] / (2 (dv/c))

E = (m a dt c / 4) ln[1 / (2 / (1 + v(n)/c) - 1)]

We are left with all finite values except for dt, which should be finite also since E is finite. I've been working on this for a while now, but cannot seem to extract a finite value for energy and I'm not sure why. Any thoughts?

bcrowell · May 30, 2011

Hi, Hurkyl,

Thanks for your reply :-)

I think we're on different wavelengths, or maybe I didn't explain what I had in mind clearly enough. It's definitely not possible to have the standard p four-vector transforming as a standard Lorentz vector, and also the p* transforming as [itex]p^*\rightarrow L(\alpha \phi)p^*[/itex]. They're incompatible theories, and the α=1 theory is the one that is both correct according to experiment and theoretically compatible with Maxwell's equations. What I'm trying to figure out is whether there's any pedagogically simple way to rule out [itex]\alpha^2\ne 1[/itex] without appealing to experiment, and without assuming knowledge of Maxwell's equations. Most freshman-survey texts that bother to give any semblance of a coherent logical foundation for SR (i.e., not very many of them), do it using the work-KE theorem, which they generally assume, without explanation, is valid in its Galilean form when you generalize to the one-dimensional relativistic case.

-Ben

Hurkyl · May 30, 2011

bcrowell said:

also the p* transforming as [itex]p^*\rightarrow L(\alpha \phi)p^*[/itex]

It's worse than you think: the Lorentz group can't act that way. It fails associativity. For example, for [itex]\alpha = 2[/itex]:

[tex] A \cdot (B \cdot p^*) = A \cdot (B^2 p^*) = A^2 B^2 p^* [/tex]
[tex] (A B) \cdot p^* = (AB)^2 p^*[/tex]

(in the above, [itex]\cdot[/itex] is the group action you describe, and juxtaposition is the ordinary multiplication of a matrix by a vector)

This transformation law would only make sense in a representation of the Lorentz group where all of the matrices commute.

bcrowell · May 30, 2011

I think I finally figured out a simple explanation for this.

When you derive the Lorentz transformation, one of the deceptively simple steps is calling dx/dt "velocity." This isn't as trivial as it seems, since velocity was assumed to have a variety of properties in Galilean relativity, and dx/dt doesn't have all of those properties in SR. E.g., in SR boosts don't add linearly, and consecutive boots in perpendicular directions don't commute. But dx/dt does at least recover these properties approximately in the limit of v<<c.

Suppose you've already convinced yourself that p has to transform as [itex]p\rightarrow L(\alpha \phi)p[/itex], and you just want to rule out [itex]\alpha^2\ne 1[/itex]. Well, in the limit of small velocities, [itex]E\approx m[/itex], so [itex]dp/dE\approx dp/dm\approx v[/itex]. So in the low-velocity limit, we need to refer to dp/dE as "velocity." But in this limit, the line p=0 transforms to a line with a slope of approximately [itex]\alpha v[/itex], and that means we must have [itex]\alpha=1[/itex].

Arbitrary rescaling of rapidity; work-KE theorem in SR

FAQ: Arbitrary rescaling of rapidity; work-KE theorem in SR

What is arbitrary rescaling of rapidity?

Why is arbitrary rescaling of rapidity important?

What is the work-KE theorem in special relativity?

How does the work-KE theorem in special relativity differ from classical mechanics?

What are some real-world applications of the work-KE theorem in special relativity?

Similar threads

Hot Threads

Recent Insights