Showing that Lorentz transformations are the only ones possible

In summary, the book "The special theory of relativity by David Bohm" states that if coordinates in frame A are (x,y,z,t) and coordinates in frame B moving with velocity v are (x',y',z',t'), and the relationship c^2t^2 - x^2 - y^2 - z^2 = 0 holds in both frames, then the only possible transformations that keep this relationship invariant are the Lorentz transformations. The author also mentions the need for rotations and reflections in order to preserve the interval. It is possible to show this using physical assumptions and linear functions, but it is difficult to prove for a general Lorentz-Herglotz transformation. However, if we make
  • #106
Fredrik said:
How am I "already imposing linearity"? I'm starting with "takes straight lines to straight lines", because that is the obvious property of inertial coordinate transformations, and then I'm using the theorem to prove that (when spacetime is ℝ4) an inertial coordinate transformation is the composition of a linear map and a translation. I don't think linearity is obvious. It's just an algebraic condition with no obvious connection to the concept of inertial coordinate transformations.


Right, if we add that to our assumptions, we can eliminate the Galilean group as a possibility. But I would prefer to just say this: These are the two theories that are consistent with a) the idea that ℝ4 is the underlying set of "spacetime", and b) our interpretation of the principle of relativity as a set of mathematically precise statements about transformations between global inertial coordinate systems. Now that we have two theories, we can use experiments to determine which one of them makes the better predictions.
The experiment for the actual discussion here is the Morley and Michelson experiment.

How is that a correction? It seems like an unrelated statement.
Intuitively (I am not a specialist) this means that that isomorphism holds true only locally (on short distances around the observer). There is not really a global inertial coordinate system (except on the paper, in theory). And (as far I understand the generalized version of the theory) this is a crucial point. Among others things, this was forcing us (Weyl's work) to introduce the concept of parallel transport and of connection.
 
Physics news on Phys.org
  • #107
Fredrik said:
How am I "already imposing linearity"?
The assumption of a spacetime that is globally R^4(not just locally which is the weaker asumption) means your underlying geometry is flat(Minkowskian, Euclidean), do you agree?
Given that space, the transformations that leave inertial coordinates invariant in the sense of SR first postulate must automatically be linear transformations, do you agree? Maybe this is not as obvious to see as I think, but I I think it is correct.

Fredrik said:
How is that a correction? It seems like an unrelated statement.
Well, It just seemed important to make more precise that the isomorphism you were talking about is local.
 
  • #108
Blackforest said:
But all this concerns only special relativity.
And pre-relativistic classical mechanics. It concerns all theories with ℝ4 as spacetime. I think it's pretty cool that there are only two such theories that are consistent with a straightforward interpretation of the principle of relativity.

Blackforest said:
Where do you see that the question asked by the OP (and recalled by friend) is imposing linearity? For me it only imposes the Christoffel's work; see the other discussion "O-S model of star collapse" post 109, Foundations of the GTR by A. Einstein and translated by Bose, [793], (25).
Someone who tries to argue that a transformation that satisfies the OP's condition must be a Lorentz transformation has probably already assumed that spacetime is ℝ4, and that the theory will involve global (i.e. defined on all of spacetime) inertial coordinate systems. That a transformation between two global inertial coordinate systems is a bijection and takes straight lines to straight lines is just a consequence of the definition of "global inertial coordinate system". The 4-dimensional version of the theorem I stated and proved in #98 shows that a bijection that takes straight lines to straight lines is affine (i.e. a composition of a linear map and a translation). So when we begin to consider the OP's condition, it's already a matter of determining which affine maps satisfy it. And the condition implies that 0 is taken to 0, so there's no translation involved, i.e. the transformation is linear.

Blackforest said:
My impression (perhaps false) is that SR is based on a coherent but circular way of thinking including "linearity" for easy understandable historical reasons.
I don't think there's anything circular about it. It's perhaps naive to think that we should be able to use ℝ4 as our spacetime, and talk about global inertial coordinate systems. But it makes sense to first find all such theories, and then ask what other theories are worth considering. I might take a look at that problem when I have worked out all the details of the ℝ4 case.
 
  • #109
TrickyDicky said:
The assumption of a spacetime that is globally R^4(not just locally which is the weaker asumption) means your underlying geometry is flat(Minkowskian, Euclidean), do you agree?
I don't agree. We don't have a geometry at that stage, because until we have chosen an inner product (or something similar), ℝ4 is just a set. (And in the case of Galilean transformations, we will never define anything like an inner product on ℝ4). The lines that we call "straight" are straight in the Euclidean sense, but we're not considering them because they're straight in the Euclidean sense, but because they describe motion with a constant velocity. We don't need an inner product to see that they do.

TrickyDicky said:
Given that space, the transformations that leave inertial coordinates invariant in the sense of SR first postulate must automatically be linear transformations, do you agree? Maybe this is not as obvious to see as I think, but I I think it is correct.
They must automatically be affine maps, but it takes a non-trivial theorem* to see that, and you specifically said that there's no need to prove that theorem.

*) This theorem is essentially "the fundamental theorem of affine geometry", stated in terms of vector spaces instead of affine spaces.

TrickyDicky said:
Well, It just seemed important to make more precise that the isomorphism you were talking about is local.
But it's not. This is the 1+1-dimensional version of what I said, with all the details made explicit: For each K>0, the group ##G_K=\{\Lambda(v)|v\in (-c,c)\}##, where ##c=1/\sqrt{K}## and
$$\Lambda(v)=\frac{1}{\sqrt{1-Kv^2}}\begin{pmatrix}1 & -Kv\\ -v & 1\end{pmatrix}$$ is isomorphic to the restricted Lorentz group.

There's nothing local about this. In fact, when K=1, this group is the restricted Lorentz group, and the isomorphism is the identity map.
 
Last edited:
  • #110
Fredrik said:
I don't agree. We don't have a geometry at that stage, because until we have chosen an inner product (or something similar), ℝ4 is just a set. .
Sorry, aren't we asuming inner product spaces? how can we even talk about transformation matrices otherwise?



Fredrik said:
But it's not.

Well, it's not with your assumption of flat inner product space, but if you consider general manifolds the restricted Lorentz group is locally isomorphic to the Lorentz group.
 
  • #111
TrickyDicky said:
Sorry, aren't we asuming inner product spaces? how can we even talk about transformation matrices otherwise?
I'm not even mentioning matrices until later in the argument, after I've determined that we're dealing with linear operators. To associate a matrix with a linear operator, we only need a basis.
 
  • #112
You wanted to prove that linear transformations are the only ones possible if one wants use rigorously the first postulate of SR, you bring a R^4 vector space because you consider natural the assumption that the space must be globally R^4, not just locally like in general manifolds, and in this space you need to perform matrix multiplications like:##T(x)=\Lambda x## that looks like a matrix product to me so we are starting with an R^4 vector space with an inner product structure, no? That is called a Euclidean structure IMO.
 
  • #113
  • #114
TrickyDicky said:
You wanted to prove that linear transformations are the only ones possible if one wants use rigorously the first postulate of SR, you bring a R^4 vector space because you consider natural the assumption that the space must be globally R^4, not just locally like in general manifolds, and in this space you need to perform matrix multiplications like:##T(x)=\Lambda x## that looks like a matrix product to me so we are starting with an R^4 vector space with an inner product structure, no? That is called a Euclidean structure IMO.
I'm not using the principle of relativity to prove that they're linear. The notation ##T(x)=\Lambda x+a## doesn't mean that ##\Lambda## is a matrix at this point. It only means that I'm using the standard convention to not write out parentheses when the map is known to be linear. We don't need an inner product to associate matrices with linear operators. We only need a basis for that. If U and V are vector spaces with bases ##A=\{u_i\}## and ##B=\{v_i\}## respectively, then the ij component of ##T:U\to V## with respect to the pair of bases (A,B) is defined as ##(Tu_j)_i##. The matrix associated with T (and the pair (A,B)) has ##(Tu_j)_i## (=the ith component of ##Tu_j##) on row i, column j.

* Spacetime is a structure with underlying set M.
* We intend to use curves in M to represent motion.
* There's a special set of curves in M that we can use to represent the motion of non-accelerating objects.
* M can be bijectively mapped onto ℝ4.
* A coordinate system on a subset ##U\subset M## is an injective map from U into ℝ4.
* A global coordinate system on M is a coordinate system with domain M.
* A global inertial coordinate system is a global coordinate system that takes the curves that represent non-accelerating motion to straight lines.
* If x and y are global coordinate systems, then ##x\circ y^{-1}## represents a change of coordinates. I call these functions coordinate transformations. When both x and y are global inertial coordinate systems, I call ##x\circ y^{-1}## an inertial coordinate transformation. (I'm getting tired of saying "global" all the time).
* These definitions imply that an inertial coordinate transformation is a bijection that takes straight lines to straight lines.
* The fundamental theorem of affine geometry tells us that this implies that inertial coordinate transformations are affine maps.
* This implies that an inertial coordinate transformation that takes 0 to 0 is linear.
* The principle of relativity tells us among other things that the set of inertial coordinate transformations is a group.
* This group has a subgroup G that consists of the proper and orthochronous inertial coordinate transformations that take 0 to 0.
* We interpret the principle of relativity as imposing a number of other conditions on G.
* Since the members of G are linear (we know this because they are affine and take 0 to 0), we can write an arbitrary member of G as a matrix. (This requires only a basis, not an inner product, and all vector spaces have a basis).
* The conditions inspired by the principle of relativity determine a bunch of relationships between the components of that matrix.
* Those relationships tell us that the group is either the restricted Galilean group without translations, or isomorphic to the restricted Lorentz group. (Restricted = proper and orthochronous).
* This implies that the group of all inertial coordinate transformations is either the Galilean group or the Poincaré group.
* We therefore define spacetime as a structure that has ℝ4 as the underlying set, and somehow singles out exactly one of these two groups as "special".
* A nice way to define a structure that singles out the Poincaré group is to define spacetime as the pair (ℝ4,g), where g is a Lorentzian metric whose isometry group is the Poincaré group.
* There's no equally nice way to handle the Galilean case. I think we either have to define spacetime as (ℝ4,G,g), where G is the Galilean group and G the metric on "space", or define it as a fiber bundle. (An ℝ3 bundle over ℝ, where each copy of ℝ3 is equipped with the Euclidean inner product). The former option is ugly. The latter is difficult to understand, unless you already understand fiber bundles of course.
 
Last edited:
  • #115
I'm given to understand that

2=dt2-dx2 = dt'2-dx'2

when (t',x') are the Lorentz transformation of (t,x).

Perhaps it's instructive to consider in what circumstances dτ should want to be considered invariant wrt to coordinate changes. Maybe those requirements are the driving force behind the necessity of the Lorentz transformations.

For example, the most obvious use of dτ is in the calculation of the line integral,

[tex]\int_{{\tau _0}}^\tau {d\tau '} = \tau - {\tau _0}[/tex]
which is the length of a line measured in terms of segments marked off along the length of the line. Then, of course, we can always place this line in an arbitrarily oriented coordinate system and express τ in term of those coordinates.

So the question is, when do we want to use the coordinates (t,x), and when would we want τ-τ0 to be invariant wrt to those coordinates?

Usually, we specify a curve in space by parameterizing the space coordinates with an arbitrary variable, call it "t". But since the x and t coordinates are arbitrarily assigned, the length of the curve can depend on the (t,x) coordinates. But if you specify that the length of the curve is invariant, then this requires the Lorentz transformations between coordinate systems.

But what requires the length of the curve to be invariant? Perhaps if we have a more fundamental requirement like

[tex]\int_{{\tau _0}}^\tau {f(\tau - {\tau _0})d\tau } = a[/tex]
this will require the length of τ-τ0 to be invariant wrt to coordinate changes in (t,x). For example, maybe [itex]{f(\tau - {\tau _0})}[/itex] might be a probability distribution along a path so that its integral along the path must be 1 in any coordinate system.

Did I get this all right? I would appreciate comments. Thank you.
 
Last edited:
  • #116
TrickyDicky said:
You wanted to prove that linear transformations are the only ones possible if one wants use rigorously the first postulate of SR, you bring a R^4 vector space because you consider natural the assumption that the space must be globally R^4, not just locally like in general manifolds, and in this space you need to perform matrix multiplications like:##T(x)=\Lambda x## that looks like a matrix product to me so we are starting with an R^4 vector space with an inner product structure, no? That is called a Euclidean structure IMO.

Why do you think we need inner products to define matrix products??
 
  • #117
micromass said:
Why do you think we need inner products to define matrix products??

No, it's not needed, I thought Fredrik was assuming Euclidean geometry but he wasn't.
 
  • #118
Fredrik said:
* Spacetime is a structure with underlying set M.
* We intend to use curves in M to represent motion.
* There's a special set of curves in M that we can use to represent the motion of non-accelerating objects.
* M can be bijectively mapped onto ℝ4.
* A coordinate system on a subset ##U\subset M## is an injective map from U into ℝ4.
* A global coordinate system on M is a coordinate system with domain M.
* A global inertial coordinate system is a global coordinate system that takes the curves that represent non-accelerating motion to straight lines.
* If x and y are global coordinate systems, then ##x\circ y^{-1}## represents a change of coordinates. I call these functions coordinate transformations. When both x and y are global inertial coordinate systems, I call ##x\circ y^{-1}## an inertial coordinate transformation. (I'm getting tired of saying "global" all the time).
* These definitions imply that an inertial coordinate transformation is a bijection that takes straight lines to straight lines.
I have some concerns about this part. Maybe there is some circularity in the argument after all. It doesn't seem obvious* that the "special" curves in spacetime that represent non-accelerated motion should include curves that correspond to infinite speed in some inertial coordinate system. If we leave them out, then what I call an inertial coordinate transformation will be a map that takes finite-speed straight lines to finite-speed straight lines. Of course, inertial coordinate transformations in SR (i.e. Poincaré transformations) can take infinite-speed lines to finite-speed lines and vice versa. If inertial coordinate transformations can't do this, there's no relativity of simultaneity. So if we leave out the infinite-speed lines from the start, we will come to the conclusion that there's only one possibility: The group is the Galilean group. (Hm, maybe there will actually be infinitely many possibilities, distinguished by what exactly they're doing to infinite-speed lines).

Do we have a reason to include infinite-speed lines other than that we know what we want the final answer to be?

*) Recall that the main reason why we need spacetime to include that special set of curves is that they (or at least some of them) are to represent the motions of "observers" that are minimally disturbed by what's being done to them. (An "observer" here is not necessarily conscious. It could be a measuring device).
 
Last edited:
  • #119
Fredrik said:
Do we have reason to include infinite-speed lines other than that we know what we want the final answer to be?

*) Recall that the main reason why we need spacetime to include that special set of curves is that they (or at least some of them) are to represent the motions of "observers" that are minimally disturbed by what's being done to them. (An "observer" here is not necessarily conscious. It could be a measuring device).
This sort of thing is one reason why I prefer to start from inertial observers defined as those that feel no acceleration. If one finds the maximal dynamical group applicable to the zero-acceleration equations of motion, the problematic case you mentioned can be handled by taking a limit afterwards.
 
  • #120
Fredrik said:
I have some concerns about this part. Maybe there is some circularity in the argument after all. It doesn't seem obvious* that the "special" curves in spacetime that represent non-accelerated motion should include curves that correspond to infinite speed in some inertial coordinate system. If we leave them out, then what I call an inertial coordinate transformation will be a map that takes finite-speed straight lines to finite-speed straight lines. Of course, inertial coordinate transformations in SR (i.e. Poincaré transformations) can take infinite-speed lines to finite-speed lines and vice versa. If inertial coordinate transformations can't do this, there's no relativity of simultaneity. So if we leave out the infinite-speed lines from the start, we will come to the conclusion that there's only one possibility: The group is the Galilean group. (Hm, maybe there will actually be infinitely many possibilities, distinguished by what exactly they're doing to infinite-speed lines).

Do we have a reason to include infinite-speed lines other than that we know what we want the final answer to be?

*) Recall that the main reason why we need spacetime to include that special set of curves is that they (or at least some of them) are to represent the motions of "observers" that are minimally disturbed by what's being done to them. (An "observer" here is not necessarily conscious. It could be a measuring device).

Why do you think relativity of simultaneity implies nonlinear transformations?(taking finite to infinite coords. and viceversa)

AFAIK RoS has always been explained with the usual linear Lorentz transformations.
 
  • #121
strangerep said:
This sort of thing is one reason why I prefer to start from inertial observers defined as those that feel no acceleration.
But that's what I do. That doesn't solve the problem. Now that I think about it, it makes things slightly worse than I understood when I wrote my previous post.

We are looking for theories in which there's a set K of curves in M (i.e. in spacetime) such that each member of K represents a possible motion of an accelerometer that measures 0. A global inertial coordinate system should be a bijection from M into ℝ4 that takes every one of those curves to a straight line. But we can't take this as the definition of a global inertial coordinate system, because we know that in SR, those curves are all timelike, and a global inertial coordinate system in SR also takes spacelike geodesics to straight lines.

I think we need to leave the term "global inertial coordinate system" partially undefined at this point. We can define it properly after we have found a group of inertial coordinate transformations.

The partial definition of "global inertial coordinate system" doesn't imply that inertial coordinate transformations take *all* straight lines to straight lines. It just implies that there's a set L of straight lines such that each member of L is taken to a member of L.

It does seem natural to also require that every inertial coordinate transformation takes all constant-velocity motions to constant-velocity motions, but this assumption doesn't pin down what an inertial coordinate transformation does to an infinite-speed straight line.

strangerep said:
If one finds the maximal dynamical group applicable to the zero-acceleration equations of motion, the problematic case you mentioned can be handled by taking a limit afterwards.
After we have found the group? But we used the assumption that *all* straight lines are taken to straight lines to find the group. Also, limits require a topology. If we could do this they way I originally intended (as described in the long list a few posts back), we would find, without any assumptions about topology, that inertial coordinate transformations are affine. Since affine maps are continuous with respect to the Euclidean topology, this could even be thought of as justification for choosing the Euclidean topology later.
 
  • #122
TrickyDicky said:
Why do you think relativity of simultaneity implies nonlinear transformations?
I don't. I said that there's no relativity of simultaneity if inertial coordinate transformations can't take infinite-speed straight lines to finite-speed straight lines. In other words, there's no relativity of simultaneity if inertial coordinate transformations can't change the slope of a horizontal line in a spacetime diagram. You don't need a non-linear transformation to change the slope of a horizontal line. A Lorentz transformation with non-zero velocity will do fine.
 
Last edited:
  • #123
Fredrik said:
I don't. I said that there's no relativity of simultaneity if inertial coordinate transformations can't take infinite-speed straight lines to finite-speed straight lines.

Ok, so I don't know why you bring up this, there is no such thing as infinite speed in SR, thus the relativity of simultaneity, there's no transformation from spacelike vectors to timelike ones.
 
  • #124
TrickyDicky said:
Ok, so I don't know why you bring up this, there is no such thing as infinite speed in SR, thus the relativity of simultaneity, there's no transformation from spacelike vectors to timelike ones.
Who said anything about spacelike to timelike?

I'm just saying that I haven't yet seen a satisfactory way to explain why inertial coordinate transformations should take *all* straight lines to straight lines. For example, why should the straight line with (t,x,y,z) coordinates t=0, x=0, y=z be taken to a straight line?
 
  • #125
Fredrik said:
Who said anything about spacelike to timelike?

I'm just saying that I haven't yet seen a satisfactory way to explain why inertial coordinate transformations should take *all* straight lines to straight lines. For example, why should the straight line with (t,x,y,z) coordinates t=0, x=0, y=z be taken to a straight line?
Hmmm, I think I see what you mean, and I'm not sure there is a satisfactory way using your path.

But of course if you took as starting point a flat spacetime that would trivially come from the fact that all geodesic lines in such a space are straight lines by definition. Still, this begs the question why should one choose such a space for SR in the first place. And the only answer is the postulates which are arbitrary to some extent.
 
Last edited:
  • #126
I think I have an argument that works. Consider the set of "vertical" lines (t variable, x,y,x constant) through a segment on that line. They would represent the motion of the component parts of a thin non-accelerating rod, in a comoving inertial coordinate system. Their union is the "world sheet" of the rod. We have already assumed that all the finite-speed lines in the world sheet, including the ones with arbitrarily high speeds, are taken to straight lines.

Suppose that the world sheet's intersection with the t=x=0 plane (i.e. the line I described), isn't taken to a straight line, then the images of the vertical lines under the inertial coordinate transformation have discontinuities at the t=x=0 plane, and a straight line with one point removed and put somewhere else isn't really a straight line. So this contradicts the assumption that finite-velocity straight lines are taken to straight lines, and this means that the line I described must be taken to a straight line.

My explanation may not be perfectly clear, but I have a spacetime diagram in my head that's seems clear enough, so I think this idea works even if I didn't explain it well enough.
 

Similar threads

Back
Top