Linear Representations and Why Precision is Important in Math
First of all: What is a representation? It is the description of a mathematical object like a Lie group or a Lie algebra by its actions on another space 1). We further want this action to preserve the given structure because its structure is exactly what we’re interested in. And this other space here should be a vector space since we want to deal with operators and transformations.
Our main examples shall be the special unitary group and its Lie algebra. The special unitary group ##SU(n)## is the group of isometries of an n-dimensional complex Hilbert space that preserve the volume form on this space. How that? I thought it were ##SU(n)=\{\text{ unitary matrices }\}##? To be a bit more precise $$SU(n)=\{A \in \mathbb{M}_n(\mathbb{C})\,\vert \, A\cdot A^\dagger = 1 \wedge \det(A)=1\}$$
Well, both is true. And the definition via matrices is already our first example of a representation. It is in a way nearby via the association
$$\it{isometry} \rightarrow \it{transformation} \rightarrow \it{ matrix}$$
However, it requires a basis according to which these matrices can be expressed: we turned the isometries into matrices. Why? To study their behavior on vectors, its geometric meaning by which we originally defined them. Unfortunately, this is also the point where the trouble starts, since it is not the only possible representation.
A representation of ##SU(n)## is a group homomorphism ##\varphi : SU(n) \longrightarrow GL(V)## into the group of regular, linear transformations of a vector space ##V##.
##V## is sometimes shortly referred to as the representation, but the correct way would be to call the entire triplet ##(SU(n),V,\varphi)## the representation. I will call ##V## occasionally the representation space. Homomorphisms are nothing else but a form of saying “preserves the structure” which is the group or matrix multiplication in ##SU(n)## in this case. Therefore only regular transformations on ##V## are allowed here. In terms of a formula it says ##\varphi(AB) = \varphi(A) \cdot \varphi(B)##. It is exactly the same as to say “##SU(n)## acts on ##V##” or “##SU(n)## operates on ##V##”. It does so by ##\varphi## and the formulation with a representation only emphasizes which rules this operation has to obey.
Especially it means we can apply our matrices ##A## on vectors ##v \in \mathbb{C}^n## via ##\varphi(A)(v)=A.v=A \cdot v=Av##. But in general neither ##V=\mathbb{C}^n## has to be chosen nor ##\varphi## has to be the matrix application on a vector to define a representation. It is, even if an important, only one special case.
Another vector space that naturally comes to mind in these cases is the tangent space in ##1## of our Lie group ##SU(n)##. It is usually denoted in small gothic (fraktur – mathfrak) letters, so $$\mathcal{T}_1(SU(n))=\mathfrak{su}(n)=\{A \in \mathbb{M}_n(\mathbb{C})\,\vert \, A+A^\dagger = 0 \wedge trace(A)=0\}$$
It is worth noting the similarities in both definitions: multiplication became addition, the identity matrix became the zero matrix, the determinant became the trace and 1 became 0. This is not by accident. It is the result of the exponential map that gets us from the tangent space back into the group. The other way around is by differentiating paths within the group and evaluating the differentials at 1.
Here is where it’s getting a bit messy. We have elements, i.e. tangent vectors, which are also complex ##n \times n##-matrices. But if ##A## is a skew-Hermitian matrix, then ##\mathrm{i}A## is Hermitian. Thus ##\mathfrak{su}(n)## cannot be viewed as a complex vector space. Scalar multiplication would leave the structure. So let us consider it as a real vector space. We have ##2n^2## real matrix entries, two for every complex matrix entry. The conditions (##A_{ij}+overline{A}_{ji}=0, , ,i \neq j##) give us two linear equations (on the real and imaginary part of the entries) for every entry above the main diagonal. Below it there are no additional requirements due to the symmetry in the indices. On the diagonal we have (with ##A_{ii}+\overline{A}_{ii}=0##) ##n## conditions for the real parts of our entries. And at last, the condition on the trace gives us another equation. This means we have
$$2n^2 – 2 \cdot \frac{n^2-n}{2} – n – 1 = n^2 – 1$$
real matrix entries as degree of freedom and thus ##\dim_\mathbb{R}\mathfrak{su}(n)=n^2-1##.
These matrices are in general not regular and don’t form a multiplicative group. They build, however, the real vector space of skew-Hermitian matrices which is a real Lie algebra: the Lie algebra ##\mathfrak{su}(n)## of the Lie group ##SU(n)##.
##\mathfrak{su}(n)## also has representations. This time “preserving the structure” means, preserving the Lie algebra structure which in terms of a formula is a Lie algebra homomorphism 2) ## \; \Phi : \mathfrak{su}(n) \longrightarrow \mathfrak{gl}(\mathfrak{su}(n))## over ##\mathbb{R}## with
$$\Phi([A,B])=[\Phi(A),\Phi(B)]=\Phi(A)\Phi(B)-\Phi(B)\Phi(A)$$
Groups and algebras are quite different objects and until now we only have the natural representations of the Lie group, resp. Lie algebra as matrices by the choice of a basis in ##\mathbb{C}^n##, the vector space these matrices can be applied to.
So on one hand they operate on the same n-dimensional complex vector space because they are complex matrices. On the other hand, however, the Lie group operates on the Lie algebra as one possible representation space, too.
It is its tangent space where it acts on. This representation in turn leads to a representation of the Lie algebra and adjoint is the keyword here. But why should we bother representations of the Lie algebra at all? As mentioned above, representations are operations on a vector space that help us to understand the operating structure itself. In the case of ##SU(n)##, they additionally have a physical relevance as gauge groups in SM.
Roughly speaking: One can decompose a finite-dimensional representation space ##V## of a (semi)-simple Lie algebra like ##\mathfrak{su}(n)## into a direct sum of certain eigenspaces ##V_\lambda## (better: weight spaces or invariant subspaces) under the action of the Cartan subalgebra (CSA) (maximal toral subalgebra) of ##\mathfrak{su}(n)## with eigenvalues ##\lambda## (better: weights). The Lie algebra elements act by shifting between these so-called weight spaces. There are also the highest and the lowest weight and the corresponding vectors are thus shifted to zero by certain Lie algebra elements. E.g. the weights of an irreducible ##\mathfrak{su}(2)## representation with ##\dim V = m+1## are in the range ##m, m-2,\ldots,-(m-2),-m##. [2] We will see that Lie algebras also act on themselves, so they, too, as a vector space, can be decomposed into eigenspaces. The corresponding eigenvalues here are called roots. And the eigenvectors of the highest, resp. the lowest root, are those certain elements. In case of ##V=\mathfrak{su}(2)## we have ##m=2## and roots ##\{2,0,-2\}\,##3) . It should be mentioned that in general an algebraically closed field like the complex numbers is needed to guarantee the existence of all eigenvalues and the CSA to be Abelian.
All these representations are important. There is another natural representation of any Lie algebra given by its left multiplication, the adjoint representation
$$\mathfrak{ad}: \mathfrak{su}(n) \longrightarrow \mathfrak{gl}(\mathfrak{su}(n))$$
$$\mathfrak{ad}: X \longmapsto (Y \mapsto [X,Y])$$
of linear transformations on itself as representation space. In terms of a formula, we have in addition to linearity the requirement
$$\mathfrak{ad}([X,Y])=[\mathfrak{ad}(X),\mathfrak{ad}(Y)]=\mathfrak{ad}(X)\mathfrak{ad}(Y)-\mathfrak{ad}(Y)\mathfrak{ad}(X)$$
which is simply the Jacobi identity 4). It becomes obvious if the formula above is applied to an element ##Z##. The mappings ##\mathfrak{ad}X## are called inner derivations of the Lie algebra and play a central role everywhere in Lie algebra theory, e.g. in their cohomology theory 5). They are closely related to the inner automorphisms (conjugations) of the Lie group which justifies the name. This means we arrive at the adjoint representation of a Lie algebra not only by its left multiplication but also by some differentiation process. What is the mapping to be differentiated? One may have guessed it: It is also called the adjoint representation ##Ad## of the Lie group.
For a unitary matrix ##u## and a skew-Hermitian matrix ##A## we have
$$uAu^{-1} + uA^\dagger u^{-1} = u(A+A^\dagger)u^{-1} =0$$
and thus ##SU(n)## operates on ##\mathfrak{su}(n)## or in other words ##\mathfrak{su}(n)## is a representation space of ##SU(n)## by conjugation with unitary matrices.
That is, the (inner) group automorphisms ##u \longmapsto (v \mapsto uvu^{-1})## induce in a natural way Lie algebra automorphisms which define the adjoint representation ##Ad## of the Lie group ##SU(n)## on its Lie algebra ##\mathfrak{su}(n)## as representation space ##V##.
$$Ad : SU(n) \longrightarrow GL(\mathfrak{su}(n))$$
$$Ad: u \longmapsto (A \mapsto uAu^{-1})$$
This is of course also in general true in the language of analytic manifolds and vector fields. However, there is some more work to do in this general case than to list algebraic properties. [1]
If differentiation gets us from ##Ad## to ##\mathfrak{ad}##, what about the other way around? Here we find a beautiful relation that adds to how the exponential map as already mentioned above comes into play once more: For ##A \in \mathfrak{su}(n)##
$$Ad(\exp(A)) = \exp(\mathfrak{ad}(A))$$
Baker-Campbell-Hausdorff (BCH or CBH) is now almost around the corner.
Basis Vectors, Generators, Pauli and Gell-Mann
Let us finally consider the matrices named after Wolfgang Pauli and Murray Gell-Mann for they belong to the context.
Groups don’t have a basis, they have generators.
Vector spaces don’t have generators, they have a basis.
This is more than just a pun because the allowed operations on these are completely different.
Generators as elements of a multiplicative group can be multiplied by each other and can be inverted. That’s it. Especially they don’t commutate in general. And addition is forbidden for it normally leads outside the group. E.g. there is no zero-element in a group.
Basis vectors (even if matrices as in our case) can only be added, stretched, compressed, turned around, and added to others. These operations are commutative and an additional multiplication between elements is only allowed if they form an algebra. But multiplication in algebra is usually completely different from the one in groups. E.g. we usually don’t have a (multiplicative) inverse element in algebras. In Lie algebras, e.g. we have ##[A,A]=0## which is as far as it can get from a group multiplication.
These fundamental differences make it important to distinguish between the term generator and the term basis vector.
Let’s begin with the Gell-Mann matrices. They are defined as
$$\lambda_{1}=\begin{pmatrix}0&1&0\\1&0&0\\0&0&0\end{pmatrix} \, , \,\lambda_{2}=\begin{pmatrix}0&-\mathrm {i} &0\\\mathrm {i} &0&0\\0&0&0\end{pmatrix}\, , \,\lambda_{3}=\begin{pmatrix}1&0&0\\0&-1&0\\0&0&0\end{pmatrix}$$
$$\lambda_{4}=\begin{pmatrix}0&0&1\\0&0&0\\1&0&0\end{pmatrix}\, , \, \lambda_{5}=\begin{pmatrix}0&0&-\mathrm {i} \\0&0&0\\\mathrm {i} &0&0\end{pmatrix}
\, , \,\lambda_{6}=\begin{pmatrix}0&0&0\\0&0&1\\0&1&0\end{pmatrix}$$
$$\lambda_{7}=\begin{pmatrix}0&0&0\\0&0&-{\mathrm i}\\0&{\mathrm i}&0\end{pmatrix}
\, , \,\lambda _{8}=\frac {1}{\sqrt{3}}
\begin{pmatrix}1&0&0\\0&1&0\\0&0&-2\end{pmatrix}$$
It can be seen without any calculations that all but ##\lambda_8## are singular and thus cannot be elements of the (special) unitary group. But they have a trace of zero and with a little more effort, we see that complex conjugation and mirroring at the main diagonal yield ##\lambda_k = \lambda_k^\dagger##, i.e. they are Hermitian. But this means they aren’t vectors of ##\mathfrak{su}(3)## either?! No, they aren’t. But if we build ##\mathrm{i}\lambda_k## instead, then we have skew-Hermitian matrices, i.e. elements of ##\mathfrak{su}(3)##. The commutator relations thus only differ by a minus sign.
As we’ve seen above the dimension of the real Lie algebra ##\mathfrak{su}(3)## is ##8## and the matrices ##\mathrm{i} \lambda_k## are thus a basis of ##\mathfrak{su}(3)##. It is important here to distinguish between the real Lie algebra ##\mathfrak{su}(3)##
and the complex transformations on ##\mathbb{C}^3## which its elements as complex skew-Hermitian matrices can be viewed as.
Now, what is meant by phrases like “the Gell-Mann matrices are representations of the infinitesimal generators of ##SU(3)##” or similar ones which are in regular use in physics? Unfortunately, they can be confusing and misleading. The reason is, that all of the above (and more) is contained in one single phrase. Moreover, every single part of this phrase appears to be wrong at first glance.
Gell-Mann matrices don’t belong to ##SU(3)## – “infinitesimal” has to indicate that actually ##\mathfrak{su}(3)## is meant, but even then a factor ##\mathrm{i}## is needed. However, it is often convenient to work with Hermitian matrices instead, because of their algebraic properties. And since their entries are already complex numbers, the multiplication by i isn’t that exotic. But it changes the nature of the linear transformations on ##\mathbb{C}^3## from Hermitian to skew-Hermitian. Nevertheless, we consider them here as tangent vectors in the first place anyway.
Gell-Mann matrices themselves are not “generators” in the sense as described above – ##\mathrm{i}\lambda_k## are tangent (basis) vectors of generators of ##SU(3)## which span ##\mathfrak{su}(3)## over ##\mathbb{R}##.
Gell-Mann matrices themselves are no “representations” in the sense as described above – ##\mathfrak{su}(3)## is the representation space of ##SU(3)## (among others). Nevertheless, it is an important one for the application in the context of gauge theories.
This doesn’t mean the definitions of generators and representations above are wrong. It means instead, that the physical phrase here is a very, very condensed form to express all of that.
At last, I want to mention some operators which are defined by the help of Gell-Mann matrices (and already display the multiplets in SM):
$$F_k := \frac{1}{2} \lambda_k \, (k=1, \ldots , 8) \text{ and }$$
$$T_{\pm} = F_1 \pm iF_2$$
$$T_3 = F_3$$
$$U_{\pm} = F_6 \pm iF_7$$
$$V_{\pm} = F_4 \pm iF_5$$
$$Y =\frac{2}{\sqrt{3}} F_8$$
The operator ##T_3## is related to the third Isospin component and the operator ##Y## to Hypercharge ##Y=B+S## which are connected to the electric charge ##Q## by the Gell-Mann-Nishijima-formula ##Q=\frac{1}{2}Y+T_3##. (cp. [5])
The Pauli matrices originally have been defined as
$$\sigma_{1}=\begin{pmatrix}0&1\\1&0\end{pmatrix},\quad \sigma_{2}=\begin{pmatrix}0&-{\mathrm{i}}\\{\mathrm{i}}&0\end{pmatrix},\quad \sigma_{3}=\begin{pmatrix}1&0\\0&-1\end{pmatrix}$$
The situation here is analogous to the one above. However, there is one difference (beside the obvious dimensionality).
It’s easy to verify that the Pauli matrices (like the Gell-Mann matrices) are Hermitian. We’ve already seen that this is no problem because we can multiply them with ##\mathrm{i}## and get skew-Hermitian matrices if needed. So the matrices ##\mathrm{i}\sigma_k## build a basis of the ##3##-dimensional real Lie algebra ##\mathfrak{su}(2)##. There is only one (semi-)simple Lie algebra of dimension ##3## up to isomorphisms. It is the Lie algebra of traceless complex ##2 \times 2##-matrices ##\mathfrak{sl}_\mathbb{C}(2,\mathbb{C})##. The Pauli matrices as well as their i-multiples can directly be taken as its (complex) basis. The reason I mention this is, that representations of ##\mathfrak{sl}_\mathbb{C}(2,\mathbb{C})## are often treated as examples in textbooks about (semi-)simple Lie algebras. It is the prototype of a simple Lie algebra with only a one-dimensional Cartan subalgebra, so there is exactly only one positive (and thus highest) and one negative (and thus lowest) root. Therefore the representation theory of ##\mathfrak{sl}_\mathbb{C}(2,\mathbb{C})## can directly be applied to ##\mathfrak{su}_\mathbb{R}(2,\mathbb{C})\cong \mathfrak{sl}_\mathbb{C}(2,\mathbb{C})##.
Now the difference between Pauli matrices and Gell-Mann matrices is, that Pauli matrices in form 6) ##\mathrm{i}\sigma_k## are also unitary and have determinant ##1##. That is they actually belong to ##SU(2)##. It is worth reading about Pauli matrices in greater detail, for they reveal a lot of geometric and algebraic structures with respect to the complex numbers, the quaternions, group structures, and the complex matrix ring ##\mathbb{M}_2(\mathbb{C})##.
##\underline{Footnotes:}##
1) I will only consider representations of groups and algebras on finite dimensional vector spaces here. Exemplary on ##SU(n)## and ##\mathfrak{su}(n)##. ##\uparrow##
2) ##\mathfrak{gl}(\mathfrak{su}(n))## is the Lie algebra of linear functions of ##\mathfrak{su}(n)## into itself. It is the tangent space of the general linear group ##GL(\mathfrak{su}(n))##. ##\uparrow##
3) It are actually the completely classified irreducible representations of the three-dimensional complex Lie Algebra of all complex ##(2 \times 2)##-matrices with trace zero ##A_1 = \mathfrak{sl}_\mathbb{C}(2,\mathbb{C}) \cong \mathfrak{su}_\mathbb{R}(2,\mathbb{C})=\mathfrak{su}(2) ##. I should add that in case the eigenvalues are actually roots, that is according to the adjoint representation ##\mathfrak{ad}##, zero is excluded, i.e. not called a root. The nullspace corresponds to the (Abelian) Cartan subalgebra. ##\uparrow##
4) One might observe the connection between the Jacobi identity and the product (Leibniz) rule for differentiation. They are basically the same thing. ##\uparrow##
5) For a quick impression on how apparently different concepts sometimes are deeply connected, see e.g. https://www.physicsforums.com/threads/why-the-terms-exterior-closed-exact.871875\#post-5474443 ##\uparrow##
6) Without the multiplication by ##\mathrm{i}## we have determinants equal to ##-1## which is not preserved under multiplication. ##\uparrow##
##\underline{Sources:}##
##[1]##https://www.amazon.com/Groups-Algebras-Representation-Graduate-Mathematics/dp/0387909699
##[2]##https://www.amazon.com/Introduction-Algebras-Representation-Graduate-Mathematics/dp/0387900535
##[3]##Appendix 1 on http://physik.uni-graz.at/~gxe/ss2013
##[4]##Appendix 3 on http://physik.uni-graz.at/~gxe/ss2013
##[5]##Second on http://www.uni-muenster.de/suche/de.cgi?q=SU%283%29&predefinedSearchArea=override&portalSearch=false (German)
##[6]## https://ncatlab.org/nlab/show/HomePage
Nice. Very important for understanding modern physics – and of course just for the math. ThanksBill
Great job on your first Insight @fresh_42!