Sunday, July 15, 2018

Tensor Invariants

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)We've mentioned an invariant for \(SO(3,\mb R)\) before, the object \(x\otimes x + y\otimes y + z\otimes z\). We're going to look at more invariants for group and Lie algebra actions.
Recall that we said that \(x\otimes x  + y\otimes y + z\otimes z\) is invariant because for any \(g \in SO(3,\mb)\), \(g\) sends \(x \otimes x + y\otimes y + z\otimes z\) to itself, where \(SO(3, \mb R)\) acts on \(\mb R^3 \otimes \mb R^3\) using the action we get from the coalgebra structure of \(SO(3, \mb R)\).
We generalize this notion for other groups: for a representation \((V, \rho)\) of a group \(G\), we say that a vector \(v \in V\) is invariant under the action of \(G\) if for all \(g \in G\), 
$$\rho(g)v = v.$$
Given a representation \((V, \rho)\) of \(G\), we say that a tensor \(T\) is a tensor invariant of \(G\) if it is invariant as an element of \(V^{\otimes m} \otimes (V^\vee)^{\otimes k}\) with the action given by the coalgebra structure of \(G\).

Invariants of group actions provide a good way to understand the groups in question. Invariant tensors describe structures that are preserved by the group, giving a connection between symmetry and algebra and geometry.

We can generalize the notion of an invariant further to Hopf algebras: given a representation \((V, \rho)\) of a Hopf algebra \(H\), we say that a vector \(v \in V\) is invariant under the action of \(H\) if for all \(h \in H\), 
$$\rho(h)v = \epsilon(h)v.$$
The appearance of \(\epsilon(h)v\) as opposed to just \(v\) allows for linearity. Similarly for the notion of an invariant tensor.
So for instance, for a Lie algebra \(\mf g\), we would say that a vector is invariant under the Lie algebra action if for all \(X \in \mf g\),
$$\rho(X)v = 0.$$

Let's look at some examples.

Every action leaves the identity matrix invariant, because otherwise things fail to be defined. If two things are equal, then the results after applying the action should also be equal. So \(I_i^j e_j \otimes e^i \in V\otimes V^\vee\) is an invariant for any representation \((V,\rho)\).

For \(SO(n)\), we have the example above of the tensor \(g = g_{ij}e^i \otimes e^j \in V^\vee \otimes V^\vee\), where \(V = \mb k^n\). In the standard basis, we would have \(g_{ij}\) be 1 if \(i = j\) and 0 otherwise, but if we're not fixing a basis then we can't guarantee these values; all we can say is that \(g_{ij} = g_{ji}\).
We also have a similar tensor, living in the dual space, \(g^{ij}\), that is also invariant, where we define \(g^{ij}g_{jk} = I_k^i\).
In fact, given such a pair of tensors, we can actually define \(O(n, \mb k)\) as the largest matrix group in \(End(\mb k^n)\) that leaves these tensors invariant. We don't technically need both tensors; what we do need is for \(g\) to be nondegenerate; in other words, for any \(v \in V\), there has to be some \(u \in V\) such that \(g(u, v)\) is nonzero.
For \(C_n\) we have a similar tensor \(\omega = \omega_{ij}e^i \otimes e^j\), where \(\omega_{ij} = -\omega_{ji}\) and again we demand that \(\omega\) be nondegenerate. We call this the symplectic form. The standard basis has \(\omega(e_i, e_{i + n}) = 1 = -\omega(e_{i+n}, e_i)\) for \(1 \leq i \leq n\), and 0 for all other pairs of indices. We define \(C_n\) to be the largest group that leaves \(\omega\) invariant.
For \(g_{ij}\) and \(\omega_{ij}\), the nondegeneracy allows us to match elements of \(V\) with \(V^\vee\) in a way that is invariant under the respective groups. Normally, for a basis of \(V\) the dual basis transforms in the opposite way so the matching in one basis does not correspond to a matching in another. Hence for most groups, the action on \(V\) and the corresponding action on \(V^\vee\) are inequivalent. But for \(O(n)\) and \(C_n\) they are equivalent.
Another important tensor is the Levi-Civita tensor, \(\varepsilon\). For a vector space \(V\) of dimension \(n\), \(\varepsilon = \varepsilon_{i_1i_2\ldots i_n}\) lives in \((V^\vee)^{\otimes n}\). It takes in \(n\) vectors and returns the size of the \(n\)-dimensional parallelepiped that those vectors give the edges of. If we work with \(\mb k = \mb R\), we often call the Levi-Civita form the volume form.
Since matrices with determinant 1 don't change the volume of such an object, we get that elements of \(SL(n)\) leave this tensor invariant, and conversely \(SL(n)\) is the set of all matrices that have this as an invariant.
The Levi-Civita tensor is fully antisymmetric, in that swapping any two indices gives you a minus sign.
 It turns out that in dimension \(2n\), \(\varepsilon\) can be built out of copies of the \(\omega\) tensor due to the nondegeneracy condition, so \(\varepsilon\) is automatically invariant under \(C_n\).

Also recall the map \(Asym_n\) that takes an element of \(V^{\otimes n}\) and returns its antisymmetrization:
$$Asym_2(u\otimes v) = \frac{1}{2}(u \otimes v - v \otimes u)$$
$$Asym_3(u \otimes v \otimes w) = \frac{1}{3!}(u \otimes v \otimes w - u\otimes w\otimes v + v\otimes w\otimes u - \ldots)$$
\(Asym\) can be written in tensor indices, with \(n\) lower indices and \(n\) upper indices. We get that, for \(S_n\) the group of permutations of the numbers \(1,\ldots, n\), we have 
$$Asym_{i_1i_2\ldots i_n}^{j_1j_2\ldots j_n} = \frac{1}{n!}\sum_{\sigma \in S_n} sign(\sigma) I_{i_1}^{j_{\sigma(1)}} I_{i_2}^{j_{\sigma(2)}}\cdots I_{i_n}^{j_{\sigma(n)}}$$
where \(sign(\sigma)\) is the number of swaps it takes to get from the ordering \(1,\ldots n\) to \(\sigma(1),\ldots, \sigma(n)\).
\(\varepsilon\) has a dual such that 
\(\varepsilon_{i_1i_2\ldots i_n}\varepsilon^{j_1j_2\ldots j_n} = Asym_{i_1i_2\ldots i_n}^{j_1j_2\ldots j_n}\)
Note that there is no contraction here.
\(Asym_n\) is a tensor invariant for all \(n\), being built from identities. In fact, all of the maps built from partial asymmetrizations and partial symmetrizations are tensor invaraints, and are one way to extract simpler tensor invariants from more complicated ones.

The other simple Lie groups can also be expressed in terms of their invariant tensors for various representations, but these are more complicated.

Symmetrizers and Antisymmetrizers

\(\newcommand{\mb}{\mathbb}\)Let's talk about permutations.

Suppose we have 3 things, which we'll denote by \(a, b\) and \(c\). We can line them up in a row as \(abc\), or we can reorder them into, say, \(bca\), or \(cba\), and so on.
The permutations, not the orderings themselves but the movement from one order to another, form a group, in this case \(S_3\).
Every permutation can be achieved by repeatedly swapping two elements at a time. So we can get from \(abc\) to \(bca\) by staring with \(abc\) and first swapping \(a\) and \(b\) to get \(bac\) and then swapping \(a\) and \(c\) to get to \(bca\). For a permutation \(\sigma\), define \(l(\sigma)\) be the minimum number of two-element swaps needed to make \(\sigma\).

Now consider a tensor \(t_{abc}\). From it we can make a new tensor, \(t_{bca}\), or \(t_{cba}\), and so on.
We define a symmetrizer \(Sym_3\) to act on \(t_{abc}\) by sending it to
$$\frac{1}{3!}\sum_{\sigma \in S_3} t_{\sigma(a)\sigma(b)\sigma(c)} = \frac{1}{6}(t_{abc} + t_{bac} + t_{bca} + \cdots )$$ We also define an antisymmetrizer \(Asym_3\) to act on \(t_{abc}\) by sending it to
$$\frac{1}{3!}\sum_{\sigma \in S_3} (-1)^{l(\sigma)} t_{\sigma(a)\sigma(b)\sigma(c)} = \frac{1}{6}(t_{abc} - t_{bac} + t_{bca} - \cdots )$$ A few things to note about \(Sym_3\) and \(Asym_3\): they're idempotent, in other words \(Sym_3 \circ Sym_3 = Sym_3, Asym_3 \circ Asym_3 = Asym_3\). They're also self-adjoint, which I'm not going to explain here because it's not terribly interesting for my purposes. But it does mean that we can view \(Sym_3\) and \(Asym_3\) as projection operators, projecting from general \(3\)-index tensors to subspaces of symmetric and antisymmetric 3-index tensors.
Also, they're orthogonal, in that \(Asym_3 \circ Sym_3 = Sym_3 \circ Asym_3 = 0\); the two subspaces are disjoint.
We can, of course, talk about symmetrizers and antisymmetrizers for two index tensors, or four index tensors. For any \(n\), we can consider
$$Sym_n(t_{a_1\ldots a_n}) = \frac{1}{n!}\sum_{\sigma \in S_n} t_{\sigma(a_1)\ldots\sigma(a_n)}$$ $$Asym_n(t_{a_1\ldots a_n}) = \frac{1}{n!}\sum_{\sigma \in S_n} (-1)^{l(\sigma)}t_{\sigma(a_1)\ldots\sigma(a_n)}$$ and again we have that $$Sym_n \circ Sym_n = Sym_n, Asym_n \circ Asym_n = Asym_n$$ $$Asym_n \circ Sym_n = Sym_n \circ Asym_n = 0$$We can also talk about partial symmetrizers and partial antisymmetrizers, which act on only some of the indices of a tensor. We note, for instance, that applying a (partial) symmetrizer to some indices, and then another partial symmetrizer to a subset of those indices is the same as just applying the first symmetrizer, and that applying a partial symmetrizer and a partial antisymmetrizer can yield something other than 0, but only if they don't have more than one index in common.

So for any \(n\), we can look for projection operators on \(n\)-index tensors that we can build out of (partial) symmetrizers and antisymmetrizers, and indeed look for minimal projection operators, so that for any projection operator \(P\) in our collection, for any other projection operator \(Q\) on \(n\)-index tensors built from symmeterizers and antisymmetrizers, \(P \circ Q = Q \circ P\) and the result is either \(P\) or \(0\).
There's a well-known combinatorial setup for finding such sets of operators, usually via what are called Young Tableaux. Briefly, an \(n\)-box Young Tableau is a collection of \(n\) square boxes arranged in rows and columns so that each row is at least as long as the one below it and each column is at least as tall as the one to the right of it. If you take a Young Tableau and flip it along the diagonal, you get another Young Tableau which I'll call the conjugate Young Tableau.
We can associate \(n\)-box Young Tableaux with projections on \(n\)-index tensors by taking a Young Tableau and associating each box with an index. Then for each row, we symmetrize over the indices corresponding to the boxes in the row. Once we've finished with the rows, then for each column we antisymmetrize over the indices corresponding to the boxes in the column.
The result is a projection that is orthogonal to all other projections made from \(n\)-box Young Tableaux, and it is minimal amongst projections made from symmetrizers and antisymmetrizers.
And there you have, essentially, the representation theory of \(SL(k)\). To get irreducible representations of \(SL(k)\), you take \((\mb C^k)^{\otimes n}\) with the standard action of \(SL(n)\) on it, and then apply various projections built from \(n\)-box Young Tableaux.
This also gives you representations of \(S_n\), in that you take a tensor \(t_{a_1,\ldots,a_n}\) with no symmetry properties, and take the vector space of linear combinations of \(t_{\sigma(a_1),\ldots,\sigma(a_n)}\) for all permutations \(\sigma \in S_n\); this gives you an \(n!\)-dimensional vector space on which \(S_n\) acts, and applying the various projections made from \(n\)-box Young Tableaux gets you irreducible representations of \(S_n\).
There's a nice interplay of the actions of \(SL(k)\) and \(S_n\) on \((\mb C^k)^{\otimes n}\) via what is called Schur-Weyl duality, but I'm not going to go into that here.

Representations of simple Lie algebras

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)So here's where most of this has been heading.

We start with a simple Lie algebra, which we'll write as \(\mf l_n\). The best thing, representation-wise, about simple Lie algebras is that all finite-dimensional representations of \(\mf l_n\) decompose into irreducible representations. The next best thing is that the only 1-dimensional representations are trivial.
The \(n\) denotes several things, but one of the things it denotes is the number of "fundamental representations" of \(\mf l_n\), which I'll write as \(V_1, V_2,\ldots,V_n\). These are irreducible representations such that if we look at all tensor products of these fundamental representations, i.e. every representation of the form \(V_1^{\otimes k_1} \otimes \cdots \otimes V_n^{\otimes k_n}\), and every irreducible representation that these representations decompose into, we get every finite-dimensional irreducible representation of \(\mf l_n\).
For instance, let's look at \(su(2)\). Also known as \(\mf a_1\), it has 1 fundamental representation \(V\) which is 2-dimensional. From there we can generate \(V\otimes V\), which is 4-dimensional, and which decomposes into the trivial 1-dimensional representation and a 3-dimensional representation. \(V\otimes V\otimes V\) is 8-dimensional and decomposes into two copies of \(V\) and a 4-dimensional representation. And so on. We can think of this in terms of particle spin: each \(V\) is a spin-1/2 particle, so that \(V\otimes V\) is 2 spin-1/2 particles. The trivial representation is the particles cancelling, the 3-dimensional representation is when they act as a single spin-1 particle.
Another example is \(su(3)\), also known as \(\mf a_2\). The fundamental representations are two 3-dimensional representations, \(V\) and \(V^\vee\). Every finite-dimensional representation of \(su(3)\) is a component of some \(V^{\otimes k} \otimes (V^\vee)^{\otimes l}\). In terms of particles we should think that each copy of \(V\) is a quark and each copy of \(V^\vee\) is an antiquark. A color-neutral configuration of quarks corresponds to the tensor product having a trivial component somewhere.

Let's look at the functions involved: a representation of \(G\) on \(\mb C^n\) gives us a map \(\rho: G \rar End(\mb C^n)\) is really a set of \(n^2\) maps from \(G\) to \(\mb C\), one for each entry in an \(n \times n\) matrix. So we can write it as \(\rho_i^j\).
We can add two such maps, or multiply them, and get more maps. We may not necessarily get more maps that show up in representations, but we nonetheless get more maps. So we get a ring of such functions.
Consider the tensor product of two representations \((V,\rho)\) and \((W,\sigma)\). The corresponding map is \(\rho \hat \otimes \sigma\), which sends \(G\) to \(End(V\otimes W)\). If we tack on indices, we get indices \(\rho_i^j\) for \(V\) and \(\sigma_k^l\) for \(W\), and so \(\rho \hat \otimes \sigma\) has indices \((\rho \hat \otimes \sigma)_{ik}^{jl}\) and factors as \(\rho_i^j \sigma_k^l\), as one would expect.
Moreover, since \(\rho \hat\otimes \sigma\) is equivalent to \(\sigma \hat \otimes \rho\) via a natural isomorphism, the corresponding functions look the same, and so the ring is commutative. There is an obvious unit function, that sends everything in \(G\) to \(1\), corresponding to the map for the trivial representation.
We also have a comultiplication, since we're looking at functions on a thing that has a multiplication. If we look at \(\rho(gh)\), we get \(\rho(g)\rho(h)\); in indices, we get \(\rho(gh)_i^j = \rho(g)_i^k \rho(h)_k^j\), so we get that \(\Delta(\rho_i^j) = \rho_i^k \otimes \rho_k^j\).
We also get a counit, which is to evaluate \(\rho_i^j\) at the identity in \(G\); since a representation always sends the identity to the identity, we get that \(\rho_i^j(e) = I_i^j\), i.e. is \(1\) if the values of \(i\) and \(j\) are equal, and 0 otherwise.
Finally, we have an antipode, with \((S\rho_i^j)(g) = \rho_i^j(g^{-1})\).
So we get a Hopf algebra \(Rep(G)\). Note that if \(G\) is a Lie group, then a representation of \(G\) becomes a representation of \(\mf g\), so we can evaluate elements of \(Rep(G)\) on \(\mf g\), and thus on \(U(\mf g)\).
In fact, if \(\mf g\) is simple, then we're in luck: \(Rep(G)\) can serve as a dual of \(U(\mf g)\), in that for any nonzero element \(X\) of \(U(\mf g)\), there is an element \(f\) of \(Rep(G)\) such that \(f(X)\) is nonzero, and vice-versa.
One would think it would be a dual to \(\mb kG\), except if \(G\) is a Lie group that isn't 0-dimensional, then \(G\) is infinite and thus \(\mb kG\) is really badly behaved, so we use \(U(\mf g)\) instead.

Saturday, July 14, 2018

Some quantum mechanics

\(\newcommand{\mb}{\mathbb}\)Okay, not actual quantum mechanics. Rather, some aspects of particle physics that can be related to what has already been posted here.

Consider the basic multi-quark objects that we know of so far: hadrons, which are things like protons and neutrons that are made of three quarks, and things like mesons, which are made of a quark and an antiquark.
The basic foundations of quantum mechanics say that to each particle involved we should assign a complex vector space for the possible states of the particle, and for multiple particles we should take the tensor product of those vector spaces for those particles to get a vector space of states for the set of particles.
So for each quark we have a vector space \(V\), and then a set of three quarks has the vector space \(V^{\otimes 3}\). An antiquark should then have the vector space \(V^\vee\), and so a meson should have \(V\otimes V^\vee\).

To each force we assign a group called a gauge group, and we say that if a vector is invariant under the action of that group, then the corresponding object or system of objects is stable with respect to that group. We say that quarks are bound to each other via the strong force, and now ask what is the gauge group of the strong force.

So what could the gauge group for the strong force be? What are our invariants? We have 3-quark systems that are stable, and quark-antiquark systems that are stable. And everything else that is stable is built from these two types of systems, so there aren't any other invariants; nothing in \(V\otimes V\), or in \(V\otimes V\otimes V^\vee\), etc.
So we have a corresponding invariant tensor in \(V^{\otimes 3}\) and an invariant tensor in \(V\otimes V^\vee\). We can guess that the tensor in \(V\otimes V^\vee\) is the identity tensor \(I_j^i e_i \otimes e^j\). What about the one in \(V^{\otimes 3}\)?
If we consider only groups that have an invariant with 3 lower indices, its dual with 3 upper indices, and the identity, and nothing else that can't be written in terms of those invariants, we end up with two possibilities: \(SL(3, \mb C)\) and \(E_6\). For \(SL(3, \mb C)\), \(V\) would be \(\mb C^3\) and the invariant with 3 lower indices is the Levi-Civita tensor. For \(E_6\) we have \(V = \mb C^{27}\) and the tensor is quite a bit more complicated, relating to the multiplication of a Jordan algebra.
So which one is it?
One more bit of information is that quarks, being fermions, prefer antisymmetric tensors. The Levi-Civita tensor is antisymmetric; swap any two indices and you pick up a minus sign. The tensor for \(E_6\) is symmetric. So the gauge group for quarks is \(SL(3, \mb C)\).

Only not quite \(SL(3, \mb C)\). The last bit of quantum mechanics is that we want everything to be unitary. One interpretation of these vectors is that \(v\cdot v^*\), where \(v^*\) is the complex conjugate of \(v\) and \(\cdot\) is the dot product, corresponds to a probability. So we want this quantity to not be affected by the symmetries of the situation.
Hence we want not \(SL(3, \mb C)\) but \(SL(3, \mb C)\cap U(3) = SU(3)\), which is the gauge group for the strong force.

We call the 3 basis directions for \(V\) "red", "green" and "blue", because 3 always means primary colors, except for when it doesn't. So a quark whose state vector is pointing in the red direction is said to have red color charge and so on. Antiquarks come in "antired", "antigreen" and "antiblue", dual to "red", "green" and "blue" respectively.
A quantum mechanical force is carried by particles whose state vectors occur in the Lie algebra of the gauge group for that force. So the force carriers for color, called gluons, occur in \(su(3)\), which is \(8\)-dimensional. The force carriers take a vector in \(V\) to a vector in \(V\), and so are in \(V\otimes V^\vee\), and thus come with a color and an anticolor. So we can talk about "red-antigreen" gluons or "blue-antired" gluons. Note that we don't talk about "red-antired" gluons, because that would require a matrix with nonzero trace, but everything in \(su(3)\) is traceless. Instead we can talk about, say, "red-antired - blue-antiblue" or "red-antired - green-antigreen" gluons, or "red-antired - (1/2)green-antigreen - (1/2)blue-antiblue" gluons.
Oftentimes physicists will say that there are 8 gluons, by which they mean the space of gluons has 8 basis elements.