Sunday, July 15, 2018

Tensor Invariants

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)We've mentioned an invariant for \(SO(3,\mb R)\) before, the object \(x\otimes x + y\otimes y + z\otimes z\). We're going to look at more invariants for group and Lie algebra actions.
Recall that we said that \(x\otimes x  + y\otimes y + z\otimes z\) is invariant because for any \(g \in SO(3,\mb)\), \(g\) sends \(x \otimes x + y\otimes y + z\otimes z\) to itself, where \(SO(3, \mb R)\) acts on \(\mb R^3 \otimes \mb R^3\) using the action we get from the coalgebra structure of \(SO(3, \mb R)\).
We generalize this notion for other groups: for a representation \((V, \rho)\) of a group \(G\), we say that a vector \(v \in V\) is invariant under the action of \(G\) if for all \(g \in G\), 
$$\rho(g)v = v.$$
Given a representation \((V, \rho)\) of \(G\), we say that a tensor \(T\) is a tensor invariant of \(G\) if it is invariant as an element of \(V^{\otimes m} \otimes (V^\vee)^{\otimes k}\) with the action given by the coalgebra structure of \(G\).

Invariants of group actions provide a good way to understand the groups in question. Invariant tensors describe structures that are preserved by the group, giving a connection between symmetry and algebra and geometry.

We can generalize the notion of an invariant further to Hopf algebras: given a representation \((V, \rho)\) of a Hopf algebra \(H\), we say that a vector \(v \in V\) is invariant under the action of \(H\) if for all \(h \in H\), 
$$\rho(h)v = \epsilon(h)v.$$
The appearance of \(\epsilon(h)v\) as opposed to just \(v\) allows for linearity. Similarly for the notion of an invariant tensor.
So for instance, for a Lie algebra \(\mf g\), we would say that a vector is invariant under the Lie algebra action if for all \(X \in \mf g\),
$$\rho(X)v = 0.$$

Let's look at some examples.

Every action leaves the identity matrix invariant, because otherwise things fail to be defined. If two things are equal, then the results after applying the action should also be equal. So \(I_i^j e_j \otimes e^i \in V\otimes V^\vee\) is an invariant for any representation \((V,\rho)\).

For \(SO(n)\), we have the example above of the tensor \(g = g_{ij}e^i \otimes e^j \in V^\vee \otimes V^\vee\), where \(V = \mb k^n\). In the standard basis, we would have \(g_{ij}\) be 1 if \(i = j\) and 0 otherwise, but if we're not fixing a basis then we can't guarantee these values; all we can say is that \(g_{ij} = g_{ji}\).
We also have a similar tensor, living in the dual space, \(g^{ij}\), that is also invariant, where we define \(g^{ij}g_{jk} = I_k^i\).
In fact, given such a pair of tensors, we can actually define \(O(n, \mb k)\) as the largest matrix group in \(End(\mb k^n)\) that leaves these tensors invariant. We don't technically need both tensors; what we do need is for \(g\) to be nondegenerate; in other words, for any \(v \in V\), there has to be some \(u \in V\) such that \(g(u, v)\) is nonzero.
For \(C_n\) we have a similar tensor \(\omega = \omega_{ij}e^i \otimes e^j\), where \(\omega_{ij} = -\omega_{ji}\) and again we demand that \(\omega\) be nondegenerate. We call this the symplectic form. The standard basis has \(\omega(e_i, e_{i + n}) = 1 = -\omega(e_{i+n}, e_i)\) for \(1 \leq i \leq n\), and 0 for all other pairs of indices. We define \(C_n\) to be the largest group that leaves \(\omega\) invariant.
For \(g_{ij}\) and \(\omega_{ij}\), the nondegeneracy allows us to match elements of \(V\) with \(V^\vee\) in a way that is invariant under the respective groups. Normally, for a basis of \(V\) the dual basis transforms in the opposite way so the matching in one basis does not correspond to a matching in another. Hence for most groups, the action on \(V\) and the corresponding action on \(V^\vee\) are inequivalent. But for \(O(n)\) and \(C_n\) they are equivalent.
Another important tensor is the Levi-Civita tensor, \(\varepsilon\). For a vector space \(V\) of dimension \(n\), \(\varepsilon = \varepsilon_{i_1i_2\ldots i_n}\) lives in \((V^\vee)^{\otimes n}\). It takes in \(n\) vectors and returns the size of the \(n\)-dimensional parallelepiped that those vectors give the edges of. If we work with \(\mb k = \mb R\), we often call the Levi-Civita form the volume form.
Since matrices with determinant 1 don't change the volume of such an object, we get that elements of \(SL(n)\) leave this tensor invariant, and conversely \(SL(n)\) is the set of all matrices that have this as an invariant.
The Levi-Civita tensor is fully antisymmetric, in that swapping any two indices gives you a minus sign.
 It turns out that in dimension \(2n\), \(\varepsilon\) can be built out of copies of the \(\omega\) tensor due to the nondegeneracy condition, so \(\varepsilon\) is automatically invariant under \(C_n\).

Also recall the map \(Asym_n\) that takes an element of \(V^{\otimes n}\) and returns its antisymmetrization:
$$Asym_2(u\otimes v) = \frac{1}{2}(u \otimes v - v \otimes u)$$
$$Asym_3(u \otimes v \otimes w) = \frac{1}{3!}(u \otimes v \otimes w - u\otimes w\otimes v + v\otimes w\otimes u - \ldots)$$
\(Asym\) can be written in tensor indices, with \(n\) lower indices and \(n\) upper indices. We get that, for \(S_n\) the group of permutations of the numbers \(1,\ldots, n\), we have 
$$Asym_{i_1i_2\ldots i_n}^{j_1j_2\ldots j_n} = \frac{1}{n!}\sum_{\sigma \in S_n} sign(\sigma) I_{i_1}^{j_{\sigma(1)}} I_{i_2}^{j_{\sigma(2)}}\cdots I_{i_n}^{j_{\sigma(n)}}$$
where \(sign(\sigma)\) is the number of swaps it takes to get from the ordering \(1,\ldots n\) to \(\sigma(1),\ldots, \sigma(n)\).
\(\varepsilon\) has a dual such that 
\(\varepsilon_{i_1i_2\ldots i_n}\varepsilon^{j_1j_2\ldots j_n} = Asym_{i_1i_2\ldots i_n}^{j_1j_2\ldots j_n}\)
Note that there is no contraction here.
\(Asym_n\) is a tensor invariant for all \(n\), being built from identities. In fact, all of the maps built from partial asymmetrizations and partial symmetrizations are tensor invaraints, and are one way to extract simpler tensor invariants from more complicated ones.

The other simple Lie groups can also be expressed in terms of their invariant tensors for various representations, but these are more complicated.

Symmetrizers and Antisymmetrizers

\(\newcommand{\mb}{\mathbb}\)Let's talk about permutations.

Suppose we have 3 things, which we'll denote by \(a, b\) and \(c\). We can line them up in a row as \(abc\), or we can reorder them into, say, \(bca\), or \(cba\), and so on.
The permutations, not the orderings themselves but the movement from one order to another, form a group, in this case \(S_3\).
Every permutation can be achieved by repeatedly swapping two elements at a time. So we can get from \(abc\) to \(bca\) by staring with \(abc\) and first swapping \(a\) and \(b\) to get \(bac\) and then swapping \(a\) and \(c\) to get to \(bca\). For a permutation \(\sigma\), define \(l(\sigma)\) be the minimum number of two-element swaps needed to make \(\sigma\).

Now consider a tensor \(t_{abc}\). From it we can make a new tensor, \(t_{bca}\), or \(t_{cba}\), and so on.
We define a symmetrizer \(Sym_3\) to act on \(t_{abc}\) by sending it to
$$\frac{1}{3!}\sum_{\sigma \in S_3} t_{\sigma(a)\sigma(b)\sigma(c)} = \frac{1}{6}(t_{abc} + t_{bac} + t_{bca} + \cdots )$$ We also define an antisymmetrizer \(Asym_3\) to act on \(t_{abc}\) by sending it to
$$\frac{1}{3!}\sum_{\sigma \in S_3} (-1)^{l(\sigma)} t_{\sigma(a)\sigma(b)\sigma(c)} = \frac{1}{6}(t_{abc} - t_{bac} + t_{bca} - \cdots )$$ A few things to note about \(Sym_3\) and \(Asym_3\): they're idempotent, in other words \(Sym_3 \circ Sym_3 = Sym_3, Asym_3 \circ Asym_3 = Asym_3\). They're also self-adjoint, which I'm not going to explain here because it's not terribly interesting for my purposes. But it does mean that we can view \(Sym_3\) and \(Asym_3\) as projection operators, projecting from general \(3\)-index tensors to subspaces of symmetric and antisymmetric 3-index tensors.
Also, they're orthogonal, in that \(Asym_3 \circ Sym_3 = Sym_3 \circ Asym_3 = 0\); the two subspaces are disjoint.
We can, of course, talk about symmetrizers and antisymmetrizers for two index tensors, or four index tensors. For any \(n\), we can consider
$$Sym_n(t_{a_1\ldots a_n}) = \frac{1}{n!}\sum_{\sigma \in S_n} t_{\sigma(a_1)\ldots\sigma(a_n)}$$ $$Asym_n(t_{a_1\ldots a_n}) = \frac{1}{n!}\sum_{\sigma \in S_n} (-1)^{l(\sigma)}t_{\sigma(a_1)\ldots\sigma(a_n)}$$ and again we have that $$Sym_n \circ Sym_n = Sym_n, Asym_n \circ Asym_n = Asym_n$$ $$Asym_n \circ Sym_n = Sym_n \circ Asym_n = 0$$We can also talk about partial symmetrizers and partial antisymmetrizers, which act on only some of the indices of a tensor. We note, for instance, that applying a (partial) symmetrizer to some indices, and then another partial symmetrizer to a subset of those indices is the same as just applying the first symmetrizer, and that applying a partial symmetrizer and a partial antisymmetrizer can yield something other than 0, but only if they don't have more than one index in common.

So for any \(n\), we can look for projection operators on \(n\)-index tensors that we can build out of (partial) symmetrizers and antisymmetrizers, and indeed look for minimal projection operators, so that for any projection operator \(P\) in our collection, for any other projection operator \(Q\) on \(n\)-index tensors built from symmeterizers and antisymmetrizers, \(P \circ Q = Q \circ P\) and the result is either \(P\) or \(0\).
There's a well-known combinatorial setup for finding such sets of operators, usually via what are called Young Tableaux. Briefly, an \(n\)-box Young Tableau is a collection of \(n\) square boxes arranged in rows and columns so that each row is at least as long as the one below it and each column is at least as tall as the one to the right of it. If you take a Young Tableau and flip it along the diagonal, you get another Young Tableau which I'll call the conjugate Young Tableau.
We can associate \(n\)-box Young Tableaux with projections on \(n\)-index tensors by taking a Young Tableau and associating each box with an index. Then for each row, we symmetrize over the indices corresponding to the boxes in the row. Once we've finished with the rows, then for each column we antisymmetrize over the indices corresponding to the boxes in the column.
The result is a projection that is orthogonal to all other projections made from \(n\)-box Young Tableaux, and it is minimal amongst projections made from symmetrizers and antisymmetrizers.
And there you have, essentially, the representation theory of \(SL(k)\). To get irreducible representations of \(SL(k)\), you take \((\mb C^k)^{\otimes n}\) with the standard action of \(SL(n)\) on it, and then apply various projections built from \(n\)-box Young Tableaux.
This also gives you representations of \(S_n\), in that you take a tensor \(t_{a_1,\ldots,a_n}\) with no symmetry properties, and take the vector space of linear combinations of \(t_{\sigma(a_1),\ldots,\sigma(a_n)}\) for all permutations \(\sigma \in S_n\); this gives you an \(n!\)-dimensional vector space on which \(S_n\) acts, and applying the various projections made from \(n\)-box Young Tableaux gets you irreducible representations of \(S_n\).
There's a nice interplay of the actions of \(SL(k)\) and \(S_n\) on \((\mb C^k)^{\otimes n}\) via what is called Schur-Weyl duality, but I'm not going to go into that here.

Representations of simple Lie algebras

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)So here's where most of this has been heading.

We start with a simple Lie algebra, which we'll write as \(\mf l_n\). The best thing, representation-wise, about simple Lie algebras is that all finite-dimensional representations of \(\mf l_n\) decompose into irreducible representations. The next best thing is that the only 1-dimensional representations are trivial.
The \(n\) denotes several things, but one of the things it denotes is the number of "fundamental representations" of \(\mf l_n\), which I'll write as \(V_1, V_2,\ldots,V_n\). These are irreducible representations such that if we look at all tensor products of these fundamental representations, i.e. every representation of the form \(V_1^{\otimes k_1} \otimes \cdots \otimes V_n^{\otimes k_n}\), and every irreducible representation that these representations decompose into, we get every finite-dimensional irreducible representation of \(\mf l_n\).
For instance, let's look at \(su(2)\). Also known as \(\mf a_1\), it has 1 fundamental representation \(V\) which is 2-dimensional. From there we can generate \(V\otimes V\), which is 4-dimensional, and which decomposes into the trivial 1-dimensional representation and a 3-dimensional representation. \(V\otimes V\otimes V\) is 8-dimensional and decomposes into two copies of \(V\) and a 4-dimensional representation. And so on. We can think of this in terms of particle spin: each \(V\) is a spin-1/2 particle, so that \(V\otimes V\) is 2 spin-1/2 particles. The trivial representation is the particles cancelling, the 3-dimensional representation is when they act as a single spin-1 particle.
Another example is \(su(3)\), also known as \(\mf a_2\). The fundamental representations are two 3-dimensional representations, \(V\) and \(V^\vee\). Every finite-dimensional representation of \(su(3)\) is a component of some \(V^{\otimes k} \otimes (V^\vee)^{\otimes l}\). In terms of particles we should think that each copy of \(V\) is a quark and each copy of \(V^\vee\) is an antiquark. A color-neutral configuration of quarks corresponds to the tensor product having a trivial component somewhere.

Let's look at the functions involved: a representation of \(G\) on \(\mb C^n\) gives us a map \(\rho: G \rar End(\mb C^n)\) is really a set of \(n^2\) maps from \(G\) to \(\mb C\), one for each entry in an \(n \times n\) matrix. So we can write it as \(\rho_i^j\).
We can add two such maps, or multiply them, and get more maps. We may not necessarily get more maps that show up in representations, but we nonetheless get more maps. So we get a ring of such functions.
Consider the tensor product of two representations \((V,\rho)\) and \((W,\sigma)\). The corresponding map is \(\rho \hat \otimes \sigma\), which sends \(G\) to \(End(V\otimes W)\). If we tack on indices, we get indices \(\rho_i^j\) for \(V\) and \(\sigma_k^l\) for \(W\), and so \(\rho \hat \otimes \sigma\) has indices \((\rho \hat \otimes \sigma)_{ik}^{jl}\) and factors as \(\rho_i^j \sigma_k^l\), as one would expect.
Moreover, since \(\rho \hat\otimes \sigma\) is equivalent to \(\sigma \hat \otimes \rho\) via a natural isomorphism, the corresponding functions look the same, and so the ring is commutative. There is an obvious unit function, that sends everything in \(G\) to \(1\), corresponding to the map for the trivial representation.
We also have a comultiplication, since we're looking at functions on a thing that has a multiplication. If we look at \(\rho(gh)\), we get \(\rho(g)\rho(h)\); in indices, we get \(\rho(gh)_i^j = \rho(g)_i^k \rho(h)_k^j\), so we get that \(\Delta(\rho_i^j) = \rho_i^k \otimes \rho_k^j\).
We also get a counit, which is to evaluate \(\rho_i^j\) at the identity in \(G\); since a representation always sends the identity to the identity, we get that \(\rho_i^j(e) = I_i^j\), i.e. is \(1\) if the values of \(i\) and \(j\) are equal, and 0 otherwise.
Finally, we have an antipode, with \((S\rho_i^j)(g) = \rho_i^j(g^{-1})\).
So we get a Hopf algebra \(Rep(G)\). Note that if \(G\) is a Lie group, then a representation of \(G\) becomes a representation of \(\mf g\), so we can evaluate elements of \(Rep(G)\) on \(\mf g\), and thus on \(U(\mf g)\).
In fact, if \(\mf g\) is simple, then we're in luck: \(Rep(G)\) can serve as a dual of \(U(\mf g)\), in that for any nonzero element \(X\) of \(U(\mf g)\), there is an element \(f\) of \(Rep(G)\) such that \(f(X)\) is nonzero, and vice-versa.
One would think it would be a dual to \(\mb kG\), except if \(G\) is a Lie group that isn't 0-dimensional, then \(G\) is infinite and thus \(\mb kG\) is really badly behaved, so we use \(U(\mf g)\) instead.

Saturday, July 14, 2018

Some quantum mechanics

\(\newcommand{\mb}{\mathbb}\)Okay, not actual quantum mechanics. Rather, some aspects of particle physics that can be related to what has already been posted here.

Consider the basic multi-quark objects that we know of so far: hadrons, which are things like protons and neutrons that are made of three quarks, and things like mesons, which are made of a quark and an antiquark.
The basic foundations of quantum mechanics say that to each particle involved we should assign a complex vector space for the possible states of the particle, and for multiple particles we should take the tensor product of those vector spaces for those particles to get a vector space of states for the set of particles.
So for each quark we have a vector space \(V\), and then a set of three quarks has the vector space \(V^{\otimes 3}\). An antiquark should then have the vector space \(V^\vee\), and so a meson should have \(V\otimes V^\vee\).

To each force we assign a group called a gauge group, and we say that if a vector is invariant under the action of that group, then the corresponding object or system of objects is stable with respect to that group. We say that quarks are bound to each other via the strong force, and now ask what is the gauge group of the strong force.

So what could the gauge group for the strong force be? What are our invariants? We have 3-quark systems that are stable, and quark-antiquark systems that are stable. And everything else that is stable is built from these two types of systems, so there aren't any other invariants; nothing in \(V\otimes V\), or in \(V\otimes V\otimes V^\vee\), etc.
So we have a corresponding invariant tensor in \(V^{\otimes 3}\) and an invariant tensor in \(V\otimes V^\vee\). We can guess that the tensor in \(V\otimes V^\vee\) is the identity tensor \(I_j^i e_i \otimes e^j\). What about the one in \(V^{\otimes 3}\)?
If we consider only groups that have an invariant with 3 lower indices, its dual with 3 upper indices, and the identity, and nothing else that can't be written in terms of those invariants, we end up with two possibilities: \(SL(3, \mb C)\) and \(E_6\). For \(SL(3, \mb C)\), \(V\) would be \(\mb C^3\) and the invariant with 3 lower indices is the Levi-Civita tensor. For \(E_6\) we have \(V = \mb C^{27}\) and the tensor is quite a bit more complicated, relating to the multiplication of a Jordan algebra.
So which one is it?
One more bit of information is that quarks, being fermions, prefer antisymmetric tensors. The Levi-Civita tensor is antisymmetric; swap any two indices and you pick up a minus sign. The tensor for \(E_6\) is symmetric. So the gauge group for quarks is \(SL(3, \mb C)\).

Only not quite \(SL(3, \mb C)\). The last bit of quantum mechanics is that we want everything to be unitary. One interpretation of these vectors is that \(v\cdot v^*\), where \(v^*\) is the complex conjugate of \(v\) and \(\cdot\) is the dot product, corresponds to a probability. So we want this quantity to not be affected by the symmetries of the situation.
Hence we want not \(SL(3, \mb C)\) but \(SL(3, \mb C)\cap U(3) = SU(3)\), which is the gauge group for the strong force.

We call the 3 basis directions for \(V\) "red", "green" and "blue", because 3 always means primary colors, except for when it doesn't. So a quark whose state vector is pointing in the red direction is said to have red color charge and so on. Antiquarks come in "antired", "antigreen" and "antiblue", dual to "red", "green" and "blue" respectively.
A quantum mechanical force is carried by particles whose state vectors occur in the Lie algebra of the gauge group for that force. So the force carriers for color, called gluons, occur in \(su(3)\), which is \(8\)-dimensional. The force carriers take a vector in \(V\) to a vector in \(V\), and so are in \(V\otimes V^\vee\), and thus come with a color and an anticolor. So we can talk about "red-antigreen" gluons or "blue-antired" gluons. Note that we don't talk about "red-antired" gluons, because that would require a matrix with nonzero trace, but everything in \(su(3)\) is traceless. Instead we can talk about, say, "red-antired - blue-antiblue" or "red-antired - green-antigreen" gluons, or "red-antired - (1/2)green-antigreen - (1/2)blue-antiblue" gluons.
Oftentimes physicists will say that there are 8 gluons, by which they mean the space of gluons has 8 basis elements.

Monday, February 29, 2016

Building representations

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\) Given a set \(S\) and representations of \(S\) on various vector spaces, we can make more representations on other vector spaces in a few ways.

Suppose we have two representations \((V,\rho:S\rar End(V))\) and \((W,\sigma:S\rar End(W))\). We say that a linear map \(\phi:V\rar W\) is an intertwiner from \((V,\rho)\) to \((W,\sigma)\) if for all \(s\in S\), we have that
$$\sigma(s) \circ \phi = \phi \circ \rho(s)$$ In other words, first moving from \(V\) to \(W\) and then acting by \(s\) is the same as first acting by \(s\) on \(V\) and then moving to \(W\).
Intertwiners serve as the appropriate notion of map between representations, since they carry the actions from one representation to the other. You might also see the term \(S\)-equivariant map for the corresponding notion for \(S\)-sets.
Now suppose that \(\phi\) is not just a linear map, but an isomorphism. Then \(V\) and \(W\) are isomorphic as vector spaces, and we can rewrite the intertwiner equation as
$$\sigma(s) = \phi \circ \rho(s) \circ \phi^{-1}$$ which looks a lot like the change-of-basis formula. Since changing basis is really just viewing things from a different perspective, we don't consider it to change anything important. Similarly, we say that if two representations have an intertwiner that's an isomorphism, the two representations are equivalent, differing only in labeling details.
In general we don't distinguish between equivalent representations. So for instance, if we ask for all of the representations of some set, we don't actually mean all representations, but rather for all equivalence classes of representations, or often one representation from each equivalence class.

Now that we've talked about what it means for two representations to be equivalent, let's see how to get inequivalent representations.

Suppose we have a representation \((V,\rho)\), and suppose that for some subspace \(W\subset V\), for all \(w\in W\) and for all \(s\in S\), \(\rho(s)w \in W\). We thus say that \(W\) is a submodule of \(V\), or that \((W,\rho|_W)\) is a subrepresentation of \((V,\rho)\).
Given a module and a submodule, we can form the quotient module, \(V/W\), whose elements are, as described in the post on quotients, equivalence classes of elements of \(V\). We note that we can form an action of \(S\) on \(V/W\), because if \(v\) is equivalent to \(v'\), then \(v-v' \in W\), and so \(\rho(s)v - \rho(s)v' = \rho(s)(v-v') \in W\), and thus \(\rho(s)v\) is equivalent to \(\rho(s)v'\).
We say that \(V\) is a simple module if it has no proper submodules (i.e. submodules that are not all of \(V\) and are not the \(0\) vector space\), i.e. no proper quotient modules; alternatively, we say that \((V,\rho)\) is an irreducible representation.

Now suppose that we have two representations \((V,\rho)\) and \((W,\sigma)\). We can put these together in a few different ways.

Look at the direct sum of \(V\) and \(W\), i.e. \(V \oplus W\) whose elements can be written as \((v, w)\) for \(v \in V\) and \(w \in W\), with the usual addition of \((v,w)+(v',w') = (v+v',w+w')\) and scalar multiplication \(c(v,w) = (cv,cw)\). The corresponding representation is given by
\((V\oplus W, \rho \oplus \sigma)\) where the map \(\rho \oplus \sigma\) acts as
$$(\rho \oplus \sigma)(s)((v,w)) = (\rho(s)v,\sigma(s)w).$$ Translating all of this into matrix form gives that \((\rho\oplus\sigma)(s)\) is a block-diagonal matrix with one block being \(\rho(s)\) and the other being \(\sigma(s)\).
We say that a representation \((U,\pi)\) is decomposable if it is equivalent to \((V\oplus W,\rho\oplus \sigma)\) for some pair of representations \((V,\rho)\) and \((W,\sigma)\) where both \(V\) and \(W\) are not the 0 vector space. A representation that isn't equivalent to a direct sum is thus indecomposable. This is distinct from being simple, as \(V/W\) is not itself a submodule of \((V,\rho)\), although it is sometimes equivalent to one, so modules can fail to be simple but still be indecomposable.
We say that a representation is fully decomposable if it is the direct sum of simple representations. There are some criteria for when we should expect representations to be fully decomposable. One case is if \(S\) is a finite group; another case is if \(S\) is a finite-dimensional simple Lie algebra or Lie group and \(V\) is finite-dimensional. These are the cases I'm mostly concerned with.
Going a bit further, we can also ask how to determine what simple representations a given representation decomposes or reduces into.
Example: Consider the group \(Uni_n(\mb R)\) of upper-triangular \(n\times n\) real-valued matrices with 1s down the diagonal, acting on \(\mb R^n\) in the usual fashion. The vector \((1,0,0,\ldots)\) is sent to itself by everything in this group, and so it is the basis of an invariant 1-dimensional subspace \(V\). However there is no \((n-1)\)-dimensional subspace complementary to \(V\) that is also invariant, since for any \(u\) outside of \(V\), there is a group element \(g\) that sends \(u\) into \(V\). So while \(\mb R^n\) is not simple as a \(Uni_n(\mb R)\) module as it has an invariant subspace, it's indecomposable since we can't find another complementary invariant subspace.

We've also talked about tensor products before, when we talked about coalgebra actions. Recall that for a coalgebra \(C\) and representations \((V,\rho)\) and \((W,\sigma)\), \(C\) acts on \(V\otimes W\) via
$$(\rho\hat\otimes \sigma)(c) = (\rho\otimes \sigma)\Delta(c).$$ We can of course extend this to Hopf algebras. Notably, sets that don't have some sort of comultiplication are not considered to be able to act on tensor products.

Example:
Consider the group \(SO(3)\) acting on \(\mb R^3\) via the usual rotation action. We can examine its action on \(\mb R^3 \otimes \mb R^3\), which as noted above fully decomposes into simple modules. Firstly we note that \(SO(3)\) preserves the symmetric and antisymmetric subspaces of \(\mb R^3 \otimes \mb R^3\), by which we mean the subspaces spanned by elements of the form \(a \otimes b + b \otimes a\) and \(a\otimes b - b \otimes a\) respectively. The antisymmetric subspace is 3-dimensional, and indeed the resulting representation is equivalent to the original representation on \(\mb R^3\).
In the case of the symmetric subspace, if we write the standard basis elements of \(\mb R^3\) as \(a,b,\) and \(c\), we get that the element \(a \otimes a + b \otimes b + c\otimes c\) gets sent to itself by any element of \(SO(3)\), so the vector space spanned by that element is invariant, giving us the trivial representation. So the 6-dimensional symmetric subspace decomposes into a copy of the trivial representation and a 5-dimensional representation, which turns out to also be simple.
Hence as an \(SO(3)\)-module, \(\mb R^3 \otimes \mb R^3\) decomposes into a 1-dimensional, a 3-dimensional, and a 5-dimensional simple module.

Example:
Suppose that we have a cocommutative Hopf algebra \(H\), for instance a group algebra or the universal enveloping algebra of a Lie algebra. Suppose that \(H\) has a module \(V\). Then we can build actions of \(H\) on tensor powers of \(V\), and thus on the entire tensor algebra \(T(V) = \bigotimes^* V\).
Now consider the ideal \(I_S = \left\langle xy - yx\right\rangle\) in the tensor algebra, remembering that we don't use \(\otimes\) to indicate multiplication inside the tensor algebra itself. \(I_S\) is invariant under the action of \(H\) since \(H\) is cocommutative, so we can form the quotient module, \(Sym(V) = T(V)/I_S\), which is the symmetric algebra on \(V\), i.e. the space generated by products of basis vectors of \(V\) where the order of multiplication doesn't matter.
Similarly, given the ideal \(I_A = \left\langle xy + yx \right\rangle\), we can form the quotient module \(Alt(V) = T(V)/I_A\), the alternating or exterior algebra on \(V\) where swapping two vectors in a product gives you a minus sign.
Another example is when \(V\) has a Lie bracket, and thus we can form the ideal \(I_L = \left\langle xy - yx - [x,y]\right\rangle\), which gives us the quotient module \(U(V) = T(V)/I_L\), i.e. the universal enveloping algebra.
Because \(H\) has a coalgebra structure, in all of these cases it preserves the algebraic structures of the quotient modules. How these various algebras decompose into simple or at least simpler \(H\)-modules is of quite some interest to representation theorists. Depending on where \(V\) comes from and the relationship of \(H\) and \(V\), these examples also give interesting results in combinatorics, geometry and theoretical physics.

Monday, February 8, 2016

Lie algebra actions, universal enveloping algebra

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)Now that we've talked about algebra and coalgebra actions in the context of groups, we can talk about them in the context of Lie algebras.

So let's consider a Lie group \(G\) and its Lie algebra \(\mf g\). Since we'll have several identities floating around, we'll write the identity element of \(G\) as \(e\), whereas identity matrices will be denoted \(I\).
For a vector space \(V\), \(End(V)\) has a Lie bracket, where for matrices \(S\) and \(T\) we have \([S, T] = ST - TS\). A Lie algebra action of \(\mf g\) on a vector space \(V\) is a map \(\rho: \mf g \rar End(V)\) such that the bracket on \(\mf g\) becomes the bracket on \(End(V)\). In other words,
$$\rho([X,Y]) = \rho(X)\rho(Y) - \rho(Y)\rho(X).$$Given an action of \(G\) on a vector space \(V\) with map \(\rho\), we can make a Lie algebra action by saying that for \(X \in \mf g\),
$$\rho(X) = \frac{\rho(e + sX) - \rho(e)}{s}$$This looks like a derivative, and those less comfortable with infinitesimals may replace it with such.
Note that this is not an algebra action, at least not if we use the Lie bracket as the multiplication, because the bracket in the Lie algebra does not become composition in \(End(V)\).

The comultiplication on \(G\) sends \(g\) to \(g \otimes g\). In particular, the identity element \(e\) gets sent to \(e \otimes e\). We view the Lie algebra as elements \(X\) such that \(e + sX\) is in \(G\) for infinitesimal \(s\). So
$$\Delta(e + sX) = (e + sX)\otimes (e + sX) = e \otimes e + sX \otimes e + e \otimes sX$$ where we use the fact that \(s^2 = 0\).
The point of infinitesimals is that they make everything look linear, so we have the idea that
$$\Delta(e + sX) = \Delta(e) + \Delta(sX) = \Delta(e) + s\Delta(X)$$Since as mentioned, \(\Delta(e) = e \otimes e\), we get that \(\Delta(X) = X\otimes e + e \otimes X\). This comultiplication is cocommutative and coassociative.
Thus we get that for two representations \((V,\rho)\) and \((W, \sigma)\), \(\mf g\) acts on \(V\otimes W\) via
$$(\rho \hat \otimes \sigma)(X)(v \otimes w) = \rho(X)v \otimes w + v \otimes \sigma(X)w$$This should look somewhat product-rule-y, in keeping with the understanding of Lie algebras as derivatives of Lie groups.
Although the comultiplication sends \(G\) to \(G\otimes G\), we don't get that the comultiplication sends \(\mf g\) to \(\mf g \otimes \mf g\). Instead it gets sent to \((\mb ke \oplus \mf g) \otimes (\mb ke \oplus \mf g)\). Most of the time we'll leave off the \(e\).
We can make \(\mb k \oplus \mf g\) into a counital coalgebra by defining \(\epsilon(X) = 0\) for all \(X \in \mf g\) and \(\epsilon(c) = c\) for \(c \in \mb k\).

What if we want an algebra that contains the Lie algebra \(\mf g\) where Lie algebra actions of \(\mf g\) become algebra actions? We call such an object the universal enveloping algebra of \(\mf g\), denoted \(U(\mf g)\), and we create it as a quotient.
We start with the tensor algebra \(\bigotimes^* \mf g\), where we're viewing \(\mf g\) as a vector space. Now we impose the Lie algebra structure by looking at the ideal \(I = \left\langle X \otimes Y - Y \otimes X - [X, Y]\right\rangle\) where \([, ]\) is the Lie bracket, and forming the quotient \(U(\mf g) = \bigotimes^* \mf g/I\).
Recall that \(\bigotimes^* V\) has a comultiplication, given by sending \(v \in V\) to \(v \otimes 1 + 1 \otimes v\), and extending by multiplication. It also has a counit, sending \(v\) to \(0\), and an antipode that sends \(v\) to \(-v\). So \(\bigotimes^* V\) is a Hopf algebra. If \(V = \mf g\), then the ideal \(I\) defined above is both an ideal and a coideal and is closed under antipode, so the quotient \(U(\mf g)\) is not only an algebra, but a Hopf algebra.
Given a universal enveloping algebra, we can recover the Lie algebra that it is built from by looking for the elements \(X\) such that \(\Delta(X) = X \otimes 1 + 1 \otimes X\), which we call "primitive" elements. Note that \(\Delta\) respects the Lie bracket, in that
$$\Delta([X, Y]) = [X, Y] \otimes 1 + 1 \otimes [X, Y] = \Delta(X)\Delta(Y) - \Delta(Y)\Delta(X)$$Thus the primitive elements in a Hopf algebra do form a Lie algebra.
For our universal enveloping algebra, we can take a representation of \(\mf g\) and extend it to a representation of \(U(\mf g)\) as follows: any non-scalar element in \(U(\mf g)\) can be written as a sum of products of elements of \(\mf g\), so we focus on such products, writing them as \(X_1X_2\cdots X_k\). Given a representation \((V,\rho)\) of \(\mf g\), we apply \(\rho\) to \(X_1X_2\cdots X_k\) in the obvious fashion, as \(\rho(X_1)\circ \rho(X_2) \circ \cdots \circ \rho(X_k)\). Scalars map also in the obvious fashion, with \(c \in \mb k \subset U(\mf g)\) going to \(cI\in End(V)\). Since \(\rho\) respects the Lie bracket, we get a well-defined algebra representation of \(U(\mf g)\). So now we can use algebra action theorems and techniques to talk about Lie algebra actions.

If we instead look at the ideal \(I_h = \left\langle X \otimes Y - Y\otimes X - h[X,Y]\right\rangle\) for \(h \in \mb k\), we get a slightly different algebra sometimes denoted by \(U_h(\mf g)\). For \(h\) nonzero this is isomorphic to \(U(\mf g)\) just by rescaling things, but when \(h = 0\), we would get the relation \(X \otimes Y - Y \otimes X = 0\), which would leave us with things being commutative, i.e. we'd end up with the symmetric algebra \(Sym(\mf g)\) instead. So we can think of \(U(\mf g)\) as a deformed version of \(S(\mf g)\), which is not exactly commutative but where the noncommutativity is tightly controlled.

Sunday, February 7, 2016

Quotients

\(\newcommand{\mb}{\mathbb}\)Let's talk about dividing sets by other sets.

You've probably heard of modular arithmetic. The usual first explanation is either in terms of clocks or remainders, or sometimes both. You say "if I multiply 5 by 9 modulo 13, I get 6" because 5 * 9 = 45, and then we subtract 13s and end up with 6.
Here's another way to look at it: consider a map\(f\) from \(\mb Z\) to a set \(T\) that has 13 elements in it, where \(f\) has the property that if \(a\) and \(b\) differ by an integer multiple of \(13\), then \(f(a) = f(b)\), and if they don't differ by an integer multiple of \(13\) exactly, then \(f\) sends them to different places. So \(f(1) = f(14)\) and \(f(45) = f(6)\), but \(f(1)\) and \(f(45)\) are different. Note that each element of \(T\) has to get hit by something, i.e. \(f\) is a surjection.
Now we give \(T\) some structure. We'll define an addition on \(T\) via addition on \(\mb Z\). In other words, we'll say that \(f(a) + f(b) = f(a + b)\) so that \(f\) is a homomorphism of groups with addition. Since \(f\) hits everything in \(T\), this defines addition for everything in \(T\). The question is whether this is well-defined; if we have \(A \in T\), there are several possible values \(a\) in \(\mb Z\) such that \(f(a) = A\), and similarly for \(B\) in \(T\) there are several values \(b\) such that \(f(b) = B\), and we want to make sure that \(A + B\) gives the same value no matter which \(a\) and \(b\) we pick. The conditions on \(f\) ensure this works in this case.
We can now consider general surjections \(f\) from \(\mb Z\) to other sets \(R\). If we want \(f\) to be a homomorphism, we need to have an addition structure on \(R\), with an identity and additive inverses, but we also need conditions on \(f\). In particular, let's look at \(f(0)\); this has to be the additive identity of \(R\). We call the set of things that get mapped to \(f(0)\) the kernel of \(f\), \(\ker(f)\); for the case above, this kernel is \(13\mb Z = \{13 k | k \in \mb Z\}\). In general, if \(f\) is going to be a homomorphism from \(\mb Z\) to another group , we need that \(\ker(f)\) is a subgroup of \(\mb Z\).

Let's suppose that we're looking at a general group \(G\) and a surjective homomorphism \(f\) from \(G\) to \(H\), and look at \(\ker(f)\), everything that gets sent to the identity \(e\) in \(H\). \(\ker(f)\) needs to be a group, because the identity composed with the identity is the identity.
But it needs to be more than just a subgroup of \(G\).
Consider \(g\) and \(h\) in \(G\) where \(h \in \ker(f)\). \(f(gh) = f(g)f(h) = f(g)\), since \(f\) is a homomorphism and \(f(h)\) is the identity. So everything of the form \(gh\) for \(h \in \ker(f)\) gets sent to the same place. Moreover, suppose that \(f(g') = f(g)\) for some \(g'\). Then we get that \(f(g^{-1}g') = f(g^{-1}f(g') = f(g^{-1})f(g') = f(g)^{-1}f(g') = f(g)^{-1}f(g) = e\). So \(g^{-1}g'\) is in \(\ker(f)\). Multiplying it by \(g\) gives that \(g'\) is of the form \(gh\) for some \(h \in \ker(f)\). We call the set \(gker(f) = \{gh | h \in \ker(f)\}\) a left-coset of \(\ker(f)\) in \(G\).
A similar argument tells us that \(g'\) is of the form \(h'g\) for some \(h' \in \ker(f)\), i.e. \(g'\) must be in the right-coset \(\ker(f)g\). So in order for \(f\) to be a homomorphism, we need that the left and right cosets \(g\ker(f)\) and \(ker(f)g\}\) are the same for all \(g\) in \(G\). We thus say that \(\ker(f)\) is normal in \(G\).
Conversely, if we have a normal subgroup \(N\), i.e. if for each \(g \in G\) the sets \(gN = \{gn | n \in N\}\) and \(Ng = \{ng | n \in N\}\) contain the same elements, then we can make a surjective homomorphism \(f\) from \(G\) to a group denoted \(G/N\) whose elements are cosets \(gN\), with multiplication in \(G/N\) given by \(gNhN = ghN\). We call \(G/N\) the quotient group of \(G\) by \(N\).
In the case of modular arithmetic above, if we treat \(\mb Z\) as a group whose composition operation is addition, then we get that all subgroups of \(\mb Z\) are normal. Indeed, for any group whose composition operation is commutative, all subgroups are normal. So for instance, given a vector space, \(V\) and a subspace \(W\), we can form \(V/W\) whose elements are cosets of the form \(v + W = \{v + w | w \in W\}\). Note that in this case we can also scalar multiply, since if \(w \in W\) then \(cw \in W\) for any scalar \(c\).
For groups that aren't commutative, the situation is more complicated. For example, consider the group \(S_3\) of permutations of 3 elements \(\{a,b,c\}\). The subgroup that contains just the identity and the permutation that swaps \(a\) and \(b\) is not normal. The subgroup \(C\) that contains the identity and the two ways to cycle all 3 elements is normal, and the quotient \(S/C\) is a group with two elements in it.

What about multiplication in quotients of \(\mb Z\)? Well, that always works so it's boring, so let's generalize until it sometimes breaks.
Consider a general associative ring \(R\), i.e. a set with an addition, additive identity, additive inverses, and a multiplication. Think matrices, for instance, or algebras.
We have a surjection \(f\) to another ring and we want to make this a homomorphism; what can we say about \(\ker(f)\)?
\(R\) is a group if we only look at addition, so \(\ker(f)\) must be a subgroup; the addition is commutative, so \(\ker(f)\) is automatically normal for addition. Our elements of \(R/\ker(f)\) are cosets of the form \(r + \ker(f)\).
Recalling that everything in \(\ker(f)\) is equivalent to \(0\) according to our map \(f\), we should get that \(\ker(f)\) times itself should be equivalent to \(0*0 = 0\), so we want that \(\ker(f)^2 = \{ab | a, b \in \ker(f)\}\) should be contained in \(\ker(f)\). So \(\ker(f)\) has to be a ring. But just as we needed more than just a subgroup, we need more than just a subring.
In our quotient object, we need that when we multiply something by anything equivalent to 0, the product is equivalent to 0. In other words, we need that \(a\ker(f) = \{ab | b \in \ker(f)\}\) needs to be contained in \(\ker(f)\). Similarly, we need that \(\ker(f)b = \{ab | a \in \ker(f)\}\) also needs to be contained in \(\ker(f)\). We say that \(\ker(f)\) is therefore a two-sided ideal, which I'll just refer to as an ideal.
Being an ideal in a ring is the equivalent of being a normal subgroup for groups: if we have a ring \(R\) and an ideal \(I\), then we can form the quotient ring \(R/I\) whose elements are of the form \(r + I\) and we have a natural surjective homomorphism \(f\) from \(R\) to \(R/I\) whose kernel is \(I\).
This is how we can think about modular arithmetic: for an integer \(k\), \(k\mb Z\) is an ideal in \(\mb Z\), so \(\mb Z/k\mb Z\) is a ring with addition, subtraction, and multiplication modulo \(k\).

We can think of this in terms of actions: A group \(G\) acts on itself via what is called the adjoint action or conjugation action: \(\rho(g)\) is the map that sends \(h\) to \(ghg^{-1}\). \(H\) being normal is then equivalent to saying that the set \(H\) is fixed as a whole by the adjoint action, although the individual elements of \(H\) might get shuffled around. Note that for commutative groups, the adjoint action is trivial, so all subgroups are fixed under it.
Similarly, a ring \(R\) acts on itself via either the left regular action or right regular action, also known as multiplication: \(lm(a)\) is the map that sends \(b\) to \(ab\), \(rm(b)\) is the map that sends \(a\) to \(ab\). \(I\) being an ideal is then equivalent to saying that \(I\) is invariant under both the left and right regular actions.
This invariance then allows for making quotients.

Given a bunch of elements \(\{x_1,\ldots,x_k\}\) in \(R\), we often write \(\left\langle x_1,\ldots,x_k\right\rangle\) for the smallest ideal in \(R\) that contains all of those elements. For groups, the same notation means the smallest group that contains all of those elements, although this makes no claims to normality. We say that \(\left\langle x_1,\ldots,x_k\right\rangle\) is generated by the elements \(x_1,\ldots,x_k\).

This gives a good way to talk about constructions of groups or algebras: we start with a thing \(F\) with no structures or relations other than the ones guaranteed by the axioms, called a "free" thing. So for instance a set that obeys the group axioms but has no other relations is called a free group. \(\mb Z\) with addition is a free group; everything in \(\mb Z\) has to be there because the group axiom demands that we can form any power of an element, but we don't get "a = b" except in the cases demanded by the associativity and inverse laws.
Then if we want to impose a relation \(A = B\) where \(A\) and \(B\) are expressions, we form the invariant subthing \(I\) generated by the difference between \(A\) and \(B\), and the quotient \(F/I\) thus can't tell the difference between \(A\) and \(B\).

For instance, suppose we start with a vector space \(V\) and form the tensor algebra \(\bigotimes^* V\). This is an associative, unital algebra, but has no other relations between elements; there are no equations that aren't just due to linearity in \(V\), so \(\bigotimes^* V\) is the free associative algebra on \(V\).
Now suppose we want to make a commutative algebra, i.e. \(a\otimes b = b\otimes a\). We start with \(\bigotimes^* V\) and consider the ideal \(I = \left\langle a\otimes b - b\otimes a | a, b \in V\right\rangle\). Then \(S(V) = \bigotimes^* V/I\) is commutative, because we've quotiented out the difference between \(a\otimes b\) and \(b \otimes a\); \(S(V)\) is often called the symmetric algebra on \(V\).

Finally, because of course we should talk about it, we can talk about quotients of coalgebras. Here we define a coideal \(I\) in a coalgebra \(C\) to be a subspace of \(C\) as a vector space with the property that
$$\Delta(I) \subset I \otimes C + C\otimes I$$ which is the dualized version of being an ideal. This is more coaction-y than action-y, using the left and right coregular coactions. Anyway, in this case we have that \(C/I\) is a coalgebra. We'll see an example of a quotient coalgebra in a bit.

We often say that a thing with no proper quotients is simple or irreducible. Such objects are like the atoms of whatever we're studying, because we can't break them into smaller pieces. And just as studying atoms is important for chemistry, studying the simple things is important for the study of things. Also just as there are many ways to put a given set of atoms together to make a molecule, there are often many ways to put a given set of simple things together to make a larger thing.