Sunday, December 27, 2015

Diagrams

\(\newcommand{\mb}{\mathbb}\)The rules for algebras, coalgebras, bialgebras and Hopf algebras can get kind of convoluted, with all of these maps and indices and the like. Some people doing research in these areas have come to prefer a different way of writing things, using diagrams to keep track of all the various pieces.
The underlying vector space \(V\) is implied, since otherwise we'd be writing it all the time and there aren't any other vector spaces involved so it's not that important to note it. Any place where a line segment begins or ends is a copy of \(V\).
Instead we start with the identity map \(id\), written as a line segment - which we view as going from the left to the right. Note that sometimes the identity map will go diagonally up like /, or diagonally down like \, but always to the right. Also the length doesn't matter; any line that doesn't have any other markings is an identity map from \(V\) to \(V\).
We might expect to put an arrow on the line segment to indicate direction, but without such indications we can immediately dualize everything by reading from right to left.
Two diagrams on top of each other indicates tensor product. This includes line segments, and includes beginnings and endings of line segments, so that a diagram like = indicates a map from \(V\otimes V\) to \(V\otimes V\) that sends \(a \otimes b\) to \(a \otimes b\).
The next interesting object is the swap operation, \(\sigma\), which we write as two line segments crossing each other:
The fact that we have two endpoints on each side indicate that we're going from \(V\otimes V\) to \(V\otimes V\). The crossing diagram obeys the following, fairly basic rules regarding \(V\otimes V\) and \(V\otimes V\otimes V\):
The interpretation of the crossed line segments as a swap makes these rules automatic.
So that's vector spaces and tensor products and the identity map.
Now let's talk about algebras and coalgebras. A unital algebra has a multiplication map and a unit map. The multiplication map takes in an element of \(V\otimes V\), so it should have two line segments going in from the left, and it spits out an element of \(V\), so it should have one line segment coming out from the right. The unit map takes an element of \(\mb k\), so it should have no line segments coming in from the left, and spits out an element of \(V\), so it should have one line segment coming out from the right. As such:
A counital coalgebra has a comultiplication and a counit; the comultiplication goes from \(V\) to \(V\otimes V\) and so has one line segment going in from the left and two coming out from the right, while the counit goes from \(V\) to \(\mb k\) and so has one line segment going in from the left and none coming out from the right. As such:
So to get just an algebra structure, we'd ignore any diagram with black dots in it; to get just a coalgebra structure, we'd ignore any diagram with white dots in it. To dualize, we'd read in the opposite direction and swap black with white. 
These structures obey rules, which I've grouped together to show duality:
The top two rules show (co)unital behavior, the bottom two show (co)associativity. Note that the only difference is the direction and the colors on the dots, so we could flip everything. Hence we see the coalgebra structures and rules are really the duals of the algebra structures and rules.
We also have the compatibility rules for bialgebras:
where again swapping directions and colors gives you back the same set of rules. The last rule says that if you ever see the diagram on the left, you can remove it without changing anything. In terms of maps, it says "apply the counit to the result of the unit map", which ends up as \(\epsilon(\eta(c)) = c\) for \(c \in \mb k\). Since juxtaposing diagrams just means tensor product, this is just scalar multiplication, which we can get away with for free since we're in a vector space.
Finally we have the antipode, which we'll write as an S in the middle of a line:
with the following rules:
The first rule just tells us that we can move antipodes past swaps. The second and third tell us that the unit and counit aren't affected by antipode, the fourth and fifth tell us that the antipode induces a swap for multiplication and comultiplication, and the last is our statement that the antipode is an inverse map.
We can write down some rules that aren't general rules for Hopf algebras, for instance commutativity and cocommutativity:
Another rule that seems obvious but doesn't actually apply in general is called involutivity:
For group algebras it's true, since \(S\hat g = \widehat{g^{-1}}\) and the inverse of the inverse is the original. But if you look at the Sweedler Hopf algebra, you'll see that applying \(S\) twice to \(X\) or \(Y\) gives you a sign change. Commutative and cocommutative Hopf algebras are necessarily involutive, but Hopf algebras that aren't commutative or cocommutative, sometimes called true quantum groups, don't have to be involutive.

Bi, Hopf algebras

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)Last time we talked about coalgebras as duals to algebras. Now we can consider what we get when we have a vector space with both an algebra and a coalgebra structure.
Let's consider a group algebra, \(\mb kG\). Recall that for a finite group \(G\), this means a vector space with a basis \(\{\hat g\}\) corresponding to elements \(g \in G\), and multiplication given by composition in \(G\).
We defined a comultiplication on this vector space by setting
$$\Delta(\hat g) = \hat g \otimes \hat g$$This is one reason why we use the letter \(\Delta\) for the comultiplication: in some cases \(\Delta\) is the diagonal map \(x \mapsto (x, x)\) whose image, when applied to \(\mb R\), is a diagonal line in \(\mb R^2\). This comultiplication is coassociative and cocommutative.
We also defined the counit \(\epsilon\) by
$$\epsilon(\hat g) = 1$$How does this interact with the group algebra structure? Well, first we note that \(\mb kG \otimes \mb kG\) has a multiplication on it, via
$$(a \otimes b)(c \otimes d) = ac \otimes bd$$In other words, we have a multiplication \(m_2\) on \(\mb kG\) defined by
$$m_2 = (m \otimes m) \circ (id \otimes \sigma_{\mb kG\mb kG} \otimes id)$$where the \(\sigma\) moves the \(c\) to the left of the \(b\).
The next thing we note is that \(\Delta\) turns group multiplication in \(\mb kG\) into multiplication in \(\mb kG \otimes \mb kG\):
$$\Delta(\widehat{gh}) = \widehat{gh} \otimes \widehat{gh} = \hat g\hat h \otimes \hat g \hat h = (\hat g \otimes \hat g)(\hat h \otimes \hat h) = \Delta(\hat g)\Delta(\hat h)$$Also that \(\Delta(\hat e) = \hat e \otimes \hat e\), where \(e\) is the identity element of \(G\) and hence \(\hat e = \eta(1)\) is the multiplicative identity of \(\mb kG\). Hence \(\hat e \otimes \hat e\) is the multiplicative identity of \(\mb kG \otimes \mb kG\), so \(\Delta\) and \(\eta\) also play well together. Dually, we have \(\epsilon(\hat f)\epsilon(\hat g) = \epsilon(\hat f \hat g)\), so the multiplication \(m\) and \(\epsilon\) play well together too.
These compatibility rules give us what we call a bialgebra.

We can then write down the rules for a general bialgebra \(B\): \(B\) is a vector space with an associative, unital algebra structure \(m, \eta\) and a coassociative counital coalgebra structure \(\Delta, \epsilon\) that are compatible.
Written out as tensors, we get that the \(m, \Delta\) compatibility rule as:
$$m_{ij}^h\Delta_h^{kl} = \Delta_i^{pq}\Delta_j^{rs}m_{pr}^km_{qs}^l$$Written as maps, we get
$$\Delta \circ m = (m \otimes m)\circ (id \otimes \sigma_{BB} \otimes id) \circ (\Delta \otimes \Delta)$$For the \(\eta, \Delta\) compatibility, it looks like
$$\Delta_i^{jk} \eta^i = \eta^j \eta^k \text{ i.e. }\Delta \circ \eta = (\eta \otimes \eta) \circ \Delta_{\mb k}$$where \(\Delta_{\mb k}\) is the comultiplication on \(\mb k\) as a 1-dimensional coalgebra over itself; \(\Delta_{\mb k}(c) = c \otimes 1\) for \(c \in \mb k\).
The \(m, \epsilon\) compatibility then becomes
$$m_{ij}^k \epsilon_k = \epsilon_i\epsilon_j\text{ i.e. }\epsilon \circ m = m_{\mb k}\circ (\epsilon \otimes \epsilon)$$Note that we usually don't even think about \(m_{\mb k}\) and just identify the tensor product of two scalars as the product of them.
The best thing about bialgebras is that the dual of a finite-dimensional bialgebra is also a bialgebra. The multiplication in \(B\) becomes a comultiplication in \(B^\vee\), and the comultiplication in \(B\) becomes a multiplication in \(B^\vee\); the unit becomes a counit, the counit becomes a unit, and the compatibility conditions for \(B\) dualize to the compatibility conditions for \(B^\vee\).

There's one more piece of group structure that we haven't even touched: inversion. Every group element has an inverse.
So we are going to introduce a new map, \(S\), which is short for "antipode", that goes from \(\mb kG\) to \(\mb kG\) and sends \(\hat g\) to \(\widehat{g^{-1}}\).
A few observations about \(S\): \(S(\hat e) = \hat e\), \(S(\widehat{gh}) = S(\hat h)S(\hat g)\), \(\epsilon(S(\hat g)) = 1 = \epsilon(\hat g)\), and \(\Delta(S(\hat g)) = S(\hat g)\otimes S(\hat g)\); there's a swap there that we can't see because \(\Delta\) here is cocommutative, but we can detect the swap by duality with the multiplication rule. Finally, the most important rule,
$$m(S(\hat g)\otimes \hat g) = \hat e = m(\hat g \otimes S(\hat g))$$which is kind of why we care about \(S\) at all.
Again, generalizing, we say that a Hopf algebra \(H\) is a bialgebra with an antipode map \(S\) such that
$$S_i^j \eta^i = \eta^j, S_i^j m^i_{kl} = m_{ba}^j S_k^a S_l^b,$$$$S_i^j \epsilon_j = \epsilon_i, S_i^j \Delta_j^{kl} = \Delta_i^{ba} S_a^k S_b^l,$$ $$m_{hi}^j S_k^h \Delta_l^{ki} = I_l^j = m_{hi}^j S_k^i \Delta_l^{hk}$$ $$S \circ \eta = \eta, S \circ m = m \circ \sigma_{HH} \circ (S\otimes S),$$ $$ \epsilon \circ S = \epsilon, \Delta \circ S = (S \otimes S) \circ \sigma_{HH} \circ \Delta.$$ $$m \circ (S \otimes id) \circ \Delta = id = m \circ (id \otimes S)\circ \Delta$$Like with biaglebras, the duals of finite-dimensional Hopf algebras are Hopf algebras. The dual of \(\mb kG\) is the space \(Fun(G)\) of functions from \(G\) to \(\mb k\). It's clearly an algebra, since you can multiply pointwise: \((st)(g) = s(g)t(g)\), and thus, treating \(s\) and \(t\) as linear functions on \(\mb kG\), we get
$$m(s \otimes t)(\hat g) = (s \otimes t)(\Delta(\hat g))$$and the unit is the function that always returns 1.
We can take as a basis of \(Fun(G)\) the set of functions \(d_g\) where \(d_g(\hat g) = 1\) and \(d_g(\hat h) = 0\) if \(h \neq g\), so that the unit then becomes \(\eta(c) = c\sum_{g \in G}d_g\)
The comultiplication is dual to the multiplication in \(\mb kG\):
$$\Delta(s)(\hat g \otimes \hat h) = s(\hat g\hat h)$$and more explicitly,
$$\Delta(d_g) = \sum_{h \in G} d_{gh^{-1}} \otimes d_h$$and the counit is evaluation at \(\hat e\).
The antipode is then \(S(d_g) = d_{g^{-1}}\), dual to the antipode of \(\mb kG\).
\(\mb kG\) is always cocommutative, but commutative only if \(G\) is commutative; \(Fun(G)\) is commutative, but cocommutative only if \(G\) is commutative.

The tensor algebra \(\bigotimes^* V) of a vector space \(V\) is itself a Hopf algebra. Its multiplication and unit we already know as tensor product \(\otimes\) and \(1 \in \mb k\), but to make things easier we'll write said multiplication without the \(\otimes\) symbol.
The comultiplication \(\Delta\) sends \(v \in V\) to \(v \otimes 1 + 1 \otimes v\), treated as an element of \(\bigotimes^* V \otimes \bigotimes^* V\). We extend this to the rest of \(\bigotimes^* V\) by the usual multiplication rules: \(Delta(uv) =  \Delta(u)\Delta(v) = uv \otimes 1 + u\otimes v + v\otimes u + 1\otimes uv\) and so on. The general rule is a little complicated.
In contrast the counit is easy: \(v\) gets sent to 0, and so does any multiple of \(v\). So only things in \(\mb k\) have nonzero counit.
Finally, the antipode sends \(v \in V\) to \(-v\), and again we extend by multiplication. So \(S(uv) = (-v)(-u) = vu\), and \(S(uvw) = -wvu\) and so on.
And you can check that this does give us a Hopf algebra.

One last weird example before we end:
The Sweedler Hopf algebra \(H\) is four dimensional. We'll denote \(\eta(1)\) as just \(1\) here to match conventional notation. A basis for \(H\) is given as \(\{1, c, x, y\}\), with the following properties:
$$c^2 = 1, cx = y = -xc, x^2 = xy = yx = y^2 = 0$$ $$\Delta(c) = c \otimes c, \Delta(x) = x \otimes 1 + c \otimes x,\Delta(y) = y \otimes c + 1 \otimes y$$ $$\epsilon(c) = 1, \epsilon(x) = 0,\epsilon(y) = 0$$ $$S(c) = c, S(x) = -y, S(y) = x$$This is a noncommutative, noncocommutative Hopf algebra and, along with its dual, is the lowest-dimensional example of such. Compute its dual, which looks even weirder.

Wednesday, December 16, 2015

Simple Lie Groups

\(\newcommand{\rar}{\rightarrow} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathfrak}\)Here we'll establish some general framework around the notion of being simple.

Recall that we mentioned that given a group \(G\) and a subgroup \(N\) invariant under the adjoint action of \(G\) on itself, usually called a normal subgroup, we can form a quotient group \(G/N\). Similarly, for a Lie algebra \(\mf g\), an ideal \(I\) is a subspace invariant under the adjoint action \(ad(X)(Y) = [X,Y]\), and given an ideal we can form a quotient Lie algebra \(\mf g/I\).
We can bind these two ideas together by going to Hopf algebras. For a Hopf algebra \(H\), we write $$\Delta(x) = \sum_i x_{(1i)}\otimes x_{(2i)}$$ mimicking Sweedler notation, and define the adjoint action of \(H\) on itself as $$Ad(x)y =  \sum_i x_{(1i)}yS(x_{(2i)})$$ which again preserves the Hopf algebra structure if \(H\) is cocommutative. You can see that this action gives the group adjoint action when \(H = \mb kG\), and the Lie algebra adjoint action when \(H = U(\mf g)\).
When there are no nontrivial invariant subthings under the adjoint action, i.e. no quotient groups or quotient Lie algebras, we say that we have a simple group or a simple Lie algebra. Except for the 1-dimensional Lie algebra, which is not considered simple most of the time. It's too trivial, in that the bracket vanishes.

For Lie groups, the notion of being simple is tied to the Lie algebra: a simple Lie group is a Lie group whose Lie algebra is simple. This does not mean that a simple Lie group is simple as a group; it can still have nontrivial normal subgroups. However, because the Lie algebra is simple, the normal subgroups have to be \(0\)-dimensional as Lie groups, i.e. isolated points.
Most interesting properties of simple Lie groups can be determined by the Lie algebra, which is why we're not so concerned about those pesky normal subgroups. For "connected" simple Lie groups, which covers all the cases we'll be considering, those normal subgroups are constrained to being in the center, i.e. in the set of elements that commute with everything. The center of a group is always a normal subgroup, and for the connected simple Lie groups the center is a finite set of points.

The finite simple groups and the finite-dimensional simple Lie algebras over \(\mb R\) and \(\mb C\) have been classified, and in both cases the setup is a few infinite families indexed by integers along with a finite set of outliers that don't fit any particular pattern. For the simple Lie algebras over \(\mb C\), we have the following four families: \(\mf a_n = sl_{n+1}\), \(\mf b_n = so_{2n+1}\), \(\mf c_n = sp_{2n}\) and \(\mf d_n = so_{2n}\), where \(n\) can be any positive integer. We also have five exceptional simple Lie algebras, \(\mf e_6, \mf e_7, \mf e_8, \mf f_4,\) and \(\mf g_2\).
The great thing about the simple Lie algebras over \(\mb C\) is that all of their finite dimensional representations are fully decomposable. In other words, any finite dimensional representation can be written uniquely as a direct sum of irreducible representations. Contrast this to the case of, say, the Lie algebra of upper triangular matrices, where the obvious representation is not decomposable but is also not irreducible. Moreover, the irreducible representations of a finite-dimensional simple Lie algebra are easy to describe just from a few bits of data about the Lie algebra itself.
Over the reals, we have a somewhat more complicated picture for both the classification of the Lie algebras themselves and the representations. For the Lie algebras there are several real forms of each object on the list above. For instance, both \(sl_2(\mb R)\) and \(so_3(\mb R)\) are real forms of \(\mf a_1\). For the representations there is a bit of trickiness in terms of how representations decompose, since we have to worry about whether certain eigenvalues are real or not.

The classification of the finite simple groups is much more complicated, but is tightly linked to the classification of simple Lie algebras as most of the finite simple groups are "groups of Lie type", meaning that we take the field \(\mb k\) to be a finite field, build a Lie group over that field, and then fiddle a bit to get rid of some extra bits.

Unfortunately, the adjoint action does not in general yield the proper notion of ideal or coideal for Hopf algebra quotients of Hopf algebras. For algebras in general, we have to go back to the notion of ideal which uses the left- and right- regular actions, i.e. multiplication rather than the Lie bracket or its analogues. But we can still define a simple algebra as one with no proper ideals, i.e. with no proper quotients.

Sunday, December 6, 2015

Coalgebras

\( \newcommand{\ncmd}{\newcommand} \ncmd{\rar}{\rightarrow} \ncmd{\mb}{\mathbb} \ncmd{\mf}{\mathfrak} \) At some point there's going to be a plan. Not at the moment. But I'm introducing coalgebras because it's easier to do representation theory of Lie algebras as infinitesimal group stuff if I have comultiplication available. Plus duality is the best thing ever.

Last time we saw an algebra \(A\) as a vector space with a multiplication operation \(m: A \otimes A \rar A\). Maybe it obeyed some laws, like associativity or commutativity, and maybe it had a multiplicative identity, given by the unit map \(\eta: \mb k\rar A\).
This time we're going to flip things around.

The motivation here is duality. Recall the dual of a vector space \(V\): it's the set \(V^\vee\) of linear maps from \(V\) to \(\mb k\), and happens to form another vector space.
Given a linear map \(T\) from a vector space \(W\) to \(V\), and an element \(f\) in \(V^\vee\), we can look at the composition \(f\circ T\), which takes a vector from \(W\) and spits out numbers in \(\mb k\). Since both \(T\) and \(f\) are linear, we get that \(f\circ T\) is in \(W^\vee\). Hence the map \(T\) from \(W\) to \(V\) gives us a way to get maps from \(V^\vee\) to \(W^\vee\). The technical term is that the operation of taking the dual is "contravariant", with maps becoming maps in the other direction.
So our multiplication \(m: A\otimes A\rar A\) dualizes to a map from \(A^\vee\) to \((A\otimes A)^\vee\). If \(A\) is finite-dimensional, then \((A\otimes A)^\vee = A^\vee \otimes A^\vee\), so we want to look at a vector space \(A^\vee\) equipped with a map from \(A^\vee\) to \(A^\vee \otimes A^\vee\). The multiplication map goes from the tensor product to the vector space, this dualized "comultiplication" map goes from the vector space to the tensor product.

So, ignoring the original vector space, a coalgebra \(C\) is a vector space with a comultiplication map \(\Delta: C \rightarrow C \otimes C\). Everything is tensors, so we can write \(\Delta\) with tensor indices. Letting \(\{e_i\}\) be a basis for \(C\), we write
$$\Delta(e_i) = \Delta_i^{jk} e_j\otimes e_k$$ Compare to the multiplication map, which when written as a tensor has two lower indices and an upper index.

Let's put in some properties. Recall that associativity looked like
$$m_{ab}^e m_{ec}^d = m_{ae}^d m_{bc}^e$$ $$m \circ (m \otimes id) = m \circ (id \otimes m)$$ Flipping that around gives the notion of coassociativity, which looks like
$$\Delta_h^{ij} \Delta_l^{hk} = \Delta_l^{ih} \Delta_h^{jk}$$ $$(\Delta \otimes id) \circ \Delta = (id \otimes \Delta) \circ \Delta$$ We could write this out in terms of the behavior on vectors, but that would require instating some specific notation for \(\Delta(v)\). The usual route is to use Sweedler notation, writing \(\Delta(v) = \sum_i v_{(i1)} \otimes v_{(i2)}\) and then abbreviating the sum as \(v_{(1)} \otimes v_{(2)}\), but I'm going to hold off on using that for now.

We can also talk about commutativity for algebras, using a swapping map \(\sigma_{AA}\) that sends \(u \otimes v\) to \(v\otimes u\). Commutativity looks like
$$m_{ab}^c = m_{ba}^c$$ $$m = m \circ \sigma_{AA}$$ Flipping this around we get the rule for cocommutativity
$$\Delta_i^{jk} = \Delta_i^{kj}$$ $$\Delta = \sigma_{CC} \circ \Delta$$
Finally we can talk about the counit. The unit map \(\eta\) goes from \(\mb k\) to \(A\), so the counit map \(\epsilon\) goes from \(C\) to the dual of \(\mb k\), which is just \(\mb k\) again.
The law for units in tensor indices is
$$m_{ij}^k \eta^j = m_{ji}^k \eta^j = I_i^k$$ So the counit looks like
$$\Delta_i^{jk}\epsilon_j = \Delta_i^{kj} \epsilon_j = I_i^k$$ In terms of maps, for the unit we write
$$m \circ (\eta \otimes id) = lm$$ $$m \circ (id \otimes \eta) = rm$$ where \(lm\) is the scalar multiplication map \(c \otimes v \mapsto cv\), and \(rm\) is scalar multiplication with the scalar on the right.
So we also need to dualize \(lm\) and \(rm\). We define \(ld\) to be the map that sends \(v\) in \(C\) to \(1 \otimes v\) in \(\mathbb{k}\otimes C\), and similarly \(rd\) to be the map that sends \(v\) to \(v\otimes 1\). Then our counit map acts as
$$(\epsilon \otimes id) \circ \Delta = ld$$ $$(id \otimes \epsilon) \circ \Delta = rd$$
So now we can talk about coassociative, cocommutative, counital coalgebras.

A field \(\mb k\) is a coalgebra as a 1-dimensional vector space over itself, with the comultiplication:
$$\Delta(c) = c \otimes 1 = 1 \otimes c$$ and the counit being the identity. Compare with the structure of \(\mb k\) as an algebra.

The coalgebra \(\mb C^\vee\) as a 2-dimensional vector space over \(\mb R\) has as a basis \(x\) and \(y\), where \(x\) returns the real part of a complex number and \(y\) the imaginary part. Note that \(x\) and \(y\) are linear functions.
The comultiplication in \(\mb C^\vee\) comes from the multiplication in \(\mb C\), i.e. the following rule: \((a + bi)(c + di) = (ac - bd) + (ad + bc)i\). So our coalgebra looks like:
$$\Delta(x) = x \otimes x - y \otimes y, \Delta(y) = x \otimes y + y \otimes x$$ This is the general idea for comultiplication in the dual of a finite-dimensional algebra: if we have \(A\) and \(A^\vee\), then for \(a\) and \(b\) in \(A\) and \(f\) in \(A^\vee\), we have
$$f(m(a\otimes b)) = \Delta(f)(a \otimes b)$$ Since \(\mb C\) has a unit, sending the real number \(a\) to the complex number \(a + 0i\), \(\mathbb{C}\) has a counit, which is just evaluating a function at the number \(1 + 0i\). So \(\epsilon(x) = 1, \epsilon(y) = 0\). This is the general idea for counits in the dual of a finite-dimensional algebra: evaluate at the multiplicative identity.
Note that \(\Delta\) in this case is coassociative, cocommutative, and counital; this is the general case for duals; if the algebra is ___, then the dual coalgebra is co___, and vice-versa.

Consider the set of \(n \times n\) matrices, \(M_n = Mat_n(\mathbb{k})\). It's a unital, associative algebra. The dual space \(M_n^\vee\) is an \(n^2\)-dimensional counital, coassociative coalgebra. If we write the basis of \(M_n\) as \(e_i^j\), then we get a basis of \(M_n^\vee\) as \(f_p^q\), where \(f_p^q(e_i^j)\) is \(1\) if \(p = l\) and \(q = j\) and is \(0\) if either of those conditions doesn't hold.
Multiplication in \(M_n\) is that \(e_i^j e_k^l\) equals \(e_i^l\) if \(j = k\) and is \(0\) otherwise. So comultiplication in \(M_n^\vee\) is
$$\Delta(f_p^q) = f_r^q \otimes f_p^r$$ Again, since \(M_n\) is unital, \(M_n^\vee\) has a counit which sends \(f_p^q\) to 1 if \(p = q\) and to 0 otherwise, i.e. evaluating on the identity matrix.

One example that will be important for us is the group algebra, \(\mb kG\). The comultiplication sends the basis elements \(\hat g\) to \(\hat g \otimes \hat g\) and the counit sends \(\hat g\) to 1. We can even discuss a comultiplication for \(G\), which sends \(g\) to \(g \otimes g\). This is not a coalgebra structure because \(G\) does not have a vector-space structure that this respects, but it is still a structure on \(G\). The counit just sends everything in \(G\) to 1. This comultiplication indicates why we use the symbol \(\Delta\): as a stand-in for the diagonal map \(g \rar (g, g)\). In particular, if \(G\) is the real line then the image of \(\Delta\) in \(\mb R^2\) would be a diagonal line.

You can build similar setups for the other finite-dimensional algebras we mentioned. For infinite dimensional algebras we have a bit of an issue taking duals, from a number of standpoints. The duals of infinite dimensional vector spaces are themselves a bit tricky to deal with, because they're generally "bigger" than the original vector spaces. Also, dualizing the multiplication in an infinite dimensional algebra to get a comultiplication in the dual often leads to needing to take infinite sums of tensor products, which is generally a bad thing if you don't want to keep careful track of convergence issues. There are infinite-dimensional coalgebras, but duality takes a little bit of doing.
In particular, we remove the "dual space" aspect and simply ask for a duality map: an algebra \(A\) is dual to a coalgebra \(C\) if there is a map \(ev\) from \(C \otimes A\) to \(\mb k\) such that, viewed as maps from \(C \otimes A \otimes A\) to \(\mb k\), the following holds:
$$ev \circ (id \otimes m) = (ev \otimes ev) \circ (id \otimes \sigma_{CA} \otimes id)\circ (\Delta \otimes id \otimes id)$$ where we use \(\sigma_{CA}\) to move an element of \(a\) to the left of an element of \(C\).
In terms of individual elements \(f\) in \(C\) and \(a\) and \(b\) in \(A\), we have
$$(ev(f \otimes m(a \otimes b)) = (ev(f_{(1)} \otimes a))(ev(f_{(2)} \otimes b))$$ where we're using the Sweedler notation for the comultiplication.

Saturday, December 5, 2015

Algebras

There are a class of mathematical objects called, somewhat unfortunately, "algebras". They are vector spaces with multiplications. Not in the sense of tensoring something in \(V\) with something in \(W\) and end up with something in \(V\otimes W\), but in the sense of multiplying something in a vector space \(A\) with another thing in \(A\) and ending up in \(A\) again.
So our multiplication map sends pairs of vectors to vectors: \(m: A \times A \rightarrow A\). Furthermore we ask that it be bilinear, i.e. if we fix a vector \(u\), the maps \(l_u: v \mapsto m(u, v)\) and \(r_u: v \mapsto m(v, u)\) are both linear maps.
We still end up talking about tensor products, however, because tensor products give us the most general form of bilinear multiplication. In particular, because of the bilinearity, we can in fact view \(m\) not as a map from \(A\times A\) to \(A\) upon which we need to impose bilinearity, but as a map from \(A \otimes A\) to \(A\) which is linear, remembering that \(A\otimes A\) is itself a vector space.
If we have our basis vectors \(e_i\) of \(A\), we can consider products of basis vectors. In particular, for \(e_i\) and \(e_j\), their product can be written as a linear combination of basis vectors, i.e.
$$e_ie_j = m(e_i \otimes e_j) = m_{ij}^k e_k$$ We call the numbers \(m_{ij}^k\) the structure constants of the algebra with respect to the given basis.
As noted in the previous posts, we have familiar algebras like Lie algebras, where the multiplication is the bracket. Other examples include the complex numbers, which can be viewed as a 2-dimensional real vector space with basis elements \(e_1 = 1\) and \(e_2 = i\). There are 8 structure constants with respect to this basis, half of them 0; the nonzero ones are easy to figure out.

There are actually algebras built from vector spaces \(V\) where the tensor product is the multiplication operation. Recall the sum of two vector spaces \(V\oplus W\), each element of which can be written in exactly one way as the sum of an element of \(V\) and an element of \(W\). We define the tensor algebra of \(V\) as \(\bigotimes^* V = \mathbb{k} \oplus V \oplus (V\otimes V) \oplus (V\otimes V \otimes V) \oplus \ldots\); to keep things nice we assume that each element of \(\bigotimes^* V\) can be written as a unique sum of one vector from finitely many of the various \(V\otimes...\otimes V\) terms, so that we don't have to worry about convergence issues.
Now the multiplication here is not a map from \(V\otimes V\) to \(V\), but rather from \((\bigotimes^* V) \otimes (\bigotimes^* V)\) to \(\bigotimes^* V\). So when we write the product of two elements in \(u\) and \(v\) in \(\bigotimes^* V\), we won't write \(u \otimes v\), we'll just use concatenation and write \(uv\) to denote multiplication inside the tensor algebra,

We often ask that our algebras \(A\) obey certain conditions to make them nicer. The ones we will look at are associativity, commutativity, and being unital.
Associativity tells us that we don't need so many parentheses: \(a(bc) = (ab)c\). In terms of our map \(m\), we have \(m(a \otimes m(b \otimes c)) = m(m(a\otimes b)\otimes c)\). We can rewrite this in two ways which will be useful later. The first uses tensor indices:
$$m_{il}^h m_{jk}^l = m_{ij}^l m_{lk}^h$$ The other is in terms of the maps but without any vectors:
$$(m\circ(m\otimes id) = m\circ(id\otimes m)$$ You can check that these both are equivalent to associativity.
For commutativity, we would say that \(ab = ba\). In terms of our map \(m\), we have \(m(a \otimes b) = m(b \otimes a)\). In tensor indices,
$$m_{ij}^k = m_{ji}^k$$ Compare to the antisymmetry/anticommutativity of the Lie bracket, expressed as \(u_{ij}^k = -u_{ji}^k\).
In vector-less maps, we need to introduce an operation \(\sigma_{VW}\), which takes an element of \(V\otimes W\) and returns an element of \(W \otimes V\) by sending \(v \otimes w\) to \(w \otimes v\). This operation will come in handy later. At the moment, we want \(\sigma_{AA}\), to express commutativity as
$$m = m \circ \sigma_{AA}$$ Finally we have being unital. This means picking out a particular vector to act as the multiplicative identity, the way \(1\) does for scalars. For future purposes I'm going to make this a map as well, \(\eta: \mathbb{k} \rightarrow A\), and we impose the following rule:
$$m(\eta(c) \otimes v) = m(v\otimes \eta(c)) = cv$$ Here the vector \(\eta(1)\) is our multiplicative identity in \(A\).
We can read this as a vector: \(\eta = \eta^i e_i\), so that 
$$m_{ij}^k \eta^i = m_{ji}^k \eta^i = I_j^k$$ where \(I_j^k\) is 1 if \(j = k\) and 0 otherwise, giving the identity map on vectors.
If we want to write this out as maps without vectors, we need to remember that for a scalar \(c\) we defined \(c \otimes v\) to be \(cv\), so that there's an obvious isomorphism between \(\mathbb{k}\otimes A\) and \(A\), which we'll call \(lm\) for left (scalar) multiplication, and similarly a map \(rm\) from \(A \otimes \mathbb{k}\) to \(A\). Then we get that
$$m \circ (\eta \otimes id) = lm$$ $$m \circ (id \otimes \eta) = rm$$
Examples!
A field \(\mathbb{k}\) is a 1-dimensional vector space over itself, with the obvious multiplication and the unit map being the identity.

The complex numbers are associative, commutative, and unital as an algebra over \(\mathbb{R}\); Lie algebras are generally not commutative or unital, and when they're not commutative then they're generally not associative by the Jacobi identity, which in fact describes how far from associative a given Lie algebra is in terms of how far from commutative it is.

The set of \(n\times n\) matrices with entries in \(\mathbb{k}\) form an algebra, with the usual addition, scalar multiplication, and multiplication operations. This algebra is associative, since matrix multiplication is associative, and unital, since we have the matrix \(I\) to serve as \(\eta(1)\), but not commutative if \(n > 1\).

From an associative-but-not-commutative algebra \(A\) we can get a commutative-but-not-associative algebra called a Jordan algebra, \(J\), which has the same elements as \(A\), and hence the same vector space structure, but the multiplication looks different: 
$$m_J(u \otimes v) = a(m(u\otimes v) + m(v\otimes u))$$ where the value of \(a\) depends on who you ask; sometimes it's 1, sometimes it's \(1/2\). Whether \(J\) is unital depends on whether \(A\) is unital, and also whether we can divide by \(2\) in \(\mathbb{k}\) (for the kinds of \(\mathbb{k}\) we're considering here, namely the real and complex numbers, we can always divide by 2). There are actually Jordan algebras that we can't get in this fashion; those are called exceptional Jordan algebras. The ones we can get this way are called special Jordan algebras. It says something about the people involved that all Jordan algebras are either special or exceptional.

Consider a set \(S\), and consider the set \(Fun(S)\) all of the functions from \(S\) to \(\mathbb{k}\). We're not going to put any restrictions on these functions yet, so a function just assigns to each element of \(S\) a number in \(\mathbb{k}\). Given two functions \(f\) and \(g\) and any number \(c \in \mathbb{k}\), we can form the functions \(f + g\), \(cf\) and \(f\cdot g\) by defining what they do on elements of \(S\):
$$(f + g)(s) = f(s) + g(s), (cf)(s) = cf(s), (f\cdot g)(s) = f(s)g(s)$$ The first two rules tell us that we have a vector space, the last rule tells us that we have an associative, commutative algebra. We have a multiplicative identity and thus a unit map that takes a number \(c\) and spits out the function denoted by \(\hat c\), where \(\hat c(s)\) is always \(c\) for all \(s \in S\).
If \(S\) has some structure or properties, we could restrict our functions via those structures or properties, like demanding only continuous functions or smooth functions or polynomial functions, etc.

For the last class of examples, take a finite group, \(G\), and make a vector space \(\mathbb{k}G\) which we define to have a basis vector for each element of \(G\). In other words, for each \(g \in G\), we have a basis element \(\hat g\) that are all linearly independent. So our elements are linear combinations:
$$\sum_{g \in G} c_g \hat g$$ We define our multiplication in \(\mathbb{k}G\) as just group composition: \(\hat g \hat h = \hat{gh}\). This multiplication is associative, since group composition is associative, and unital, since groups have to have identity elements. It's commutative if and only if \(G\) is commutative.
We call this construction the group algebra for \(G\).
We could do this construction with infinite groups as well, but we're not going to be using those as examples.

At this point you might be wondering why I insisted on writing everything in terms of both tensor indices and maps without vectors. The reason is that next time we're going to dualize everything, swapping our upper and lower indices and flipping all our arrows around to talk about coalgebras.

Thursday, December 3, 2015

Matrices, Products, Lie brackets as tensors

One good thing about tensors is that they give us a way to unify matrices and vectors. What's a matrix? It's a linear map that takes in a vector in \(V\) and spits out a vector in \(W\). An element \(V^\vee\) takes in a vector, a vector \(\textbf{w}\) in \(W\) can be viewed as taking a number \(c\) and spitting out the vector \(c\textbf{w}\). So elements of \(V^\vee \otimes W\) take in vectors of \(V\) and spit out vectors in \(W\). If we look at a general element, \(u_i^j \textbf{e}^i \otimes \textbf{f}_j\), we can look at the coefficients \(u_i^j\) as the entries in the matrix with respect to the given bases. There are \(mn\) basis elements for \(V^\vee \otimes W\), and there are \(mn\) entries in a matrix from \(V\) to \(W\). So \(V^\vee\otimes W\) is equivalent to the set of linear transformations from \(V\) to \(W\).
If we have a matrix \(R = R_i^j \textbf{e}^i\otimes \textbf{f}_j\), then the behavior on a basis vector \(\textbf{e}_k\) is to compute \(R_i^j\textbf{e}^i(\textbf{e}_k) \textbf{f}_j\). Remember that \(\textbf{e}^i(\textbf{e}_k)\) is 1 if \(i = k\) and 0 otherwise, so the previous expression becomes \(R_k^j \textbf{f}_j\). Hence for a general vector \(\textbf{v} = v^k \textbf{e}_k\), we get
$$R(\textbf{v}) = v^k R_k^j \textbf{f}_j$$ which you can check matches the usual notion of applying matrices to vectors. Note: \(v^k\) and \(R_k^j\) are both scalars, so it doesn't matter what order we write them in. We only need to worry about order for \(\otimes\).
Now we can talk about change of basis stuff, which makes all of this index notation useful.
Suppose that we want to change our basis of \(V\) from \(\textbf{e}_i\) to some new basis, \(\hat{\textbf{e}}_i\). In particular, suppose that we have a change-of-basis matrix \(R\) that expresses vectors \(\textbf{e}_i\) in terms of the vectors \(\hat{\textbf{e}}_i\):
$$R(\textbf{e}_i) = R_i^j\hat{\textbf{e}}_j.$$ We can also change to the corresponding dual basis, \(\hat{\textbf{e}}^j\) via \(R^{-1}\):
$$R^{-1}(\textbf{e}^j) = (R^{-1})_i^j \hat{\textbf{e}}^i.$$ Now suppose we have a linear transformation \(S\) from \(W\) to \(V\), written in terms of the \(\textbf{e}_i\) basis. So we get that
$$S(\textbf{f}_k) = S_k^i \textbf{e}_i.$$ To change that to the new basis, we rewrite \(\textbf{e}_i\) in terms of the matrix \(R\) to get
$$S(\textbf{f}_k) = S_k^i R_i^j\hat{\textbf{e}}_j = (RS)_k^j\hat{\textbf{e}}_j.$$ You can check that this is the usual way of doing change-of-basis stuff. Similarly, if we have a linear transformation \(T\) from \(V\) to \(W\), we go the other way around:
$$T(\textbf{e}_i) = T_i^k \textbf{f}_k.$$ becomes
$$T(R_i^j\hat{\textbf{e}}_j) = T_i^k \textbf{f}_k.$$ Rewriting everything gives us
$$T(\hat{\textbf{e}}_j) = T_i^k (R^{-1})_j^i \textbf{f}_k = (TR^{-1})_j^k \textbf{f}_k.$$ Finally, if we have a transformation \(P\) from \(V\) to \(V\) and we change the basis on both ends, we get
$$P(\textbf{e}_i) = P_i^k \textbf{e}_k$$ becoming
$$P(\hat{\textbf{e}}_j) = R_l^kP^l_i(R^{-1})^i_j \hat{\textbf{e}}_k = (RPR^{-1})_j^k  \hat{\textbf{e}}_k.$$ So we see that by matching indices together we can perform linear transformations.
Other familiar objects that can be written as tensors:
The dot product works well in the standard basis but can get a little wonky in other bases. We noted that there's even a group that talks about the matrices that preserve the dot product, \(O_n\). Let's get rid of that specific basis dependency by writing it in this index notation so that we can see how it transforms. The dot product takes in two vectors and spits out a number, so it lives in \(V^\vee \otimes V^\vee\) and is therefore a linear combination of tensors of the form \(\textbf{e}^i \otimes \textbf{e}^j\), with coefficients of the form \(u_{ij}\). We say that it's a symmetric tensor, in that for any pair of values \(i\) and \(j\), \(u_{ij} = u_{ji}\) regardless of what basis we're in.
If we change our basis using the matrix \(R\), we change the dual basis by \(R^{-1}\), so we end up with $$\hat{u}_{kl} = u_{ij}(R^{-1})_k^i(R^{-1})_l^j.$$ Another familiar object: the cross product. It takes two vectors and spits out a vector, so it lives in \(V^\vee \otimes V^\vee \otimes V\) where \(V = \mathbb{R}^3\). The cross product is a linear combination of tensors of the form \(\textbf{e}^i \otimes \textbf{e}^j \otimes \textbf{e}_k\) and thus has coefficients of the form \(u_{ij}^k\). It's antisymmetric in \(i\) and \(j\), in that \(u_{ij}^k = -u_{ji}^k\). When we change our basis it becomes
$$\hat{u}_{pq}^r = u_{ij}^k (R^{-1})_p^i (R^{-1})_q^j R_k^r.$$ A generalization of the cross product is the Lie bracket of a Lie algebra. Here \(V\) is the vector space of the Lie algebra, and again the bracket takes in two vectors and spits out a vector, so its coefficients have the same structure \(u_{ij}^k\) and transform the same way. The antisymmetry rule is again \(u_{ij}^k = -u_{ji}^k\). The Jacobi identity is \(u_{ij}^h u_{kl}^j = u_{jl}^h u_{ik}^j + u_{kj}^h u_{il}^j\).
Okay, that's kind of an ugly mess, but the important bit is that we can see how the coefficients of the bracket transform when we change basis.
Since we're here, we can note that the cross product also obeys the Jacobi identity, so \(\mathbb{R}^3\) is also a Lie algebra with this bracket. Which one? \(so_3(\mathbb{R})\)!

Tensors

Tensors are like products for vector spaces.

More explicitly, we can add vectors, and we have two things that are called "products" for vectors, namely the dot product and the cross product. But we would like something a little more general.
Let's take some vector spaces, \(V\) and \(W\), over our field \(\mathbb{k}\). Let's suppose that we have a basis for each space, so that \(V\) is has basis \(\textbf{e}_1,\textbf{e}_2,\ldots,\textbf{e}_m\) and similarly \(W\) has basis \(\textbf{f}_1,\ldots,\textbf{f}_n\).
We introduce the notion of an upper index so that we can write vectors in \(V\) as
$$\textbf{v} = v^1 \textbf{e}_1 + v^2\textbf{e}_2 + \ldots + v^m \textbf{e}_m$$ where the terms \(v^i\) are all numbers in \(\mathbb{k}\). For the rest of this post and many posts afterwards, \(^1, ^2, ^i\) and so on will indicate just bookkeeping indices rather than exponents; the numbers \(v^1, v^2,\ldots\) don't necessarily have any relation to each other. The upper-lower stuff becomes useful for distinguishing what goes where later.
Now we want to consider pairs where one element is a basis vector from \(V\) and the other is a basis element from \(W\): \((\textbf{e}_i, \textbf{f}_j)\). We'll write this pair as \(\textbf{e}_i \otimes \textbf{f}_j\) to emphasize that this is supposed to be viewed as the product of \(\textbf{e}_i\) by \(\textbf{f}_j\). The symbol \(\otimes\) is called the tensor product.
What if we want to take the product of more general vectors? Let's look at \((v^1 \textbf{e}_1 + v^2 \textbf{e}_2) \otimes (w^3 \textbf{f}_3 + w^4 \textbf{f}_4)\). Expanding out as if this were regular multiplication gives:
$$v^1 \textbf{e}_1 \otimes w^3 \textbf{f}_3 + v^1 \textbf{e}_1 \otimes w^4 \textbf{f}_4 + v^2 \textbf{e}_2 \otimes w^3 \textbf{f}_3 + v^2 \textbf{e}_2 \otimes w^4 \textbf{f}_4$$ We'll instate the following rule: scalars can pass through \(\otimes\). This makes sense, since for scalars, \(a\) times a product \(bc\) is equal to \(b\) times the product \(ac\), so \(b\) passes through the product of \(a\) and \(c\).
So we can rewrite the last term as
$$v^1w^3 \textbf{e}_1 \otimes \textbf{f}_3 + v^1w^4 \textbf{e}_1 \otimes \textbf{f}_4 + v^2w^3 \textbf{e}_2 \otimes \textbf{f}_3 + v^2w^4 \textbf{e}_2 \otimes \textbf{f}_4$$ This tells us that we'll want to have the objects \(\textbf{e}_i\otimes \textbf{f}_j\) be considered distinct, and indeed linearly independent, when we vary \(i\) and \(j\). We get a new vector space \(V\otimes W\) with basis given by all the products of the form \(\textbf{e}_i \otimes \textbf{f}_j\). Since there are \(m\) basis elements \(\textbf{e}_i\) for \(V\) and \(n\) basis elements \(\textbf{f}_j\) for \(W\), we get that the vector space \(V\otimes W\) has dimension \(mn\).

Note that we can also form \(W \otimes V\), whose basis is given by products of the form \(\textbf{f}_j \otimes \textbf{e}_i\). The two vector spaces \(V\otimes W\) and \(W\otimes V\) have the same dimension, and indeed the basis vectors of each can be easily and naturally matched up. So we don't always distinguish between them.
We do care about the ordering when it comes to individual elements. In particular, if we look at \(V\otimes V\), that has basis elements of the form \(\textbf{e}_i \otimes \textbf{e}_j\). If this is going to have dimension \(m^2\), then we need that \(\textbf{e}_i \otimes \textbf{e}_j\) is linearly independent from \(\textbf{e}_j \otimes \textbf{e}_i\); this distinguishes \(\otimes\) from the usual product for scalars.
We note that \(v^1w^3\) has two upper indices, and so in general, the coefficients for elements of \(V\otimes W\) will have two upper indices, one for \(V\) and one for \(W\). So we get that a general element of \(V\otimes W\) looks like \(\sum_{i,j=1}^{m,n} u^{ij} \textbf{e}_i \otimes \textbf{f}_j\).
Since \(V\otimes W\) is a vector space, if we have another vector space \(U\) we can form the tensor product \(U \otimes V \otimes W\), and the coefficients will then have 3 indices, one for each factor. and so on for more complicated products. Just for completeness, if for some reason we have the tensor product of a vector and a scalar, i.e. \(c \otimes \textbf{v}\) or \(\textbf{v}\otimes c\), both of those are just equal to \(c\textbf{v}\).

Let's introduce another idea: duals:
Let's look at linear transformations from \(V\) to \(\mathbb{k}\). Given two such transformations \(p\) and \(q\), we can form their sum, \(p + q\), defined by \((p + q)(\textbf{v}) = p(\textbf{v}) + q(\textbf{v})\). We can also multiply a transformation by a scalar in the obvious fashion. So the set of linear transformations from \(V\) to \(\mathbb{k}\) forms a vector space, which we'll call the dual space of \(V\) and write as \(V^\vee\).
If \(V\) is finite dimensional, then we can use the basis of \(V\) to make a dual basis for \(V^\vee\): we define \(\textbf{e}^i\) such that \(\textbf{e}^i(\textbf{e}_j)\) is 1 if \(i = j\) and 0 if \(i \neq j\); since everything is linear, this condition defines \(\textbf{e}^i\) on all of \(V\). Again, the upper index is just for bookkeeping, not an exponent.
We write a more general element of \(V^\vee\) as \(\sum_{i = 1}^m v_i\textbf{e}^i\), where the \(v_i\) are in \(\mathbb{k}\).
We can talk about tensor products of elements of \(V^\vee\) with elements of \(V\) or \(W\); again we just form the products of the basis elements and then take linear combinations.

A general basis element of \(V \otimes V^\vee\) is of the form \(\textbf{e}_i \otimes \textbf{e}^j\), and so the coefficient to match that should have an upper index \(^i\) and a lower index \(_j\). So a general element of \(V\otimes V^\vee\) looks like \(\sum_{i,j=1}^m u^i_j \textbf{e}_i\otimes \textbf{e}^j\)

A bit about how matrices act on tensors:
Suppose we have a tensor product of a bunch of vectors, \(\textbf{v}_1, \textbf{v}_2, \ldots, \textbf{v}_k\), and a bunch of matrices, \(T_1, T_2, \ldots, T_k\), such that \(T_i \textbf{v}_i\)  makes sense. Then we define the tensor product of \(T_1 \otimes T_2 \otimes \cdots \otimes T_k\) as a transformation on \(\textbf{v}_1 \otimes \textbf{v}_2 \otimes \cdots \otimes \textbf{v}_k\) such that
$$(T_1 \otimes T_2 \otimes \cdots T_k)(\textbf{v}_1 \otimes \textbf{v}_2 \otimes \cdots \textbf{v}_k) = (T_1 \textbf{v}_1) \otimes (T_2 \textbf{v}_2) \otimes \cdots (T_k \textbf{v}_k).$$ From this, matrix composition for tensors acts factor-wise as well:
$$(S_1 \otimes S_2 \otimes \cdots \otimes S_k)(T_1 \otimes T_2 \otimes \cdots \otimes T_k) = (S_1 T_1) \otimes (S_2T_2) \otimes \cdots \otimes (S_kT_k)$$
At the moment we'll introduce what is called "Einstein summation": if in an expression we have something with an upper index, say, \(^i\) and in the same term a matching lower index, \(_i\), then we automatically assume that we sum the term over all possible values of that index. So when we write \(v^i \textbf{e}_i\) we really mean \(\sum_{i=1}^m v^i \textbf{e}_i\). This helps when our tensors have lots of pieces.
So a general element of \(V\otimes V^\vee\) looks like \(u^i_j \textbf{e}_i \otimes \textbf{e}^j\), and a general element of \(V\otimes W\) looks like \(u^{ij} \textbf{e}_i \otimes \textbf{f}_j\).
This convention will be in effect for basically ever, except when otherwise noted.

Lie Algebras

Now we're going to talk about a subject closely related to matrix groups using the infinitesimals that we established.

So let's go back to our idea of matrix groups. We are considering \(n \times n\) matrices that are invertible, and we're in particular considering subsets of such matrices that obey the various group laws. We're going to say that our matrices have entries in either \(\mathbb{R}\) or \(\mathbb{C}\), and when I don't want to specify one or the other I'll say that the entries are in \(\mathbb{k}\).
Recall that every group has an identity element that we'll call \(I\), which here means the \(n \times n\) matrix with \(1\)s down the diagonal and \(0\)s everywhere else. We want to look at group elements of the form \(I + sX\), where \(s\) is an infinitesimal and \(X\) has either real or complex entries. This is the first instance of "things that don't technically exist" regarding infinitesimals; if our group is made of matrices only with real or complex entries, then we don't actually have elements of the form \(I + sX\) since that would require infinitesimal entries.
But we're going to pretend as if \(I + sX\) is in the group, the same way that we're pretending that our numbers allow for infinitesimals at all. There are ways to do this more formally, via what are called "algebraic schemes" but I'm not going to get into that.
To add in a condition that will make this pretending a little less egregious, we're going to say that if \(I + sX\) is in the group, then \(I + csX\) is in the group for any number \(c \in \mathbb{k}\).
Note that we already know that \(I + nsX\) is in the group for any integer \(n\): if we multiply out \((I + sX)^n\) for positive integer \(n\) we get \(I + nsX\), and the inverse of \(I + sX\) is \(I - sX\) (Check this yourself!), so \(I + nsX\) is in the group for any negative integer \(n\). But we extend to any number in \(k\) to get ourselves a Lie group.
We want to look at the set of matrices \(X\) with entries in \(\mathbb{k}\) such that \(I\) plus an infinitesimal times \(X\) is in \(G\). We call this set \(\mathfrak{g}\). What is in this set? We know that if \(X\) is in \(\mathfrak{g}\) then \(cX\) is for any \(c\in \mathbb{k}\). If we look at \(Y\) such that \(I + sY\) is in the group, then we get that
$$(I + sX)(I + sY) = I + s(X + Y)$$ so \(X + Y\) is in \(\mathfrak{g}\). Hence \(\mathfrak{g}\) is a vector space.
\(\mathfrak{g}\) has some further structure, though.
Consider the question of whether \(G\) is commutative or not, i.e. whether for \(g\) and \(h\) in \(G\), \(gh = hg\). We can look at this question as instead asking whether \(ghg^{-1}h^{-1} = I\), since if we can swap \(g\) and \(h\), then \(ghg^{-1}h^{-1} = hgg^{-1}h^{-1} = hh^{-1} = I\), and conversely.
Now let's look at group elements of the form \(I + sX\). Recall that \((I + sX)^{-1} = I - sX\). We get
$$(I + sX)(I + sY)(I - sX)(I - sY) = I$$ so that doesn't tell us anything useful. But remember that we have two independent infinitesimals to play around with, so we look instead at \(I + tY\) and get
$$(I + sX)(I + tY)(I - sX)(I - tY) = I + st(XY - YX)$$ So now the object \(st(XY - YX)\) measures whether \(I + sX\) and \(I + tY\) commute or not.
Now \(st\) is an infinitesimal, and \(I + st(XY - YX)\) is in the group, so we get that \(XY - YX\) is in \(\mathfrak{g}\). This is the extra structure that \(\mathfrak{g}\) has, called a Lie bracket, denoted by \([X, Y] = XY - YX\).
Abstracting away from our groups and matrices, we get the abstract rules for a Lie algebra:
1): A Lie algebra \(\mathfrak{g}\) is a vector space with an operation denoted by \([, ]\).
2): If \(X\) and \(Y\) are in \(\mathfrak{g}\) then \([X, Y]\) is in \(\mathfrak{g}\).
3): Bilinearity: For \(X, Y,\) and \(Z\) in \(\mathfrak{g}\) and \(c \in \mathbb{k}\),
$$[X, Y + cZ] = [X, Y] + c[X, Z], [X + cY, Z] = [X, Z] + c[Y, Z]$$ 4): Antisymmetry: For \(X\) and \(Y\) in \(\mathfrak{g}\), \([X, Y] = -[Y, X]\)
5): Jacobi identity: For \(X, Y\) and \(Z\) in \(\mathfrak{g}\),
$$[X, [Y, Z]] = [[X, Y], Z] + [Y, [X, Z]]$$ You can check that all of these hold for matrices, but sometimes we want to talk about Lie algebras without going through matrices, just as we have groups that are sets without being realized as matrices.
Let's look at some examples. Recall the group \(SO_3(\mathbb{R})\) of rotations in 3-dimensional real space. The elements of \(SO_3(\mathbb{R})\) obey the rule \(T^t T = I\) where \(^t\) indicates matrix transposition.
If we replace \(T\) with \(I + sX\), we get the rule \((I + sX)^t (I + sX) = I\). \((I + sX)^t = I + sX^t\), so we get
$$(I + sX^t)(I + sX) = I$$ We already know that \((I - sX)(I + sX) = I\), so we get the rule that \(X^t = -X\). So the Lie algebra of \(SO_3(\mathbb{R})\), often denoted by \(so_3(\mathbb{R})\), is given by the \(3 \times 3\) antisymmetric matrices with real entries.
Another example: the group \(SL_n\) of \(n \times n\) matrices with determinant \(1\). Given an element \(I + sX\), what is the determinant?
The determinant of \(I + sX\) ends up being \(1 + s\text{tr}(X)\), which you can see by writing out a few of the terms and noting that any term that involves \(X\) more than once has to vanish since \(s^2 = 0\). So if \(I + sX\) has determinant \(1\), that implies that \(tr(X)\) is \(0\), and conversely. So the Lie algebra \(sl_n\) is the set of \(n \times n\) matrices with trace \(0\).
For \(GL_n\), \(X\) can be anything, since \(I + sX\) always has inverse \(I - sX\) for any \(X\).
This, by the way, is how we're going to get around the issues of whether these infinitesimal elements are actually in the group or not. We look at matrices with entries not in \(\mathbb{k}\), but rather in \(\mathbb{k}[s, t]\) so that we have infinitesimals, and then we focus on groups of the form "matrices obeying these conditions" where the conditions can be written as polynomials in the entries of the matrices. In this setup, we actually do end up with elements of the form \(I + sX\).
If we wanted to do this without infinitesimals, we'd have to talk about derivatives and one-parameter groups and invariant vector fields and all that machinery from analytic differential geometry, and honestly the synthetic, infinitesimal stuff doesn't get enough love.

Wednesday, December 2, 2015

Infinitesimals

So now we get to the bit that looks kind of weird to many mathematicians, as opposed to just being weird to laypeople.
One of the things we take for granted about the real numbers, or indeed elements of any field, is that if you have a number \(b\) and \(b\) is not 0, then if \(ab = 0\) then \(a\) has to be 0. The technical jargon is that a field has no "zero-divisors".
For our purposes, we're going to break this rule. We're going to define what are called "infinitesimals". We say that a number \(s\) is infinitesimal if \(s\) is not equal to 0, but for some positive integer \(n\), \(s^n = 0\). This isn't true for any real or complex number, so infinitesimals are necessarily a new thing.
We'll assume for all of our infinitesimals here that \(s^2 = 0\). We could have infinitesimals where \(s^3 = 0\) but \(s^2 \neq 0\), but we don't need them for what I want to do. So now we explicitly define an infinitesimal to be an object \(s\) where \(s \neq 0\) but \(s^2 = 0\).
if \(s\) is an infinitesimal and \(a\) is a real number, then \(as\) is also an infinitesimal. We can add infinitesimals to regular numbers, and get things like \(a + bs\), and we can multiply such sums:
$$(a + bs)(x + ys) = ax + ays + bsx + bsys = ax + (ay + bx)s$$ Here we've decided that for \(s\) an infinitesimal, \(as = sa\) just like with regular numbers. We also say that for two infinitesimals \(s\) and \(t\), \(st = ts\). Just to make life simpler.
Also note that if \(s\) is an infinitesimal and \(t\) is an infinitesimal, then \(st\) is either infinitesimal or 0. We'll say that \(s\) and \(t\) are independent infinitesimals if they're both infinitesimal and \(st\) is not 0. I don't think this is standard convention, but I need a term for this.
So let's talk about calculus for a moment.
Consider the function \(f(x) = x^n\). What happens if we evaluate \(f(x + s)\)?
Recall the binomial theorem:
$$(x + y)^n = x^n + \binom{n}{1}x^{n-1}y + \binom{n}{2}x^{n-2}y^2 + \ldots + y^n$$ Plugging in \(s\) for \(y\) and recalling that \(s^2 = 0\), we get that
$$(x + s)^n = x^n + nx^{n-1}s.$$ This looks like \(f(x)\) plus \(f'(x)s\). Extending to polynomials and then to Taylor series tells us that for functions that are equal to their Taylor series,
$$f(x+s) = f(x) + f'(x)s.$$ Hence our infinitesimals allow us to take derivatives!
Perhaps more reassuringly, we can rewrite the previous expression as
$$f'(x) = \frac{f(x+s) - f(x)}{s}$$ where we have to be a bit careful about dividing by an infinitesimal; we can only do it because \(f(x+s)-f(x)\) is a multiple of \(s\); dividing a regular number by \(s\) gets us into trouble.
We'll use this trick of using infinitesimals to mimic derivatives several times, and every time we do we could have used derivatives but that would lead to a more complicated setup. We can do this only because I'm going to be doing everything algebraically; no inequalities, no limits, no integrals. Only these derivatives that we get from infinitesimals. Occasionally this will mean that I leave out some conditions necessary for various things to exist, instead just assuming that we're in whatever case needed for those things to exist. I'll try to note when that happens.
Anyway, for the nonce we're going to say that for a field \(\mathbb{k}\) and independent infinitesimals \(s\) and \(t\), the set \(\mathbb{k}[s,t]\) is the set of elements of the form \(a + bs + ct + dst\) where \(a, b, c\) and \(d\) are in \(\mathbb{k}\). Note that we can add, subtract, and multiply in this set without any problems; dividing is tricky, so we'll try to only divide by numbers in \(\mathbb{k}\).

Matrix Groups

Let's look at a big class of groups called "matrix groups". In other words, groups that can be realized as sets of matrices.
Let's fix a positive integer \(n\) and consider \(n \times n\) matrices with entries in either \(\mathbb{R}\) or \(\mathbb{C}\); when I don't want to pick one in particular, I'll just write \(\mathbb{k}\). Recall what we need for a group:
1): Composition
2): Identity
3): Inverses
4): Associativity
Here composition means matrix multiplication. If we have two \(n \times n\) matrices, we can multiply them to get another \(n \times n\) matrix. Here's where the order of operations comes in. When we apply a matrix to a vector by multiplication, we multiply with the matrix on the left: \(T \textbf{v}\). So if we first apply \(T\) and then apply \(S\), we get \(S(T \textbf{v})\) = \((ST) \textbf{v}\). Hence the ordering as described last time: things applied later go on the left. It's kind of weird looking, but we're stuck with it.
What about identity? Well, we have an object nicely called the "identity matrix", which has 1s down the main diagonal and 0s everywhere else. Call this object \(I\). It satisfies \(IT = TI = T\) for any \(n \times n\) matrix \(T\), so we have an identity.
Now we need inverses. Not all matrices have inverses. Fortunately there are several ways to determine whether a matrix has an inverse. We're not going to pick one in particular, we're just going to restrict ourselves to invertible matrices for the moment. The set of invertible \(n \times n\) matrices over \(\mathbb{k}\) is often written as \(GL_n(\mathbb{k})\), where \(GL\) stands for "general linear", since matrices are linear transformations. If we've decided on what \(\mathbb{k}\) stands for, we'll often just write \(GL_n\).
Finally, matrix multiplication always obeys associativity, so we don't need to worry about that one.
\(GL(n)\) is a group, since if a matrix is invertible then so is its inverse. But there are plenty of other groups that can be realized as matrices.
Consider the cube again. Let's place it so that the center of the cube is at the origin in \(\mathbb{R}^3\) and that its edges are parallel to the coordinate axes. Then the symmetries of the cube can be written as linear transformations on all of \(\mathbb{R}^3\), since we can match basis vectors to the sides of the cube and so moving the sides moves the basis vectors. Hence we have another group of matrices. Note that all of these matrices are necessarily invertible, and hence live in \(GL_3\); we say that they form a subgroup, since they're a subset of \(GL_3\) and they're form a group.
Now consider a sphere centered at the origin. Just like the cube, we can rotate it around various axes. In fact, we can rotate it around any axis that passes through the origin and it will still occupy the same space. So we can again realize the symmetries of a sphere as a group of matrices. If we include reflections, we get another group of matrices, denoted \(O_3(\mathbb{R})\) (since the sphere is real and any matrix that takes the sphere to itself has real entries). The \(O\) indicates "orthogonal", because any linear transformation that preserves the sphere also preserves right angles between things (in fact, it preserves all angles). In fact, \(O_3(\mathbb{R})\) preserves more than just angles; since it preserves a sphere, and a sphere is defined in terms of distances, this group preserves distances as well. In general, the group \(O_n(\mathbb{R})\) is the group of \(n \times n\) matrices that preserve distance in \(n\)-dimensional space.
Preserving distances means that the rotations preserve the dot product. In other words, if \(T\) is a rotation matrix, then for vectors \(\textbf{u}\) and \(\textbf{v}\), we get that
$$T\textbf{u} \cdot T\textbf{v} = \textbf{u} \cdot \textbf{v}$$ If we view our vectors as matrices \([u]\) and \([v]\), we get that we can write the dot product as
$$\textbf{u}\cdot \textbf{v} = [u]^t [v]$$ where \(^t\) indicates matrix transposition. So our rule about preserving dot products can be written with only matrices:
$$(T[u])^t T[v] = [u]^t T^t T [v] = [u]^t [v]$$ Since this has to be true for all \(\textbf{u}\) and \(textbf{v}\), we get that the only possibility is for \(T^t T = I\). Thus we get that any real-valued matrix such that \(T^t T = I\) is in \(O(n, \mathbb{R})\). Similarly, any complex-valued matrix such that \(T^t T = I\) is in \(O(n, \mathbb{C})\).
In keeping with the promise to become less easily-visualized, let's talk about the determinant briefly. The determinant is multiplicative, in that given two \(n \times n\) matrices \(S\) and \(T\)
$$det(S)det(T) = det(ST)$$ In particular, if \(det(S) = det(T) = 1\), then \(det(ST) = 1\), and also \(S\) and \(T\) are invertible, and their inverses have determinant 1. So the set of \(n \times n\) matrices with determinant 1 forms a group, called \(SL_n\), for "special linear" group. We can also look at \(SO_n(\mathbb{k})\) which are the orthogonal matrices that also have determinant 1.
Before I give anyone the wrong idea, there are groups that actually cannot be realized as matrices. They obey the four rules, but for some reason or other there is no set of matrices such that for each element in the group there is a distinct matrix and the group composition matches the matrix multiplication. We mostly won't be concerning ourselves with such objects, but it is notable that they exist.

Groups

One of the objects that almost every mathematician comes across is a group. Groups describe how symmetries work in mathematics.

Consider a square hole and a square peg that sits inside the hole. You can remove the peg, rotate it by a quarter turn, and put it back in the hole, and it will fit. You can remove the peg, rotate it by half a turn, or 3-quarters of a turn, or a whole turn, and it will fit. You can do any of these actions followed by any other action, and the peg will fit, and you can undo any of these: if you rotate the peg by a quarter turn, followed by a 3-quarters turn, it will be in the original position, so it's like you never did anything.
So we have four positions that the peg can end up in, and thus four essentially different things we can do to the peg: rotate a quarter turn, rotate a half turn, rotate 3-quarters of a turn, or do nothing.
If we started with a triangular peg in a triangular hole, we'd have three positions, and thus three things we could do.
If we started with a circular peg in a circular hole, we'd have an infinite number of positions.
Now imagine a cubical box. It's not sitting in a hole, it's just lying on a table, but we're still supposed to pick it up, move it in some way, and then put it back so that it's occupying the same space that it was in before.
We can do a few things to the box. We can flip it over in various ways, and rotate it in various ways. In fact we have three axes in which we can rotate the box, and we can do a quarter turn, half turn, or 3-quarters turn around each of those axes. If we do several rotations, it matters which order we do the rotations in. If you rotate it a quarter turn around one axis and then a quarter turn around another, you get a different result than if you do the two turns in the opposite order.

Given an object and a hole that it fits in, we can rotate or flip the object in various ways so that it still fits in the hole. Call these things we can do to the object "symmetries" of the object. Normally when we say "symmetry" we mean an object looking the same on both sides, which corresponds to flipping the object over; since it looks the same on both sides it still fits in the "hole". Mathematicians use "symmetry" to refer to the act of flipping itself.
We've noticed some things about symmetries:
Doing nothing is a symmetry. A symmetry followed by another symmetry is a symmetry. Every symmetry is reversible, we can undo any symmetry.
If we write our symmetries symbolically, we can write the do-nothing symmetry as \(e\), and for symmetries \(f\) and \(g\), we write "doing \(f\) followed by doing \(g\) as \(gf\)(the reason for the ordering will come later), and that for a symmetry \(f\), there is another symmetry denoted \(f^{-1}\) such that \(f\) followed by its inverse is like doing nothing: \(f^{-1}f = e\).
There is one more rule that symmetries follow. Remember that \(gf\) is a symmetry, and \(h\) is a symmetry, so doing \(gf\) followed by doing \(h\) is also a symmetry, \(h(gf)\). \(hg\) is also a symmetry, and doing \(f\) followed by doing \(hg\) is also a symmetry, \((hg)f\). The rule, which should sound trivial, is that \(h(gf) = (hg)f\), doing \(f\) then doing \(g\) then doing \(h\). So we can just write \(hgf\) without parentheses.
Any full set of symmetries of an object obeys these rules:
1): Closure: if \(f\) and \(g\) are in the set then \(gf\) is in the set.
2): Identity: there is a do-nothing called the identity \(e\), so that \(ef = fe = f\)
3): Inverses: there is an inverse of \(f\), denoted \(f^{-1}\), such that \(ff^{-1} = f^{-1}f = e\)
4): Associativity: \((hg)f = h(gf)\)
Any set with a composition operation that obeys the above four rules is called a group. The set of rotations that we can do to a square peg and still have it fit in the square hole is called the "group of symmetries of the square peg", and similarly for the triangular peg, the circular peg, and the cubical box.
As we go on, the "pegs" and the kinds of "holes" we want them to fit will become more complicated, less easily visualized, but these examples provide the basic intuition.
One last bit to note:
For the square peg, you have four symmetries, and if you do two of them in a row, it doesn't matter what order you do them in, the result is the same. We say thus say that the symmetries commute and that the group of symmetries of the square peg is "commutative" or "Abelian". For the cube, in contrast, it does matter sometimes, so we say that the group of symmetries of the cube is "noncommutative" or "non-Abelian". These notions, of commutativity and noncommutativity, play a big part in what's to come.

Preliminaries

I'm going to assume everyone here is familiar with linear algebra, at least so far as matrices being used for turning vectors into other vectors. I'll want to go a little bit beyond that, but more on that as I figure it out.
Also I hope everyone is comfortable with the complex numbers, because a lot of what I want to do doesn't work well if we don't have square roots of everything. Some things can be done with just the real numbers, some can't. Similarly, I'm going to be assuming characteristic 0 unless I explicitly say otherwise, because representation theory over fields of positive characteristic gets really hairy really quickly.
Finally, I won't be proving much. Some proofs are great, but some are just tedious, and I mostly want to build things and then show why I'm building them rather than checking fiddly details to make sure the constructions work.

While we're here, I'm going to mention something that I did on G+ but is a little bit nonstandard: I'm going to be using algebraic infinitesimals to motivate certain things about Lie algebras. Often mathematicians get a little leery of infinitesimals, and for good reason, but those reasons don't apply to what I'm doing so I'm going to do it. It makes things a little nicer in a lot of cases.

But that won't come up for a bit, as I want to get all of my tensor stuff here so I can refer to it.

Welcome

I don't know what I'm doing here, but this is a better venue for posting what apparently is becoming a book on Lie algebras than just using G+. Mostly because it has LaTeX support, and also because I somehow feel less guilty about spewing long posts on a personal blog than on G+. More general nonsense will remain on G+, except for the stuff that ends up elsewhere and for the stuff that I forget doesn't really belong here.
So as such this will mostly be a not-really-coherent blog about Lie algebras from someone who first learned them in terms of differential geometry and wants to learn them more algebraically, but is instead mostly just looking at categories at the moment for some reason. We'll see how this goes.