Thursday, December 3, 2015

Tensors

Tensors are like products for vector spaces.

More explicitly, we can add vectors, and we have two things that are called "products" for vectors, namely the dot product and the cross product. But we would like something a little more general.
Let's take some vector spaces, \(V\) and \(W\), over our field \(\mathbb{k}\). Let's suppose that we have a basis for each space, so that \(V\) is has basis \(\textbf{e}_1,\textbf{e}_2,\ldots,\textbf{e}_m\) and similarly \(W\) has basis \(\textbf{f}_1,\ldots,\textbf{f}_n\).
We introduce the notion of an upper index so that we can write vectors in \(V\) as
$$\textbf{v} = v^1 \textbf{e}_1 + v^2\textbf{e}_2 + \ldots + v^m \textbf{e}_m$$ where the terms \(v^i\) are all numbers in \(\mathbb{k}\). For the rest of this post and many posts afterwards, \(^1, ^2, ^i\) and so on will indicate just bookkeeping indices rather than exponents; the numbers \(v^1, v^2,\ldots\) don't necessarily have any relation to each other. The upper-lower stuff becomes useful for distinguishing what goes where later.
Now we want to consider pairs where one element is a basis vector from \(V\) and the other is a basis element from \(W\): \((\textbf{e}_i, \textbf{f}_j)\). We'll write this pair as \(\textbf{e}_i \otimes \textbf{f}_j\) to emphasize that this is supposed to be viewed as the product of \(\textbf{e}_i\) by \(\textbf{f}_j\). The symbol \(\otimes\) is called the tensor product.
What if we want to take the product of more general vectors? Let's look at \((v^1 \textbf{e}_1 + v^2 \textbf{e}_2) \otimes (w^3 \textbf{f}_3 + w^4 \textbf{f}_4)\). Expanding out as if this were regular multiplication gives:
$$v^1 \textbf{e}_1 \otimes w^3 \textbf{f}_3 + v^1 \textbf{e}_1 \otimes w^4 \textbf{f}_4 + v^2 \textbf{e}_2 \otimes w^3 \textbf{f}_3 + v^2 \textbf{e}_2 \otimes w^4 \textbf{f}_4$$ We'll instate the following rule: scalars can pass through \(\otimes\). This makes sense, since for scalars, \(a\) times a product \(bc\) is equal to \(b\) times the product \(ac\), so \(b\) passes through the product of \(a\) and \(c\).
So we can rewrite the last term as
$$v^1w^3 \textbf{e}_1 \otimes \textbf{f}_3 + v^1w^4 \textbf{e}_1 \otimes \textbf{f}_4 + v^2w^3 \textbf{e}_2 \otimes \textbf{f}_3 + v^2w^4 \textbf{e}_2 \otimes \textbf{f}_4$$ This tells us that we'll want to have the objects \(\textbf{e}_i\otimes \textbf{f}_j\) be considered distinct, and indeed linearly independent, when we vary \(i\) and \(j\). We get a new vector space \(V\otimes W\) with basis given by all the products of the form \(\textbf{e}_i \otimes \textbf{f}_j\). Since there are \(m\) basis elements \(\textbf{e}_i\) for \(V\) and \(n\) basis elements \(\textbf{f}_j\) for \(W\), we get that the vector space \(V\otimes W\) has dimension \(mn\).

Note that we can also form \(W \otimes V\), whose basis is given by products of the form \(\textbf{f}_j \otimes \textbf{e}_i\). The two vector spaces \(V\otimes W\) and \(W\otimes V\) have the same dimension, and indeed the basis vectors of each can be easily and naturally matched up. So we don't always distinguish between them.
We do care about the ordering when it comes to individual elements. In particular, if we look at \(V\otimes V\), that has basis elements of the form \(\textbf{e}_i \otimes \textbf{e}_j\). If this is going to have dimension \(m^2\), then we need that \(\textbf{e}_i \otimes \textbf{e}_j\) is linearly independent from \(\textbf{e}_j \otimes \textbf{e}_i\); this distinguishes \(\otimes\) from the usual product for scalars.
We note that \(v^1w^3\) has two upper indices, and so in general, the coefficients for elements of \(V\otimes W\) will have two upper indices, one for \(V\) and one for \(W\). So we get that a general element of \(V\otimes W\) looks like \(\sum_{i,j=1}^{m,n} u^{ij} \textbf{e}_i \otimes \textbf{f}_j\).
Since \(V\otimes W\) is a vector space, if we have another vector space \(U\) we can form the tensor product \(U \otimes V \otimes W\), and the coefficients will then have 3 indices, one for each factor. and so on for more complicated products. Just for completeness, if for some reason we have the tensor product of a vector and a scalar, i.e. \(c \otimes \textbf{v}\) or \(\textbf{v}\otimes c\), both of those are just equal to \(c\textbf{v}\).

Let's introduce another idea: duals:
Let's look at linear transformations from \(V\) to \(\mathbb{k}\). Given two such transformations \(p\) and \(q\), we can form their sum, \(p + q\), defined by \((p + q)(\textbf{v}) = p(\textbf{v}) + q(\textbf{v})\). We can also multiply a transformation by a scalar in the obvious fashion. So the set of linear transformations from \(V\) to \(\mathbb{k}\) forms a vector space, which we'll call the dual space of \(V\) and write as \(V^\vee\).
If \(V\) is finite dimensional, then we can use the basis of \(V\) to make a dual basis for \(V^\vee\): we define \(\textbf{e}^i\) such that \(\textbf{e}^i(\textbf{e}_j)\) is 1 if \(i = j\) and 0 if \(i \neq j\); since everything is linear, this condition defines \(\textbf{e}^i\) on all of \(V\). Again, the upper index is just for bookkeeping, not an exponent.
We write a more general element of \(V^\vee\) as \(\sum_{i = 1}^m v_i\textbf{e}^i\), where the \(v_i\) are in \(\mathbb{k}\).
We can talk about tensor products of elements of \(V^\vee\) with elements of \(V\) or \(W\); again we just form the products of the basis elements and then take linear combinations.

A general basis element of \(V \otimes V^\vee\) is of the form \(\textbf{e}_i \otimes \textbf{e}^j\), and so the coefficient to match that should have an upper index \(^i\) and a lower index \(_j\). So a general element of \(V\otimes V^\vee\) looks like \(\sum_{i,j=1}^m u^i_j \textbf{e}_i\otimes \textbf{e}^j\)

A bit about how matrices act on tensors:
Suppose we have a tensor product of a bunch of vectors, \(\textbf{v}_1, \textbf{v}_2, \ldots, \textbf{v}_k\), and a bunch of matrices, \(T_1, T_2, \ldots, T_k\), such that \(T_i \textbf{v}_i\)  makes sense. Then we define the tensor product of \(T_1 \otimes T_2 \otimes \cdots \otimes T_k\) as a transformation on \(\textbf{v}_1 \otimes \textbf{v}_2 \otimes \cdots \otimes \textbf{v}_k\) such that
$$(T_1 \otimes T_2 \otimes \cdots T_k)(\textbf{v}_1 \otimes \textbf{v}_2 \otimes \cdots \textbf{v}_k) = (T_1 \textbf{v}_1) \otimes (T_2 \textbf{v}_2) \otimes \cdots (T_k \textbf{v}_k).$$ From this, matrix composition for tensors acts factor-wise as well:
$$(S_1 \otimes S_2 \otimes \cdots \otimes S_k)(T_1 \otimes T_2 \otimes \cdots \otimes T_k) = (S_1 T_1) \otimes (S_2T_2) \otimes \cdots \otimes (S_kT_k)$$
At the moment we'll introduce what is called "Einstein summation": if in an expression we have something with an upper index, say, \(^i\) and in the same term a matching lower index, \(_i\), then we automatically assume that we sum the term over all possible values of that index. So when we write \(v^i \textbf{e}_i\) we really mean \(\sum_{i=1}^m v^i \textbf{e}_i\). This helps when our tensors have lots of pieces.
So a general element of \(V\otimes V^\vee\) looks like \(u^i_j \textbf{e}_i \otimes \textbf{e}^j\), and a general element of \(V\otimes W\) looks like \(u^{ij} \textbf{e}_i \otimes \textbf{f}_j\).
This convention will be in effect for basically ever, except when otherwise noted.

No comments:

Post a Comment