I was trying to make sense of tensor products; here are some notes on the topic.

As multilinear maps

Let (V,+,)(V, +, \cdot) be a vector space. Let rr, ss be nonnegative integers. An (r,s)(r, s)-tensor over VV - let us call it TT - is a multilinear map [1], [2], [3]

T:V××Vr times×V××Vs timesI ⁣R,T:\underbrace{V^* \times \ldots \times V^*}_{r \text{ times}} \times \underbrace{V \times \ldots \times V}_{s \text{ times}} \to {\rm I\!R},

where VV^* is the dual vector space of VV - the space of covectors.

Before we explain what multilinearity is let's give an example.

Let's say TT is a (1,1)(1, 1)-tensor. Then T:V×VI ⁣RT: V^* \times V \to {\rm I\!R}, that is TT takes a covector and a vector and multilinearity means that for two covectors ϕ,ψV\phi, \psi\in V^*,

T(ϕ+ψ,v)=T(ϕ,v)+T(ψ,v),T(\phi+\psi, v) = T(\phi, v) + T(\psi, v),

and

T(λϕ,v)=λT(ϕ,v),T(\lambda \phi, v) = \lambda T(\phi, v),

for all vVv\in V, and it is also linear in the second argument, that is,

T(ϕ,v+w)=T(ϕ,v)+T(ϕ,w),T(\phi, v+w) = T(\phi, v) + T(\phi, w),

and

T(ϕ,λv)=λT(ϕ,v).T(\phi, \lambda v) = \lambda T(\phi, v).

Using these properties twice we can see that

T(ϕ+ψ,v+w)=T(ϕ,v)+T(ϕ,w)+T(ψ,v)+T(ψ,w).T(\phi+\psi, v+w) = T(\phi, v) + T(\phi, w) + T(\psi, v) + T(\psi, w).

Example: A (0,2)(0, 2)-tensor. Let us denote the set of polynomials with real coefficients is denoted by P\mathsf{P}. Define the map

g:(p,q)g(p,q)01p(x)q(x)dx.g: (p, q) \mapsto g(p, q) \coloneqq \int_0^1 p(x)q(x){\rm d}x.

Then this is a (0,2)(0, 2)-tensor,

g:P×PI ⁣R,g: \mathsf{P} \times \mathsf{P} \to {\rm I\!R},

and multilinearity is easy to check.

It becomes evident that an inner product is a (0,2)(0, 2)-tensor. More precisely, an inner product is a bilinear symmetric positive definite operator V×VI ⁣RV\times V \to {\rm I\!R}.

Example: Covectors as tensors. It is easy to see that covectors, ϕ:VI ⁣R\phi: V\to {\rm I\!R} are (0,1)(0, 1)-tensors. Trivially.

Example: Vectors as tensors. Given a vector vVv\in V we can consider the mapping

Tv:VϕTv(ϕ)=ϕ(v)I ⁣R.T_v: V^* \ni \phi \mapsto T_v(\phi) = \phi(v) \in {\rm I\!R}.

This makes TvT_v into a (1,0)(1, 0)-tensor.

Example: linear maps as tensors. We can see a linear map ϕHom(V,V)\phi \in {\rm Hom}(V, V) as a (1,1)(1,1)-tensor

Tϕ:V×V(ψ,v)Tϕ(ψ,v)=ψ(ϕ(v))I ⁣R.T_\phi: V^* \times V \ni (\psi, v) \mapsto T_\phi(\psi, v) = \psi(\phi(v)) \in {\rm I\!R}.

Components of a tensor

Let TT be an (r,s)(r, s)-tensor over a finite-dimensional vector space VV and let {ei}i\{e_i\}_i be a basis for VV. Let the dual basis, for VV^*, be {ϵi}i\{\epsilon^i\}_i. The two bases have the same cardinality. Then define the (r+s)dimV(r+s)^{\dim V} many numbers

Tj1,,jsi1,,ir=T(ϵi1,,ϵir,ej1,,ejs).T^{i_1, \ldots, i_r}_{j_1, \ldots, j_s} = T(\epsilon^{i_1}, \ldots, \epsilon^{i_r}, e_{j_1}, \ldots, e_{j_s}).

These are the components of TT with respect to the given basis.

This allows us to conceptualise a tensor as a multidimensional (multi-indexed) array. But maybe we shouldn’t… This is as bad as treating vectors as sequences of numbers, or matrices as “tables” instead of elements of a vector space and linear maps respectively.

Let's see how exactly this works via an example [4]. Indeed, consider an (2,3)(2,3)-tensor of the tensor space T32=V2(V)3\mathcal{T}^{2}_{3} = V^{\otimes 2} \otimes (V^*)^{\otimes 3} , which can be seen as a multilinear map

T:V×V×V×V×VI ⁣R.T:V^*\times V^*\times V \times V \times V \to {\rm I\!R}.

Yes, we have two VV^* and three VV!

Take a basis (ei)i(e_i)_i of VV and a basis (θj)j(\theta^j)_j of VV^*. Let us start with an example using a pure tensor of the form ei1ei2ei3θj1θj2,e_{i_1}\otimes e_{i_2} \otimes e_{i_3} \otimes \theta^{j_1} \otimes \theta^{j_2}, for some indices i1,i2,i3i_1,i_2,i_3 ,and j1,j2j_1, j_2. This can be seen as a map V×V×V×V×VI ⁣RV^*\times V^*\times V \times V \times V \to {\rm I\!R}, which is defined as

(θj1θj2ei1ei2ei3)(eˉ1,eˉ2,θˉ1,θˉ2,θˉ3)=θj1,eˉ1θj2,eˉ2ei1,θˉ1ei2,θˉ2ei3,θˉ3,\begin{aligned}(\theta^{j_1} \otimes \theta^{j_2} \otimes e_{i_1}\otimes e_{i_2} \otimes e_{i_3}) (\bar{e}_1, \bar{e}_2, \bar{\theta}^1, \bar{\theta}^2, \bar{\theta}^3) \\ {}={} \langle \theta^{j_1}, \bar{e}_1\rangle \langle \theta^{j_2}, \bar{e}_2\rangle \langle e_{i_1}, \bar{\theta}^1\rangle \langle e_{i_2}, \bar{\theta}^2\rangle \langle e_{i_3}, \bar{\theta}^3\rangle,\end{aligned}

where ,\langle {}\cdot{}, {}\cdot{}\rangle is the natural pairing between VV and its dual, so

(θj1θj2ei1ei2ei3)(eˉ1,eˉ2,θˉ1,θˉ2,θˉ3)=θj1(eˉ1)θj2(eˉ2)θˉ1(ei1)θˉ2(ei2)θˉ3(ei3).\begin{aligned} (\theta^{j_1} \otimes \theta^{j_2} \otimes e_{i_1}\otimes e_{i_2} \otimes e_{i_3}) (\bar{e}_1, \bar{e}_2, \bar{\theta}^1, \bar{\theta}^2, \bar{\theta}^3) \\{}={} \theta^{j_1}(\bar{e}_1) \theta^{j_2}(\bar{e}_2) \bar{\theta}^1(e_{i_1}) \bar{\theta}^2(e_{i_2}) \bar{\theta}^3(e_{i_3}).\end{aligned}

More specifically, when the above tensor is applied to a basis vector (ei1,ei2,θj1,θj2,θj3)(e_{i_1'}, e_{i_2'}, \theta^{j_1'}, \theta^{j_2'}, \theta^{j_3'}), then the result is

(θj1θj2ei1ei2ei3)(ei1,ei2,θj1,θj2,θj3)=θj1(ei1)θj2(ei2)θj1(ei1)θj2(ei2)θj3(ei3)=δi1j1δi2j2δi1j1δi2j2δi3j3.\begin{aligned} &(\theta^{j_1} \otimes \theta^{j_2} \otimes e_{i_1}\otimes e_{i_2} \otimes e_{i_3}) (e_{i_1'}, e_{i_2'}, \theta^{j_1'}, \theta^{j_2'}, \theta^{j_3'}) \\ {}={}& \theta^{j_1}(e_{i_1'}) \theta^{j_2}(e_{i_2'}) \theta^{j_1'}(e_{i_1}) \theta^{j_2'}(e_{i_2}) \theta^{j_3'}(e_{i_3}) \\ {}={}& \delta_{i_1' j_1} \delta_{i_2' j_2} \delta_{i_1 j_1'} \delta_{i_2 j_2'} \delta_{i_3 j_3'}. \end{aligned}

A general tensor of T32\mathcal{T}^{2}_{3} has the form

T=i1,i2,i3j1,j2Tj1,j2i1,i2,i3(θj1θj2ei1ei2ei3),T {}={} \sum_{ \substack{i_1, i_2, i_3 \\ j_1, j_2} } T_{j_1, j_2}^{i_1, i_2, i_3} (\theta^{j_1} \otimes \theta^{j_2} \otimes e_{i_1}\otimes e_{i_2} \otimes e_{i_3}),

for some parameters (components) Tj1,j2i1,i2,i3T_{j_1, j_2}^{i_1, i_2, i_3}. This can be seen as a mapping T:V×V×V×V×VI ⁣RT:V^*\times V^* \times V^* \times V \times V \to {\rm I\!R} that acts on elements of the form

x=(j1aj11θj1,j2aj22θj2,j3aj33θj3,i1bi11ei1,i2bi21ei2,i3bi31ei3),x {}={} \left( \sum_{j_1}a^1_{j_1}\theta^{j_1}, \sum_{j_2}a^2_{j_2}\theta^{j_2}, \sum_{j_3}a^3_{j_3}\theta^{j_3}, \sum_{i_1}b^1_{i_1}e_{i_1}, \sum_{i_2}b^1_{i_2}e_{i_2}, \sum_{i_3}b^1_{i_3}e_{i_3} \right),

and gives

Tx=i1,i2,i3j1,j2Tj1,j2i1,i2,i3(ei1,ei2,ei3,θj1,θj2)(x)=i1,i2,i3j1,j2Tj1,j2i1,i2,i3ei1,j1aj11θj1ei2,j2aj21θj2ei3,j3aj31θj3θj1,i1bi11ei1θj2,i2bi22ei2=i1,i2,i3j1,j2Tj1,j2i1,i2,i3ai11ai22ai33bj11bj21.\begin{aligned} Tx {}={}& \sum_{ \substack{i_1, i_2, i_3 \\ j_1, j_2} } T_{j_1, j_2}^{i_1, i_2, i_3} (e_{i_1}, e_{i_2}, e_{i_3}, \theta^{j_1}, \theta^{j_2})(x) \\ {}={}& \sum_{ \substack{i_1, i_2, i_3 \\ j_1, j_2} } T_{j_1, j_2}^{i_1, i_2, i_3} \left\langle e_{i_1}, \sum_{j_1'}a^1_{j_1'}\theta^{j_1'}\right\rangle \left\langle e_{i_2}, \sum_{j_2'}a^1_{j_2'}\theta^{j_2'}\right\rangle \\ &\qquad\qquad\qquad\left\langle e_{i_3}, \sum_{j_3'}a^1_{j_3'}\theta^{j_3'}\right\rangle \left\langle \theta_{j_1}, \sum_{i_1'}b^1_{i_1'}e^{i_1'}\right\rangle \left\langle \theta_{j_2}, \sum_{i_2'}b^2_{i_2'}e^{i_2'}\right\rangle \\ {}={}& \sum_{ \substack{i_1, i_2, i_3 \\ j_1, j_2} } T_{j_1, j_2}^{i_1, i_2, i_3} a^1_{i_1}a^2_{i_2}a^3_{i_3}b^1_{j_1}b^1_{j_2}. \end{aligned}

As a quotient on a huge space

Here we construct a huge vector space and apply a quotient to enforce the axioms we expect tensors and tensor products to have. This huge space is a space of functions sometimes referred to as the "formal product" [5]. See also this video [7].

We will define the tensor product of two vector spaces. Let V,WV, W be two vector spaces. We define a vector space VWV*W, which we will call the formal product of VV and WW, as the linear space that has V×WV\times W as a Hamel basis. This space is also known as the free vector space, VW=Free(V×W)V*W = {\rm Free}(V\times W).

To make this more concrete, we can identify VWV*W by the space of functions φ:V×WI ⁣R\varphi: V\times W \to {\rm I\!R} with finite support. Representatives (and a basis) for this set are the functions

δv,w(x,y)={1, if (x,y)=(v,w)0, otherwise\delta_{v, w}(x, y) {}={} \begin{cases} 1, & \text{ if } (x,y)=(v,w) \\ 0,& \text{ otherwise} \end{cases}

Indeed, every function f:V×WI ⁣Rf:V\times W\to{\rm I\!R} with finite support (a function of VWV*W) can be written as a finite combination of such δ\delta functions and each δ\delta function is identified by a pair (v,w)V×W(v,w)\in V\times W.

Note that V×WV\times W is a vector space when equipped with the natural operations of function addition and scalar multiplication.

We consider the natural embedding, δ\delta, of V×WV\times W into VWV*W, which is naturally defined as

δ:V×W(v0,w0)δv0,w0VW.\delta:V\times W \ni (v_0, w_0) \mapsto \delta_{v_0, w_0} \in V * W.

Consider now the following subspace of VWV*W

M0=span{δ(v1+v2,w)δ(v1,w)δ(v2,w),δ(v,w1+w2)δ(v,w1)δ(v,w2),δ(λv,w)λδ(v,w),δ(v,λw)λδ(v,w),v,v1,v2V,w,w1,w2W,λI ⁣R}M_0 {}={} \operatorname{span} \left\{ \begin{array}{l} \delta(v_1+v_2, w) - \delta(v_1, w) - \delta(v_2, w), \\ \delta(v, w_1+w_2) - \delta(v, w_1) - \delta(v, w_2), \\ \delta(\lambda v, w) - \lambda \delta(v, w), \\ \delta(v, \lambda w) - \lambda \delta(v, w), \\ v, v_1, v_2 \in V, w, w_1, w_2 \in W, \lambda \in {\rm I\!R} \end{array} \right\}

Quotient space definition of tensor space.

We define the tensor product of VV with WW as VW=VWM0.V\otimes W = \frac{V * W }{M_0}. This is called the tensor space of VV with WW and its elements are called tensors.

This is the space we were looking for. Here's what we mean: we have already defined the mapping δ:V×WVW\delta: V\times W \to V * W. We also define the canonical embedding π:VWVM0\pi: V*W \to V \otimes M_0. We then define the tensor product of vVv\in V and wWw\in W as

vw=(πδ)(v,w)=δv,w+M0.v\otimes w = (\pi {}\circ{} \delta)(v, w) = \delta_{v, w} {}+{} M_0.

It is a mapping :V×WVW\otimes: V\times W \to V\otimes W and we can see that it is bilinear.

Properties of \otimes.

The operator \otimes is bilinear.

Proof. For vVv\in V, wWw\in W and λI ⁣R\lambda \in {\rm I\!R} we have

(λv)w=δλv,w+M0=δλv,wλδv,wM0+λδv,w+M0=λδv,w+M0=λ(δv,w+M0)=λ(vw).\begin{aligned} (\lambda v)\otimes w {}={}& \delta_{\lambda v, w} + M_0 \\ {}={}& \underbrace{\delta_{\lambda v, w} - \lambda \delta_{v, w}}_{\in{}M_0} + \lambda \delta_{v, w} + M_0 \\ {}={}& \lambda \delta_{v, w} + M_0 \\ {}={}& \lambda (\delta_{v, w} + M_0) \\ {}={}& \lambda (v\otimes w). \end{aligned}

Likewise, for v1,v2Vv_1, v_2 \in V, wWw\in W, λI ⁣R\lambda \in {\rm I\!R},

(v1+v2)w=δv1+v2,w+M0=δv1+v2,wδv1,wδv2,wM0+δv1,w+δv2,w+M0=δv1,w+δv2,w+M0=(δv1,w+M0)+(δv2,w+M0)=v1w+v2w\begin{aligned} (v_1 + v_2)\otimes w {}={}& \delta_{v_1+v_2, w} + M_0 \\ {}={}& \underbrace{\delta_{v_1+v_2, w} - \delta_{v_1, w} -\delta_{v_2, w}}_{\in {} M_0} + \delta_{v_1, w} + \delta_{v_2, w} + M_0 \\ {}={}& \delta_{v_1, w} + \delta_{v_2, w} + M_0 \\ {}={}& (\delta_{v_1, w} + M_0) + (\delta_{v_2, w} + M_0) \\ {}={}& v_1\otimes w + v_2 \otimes w \end{aligned}

The other properties are proved likewise. \Box

Universal property

Here's how I understand the universal property: Suppose we know that we have a function f:V×W?f: V\times W \to{} ?, which is bilinear. We can always think of the mysterious space ?? as the tensor space VWV\otimes W [5].

Let's look at Figure 1.

Universal property of tensor product

Figure 1. Universal property of tensor product.

There is a unique linear function f~:VW?\tilde{f}:V\otimes W \to{} ? such that

f(v,w)=f~(vw).f(v, w) = \tilde{f}(v\otimes w).

Let us underline that f~\tilde{f} is linear! This makes \otimes a prototype bilinear function as any other bilinear function is a linear map of precisely this one.

Dimension

Let VV and WW be finite dimensional. We will show that

Dimension of tensor

dim(VW)=dimVdimW.\dim(V\otimes W) = \dim V \dim W.

Proof 1. To that end we use the fact that the dual vector space has the same dimension as the original vector space. That is, the dimension of VWV\otimes W is the dimension of (VW).(V\otimes W)^*.

The space (VW)=Hom(VW,I ⁣R)(V\otimes W)^* = {\rm Hom}(V\otimes W, {\rm I\!R}) is the space of bilinear maps V×WI ⁣RV\times W \to {\rm I\!R}.

Suppose VV has the basis {e1V,,enVV}\{e^V_1, \ldots, e^V_{n_V}\} and WW has the basis {e1W,,enWW}\{e^W_1, \ldots, e^W_{n_W}\}.

To form a basis for the space of bilinear maps f:V×WI ⁣Rf:V\times W\to{\rm I\!R} we need to identify every such function with a sequence of scalars. We have

f(u,v)=f(i=1nVaiVeiV,i=1nWaiWeiW).f(u, v) = f\left(\sum_{i=1}^{n_V}a^V_{i}e^V_{i}, \sum_{i=1}^{n_W}a^W_{i}e^W_{i}\right).

From the bilinearity of ff we have

f(u,v)=i=1nVj=1nWaiVajWf(eiV,ejW).f(u, v) = \sum_{i=1}^{n_V}\sum_{j=1}^{n_W} a^V_{i}a^W_{j} f(e^V_{i}, e^{W}_{j}).

The right hand side is a bilinear function and the coefficients (aiV,ajW)(a^V_i, a^W_j) suggest that the dimension if nVnWn_V n_W. \Box

Proof. This is due to [22]. We can write any TVWT \in V\otimes W as

T=k=1nakbk,T = \sum_{k=1}^{n}a_k \otimes b_k,

for akVa_k\in V and bkWb_k\in W and some finite nn. If we take bases {vi}i=1nV\{v_i\}_{i=1}^{n_V} and {wi}i=1nW\{w_i\}_{i=1}^{n_W} we can write

ak=i=1nVαkivi,a_k = \sum_{i=1}^{n_V}\alpha_{ki}v_i,

and

bk=j=1nWβkjwj,b_k = \sum_{j=1}^{n_W}\beta_{kj}w_j,

therefore,

T=i=1n(i=1nVαkivi)(j=1nWβkjwj)=i=1ni=1nVj=1nWαkiβkjviwj.\begin{aligned}T {}={}& \sum_{i=1}^{n}\left(\sum_{i=1}^{n_V}\alpha_{ki}v_i\right) \otimes\left(\sum_{j=1}^{n_W}\beta_{kj}w_j\right) \\ {}={}& \sum_{i=1}^{n} \sum_{i=1}^{n_V} \sum_{j=1}^{n_W} \alpha_{ki}\beta_{kj} v_i\otimes w_j. \end{aligned}

This means that {viwj}\{v_i\otimes w_j\} is a basis of VWV\otimes W.

Tensor basis

In the finite-dimensional case, the basis of VWV\otimes W can be constructed from the bases of VV and WW, {e1V,,enVV}\{e^V_1, \ldots, e^V_{n_V}\} and {e1W,,enWW}\{e^W_1, \ldots, e^W_{n_W}\} as the set

BVW={eiVejW,i=1,,nV,j=1,,nW}.\mathcal{B}_{V\otimes W} = \{e^V_i \otimes e^{W}_j, i=1,\ldots, n_V, j=1,\ldots, n_W\}.

This implies that

dim(VW)=dimVdimW.\dim (V\otimes W) = \dim V \dim W.

Tensor products of spaces of functions

We need first to define the function space F(S)F(S) as in Kostrikin[2].

Let SS be any set. We define the set F(S)F(S)—we can denote it also as Funct(S,I ⁣R){\rm Funct}(S, {\rm I\!R})—as the set of all functions from SS to I ⁣R{\rm I\!R}. If fF(S)f\in F(S), then ff is a function f:SI ⁣Rf:S\to{\rm I\!R} and f(s)f(s) denotes the value at sSs\in S.

On F(S)F(S) we define addition and scalar multiplication in a pointwise manner: For f,gF(S)f, g\in F(S) and cI ⁣Rc\in {\rm I\!R},

(f+g)(s)=f(s)+g(s),(cf)(s)=cf(s).\begin{aligned} (f+g)(s) {}={}& f(s) + g(s), \\ (cf)(s) {}={}& c f(s). \end{aligned}

This makes F(S)F(S) into a vector space. Note that SS is not necessarily a vector space.

If S={s1,,sn}S = \{s_1, \ldots, s_n\} is a finite set, F(S)F(S) can be identified with I ⁣Rn{\rm I\!R}^n. After all, for every fF(S)f\in F(S) all you need to know is f(s1),,f(sn)f(s_1), \ldots, f(s_n).

Every element sSs\in S is associated with the delta function

δs(σ)={1, if σ=s,0, otherwise\delta_{s}(\sigma) = \begin{cases} 1, &\text{ if } \sigma = s, \\ 0, &\text{ otherwise} \end{cases}

The function δs:S{0,1}\delta_s:S \to \{0,1\} is called Kronecker's delta.

If SS is finite, then every fSf\in S can be written as

f=sSasδs.f = \sum_{s\in S} a_s \delta_s.

Let S1S_1 and S2S_2 be finite sets and let F(S1)F(S_1) and F(S2)F(S_2) be the corresponding function spaces. Then, there is a canonical identity of the form

F(S1×S2Finite set of pairs)Has δs1,s2 as basis=F(S1)F(S2),\underbrace{F(\underbrace{S_1 \times S_2}_{\text{Finite set of pairs}})}_{\text{Has }\delta_{s_1, s_2}\text{ as basis}} = F(S_1) \otimes F(S_2),

which associates each function δs1,s2\delta_{s_1, s_2} with δs1δs2\delta_{s_1} \otimes \delta_{s_2}.

If f1F(S1)f_1 \in F(S_1) and f2F(S2)f_2 \in F(S_2) then using the standard bases of F(S1)F(S_1) and F(S2)F(S_2)

f1f2=(s1S1f1(s1)δs1)(s1S2f2(s2)δs2).f_1 \otimes f_2 = \left(\sum_{s_1 \in S_1}f_1(s_1) \delta_{s_1}\right) \otimes \left(\sum_{s_1 \in S_2}f_2(s_2) \delta_{s_2}\right).

Isomorphism (UV)UV(U\otimes V)^* \cong U^* \otimes V^*.

We have the isomorphism of vector spaces (UV)UV(U\otimes V)^* \cong U^* \otimes V^* with isomorphism map

ρ:UV(UV),\rho:U^* \otimes V^* \to (U\otimes V)^*,

where for fUf\in U^* and gVg\in V^*,

ρ(fg)(uv)=f(u)g(v).\rho(f\otimes g)(u\otimes v) = f(u)g(v).

For short, we write (fg)(uv)=f(u)g(v)(f\otimes g)(u\otimes v) = f(u)g(v).

Roman [5] in Theorem 14.7 proves that ρ\rho is indeed an isomorphism.

Associativity

As Kostrikin notes [2], the spaces (L1L2)L3(L_1\otimes L_2)\otimes L_3 and L1(L2L3)L_1 \otimes (L_2 \otimes L_3) do not coincide — they are spaces of different type. They, however, can be found to be isomorphic via a canonical isomorphism.

We will start by studying the relationship between L1L2L3L_1 \otimes L_2 \otimes L_3 and (L1L2)L3(L_1 \otimes L_2) \otimes L_3.

The mapping L1×L2(l1,l2)l1l2L1L2L_1 \times L_2 \ni (l_1, l_2) \mapsto l_1 \otimes l_2 \in L_1 \otimes L_2 is bilinear, so the mapping

L1×L2×L2(l1,l2,l3)(l1l2)l2(L1L2)L3L_1 \times L_2 \times L_2 \ni (l_1, l_2, l_3) \mapsto (l_1 \otimes l_2) \otimes l_2 \in (L_1 \otimes L_2) \otimes L_3

is trilinear, so it can be constructed via the unique linear map

L1L2L3(L1L2)L3,L_1 \otimes L_2 \otimes L_3 \to (L_1 \otimes L_2) \otimes L_3,

that maps l1l2l3l_1 \otimes l_2 \otimes l_3 to (l1l2)l3(l_1 \otimes l_2) \otimes l_3. Using bases we can show that this map is an isomorphism.

Commutativity and the braiding map

See Wikipedia.

As a note, take x=(x1,x2)x = (x_1, x_2) and y=(y1,y2)y = (y_1, y_2). Then xyx\otimes y and yxy\otimes x look like

[x1y1x1y2x2y1x2y2],[x1y1x2y1x1y2x2y2]\begin{bmatrix}x_1y_1\\x_1y_2\\x_2y_1\\x_2y_2\end{bmatrix}, \begin{bmatrix}x_1y_1\\x_2y_1\\x_1y_2\\x_2y_2\end{bmatrix}

and we see that the two vectors contain the same elements in different order. In that sense,

VWWV.V\otimes W \cong W \otimes V.

The corresponding map, VVvvvvVVV\otimes V \ni v\otimes v' \mapsto v'\otimes v \in V\otimes V is called the braiding map and it induces an automorphism on VVV\otimes V.

Some key isomorphisms

Result 1: Linear maps as tensors

This is only true in the finite dimensional case.

To prove this we build a natural isomorphism

Φ:VWHom(V,W),\Phi:V^*\otimes W \to {\rm Hom}(V, W),

by defining

Φ(vw)(v)=v(v)w.\Phi(v^*\otimes w)(v) = v^*(v)w.

Note that vwv^*\otimes w is a pure tensor. A general tensor has the form

iviwi,\sum_i v_i^*\otimes w_i,

and

Φ(iviwi)(v)=ivi(v)wi.\Phi\left(\sum_i v_i^*\otimes w_i\right)(v) = \sum_i v_i^*(v)w_i.

This is a linear map. Linearity is easy to see. See also this video [6].

Here is the statement:

First isomorphism result: Hom(V,W)VW{\rm Hom}(V, W) \cong V^* \otimes W.

In the finite-dimensional case, we have that Hom(V,W)VW{\rm Hom}(V, W) \cong V^* \otimes W holds with isomorphism ϕw(vϕ(v)w).\phi \otimes w \mapsto (v \mapsto \phi(v)w).

Proof. [10] The proposed map, ϕw(vϕ(v)w)\phi \otimes w \mapsto (v \mapsto \phi(v)w), is a linear map VWHom(V,W)V^*\otimes W\to {\rm Hom}(V, W). We will construct its inverse (and prove that it is the inverse). Let (ei)i(e_i)_i be a basis for VV. Let (ei)i(e_i^*)_i be the naturally induced basis of the dual vector space, VV^*. We define the mapping

g:Hom(V,W)ϕieiϕ(ei)VW.g:{\rm Hom}(V, W) \ni \phi \mapsto \sum_{i}e_i^* \otimes \phi(e_i) \in V^*\otimes W.

We claim that gg is the inverse of Φ\Phi which we already defined to be the map VWHom(V,W)V^*\otimes W \to {\rm Hom}(V, W) given by Φ(vw)(v)=v(v)w.\Phi(v^*\otimes w)(v) = v^*(v)w. Indeed, we see that

Φ(g(ϕ))=Φ(i=1neiϕ(ei))Definition of g=i=1nΦ(eiϕ(ei))Φ is linear=i=1nei()ϕ(ei)=ϕ(ei()eiid)ϕ is linear=ϕ.\begin{aligned} \Phi(g(\phi)) {}={}& \Phi\left(\sum_{i=1}^{n}e_i^* \otimes \phi(e_i)\right) &&\text{Definition of gg} \\ {}={}& \sum_{i=1}^{n} \Phi\left(e_i^* \otimes \phi(e_i)\right) &&\text{Φ\Phi is linear} \\ {}={}& \sum_{i=1}^{n}e_i^*({}\cdot{})\phi(e_i) \\ {}={}& \phi\left(\underbrace{e_i^*({}\cdot{})e_i}_{{\rm id}}\right) && \text{ϕ\phi is linear} \\ {}={}& \phi. \end{aligned}

Conversely,

g(Φ(vw))=g(v()w)=i=1neiv(ei)w=(i=1neiv(ei))w=vw.\begin{aligned} g(\Phi(v^*\otimes w)) {}={}& g(v^*(\cdot)w) \\ {}={}& \sum_{i=1}^{n}e_i^* \otimes v^*(e_i)w \\ {}={}& \left(\sum_{i=1}^{n}e_i^* v^*(e_i)\right) \otimes w \\ {}={}& v^*\otimes w. \end{aligned}

This completes the proof. \Box

Result 2: Dual of space of linear maps

For two finite-dimensional spaces VV and WW, it is

Hom(V,W)(VW)VWVWHom(W,V).\begin{aligned} {\rm Hom}(V, W)^* {}\cong{} (V^*\otimes W)^* {}\cong{} V^{**}\otimes W^* {}\cong{} V\otimes W^* {}\cong{} {\rm Hom}(W, V). \end{aligned}

It is Hom(V,W)Hom(W,V).{\rm Hom}(V, W)^* {}\cong{} {\rm Hom}(W, V).

Some additional consequences of this are:

  1. Hom(Hom(V,W),U)Hom(V,W)U(VW)UVWU.{\rm Hom}({\rm Hom}(V,W),U) \cong {\rm Hom}(V,W)^*\otimes U \cong (V^*\otimes W)^*\otimes U \cong V\otimes W^* \otimes U.
  2. UVWHom(U,V)WHom(Hom(V,U),W)U\otimes V \otimes W \cong {\rm Hom}(U^*, V)\otimes W \cong {\rm Hom}({\rm Hom}(V,U^*),W)
  3. VVHom(V,V)Hom(V,V)V\otimes V^* \cong {\rm Hom}(V^*, V) \cong {\rm Hom}(V,V), where if VV is finite dimensional, VVV^*\cong V, so we see that VVEnd(V)V^*\otimes V \cong {\rm End}(V)

Question: How can we describe linear maps from I ⁣Rm×n{\rm I\!R}^{m\times n} to I ⁣Rp×q{\rm I\!R}^{p\times q} with sensors?

We are talking about the space Hom(I ⁣Rm×n,I ⁣Rp×q){\rm Hom}({\rm I\!R}^{m\times n}, {\rm I\!R}^{p\times q}), but every matrix AA can be identified by the linear map xAxx\mapsto Ax, so I ⁣Rm×nHom(I ⁣Rn,I ⁣Rm){\rm I\!R}^{m\times n}\cong {\rm Hom}({\rm I\!R}^{n}, {\rm I\!R}^{m}), so

Hom(I ⁣Rm×n,I ⁣Rp×q)Hom(Hom(I ⁣Rn,I ⁣Rm),Hom(I ⁣Rq,I ⁣Rp))((I ⁣Rn)I ⁣Rm)(I ⁣Rq)I ⁣RpI ⁣RnI ⁣RmI ⁣RqI ⁣Rp\begin{aligned} {\rm Hom}({\rm I\!R}^{m\times n}, {\rm I\!R}^{p\times q}) {}\cong{}& {\rm Hom}({\rm Hom}({\rm I\!R}^n, {\rm I\!R}^m), {\rm Hom}({\rm I\!R}^q, {\rm I\!R}^p)) \\ {}\cong{}& (({\rm I\!R}^n)^*\otimes {\rm I\!R}^m)^* \otimes ({\rm I\!R}^{q})^*\otimes {\rm I\!R}^p \\ {}\cong{}& {\rm I\!R}^n \otimes {\rm I\!R}^{m*} \otimes {\rm I\!R}^{q*} \otimes {\rm I\!R}^p \end{aligned}

Such objects can be seen as multilinear maps I ⁣Rn×I ⁣Rm×I ⁣Rq×I ⁣RpI ⁣R{\rm I\!R}^{n*} \times {\rm I\!R}^{m} \times {\rm I\!R}^{q} \times {\rm I\!R}^{p*} \to {\rm I\!R} , that is,

(xϕψy)(θ,s,t,ζ)=θ(x)ϕ(s)ψ(t)ζ(y).(x\otimes \phi \otimes \psi \otimes y)(\theta, s, t, \zeta) = \theta(x)\phi(s)\psi(t)\zeta(y).

Lastly,

dimHom(I ⁣Rm×n,I ⁣Rp×q)=dimI ⁣RnI ⁣RmI ⁣RqI ⁣Rp=nmpq.\dim {\rm Hom}({\rm I\!R}^{m\times n}, {\rm I\!R}^{p\times q}) = \dim {\rm I\!R}^n \otimes {\rm I\!R}^{m*} \otimes {\rm I\!R}^{q*} \otimes {\rm I\!R}^p = nmpq.

Result 3: Tensor product of linear maps

Tensor product of linear maps. Let U1,U2U_1, U_2 and V1,V2V_1, V_2 be vector spaces. Then

Hom(U1,U2)Hom(V1,V2)Hom(U1V1,U2V2){\rm Hom}(U_1, U_2) \otimes {\rm Hom}(V_1, V_2) \cong {\rm Hom}(U_1\otimes V_1, U_2\otimes V_2)

with isomorphism function

(ϕψ)(uv)=ϕ(u)ψ(v).(\phi\otimes\psi)(u\otimes v) = \phi(u) \otimes \psi(v).

Again, using the universal property

We can define VWV\otimes W to be the a space accompanied by a bilinear operation :V×WVW\otimes: V\times W \to V \otimes W that has the universal property in the following sense: [5]

Universal property. If f:V×WZf: V\times W \to Z is a bilinear function, then there is a unique linear function f~:VZ\tilde{f}: V\otimes Z such that f=f~f = \tilde{f} \circ \otimes.

Universal property mind-blowing

The universal property says that every bilinear map f(u,v)f(u, v) on a vector space VV is a linear function f~\tilde{f} of the tensor product, f~(uv)\tilde{f}(u\otimes v) [6].

See more about the uniqueness of tensor product via the universality-based definition as well as[8].

Making sense of tensors using polynomials

This is based on[9]. We can identify vectors u=(u0,,un1)I ⁣Rnu=(u_0, \ldots, u_{n-1})\in{\rm I\!R}^n and vI ⁣Rmv\in{\rm I\!R}^m as polynomials over I ⁣R{\rm I\!R} of the form

u(x)=u0+u1x++un1xn1.u(x) = u_0 + u_1x + \ldots + u_{n-1}x^{n-1}.

We know that a pair of vectors lives in the Cartesian product space I ⁣Rn×I ⁣Rm{\rm I\!R}^n\times {\rm I\!R}^m. A pair of polynomials lives in a space with basis

Bprod{(1,0),(x,0),,(xn1,0)}{(0,1),(0,y),,(0,yn1)}.\mathcal{B}_{\rm prod} {}\coloneqq{} \{(1, 0), (x, 0), \ldots, (x^{n-1}, 0)\} \cup \{(0, 1), (0, y), \ldots, (0, y^{n-1})\}.

But take now two polynomials uu and vv and multiply them to get

u(x)v(y)=ijcijxiyj.u(x)v(y) = \sum_{i}\sum_{j}c_{ij}x^iy^j.

The result is a polynomial of two variables, xx and yy. This corresponds to the tensor product of uu and vv.

Further reading

Lots of books on tensors for physicists are in [11]. In Linear Algebra Done Wrong [12] there is an extensive chapter on the matter. It would be interesting to read about tensor norms [13]. These lecture notes [14] [15] [16] seem worth studying too. These [18] lectures notes on multilinear algebra look good, but are more theoretical and look a bit category-theoretical. A must-read book is [19] and these lectures notes for MIT [20]. A book that seems to explain things in an accessible way, yet rigorously is [21].

References

  1. F. Schuller, Video lecture 3, Accessed on 12 Sept 2023.
  2. А. И. Кострикин, Ю.И.Манин, Линейная алгебра и геометрия, Часть 4
  3. Wikipedia, tensor product on Wikipedia, Accessed on 15 Sept 2023.
  4. P. Renteln, Manifolds, Tensors, and Forms: An introduction for mathematicians and physicists, Cambridge University Press, 2014.
  5. Steven Roman, Advanced Linear Algebra, Springer, 3rd Edition, 2010
  6. Check out this guy on YouTube (@mlbaker) gives a cool construction and explanation of tensors
  7. M. Penn's video "What is a tensor anyway?? (from a mathematician)" on YouTube
  8. Proof: Uniqueness of the Tensor Product, YouTube video by Mu Prime Math
  9. Prof Macauley on YouTube, Advanced Linear Algebra, Lecture 3.7: Tensors, accessed on 9 November 2023; see also his slides, Lecture 3.7: Tensors, lecture slides on tensors, some nice diagrams and insights.
  10. Answer on MSE on why Hom(V,W){\rm Hom}(V, W) is the same thing as VWV^*\otimes W, accessed on 9 November 2023
  11. Lots of books on tensors, tensor calculus, and applications to physics can be found on this GitHub repo, accessed on 9 November 2023
  12. S. Treil, Linear algebra done wrong, Chapter 8: dual spaces and tensors, 2017
  13. A. Defant and K. Floret, Tensor norms and operator ideals (link), North Holand, 1993
  14. K. Purbhoo, Notes on Tensor Products and the Exterior Algebra for Math 245, link
  15. Lecture notes on tensors, Uni Berkeley, link
  16. J. Zintl, Notes on tensor products, part 3, link
  17. Rich Schwartz, Notes on tensor products
  18. J. Zintl, Notes on multilinear maps, link
  19. A Iozzi, Multilinear Algebra and Applications. Concise (100 pages), very clear explanation.
  20. MIT Multilinear algebra lecture notes/book, multilinear algebra, introduction (dual spaces, quotients), tensors, the pullback operation, alternating tensors, the space Λk(V)\Lambda^k(V^*), the wedge product, the interior product, orientations, and more 👍 (must read)
  21. JR Ruiz-Tolosa and E Castillo, From vectors to tensors, Springer, Universitext, 2005.
  22. Elias Erdtman, Carl Jonsson, Tensor Rank, Applied Mathematics, Linkopings Universite, 2012