In the last chapter, we dealt with notion of dot product and geometry in \(\R^n\text{.}\) The dot product and related notion can be generalized to an arbitrary vector space over \(\R\) or \(\mathbb{C}\text{.}\) All the notions, we have learned in the last section can be generalized over an inner product space. In this chapter we, shall introduce inner product on a vector space \(V\) over \(\mathbb{R}\) and define extend the results studied in the last chapter on \(V\text{.}\)
Note that the dot product of two vectors in \(\R^n\) is a scalar, in particular, dot product can be thought of as a function from \(`\cdot' \colon \R^n\times \R^n \to\R\) satisfying the following properties:
Let \(V\) be a vector space over \(\R\text{.}\) An inner product on \(V\) is a function that assigns a real number \(\langle x, y\rangle\) to every pair \(x,y\) of vectors in \(V\) (that is, a function \(\langle \cdot, \cdot\rangle\colon V \times V \to \R\)) satisfying the following properties.
If \(V\) is real vector space with inner product \(\langle. , .\rangle\text{.}\) Then \((V, \langle . , .\rangle)\) called in inner product space over \(\R\text{.}\)
The last two properties make the inner product linear in the second variable. Using the symmetry property, it can also be shown that the inner product is linear in the first variable as well. That is,
\begin{equation*}
\langle (x+y),z\rangle=\langle x, z\rangle+\langle y, z\rangle, \text{ and } \langle (\alpha x) y\rangle=\alpha\langle x, y\rangle
\end{equation*}
Note that this inner product can be thought of as the standard dot product on \(\R^{n^2}\text{.}\) The elements of the matrix \(A\) can be thought of as a vector in \(\R^{n^2}\text{.}\) Then
Since \(A\) is symmetric and positive definite matrix, there exists a matrix \(B\) such that \(B^2=A\text{.}\) We call \(B\) as positive definite square root of \(A\) and is denoted by \(A^{1/2}\text{.}\)
We have defined an inner product on \(\R^n\) using a symmetric positive definite matrix \(A\text{.}\) In fact any inner product on \(\R^n\) can be obtained in this way.
Let \(\innprod{x}{y}\) be an inner product on \(\R^n\text{.}\) Let \(e_1, e_2, \ldots, e_n\) be the standard basis of \(\R^n\text{.}\) For \(x,y\in \R^n\) with
It is easy to see that \(\langle p,q \rangle\) defined inner product on the vector space \({\cal P}_n(R)\text{.}\) This inner product is called the discrete inner product on \({\cal P}_n(\mathbb{R})\text{.}\)
Let \((V, \langle .\rangle)\) be a real inner product space. Then norm of any vector \(x\in V\) corresponding to the inner product \(\langle . \rangle\) is defined as
This is called the parallelogram identity. Geometrically, in a parallelogram, the sum of square of the diagonals is twice the sum of the squares of the side lengths.
If \(x = 0\) or \(y = 0\text{,}\) then \(\innprod{x}{y} = 0\text{.}\) Also either \(\innprod{x}{x} = 0\) or \(\innprod{y}{y} = 0\text{.}\) Hence the result follows.
We now prove the statement for equality. Let \(\mod{\innprod{x}{y}} = 1.\) This implies either \(\innprod{x}{y} = 1\) or \(\innprod{x}{y} = -1.\) If \(\innprod{x}{y} = 1\text{,}\) from the above chain of inequalities we deduce that \(\innprod{x - y}{x - y} = 0\) or \(x = y.\) If \(\innprod{x}{y} = -1\text{,}\) we see that \(x = -y.\) Thus the equality holds if and only if \(x = \pm y.\)
Next suppose \(x\) and \(y\) are nonzero (not necessarily of unit length). Then we can write \(u = \frac{x}{\norm{x}}\) and \(v = \frac{y}{\norm{y}}\) are of unit length. Hence by the previous case, \(\mod{\innprod{u}{v}} \leq 1.\) This implies,
If \(x\) and \(y\) are nonzero, then the equality from the earlier case means \(\innprod{x}{y} = \norm{x} \norm{y}\) or \(-\innprod{x}{y} = \norm{x} \norm{y}\text{.}\) Let us look at one of the case.
Fix \(x\) and \(y\) in \(V\text{.}\) If \(y = 0\text{,}\) then the result is obviously true. Without loss of generality assume that \(y \neq 0\text{.}\) Consider the real-valued function \(\varphi\colon \R \to \R\) defined as
\begin{equation*}
\varphi(t) := \innprod{x + t y}{x + t y}\text{.}
\end{equation*}
We shall use calclus and investigate the minimum value of \(\varphi(t)\) to prove the inequality.
\begin{align*}
\varphi(t) \amp= \amp\innprod{x + t y}{x + t y}\\
\amp= \amp\innprod{x}{x} + 2 t \innprod{x}{y} + t^{2} \innprod{y}{y}.
\end{align*}
Note that this is a differentiabke function of \(t\text{.}\) Differentiating \(\varphi(t)\) with respect to \(t\text{,}\) we get
\begin{equation*}
\varphi'(t) = 2 \innprod{x}{y} + 2 t \innprod{y}{y}.
\end{equation*}
Setting \(\varphi'(t) = 0\text{,}\) we get the critical point at
\begin{equation*}
t = -\frac{\innprod{x}{y}}{\innprod{y}{y}}.
\end{equation*}
Since \(\varphi''(t) =\innprod{y}{y} \gt 0\text{,}\) this critical point is a minimum. Thus
Thus for any two non zero vectors, \(x\) and \(y\text{,}\)\(\frac{\inner{x}{y}}{\norm{x}\norm{y}}\) always lies between \(-1\) and 1. This allows us to define the angle between two non zero vectors. We assign this number to \(\cos\theta\) with \(\theta\in[-\pi,\pi]\) called the angle between \(x\) and \(y\text{.}\) Thus, if \(\theta\) is the angle between \(x\) and \(y\text{,}\) then we have
All the notions that we defined for dot product, namely, orthogonality, orthogonal projection, Gram-Schmidt orthogonalization process can we defined in a similar manner. All we need to do is, replace the dot product by the given inner product.
Any vector space \(V\) over \(\R\) with a function \(\norm{.} : V \to \R\) which satisfies all the properties mentioned in TheoremΒ 7.1.15 is called a normed linear space.. Thus any inner product space is also a normed linear space.
The concepts such as orthogonality, orthogonal projection, orthogonal complement of any subset, orthogonal and orthonormal sets and Gram-Schmidt orthogonalization process etc that we defined and dealt with in the previuos chapter with respect to the dot product on \(\R^n\) can be defined on an inner product space. All we need to do is to replace the dot product by the corresponding inner product. We encourage readers to define each one of them.
As mentioned earlier all the notions related to orthogonality can be defined in a similar manner on an inner product space and the results that we proved in the previous chapter hold true in this more general setting. However, let us state some of the definitions and results here for completeness.
Let \((V,\innprod{.}{.})\) be an inner product space. A set of vectors \(\{v_1,\ldots,v_k\}\) is said to be orthogonal set if \(\innprod{v_i}{v_j}=0\) for \(i\neq j\text{.}\)
In addition, if each vector in the set is of unit norm, i.e., \(\norm{v_i}=1\) for all \(i=1,2,\ldots,k\text{,}\) then the set is called an orthonormal set.
Let \(C([-\pi,\pi])\) be the vectors space of set of continuous functions from \([-\pi,\pi]\) to \(\R\text{.}\) Define the inner product on \(C([-\pi,\pi])\) as
Let \(\beta=\{u_1,\ldots, u_n\}\) be an orthogonal basis of an inner product space \(V\text{.}\) Let \(v\in V\) and \(\theta_1,\ldots, \theta_n\) be the angle between \(v\) and \(u_1,\ldots, u_n\text{,}\) respectively. Then
If \(v=0\) the statement is trivial (all direction cosines are zero). Assume \(v\neq 0\text{.}\) Since \(\beta\) is an orthogonal basis we can expand \(v\) uniquely as
\begin{equation*}
v = \sum_{i=1}^{n} \frac{\innprod{v}{u_{i}}}{\innprod{u_{i}}{u_{i}}}\, u_{i}.
\end{equation*}
Taking norms squared and using orthogonality of the \(u_{i}\) gives
A basis \(\{v_1, \ldots, v_n\}\) of an inner product space \(V\) is said to be orthonormal basis if we have \(\innprod{v_i}{v_j}=\delta_{ij}\) for \(1 \leq i,j \leq n\text{.}\)
Let us assume that \(\{v_i\}\) is an orthonormal basis of \(V\text{.}\) Write \(v = \sum_i \alpha_i v_i\text{.}\) Taking the inner product of both sides with the vector \(v_j\text{,}\) and using the orthonormal properties of the basis, we get
Next we turn our attention to finding an orthogonal basis in an inner product space using the Gram-Schmidt process. The process is exactly same as before, we just need to replace the dot product by the given inner product. Let \((V, \inner{.}{.})\) be an inner product space. Let \(\{v_1, v_2, \ldots, v_n\}\) be a basis of \(V\text{.}\) We construct an orthogonal basis \(\{u_1, u_2, \ldots, u_n\}\) as follows:
Let \(\{v_1, v_2, \ldots, v_n\}\) be a linearly independent set in an inner product space \(V\text{.}\) Then there exists an orthogonal set \(\{u_1, u_2, \ldots, u_n\}\) such that
The proof proceeds by induction on \(k\text{.}\) For \(k = 1\text{,}\) set \(u_1 = v_1\text{,}\) which is nonzero since \(v_1\) is linearly independent. Suppose \(u_1, \ldots, u_{k-1}\) have been constructed such that they are mutually orthogonal and \(\operatorname{span}\{u_1, \ldots, u_{k-1}\}
= \operatorname{span}\{v_1, \ldots, v_{k-1}\}\text{.}\)
The GramβSchmidt process provides an explicit algorithm for transforming any linearly independent set into an orthonormal one. It is a fundamental tool in numerical linear algebra, functional analysis, and the construction of orthogonal polynomials.
The set \(\{e_{1},e_{2},e_{3}\}\) is an orthonormal basis of \(\operatorname{span}\{1,x,x^{2}\}=\mathcal{P}_2(\mathbb{R})\) (with respect to the discrete inner product above).
Let \(\beta=\{u_1,\ldots,
u_n\}\) be an orthogonal basis of an inner product space \(V\text{.}\) Let \(x\) and \(y\) be two vectors such that \(x=\sum x_i u_i\) and \(y=\sum y_i u_i\text{.}\) Then
Consider \(V ={\cal P}_3(\R)\) with inner product \(\inner{p}{q}:=\int_{-1}^1 p(x)q(x)\,dx\text{.}\) Use the standard basis \(\beta =\{v_1,v_2,v_3,v_4\} = \{1,x,x^2,x^3\}\) to find an orthogonal basis of \({\cal P}_3(\R)\text{.}\)
First of all notice that \(\beta\) is not an orthogonal basis. For \(\inner{v_1}{v_3}=\inner{1}{x^2} = \int_{-1}^1 x^2 dx = \frac23\text{,}\)\(\inner{v_2}{v_4}=\int_{-1}^1 x^4 dx = \frac25\text{.}\) Also note that \(\inner{v_1}{v_2}=\int_{-1}^1 xdx = 0\text{.}\)\(\inner{v_2}{v_3}=\int_{-1}^1 x^3dx = 0\text{.}\)\(\inner{v_1}{v_4}=\int_{-1}^1 x^3 dx = 0\text{.}\)\(\inner{v_3}{v_4}=\int_{-1}^1 x^5dx = 0\text{.}\)
Consider the standard basis \(\beta=\{1,x,x^2,x^3\}\) of \({\cal P}_3(\R)\) with inner product \(\inner{f}{g}:=\int_0^1 f(x)g(x)\,dx\text{.}\) Find an orthonormal basis starting with \(\beta\) using the Gram-Schmidt orthogonalization process.
Let \(A=\left(\begin{array}{rrr}2 \amp -1 \amp 0 \\-1 \amp 2 \amp -1 \\0 \amp -1 \amp 2 \end{array} \right)\text{.}\) It is easy to check that \(A\) is a symmetric and positive definite matrix. (why?) Define an inner product on \(\mathbb{R}^3\) as \(\inner{u}{v}:=v^TAu\text{.}\)
Use the the Gram-Schmidt orthogonalization process to find an orthonormal basis of from the standard basis vectors \(\beta=\{e_1, e_2, e_3\}\) with respect to the above inner product.
Note that the concepts of GramβSchmidt orthogonalization, orthogonal projection, and reflection can be naturally extended to an inner product space \((V, \langle \cdot, \cdot \rangle)\text{.}\) Explore how these notions generalize in such spaces, and implement solutions to related problems using Sage.
Let \((V,\innprod{.}{.})\) be an inner product space. Suppose \(u,v\in V\) with \(u\neq 0\text{.}\) Then the orthogonal projection of \(v\) onto \(u\) is defined as
Let \(V=\mathcal{P}_2(\R)\) with inner product \(\inner{p}{q}=\int_0^1 p(x)q(x)dx\text{.}\) Find the orthogonal projection of \(p(x)=x^2\) onto \(u(x)=x+1\text{.}\)
Let \(V\) be an inner product space and \(W\leq V\text{,}\) a finite dimensional subspace of \(V\text{.}\) Let \(\{u_1,\ldots, u_k\}\) be an orthonormal basis of \(W\text{.}\) Suppose \(v\in V\text{.}\) Similar to definitionΒ 6.3.5, we can define the orthogonal projection of \(v\) onto \(W\) as
Find the orthogonal projection of vector \(b=\begin{bmatrix}1\\2\\3\\4 \end{bmatrix}\) onto the subspace spanned by three vectors \(\left\{\begin{bmatrix}1\\-1\\0\\1 \end{bmatrix} , \begin{bmatrix}0\\1\\1\\-1 \end{bmatrix} , \begin{bmatrix}1\\1\\-1\\0 \end{bmatrix} \right\}\text{.}\)
Let \((V,\innprod{.}{.})\) be an inner product space over \(\R\) and \(U\subset V\text{.}\) The the orthogonal complement of \(U\) in \(V\) is defined as
\begin{equation*}
U^{\perp}=\{v\in V: \innprod{u}{v}=0 \text{ for all } u\in U\}.
\end{equation*}
Let \(A\) be a symmetric \(n\times n\) and positive definite matrix. Define an inner product on \(\R^n\) as \(\innprod{u}{v}=v^TAu\text{.}\) Let \(U=\{e_1\}\text{.}\) Then
Since \(e_1=(1,0,\ldots,0)^T\text{,}\)\(Ae_1\) is the first column of \(A\text{.}\) This implies that \(U^{\perp}\) is the set of vector orthogonal to the first column of \(A\) with respect to the standard inner product. Since the first column of \(A\) is nonzero, \(\dim(U^{\perp})=n-1\text{.}\) It is easy to see that \(U^{\perp}\) is the set of vector orthogonal to the first column of \(A\) with respect to the standard inner product.
Let \(u\in W\cap W^{\perp}\text{.}\) Then \(\innprod{u}{u}=0\text{.}\) Hence the only vector in both \(W\) and \(W^{\perp}\) is the zero vector. Let \(x\in V\text{.}\) We need to show that there exist \(w\in W\) and \(v\in W^{\perp}\) such that \(x=w+v\text{.}\) Let \(\{u_1,\ldots, u_k\}\) be an orthonormal basis of \(W\text{.}\) Define
\begin{equation*}
w = \innprod{x}{u_1}u_1+\innprod{x}{u_2}u_2+\cdots +\innprod{x}{u_k}u_k.
\end{equation*}
Now define \(v=x-w\text{.}\) For each \(i=1,2,\ldots, k\text{,}\) we have
Hence \(\dim(W)=\dim(U^{\perp})\text{.}\) It remains to show that \(W\subseteq U^{\perp}\) and \(U^{\perp}\subseteq W\text{.}\) Let \(w\in W\) and \(u\in U\text{.}\) Then \(\innprod{w}{u}=0\) since \(U=W^{\perp}\text{.}\) This shows that \(W\subseteq U^{\perp}\text{.}\) Now let \(y\in U^{\perp}\text{.}\) We need to show that \(y\in W\text{.}\) By (1), there exist \(w\in W\) and \(u\in U\) such that \(y=w+u\text{.}\)
If \(W\) is a finite dimensional subspace of an inner product space \(V\text{.}\) Then the subspace \(W^{\perp}\) is called the orthogonal complement of \(W\) in \(V\text{.}\) Since \(V=W\oplus W^{\perp}\text{,}\) every vector \(x\in V\) can be uniquely written as \(x=w+v\) with \(w\in W\) and \(v\in W^{\perp}\text{.}\) The vector \(w\) is called the orthogonal projection of \(x\) onto \(W\) and is denoted by \(\proj_W(x)\text{.}\)
Let \((V,\innprod{.}{.})\) be an inner product space over \(\R\) and \(W\leq V\) be a finite dimensional subspace of \(V\text{.}\) Given \(x\in V\text{,}\) let \(w=\proj_W(x)\text{.}\) Then for all \(u\in W\text{,}\)\(\norm{x-w}\leq \norm{x-u}\text{.}\)
Let \(u\in W\text{.}\) Then \(u-w\in W\text{.}\) Since \(x-w\in W^{\perp}\text{,}\) we have \(\innprod{x-w}{u-w}=0\text{.}\) Hence by the Pythagorean theorem,
The Approximation Theorem shows that the orthogonal projection of a vector \(x\) onto a subspace \(W\) is the best approximation of \(x\) by a vector in \(W\text{.}\) This result has important applications in numerical analysis and scientific computing.
Let us explore one such application from Fourier analysis. Define an inner product on the space of continuous real-valued functions on \([-\pi, \pi]\) as
The coefficients \(a_0, a_k, b_k\) are called the Fourier coefficients of \(f\text{.}\) Thus we have shown that the Fourier coefficients give the best approximation of a function by a trigonometric polynomial of degree at most \(n\)
Let \((V,\innprod{.}{.})\) be an inner product space over \(\R\text{.}\) Fix a vector \(a\in V\) and define \(f_a\colon V\to \R\) as \(f_a(x)=\innprod{x}{a}\text{.}\) It follows from from properties of inner product that \(f_a\) is a linear map. (why?)
Recall that we have characterized all linear maps from \(\R^n\to \R\text{.}\) In particular, if \(T\colon \R^n\to \R\) is a linear map, them there exist a vector \(a=(a_1,\ldots, a_n)\) such that
\begin{equation*}
T(x)=\sum a_ix_i=x \cdot a.
\end{equation*}
The question is can we characterize linear functionals on an inner product space.
Let \(\innprod{.}{.}\) be an inner product space over \(\R\text{.}\) Given a linear map \(f\colon V \to \R\text{,}\) there exists a unique \(y
\in V\) such that \(f(x) = \innprod{x}{y}\) for all \(x \in V\text{.}\)
To show uniqueness, suppose there exists \(z\in V\) such that \(f(x)=\innprod{x}{z}\) for all \(x\in V\text{.}\) Then for each \(i=1,2,\ldots, n\text{,}\) we have
Let \((V,\innprod{.}{.})\) and \((W,\innprod{.}{.})\) be finite dimenional inner product spaces over \(\R\text{.}\) Let \(T\colon V\to W\) be a linear map. Let \(\alpha=\{u_1,\ldots, u_n\}\) be an orthonormal basis of \(V\) and \(\beta =\{v_1,\ldots, v_m\}\) an orthonormal basis of \(W\text{.}\) Then the matrix of \(T\) with respect to these bases is given by
The above theorem shows that a linear map between two finite dimensional inner product spaces is completely determined by the images of an orthonormal basis of the domain space.
Now suppose \(W=V\) in TheoremΒ 7.1.48 and \(T\colon V\to V\) be a linear map on \(V\text{.}\) Let \(\alpha=\{u_1,\ldots, u_n\}\) be an orthonormal basis of \(V\text{.}\) Then the matrix of \(T\) with respect to this basis is given by
Recall that we defined a linear map \(T\colon \R^n\to \R^n\) assocciated with an \(n\times n\) real matrix \(A\) as \(T_A(x)=Ax\text{.}\) Also the matrix of \(T_A\) with respect to the standard basis is \(A\text{.}\) This we can define an \(n\times n\) matrix \(A\) to be symmetric if the linear map \(T_A\colon \R^n\to \R^n\) is symmetric with respect to the standard inner product (dot product) on \(\R^n\text{.}\)
It is easy to see that an \(n\times n\) real matrix \(A\) is symmetric in the usual sense (i.e., \(A=A^T\)) if and only if the linear map \(T_A\colon \R^n\to \R^n\) is symmetric with respect to the standard inner product on \(\R^n\text{.}\)
Next we revist the orthogonal linear transformation defined defined in the last chapter and give an equivalent definition using inner products. We have the following equivalent definition.
Let \((V,\innprod{.}{.})\) be a finite dimensional inner product space over \(\R\text{.}\) A linear map \(T\colon V\to V\) is an orthogonal linear transformation if and only if
In particular orthogonal linear transformations are precisely those linear maps that preserve the inner product. It is easy to see that a linear map \(T\colon V\to V\) is an orthogonal linear transformation if and only if
Let \((V,\innprod{.}{.})\) be a finite dimensional inner product space over \(\R\text{.}\) and \(T\colon V\to V\) be a linear map. Then the following are equivalent.
\(T\) maps every orthonormal basis of \(V\) to an orthonormal basis of \(V\text{.}\) That is, if \(\{u_1,\ldots, u_n\}\) is an orthonormal basis of \(V\text{,}\) then \(\{T(u_1),\ldots, T(u_n)\}\) is also an orthonormal basis of \(V\text{.}\)
This shows that \(\{T(u_1),\ldots, T(u_n)\}\) is an orthonormal set. Since \(T\) is a linear map, \(\{T(u_1),\ldots, T(u_n)\}\) spans \(\im(T)\text{.}\) Also \(\{T(u_1),\ldots, T(u_n)\}\) is linearly independent. Hence
(3) \(\Rightarrow\) (1): Let \(u,v\in V\text{.}\) Since \(\{u_1,\ldots, u_n\}\) is an orthonormal basis of \(V\text{,}\) we can write \(u=c_1u_1+c_2 u_2+\cdots +c_nu_n\) and \(v=d_1u_1+d_2 u_2+\cdots +d_nu_n\text{.}\) Then it is easy to see that
Now let us see what happens to the matrix of an orthogonal linear transformation with respect to an orthonormal basis. Let \(T\colon V\to V\) be an orthogonal linear transformation on a finite dimensional inner product space \((V,\innprod{.}{.})\text{.}\) Let \(\alpha=\{u_1,\ldots, u_n\}\) be an orthonormal basis of \(V\text{.}\) Then the matrix of \(T\) with respect to this basis is given by
Let us write the columns of \(A\) as \(A=[c_1 \; c_2 \; \cdots \; c_n]\text{.}\) Since \(\{T(u_1),\ldots, T(u_n)\}\) is an orthonormal basis of \(V\text{,}\) it follows that
the dot product of \(c_i\) and \(c_j\text{.}\) This shows that the columns of \(A\) form an orthonormal set in \(\R^n\text{.}\) This suggest the following definition of an othogonal matrix.
A real \(n\times n\) matrix \(A\) is called an orthogonal matrix if its columns form an orthonormal set in \(\R^n\text{.}\) This is, if \(A=[c_1 \; c_2 \; \cdots \; c_n]\text{,}\) then