Eigenvalues and Eigenvectors

Section 5.1 Eigenvalues and Eigenvectors

In this section we define the notion of eigenvalues and eigenvectors and look at some of its properties. It has several real life applications. We shall deal with some of them in later sections.

Definition 5.1.1. Eigenvalues and Eigenvectors.

Let \(T\) be a linear transformation from \(\mathbb{R}^n\to \R^n\text{.}\) A real number (scalar) is called an eigenvalue of \(T\) if there exists a non zero vector \(v\in \R^n\) (called aneigenvector corresponding to eigenvalue \(\lambda\)) if \(T(v)=\lambda v\text{.}\) That is, if \(T(v)\) is parallel to \(v\text{.}\)

Thus if \(T(v)=\lambda v\text{,}\) then \({\lambda I-T}(v)=0\text{,}\) where \(I\) is identity transformation on \(V\text{.}\)

If \(A\) is an \(n\times n\) real matrix matrix, then we know that \(T_A(x)=Ax\) is a linear transformation induced by \(A\text{.}\) We can define eigenvalue of \(A\) as eigenvalue of \(T_A\text{.}\) In particular, real number is called an eigenvalue of \(A\) if there exists a non zero vector \(v\in \R^n\) (called an eigenvector corresponding to eigenvalue \(\lambda\)) if \(Av=\lambda v\text{.}\)

Example 5.1.2.

Let \(A=\left(\begin{array}{rr} 1 \amp 2 \\ -1 \amp 4 \end{array} \right)\text{.}\) Consider a vector \(u=\begin{pmatrix}1\\1 \end{pmatrix}\text{.}\) Then \(Au=\begin{pmatrix}3\\3 \end{pmatrix} =3\begin{pmatrix}1\\1 \end{pmatrix} =3u\text{.}\) Hence \(u\) is an eigenvector and \(\lambda = 3\) is an eigenvalue.

Consider \(v=\begin{pmatrix}2\\1 \end{pmatrix}\text{.}\) Then it is easy to check that \(Av=\begin{pmatrix}4\\2 \end{pmatrix} =2\begin{pmatrix}2\\1 \end{pmatrix}\text{.}\) Hence \(v\) is also an eigenvector and \(\lambda = 2\) is an eigenvalue.

Example 5.1.3.

If \(T\) is an identity transformation from \(\R^n\to \R^n\text{,}\) then every nonzero vector is an eigenvector corresponding the eiegenvalue 1. The same is true for \(n\times n \) identity matrix.

Example 5.1.4.

Consider the matrix of rotation \(R_\theta\) in anti-clock wise by an angle \(\theta\neq n\pi\) for \(n\in \Z\text{.}\) Then it is easy to see that \(R_\theta\) does not have an eigenvector. Thus not all square matrices have eigenvectors.

Remark 5.1.5.

If \(\lambda\) is an eigenvalue of \(A\) with corresponding eigenvector \(v\neq 0\text{.}\) Then any scalar multiple of \(v\) is also an eigenvector corresponding to the same eigenvalue \(\lambda\text{.}\)

Let us analyze the notion of eigenvalues and eigenvector. If \(v\) is a eigenvector corresponding to an eigenvalue \(\lambda\text{.}\) Then \(Av=\lambda v\text{.}\) This implies \((\lambda I-A)v=0\text{,}\) where \(I\) is \(n\times n\) identity matrix. This means that the homogeneous system \((\lambda I-A)x=0\) has a non zero solution, namely \(v\text{.}\) Hence \(\det{(\lambda I-A)}=0\text{.}\) Notice that \(\det(A-\lambda I)\) is a polynomial (called the characteristic polynomial of \(A\)) of degree \(n\) in \(\lambda\text{.}\) Thus if \(Av=\lambda v\text{,}\) then \(\lambda\) is a root of the the characteristic polynomial \(\det(A-\lambda I)\text{.}\) By fundamental theorem of algebra an \(n\times n\) real matrix can have at most \(n\) real eigenvalues. The equation \(\det(A-\lambda I)=0\) is called characteristic equation of \(A\text{.}\)

We can write \(\det(A-\lambda I)=0\) as \(\det(A-\lambda I)=\lambda^n-c_1\lambda^{n-1}+c_2\lambda^{n-2}-\cdots+(-1)^nc_n=0\text{.}\) If \(\lambda_1,\ldots, \lambda_n\) are roots of the characteristics equation, then using the theory of equations one can show that

\begin{align} \lambda_1+\cdots+\lambda_n\amp = c_1={ trace(A)}\tag{5.1.1}\\ \lambda_1\lambda_2+\cdots +\lambda_{n-1}\lambda_n \amp = c_2\tag{5.1.2}\\ \vdots \amp \notag\\ \lambda_1\lambda_2\cdots\lambda_n \amp = c_n = \det(A)\text{.}\tag{5.1.3} \end{align}

Thus we have the following.

Theorem 5.1.6.

Let \(A\) be an \(n\times n\) real matrix. Then (i) the sum of eigenvalues of \(A\) is the trace of \(A\) and (ii) the product of eigenvalues is the determeninat of \(A\text{.}\)

Example 5.1.7.

Let \(A = \begin{pmatrix}1 \amp 1 \amp 1\\ 1 \amp 1 \amp 1\\1 \amp 1 \amp 1 \end{pmatrix}\text{.}\) What are eigenvalues and eigenvectors of \(A\text{?}\)

Note that \(Ae_1=Ae_2=Ae_3=e_1+e_2+e_3\text{.}\) This means \(A(e_1+e_2+e_3)=3(e_2+e_2+e_3)\text{.}\) Hence \(3\) is an eigenvalue and \(e_1+e_2+e_3=\begin{pmatrix}1 \\1\\1 \end{pmatrix}\) is an eigenvectors w.r.t. eigenvalue 3.

Also \(A(e_1-e_2)=0\text{.}\) Hence \(0\) is an eigenvalue and \(e_1-e_2=\begin{pmatrix}1 \\-1\\0 \end{pmatrix}\) is an eigenvector corresponding to the eigenvalue 0. Also, \(e_1-e_3\) and \(e_2-e_3\) are also eigenvectors corresponding to the eigenvalue 0.

Note that in this example, we are able to find eigenvalues and eigenvectors by inspection and without going through characteristic polynomials.

What will be generalization of this example?

Example 5.1.8.

Let \(A = \begin{pmatrix}1 \amp t \amp t\\ t \amp 1 \amp t\\t \amp t \amp 1 \end{pmatrix}\text{.}\) What are eigenvalues of \(A\text{?}\)

The trace of \(A\) is 3. The \(\det{(A)}=2t^3 - 3t^2 + 1=(2t + 1)(t - 1)^2\text{.}\) Since sum of eigenvalues is 3 and the product of eigenvalues is \(\det{(A)}\text{,}\) it is easy to guess that \(\lambda_1 =2t+1\text{,}\) \(\lambda_2=\lambda_3=1-t\) are eigenvalues of \(A\text{.}\)

We can adopt a procedure similar to Example 5.1.8 to show that \(e_1+e_2+e_3\) is an eigenvector corresponding to the eigenvalue \(1+2t\text{.}\) Similarly, \(e_1-e_2,e_2-e_3, e_1-e_3\) are eigenvectors corresponding to the eigenvalue \(1-t\text{.}\)

Example 5.1.9.

Let \(A=\begin{pmatrix}1\amp 2\amp -2\\1\amp 1\amp 1\\1\amp 3\amp -1 \end{pmatrix}\text{.}\) Find eigenvalues and corresponding eigenvector of \(A\text{.}\)

We have

\begin{equation*} \det(A-\lambda I)=\begin{vmatrix}1-\lambda\amp 2\amp -2\\ 1\amp 1-\lambda\amp 1\\1\amp 3\amp -1-\lambda \end{vmatrix}=\lambda^3-\lambda^2-4\lambda+4\text{.} \end{equation*}

It is easy to see that characteristic polynomial \(\det(A-\lambda I)\) has roots \(\lambda=1, \lambda=-2, \lambda=2\text{.}\) Thus \(A\) has eigenvalues \(1, -2, 2\text{.}\)

Let us find eigenvectors with respect to the eigenvalue \(\lambda=1\text{.}\) Let \(v=\begin{pmatrix}x_1\\x_2\\x_3 \end{pmatrix}\) be an eigenvector corresponding to \(\lambda=1\text{.}\) Then \(Av=\lambda v=v\text{.}\) That is,

\begin{equation*} \begin{pmatrix}1\amp 2\amp -2\\1\amp 1\amp 1\\1\amp 3\amp -1 \end{pmatrix} \begin{pmatrix}x_1\\x_2\\x_3 \end{pmatrix} =\begin{pmatrix}x_1\\x_2\\x_3 \end{pmatrix}\text{.} \end{equation*}

This gives a system of linear equations

\begin{equation*} x_1+2x_2-2x_3=x_1; x_1+x_2+x_3=x_2; x_1+3x_2-x_3=x_3 \end{equation*}

Solving the above system, we get \(x_1=-x_2, x_2=x_3\text{.}\) Thus \(v=\begin{pmatrix}\alpha\\-\alpha\\-\alpha \end{pmatrix}\) for \(\alpha\in \R\) is an eigenvector. In particular, \(v=\begin{pmatrix}1\\-1\\-1 \end{pmatrix}\) is an eigenvector of \(A\) corresponding to \(\lambda=1\text{.}\)

Similarly show that \(\begin{pmatrix}0\\1\\1 \end{pmatrix}\) is an eigenvector of \(A\) corresponding to \(\lambda=2\) and \(\begin{pmatrix}8/7\\-5/7\\1 \end{pmatrix}\) is an eigenvector of \(A\) corresponding to \(\lambda=-2\)

Example 5.1.10.

Let \(\begin{pmatrix}0\amp 1\\-1\amp 0 \end{pmatrix}\text{.}\) Find eigenvalues and corresponding eigenvector of \(A\text{.}\)

The characteristic equation of \(A\) is given by \(\det(A-\lambda I)=\begin{vmatrix}-\lambda\amp 1\\-1\amp -\lambda\end{vmatrix}=\lambda^2+1\text{.}\) Hence eigenvalues of \(A\) are \(\lambda=\pm i\text{.}\)

Let us find eigenvectors with respect to the eigenvalue \(\lambda=i\text{.}\) Let \(v=\begin{pmatrix}x_1\\x_2 \end{pmatrix}\) be an eigenvector corresponding to \(\lambda=i\text{.}\) Then \(Av=\lambda v=i v\text{.}\) That is,

\begin{equation*} \begin{pmatrix}0\amp 1\\-1\amp 0 \end{pmatrix} \begin{pmatrix}x_1\\x_2 \end{pmatrix} =\begin{pmatrix}i x_1\\ ix_2 \end{pmatrix} \Longrightarrow \begin{pmatrix}x_2\\-x_1 \end{pmatrix} =\begin{pmatrix}i x_1\\ ix_2 \end{pmatrix}\text{.} \end{equation*}

Now it is easy to see that \(v=\begin{pmatrix}1\\ i \end{pmatrix}\) is an eigenvector of \(A\) corresponding to \(\lambda=i\text{.}\) Similarly one can show that \(v=\begin{pmatrix}1\\ -i \end{pmatrix}\) is an eigenvector of \(A\) corresponding to \(\lambda=-i\text{.}\)

Note that in the above example, \(A\) is a real matrix but its eigenvalues and eigenvectors are complex.

Reading Questions Reading Questions

1.

Let \(A\) an \(n\times n\) real matrix and \(\lambda\in \R\) be an eigenvalue of \(A\text{.}\) Then

\begin{equation*} E_\lambda = \{x\in \R^n: Ax=\lambda x\} \end{equation*}

is a subspace of \(\R^n\text{.}\)

2.

\(E_\lambda\) is the kernel of \(A-\lambda I\text{.}\)

Definition 5.1.11. Eigenspace.

Let \(A\) an \(n\times n\) real matrix and \(\lambda\in \R\) be an eigenvalue of \(A\text{.}\) Then

\begin{equation*} E_\lambda = \{x\in \R^n: Ax=\lambda x\} \end{equation*}

the collection of all eigenvectors of \(A\) corresponding to \(\lambda\) is a subspace of \(A\text{,}\) called the eigenspace of \(A\). The dimension of \(E_\lambda\) is called the geometric multiplicity of \(A\text{.}\)

Let \(\det{(x I -A)}=(x-\lambda_1)^{m_1}(x-\lambda_21)^{m_2}\cdots (x-\lambda_k)^{m_k}\text{.}\) Then \(\lambda_i\) are eigenvalue of \(A\) with multiplicity \(m_i\text{,}\) called the algebraic multiplicity of \(\lambda_i\text{.}\)

Remark 5.1.12.

Geometric multiplicity of an eigenvalue is always less than or equals to its algebraic multiplicity. That is, if \(m\) is the geometric multiplicity of \(\lambda\) then \(m\leq \dim{(E_\lambda)}\text{.}\)

Checkpoint 5.1.13.

The geometric multiplicity of an eigenvalue \(\lambda\) is the nullity of \(A-\lambda I\) which is the dimension of null space of \(A-\lambda I\text{.}\)

Example 5.1.14.

Consider the matrix \(A=\begin{pmatrix}-1\amp 1 \amp 0\\0 \amp -1 \amp 1\\0 \amp 0 \amp -1 \end{pmatrix}\text{.}\) It is easy to check that \(\det{(xI-A)}=(x+1)^3\text{.}\) That is, \(A\) has only one eigenvalue of \(\lambda =-1\) of geometric multiplicity 3. It is easy to see that \(e_3=(0,0,1)\) is an eigenvector corresponding to \(\lambda=-1\text{.}\) We have

\begin{equation*} A -\lambda I = \begin{pmatrix}0\amp 1 \amp 0\\0 \amp 0 \amp 1\\0 \amp 0 \amp 0 \end{pmatrix}\text{.} \end{equation*}

It is easy to see that nullity of \(A -\lambda I\) is 1. Hence the geometric multiplicity of \(\lambda\) is 1 where as its algebraic multiplicity is 3.

We list the following properties of eigenvalues and eigenvectors without proof.

Theorem 5.1.15. Properties of Eigenvalues and Eigenvectors.

\(A\) and \(A^T\) have the same eigenvalues.
If \(\lambda\) is an eigenvalue of \(A\text{,}\) then \(\alpha \lambda\) is an eigenvalue of \(\alpha A\text{.}\)
If \(\lambda\) is an eigenvalue of \(A\text{,}\) then \(\lambda^2\) is an eigenvalue of \(A^2\text{.}\)
If \(\lambda\) is an eigenvalue of a non singular matrix \(A\text{,}\) then \(1/ \lambda\) is an eigenvalue of \(A^{-1}\text{.}\)
If \(\lambda\) is an eigenvalue of \(A\text{,}\) then \(\lambda-k\) is an eigenvalue of \(A-kI\) for any scalar \(k\text{.}\)
If \(\lambda\) is an eigenvalue of \(A\) and \(p(x)=c_0+c_cx+c_2x^2+\cdots c_kx^k\) is a polynomial in \(x\text{,}\) then \(p(\lambda)\) is an eigenvalue of \(p(A)=c_0I+c_cA+c_2A^2+\cdots c_kA^k\text{.}\)
Two matrices \(A\) and \(B\) are called similar if there exists a matrix \(P\) such that \(B=P^{-1}AP\text{.}\) Similar matrices have same eigenvalues.
If \(\lambda_1\) and \(\lambda_2\) are distinct eigenvalues of \(A\) then eigenvectors \(v_1\) and \(v_2\) corresponding to \(\lambda_1\) and \(\lambda_2\) are linearly independent. Can you generalize this?
The rank if a matrix a square matrix \(A\) is is the number of nonzero eigenvalues of \(A\text{.}\)
If \(T\) is a linear transformation from \(\R^n\to \R^m\text{.}\) Fix a basis \(\beta\) of \(\R^n\text{.}\) Let \(A=[T]_\beta\) be the matrix of \(T\) with respect to \(\beta\text{.}\) Then \(A\) and \(T\) have the same eigenvalues. Furthermore, eigenvalues of \(T\) are independent of the basis.

Example 5.1.16.

Let \(A=\begin{pmatrix}1\amp 2\amp -2\\1\amp 1\amp 1\\1\amp 3\amp -1 \end{pmatrix}\) and \(B=A^3-3A+I\text{.}\) Let us find eigenvalues of \(B\text{.}\)

It is easy to the characteristic polynomial of \(A\) is given by \(\lambda^3-\lambda^2-4\lambda+4\) and \(\lambda=-2, 1, 2\text{.}\) Then eigenvalues of \(B\) are given by

\begin{equation*} \{(-2)^3-3\times (-2)+1, 1^3-3\times 1+1, 2^3-3\times (2)+1\} =\{-1, -1, 3\} \end{equation*}

Theorem 5.1.17.

Eigenvalues of Hermitian (symmetric) matrix are real.
Eigenvalues of skew-Hermitian (skew-symmetric) matrix are zero or purely imaginary.

Proof.

(a) Let \(\lambda\) be an eigenvalues of \(A\) and \(v\text{,}\) the corresponding eigenvector of \(A\text{.}\) Then by definition \(Av=\lambda v\text{.}\) Multiplying both sides by \(\overline{v}^T\) (the conjugate transpose of the vector \(v\)), we get

\begin{equation*} \overline{v}^TAv=\lambda\overline{v}^Tv \Longrightarrow \lambda=\dfrac{\overline{v}^TAv}{\overline{v}^Tv}\text{.} \end{equation*}

It is easy to see that \(\overline{v}^TAv\) and \(\overline{v}^Tv\) are scalars and that \(\overline{v}^Tv\) is a real number. Hence the behavior of \(\lambda\) is determined by \(\overline{v}^TAv\text{.}\)

If \(A\) is a herminitan matrix then \(\overline{A}=A^T\text{,}\) also \(\overline{v}^TAv\) is scalar, implies \((\overline{v}^TAv)^T=\overline{v}^TAv\text{.}\) Hence

\begin{equation*} \overline{(\overline{v}^TAv)}=v^T\overline{A}\overline{v}=v^T{A^T}\overline{v}=(\overline{v}^TA{v})^T=v^T{A}\overline{v}\text{.} \end{equation*}

This implies that \(\overline{v}^TAv\) is a real number and hence \(\lambda\) is a real number.

Now if \(A\) is a skew-hermitian matrix, then it is easy to show that \(\overline{(\overline{v}^TAv)}=-(\overline{v}^TAv)\text{.}\) Hence \(\overline{v}^TAv\) is either purely imaginary or zero. Which show \(\lambda\) is either purely imaginary of zero.

In genegarl, let \(f(x)=\alpha_0+\alpha_1 x+\alpha_2 x^2+\cdots+\alpha_kx^k\) be a polynomial of degree \(k\) and \(A\) be an \(n\times n\) real matrix. Then we can define \(f(A)=\alpha_0 I+\alpha_1 A+\alpha_2 A^2+\cdots+\alpha_k A^k\text{.}\) If \(\lambda\) is an eigenvalue of \(A\text{,}\) theh \(f(\lambda)\) is an eigenvalue of \(f(A)\text{.}\)

Theorem 5.1.18. Cayley-Hamilton Theorem.

Every square matrix satisfies its characteristic equation. That is, if \(p(x)=0\) is characteristic equation of \(A\text{,}\) then \(p(A)=0\text{.}\)

Example 5.1.19.

Let \(A=\begin{pmatrix}1\amp 2\amp -2\\1\amp 1\amp 1\\1\amp 3\amp -1 \end{pmatrix}\text{.}\) From Example 5.1.9, the characteristic polynomial of \(A\) is given by \(p(x)=\det{x I -A}=x^3-x^2-4x+4\text{.}\) We have \(A^2=\left(\begin{array}{rrr} 1 \amp -2 \amp 2 \\ 3 \amp 6 \amp -2 \\ 3 \amp 2 \amp 2 \end{array} \right)\) and \(A^3=\left(\begin{array}{rrr} 1 \amp 6 \amp -6 \\ 7 \amp 6 \amp 2 \\ 7 \amp 14 \amp -6 \end{array} \right)\text{.}\) Hence

\begin{align*} p(A) =\amp A^3-A^2-4A+4I\\ =\amp \left(\begin{array}{rrr} 1 \amp 6 \amp -6 \\ 7 \amp 6 \amp 2 \\ 7 \amp 14 \amp -6 \end{array} \right)- \left(\begin{array}{rrr} 1 \amp -2 \amp 2 \\ 3 \amp 6 \amp -2 \\ 3 \amp 2 \amp 2 \end{array} \right)\\ \amp -4\begin{pmatrix}1\amp 2\amp -2\\1\amp 1\amp 1\\1\amp 3\amp -1 \end{pmatrix} +4\begin{pmatrix}1\amp 0\amp 0\\0\amp 1\amp 0\\ 0\amp 0\amp 1 \end{pmatrix}\\ =\amp \begin{pmatrix}0\amp 0\amp 0\\0\amp 0\amp 0\\ 0\amp 0\amp 0 \end{pmatrix} \text{.} \end{align*}

Hence \(A\) satisfies its characteristic equation.

It is easy to check that \(\det{(A)}=-4\text{,}\) hence \(A\) is non singular. Since \(A^3-A^2-4A+4I=0\text{,}\) multiplying both sides by its inverse, we get \(A^2-A+4I+4A^{-1}=0\text{.}\) Hence

\begin{equation*} A^{-1}=-\frac{-1}{4}A^2+\frac{1}{4}A+I=\left(\begin{array}{rrr} 1 \amp 1 \amp -1 \\ -\frac{1}{2} \amp -\frac{1}{4} \amp \frac{3}{4} \\ -\frac{1}{2} \amp \frac{1}{4} \amp \frac{1}{4} \end{array} \right)\text{.} \end{equation*}

We can also find higher powers of a matrix, using the Cayley-Hamilton theorem. For example multiplying by \(A\) to the equation, \(A^3-A^2-4A+4I=0\text{,}\) we get \(A^4-A^3-4A^2+4A=0\text{,}\) from this we have

\begin{equation*} A^4 = A^3+4A^2-4A=\left(\begin{array}{rrr} 1 \amp -10 \amp 10 \\ 15 \amp 26 \amp -10 \\ 15 \amp 10 \amp 6 \end{array} \right)\text{.} \end{equation*}

Can you find \(A^5\text{?}\)

Checkpoint 5.1.20.

(i) Consider the matrix \(A = \begin{pmatrix}3 \amp -2 \\-4\amp 3 \end{pmatrix}\text{.}\) Show that \(A\) satisfies its characteristics equation. Hence find \(A^{-1}, A^3, A^4\text{.}\)

(i) Consider the matrix \(A = \begin{pmatrix}1 \amp 0 \amp 0 \\ -4 \amp -3 \amp 4 \\ -2 \amp -2 \amp 3 \end{pmatrix}\text{.}\) Show that \(A\) satisfies its characteristics equation. Hence find \(A^{-1}, A^4, A^5\text{.}\)

Definition 5.1.21. Spectral Radius.

Let \(A\) an \(n\times n\) and \(\lambda_i\) for \(1\leq i\leq n\) be eigenvalues of \(A\) then the spectral radius of \(A\) is define as \(\rho(A):=\displaystyle\max_{1\leq i\leq n}\{ |\lambda_i| \}\text{.}\)

Example 5.1.22.

Let \(A= \begin{pmatrix}0 \amp -1\\1 \amp 0 \end{pmatrix}\text{.}\) Then the characteristics polynomial of \(A\) is \(\det{(xI-A)}=x^2+1\text{.}\) Hence \(x=\{i,-i\}\) are roots of the characteristic polynomial. Hence \(i\) and \(-i\) are eigenvalues of \(A\text{.}\) Hence

\begin{equation*} \rho(A) = \max\{|i|,|-i|\}=1\text{.} \end{equation*}

Example 5.1.23.

Consider the matrix \(A=\left(\begin{array}{rrr} 4 \amp -3 \amp 0 \\ 3 \amp 4 \amp 0 \\ 5 \amp 10 \amp 10 \end{array} \right)\text{.}\) Then the characteristics polynomial of \(A\) is \(\det{xI-A}=x^{3} - 18 x^{2} + 105 x - 250\text{.}\) Which has roots, \(10, 3-4i,3+4i\text{.}\) Hence

\begin{equation*} \rho(A) = \max\{10,|3-4i|,|3+4i\}=10\text{.} \end{equation*}

Checkpoint 5.1.24.

Find the spectral radius of \(\left(\begin{array}{rr} 2 \amp -3 \\ 3 \amp 2 \end{array} \right), \left(\begin{array}{rrr} 2 \amp 3 \amp 1 \\ -3 \amp 2 \amp 2 \\ 0 \amp 0 \amp 2 \end{array} \right)\) and \(\left(\begin{array}{rrr} 2 \amp 3 \amp 1 \\ 3 \amp 2 \amp 2 \\ 0 \amp 0 \amp 1 \end{array} \right)\text{.}\)

Definition 5.1.25. Positive definite matrix.

Let \(A\) be an \(n\times n\) symmetric matrix. Then \(A\) is said to be positive definite if \(x^TAx\gt 0\) for all \(x\in \R^n\setminus \{0\}\) and \(x^TAx=0\) if and only if \(x=0\text{.}\) \(A\) is called negative definite if \(-A\) is positive definite.

Example 5.1.26.

Let \(A=\begin{pmatrix}1\amp 2\\2\amp 5 \end{pmatrix}\text{.}\) Let \(x=\begin{pmatrix}x_1\\x_2 \end{pmatrix}\text{.}\) Then

\begin{align*} x^TAx=\amp \begin{pmatrix}x_1\amp x_2 \end{pmatrix} \begin{pmatrix}1\amp 2\\2\amp 5 \end{pmatrix} \begin{pmatrix}x_1\\x_2 \end{pmatrix}\\ =\amp \begin{pmatrix}x_1\amp x_2 \end{pmatrix} \begin{pmatrix}x_1+2x_2\\2x_1+5x_2 \end{pmatrix} \\ =\amp x_1^2+4x_1x_2+5x_2^2=(x_1+2x_2)^2+x_2^2\text{.} \end{align*}

Clearly \(x^TAx>0\) for all non zero vector \(x\) and \(x^TAx=0\) if and only if \(x=0\text{.}\) Hence \(A\) is positive definite.

Example 5.1.27.

Let \(A=\begin{pmatrix}0\amp 1\\1\amp 0 \end{pmatrix}\text{.}\) Let \(x=\begin{pmatrix}x_1\\x_2 \end{pmatrix}\text{.}\) Then

\begin{equation*} x^TAx=\begin{pmatrix}x_1\amp x_2 \end{pmatrix} \begin{pmatrix}0\amp 1\\1\amp 0 \end{pmatrix} \begin{pmatrix}x_1\\x_2 \end{pmatrix} = \begin{pmatrix}x_1\amp x_2 \end{pmatrix} \begin{pmatrix}x_2\\x_1 \end{pmatrix} =2x_1x_2 \end{equation*}

Thus if \(x=\begin{pmatrix}1\\-1 \end{pmatrix}\) then \(x^TAx=-2\lt 0\text{.}\) Hence it is not a positive definite. Its easy to see that \(A\) is also not negative definite.

Checkpoint 5.1.28.

If \(A\) is a positive definite matrix then all its eigenvalues are positive.

Checkpoint 5.1.29.

If \(A\) is a negative definite matrix then all its eigenvalues are negative.

We have the following result about positive definite matrices known as Sylvester’s criterion. It allows us to determine if a given matrix in positive definite using the leading principal minors of the matrix.

The leading principal minors of a matrix \(A\) are \({\rm det}(A)\) and the minors obtained by successively removing the last row and the last columns. That is, the leading principal miniors of a matrix \(A\) are

\begin{equation*} \Delta_1 =a_{11}, \Delta_2 =\begin{vmatrix}a_{11} \amp a_{12}\\a_{21} \amp a_{22}\end{vmatrix}, \Delta_3= \begin{vmatrix}a_{11} \amp a_{12} \amp a_{13}\\a_{21} \amp a_{22} \amp a_{23}\\a_{31} \amp a_{32}\amp a_{33}\end{vmatrix}, \cdots, \Delta_n ={\rm det}{(A)}\text{.} \end{equation*}

Theorem 5.1.30. Sylvester’s Criterion.

If \(A\) is a real symmetric matrix then \(A\) is positive definite if and only if all leading minor of \(A\) are positive.

Example 5.1.31.

Let \(A=\begin{pmatrix}2 \amp 1 \amp 1\\1\amp 2\amp 1\\1\amp 1\amp 2 \end{pmatrix}\text{.}\) For any \(x=\begin{pmatrix}x_1\\x_2\\x_2 \end{pmatrix} \in \R^2\text{,}\) we have

\begin{equation*} x^TAx=x_1^2+x_2^2+x_3^2+(x_1+x_2+x_3)^2 \end{equation*}

Henc e \(A\) is positive definite.

Checkpoint 5.1.32.

Let \(A=\begin{pmatrix}2 \amp -1 \amp 0\\-1\amp 2\amp -1\\0\amp -1\amp 2 \end{pmatrix}\text{.}\) Show that \(A\) is positive definite.

Note that if \(A\) is not a symmetric matrix, then the Sylvester’s criteria cannot be used to check positive definiteness. For, condider the matrix \(A=\begin{pmatrix} 1 \amp 0\\-3 \amp 1\end{pmatrix}\text{.}\) It is easy to see that all principal minors of \(A\) are positive. For \(u=\begin{pmatrix}1\\2\end{pmatrix}\text{,}\) \(u^TAu=-1\text{,}\) however for \(v=\begin{pmatrix}3\\1\end{pmatrix}\text{,}\) \(v^TAv=1\text{.}\)

Prev Top Next