Applications of Eigenvalues and Eigenvectors

Section 5.3 Applications of Eigenvalues and Eigenvectors

In this section we look at several applications of eigenvalues and eigenvectors.

Subsection 5.3.1 Fibonacci Sequence

Originally the Fibonacci sequence appeared in the solution of the following problem posed by Leonardo of Pisa, popularly known as Fibonacci, in the book Liber Abaci in 1202.

Consider the following problem.

A certain man put a pair of rabbits in a places surrounded on all sides by a wall. How many pairs of rabbits can be produced from that pair in year if it is supposed that every month each pair begets a new pair which from second month becomes productive?

We assume that the number of pairs at the end of the $n$-th year is denoted by $F_n\text{.}$ We start with $F_0=0$ and $F_1=1\text{.}$ Note that the number of pair of rabbits in any given month is the number of pairs in the previous month plus those that are hatched in that month and those are as many as the previous month. This gives gives rise to a recurrence relation

\begin{equation*} F_{n+1}=F_n+F_{n-1}, \qquad \text{with } F_0=0, F_1=1\text{.} \end{equation*}

Let $X_{n}=\begin{pmatrix}F_n\\F_{n-1}\end{pmatrix}\text{.}$ Then we have

\begin{equation*} X_{n+1}=\begin{pmatrix}F_{n+1}\\F_{n}\end{pmatrix}= \begin{pmatrix}F_{n}+F_{n-1}\\F_{n}\end{pmatrix}= \begin{pmatrix}1\amp 1\\1 \amp 0\end{pmatrix} \begin{pmatrix}F_{n}\\F_{n-1}\end{pmatrix}. \end{equation*}

Let $M=\begin{pmatrix}1 \amp 1\\1\amp 0\end{pmatrix}\text{.}$ Then we have

\begin{equation*} X_{n+1}=MX_n. \end{equation*}

This implies

\begin{equation*} X_{n+1}=M^nX_1. \end{equation*}

This leads to computation of power of $M\text{.}$ Therefore, we can invoke diagonalization of the matrix.

Note that $M$ is symmetric matrix with characteristic polynomial $\lambda^2-\lambda-1\text{.}$ The eigenvalues of $M$ are $\frac{1\pm \sqrt{5}}{2}$ and the corresponding eigenvectors are $\begin{pmatrix}\frac{1\pm \sqrt{5}}{2} \\1\end{pmatrix}\text{.}$ Since $M$ has distinct eigenvalues, $M$ is diagonalizable and we have

\begin{equation*} M=PDP^{-1} \text{ where } P= \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp \frac{1-\sqrt{5}}{2}\\1 \amp 1 \end{pmatrix} \text{ and } D= \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp 0 \\0 \amp \frac{1-\sqrt{5}}{2} \end{pmatrix}. \end{equation*}

Let $\varphi=\frac{1+\sqrt{5}}{2}$ be one of the above eigenvalue of $M\text{,}$ then $1-\varphi=\frac{1-\sqrt{5}}{2}\text{.}$ Also it is easy to check that $(1+\frac{1}{\varphi})=\varphi\text{.}$ Now we have

\begin{align*} X_{n+1} =\amp M^nX_1\\ =\amp PD^n P^{-1} X_1\\ =\amp \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp \frac{1-\sqrt{5}}{2}\\1 \amp 1 \end{pmatrix}\begin{pmatrix} \varphi^n \amp 0\\ 0 \amp (1-\varphi)^n \end{pmatrix} \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp \frac{1-\sqrt{5}}{2}\\1 \amp 1 \end{pmatrix}^{-1} \begin{pmatrix} 1\\0 \end{pmatrix}\\ =\amp \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp \frac{1-\sqrt{5}}{2}\\1 \amp 1 \end{pmatrix}\begin{pmatrix} \varphi^n \amp 0\\ 0 \amp (1-\varphi)^n \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{5}} \amp *\\ -\frac{1}{\sqrt{5}} \amp * \end{pmatrix} \begin{pmatrix} 1\\0 \end{pmatrix}\\ =\amp \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp \frac{1-\sqrt{5}}{2}\\1 \amp 1 \end{pmatrix}\begin{pmatrix} \varphi^n \amp 0\\ 0 \amp (1-\varphi)^n \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{5}}\\ -\frac{1}{\sqrt{5}} \end{pmatrix} \\ =\amp \begin{pmatrix} \frac{1+\sqrt{5}}{2} \amp \frac{1-\sqrt{5}}{2}\\1 \amp 1 \end{pmatrix} \begin{pmatrix} \frac{\varphi^n}{\sqrt{5}}\\ -\frac{(1-\varphi)^n}{\sqrt{5}} \end{pmatrix}\\ =\amp \begin{pmatrix} *\\\frac{\varphi^n-(1-\varphi)^n}{\sqrt{5}} \end{pmatrix} \end{align*}

Thus we have

\begin{equation*} F_n=\frac{(1+\sqrt{5})^n-(1-\sqrt{5})^n}{2^n\sqrt{5}}. \end{equation*}

Checkpoint 5.3.1.

Show that

\begin{equation*} \displaystyle\lim_{n\to \infty}\dfrac{F_{n+1}}{F_n}=\frac{1+\sqrt{5}}{2}=\varphi. \end{equation*}

Note that $\varphi$ is called the Golden-Ratio which has many application in nature. The approximate value of $\varphi \approx 1.618025\text{.}$

Hint.

Let $x_n:=\dfrac{F_{n+1}}{F_n}\text{.}$ Then we have

\begin{equation*} x_n=\frac{F_n+F_{n-1}}{F_n}=1+\frac{F_{n-1}}{F_n}=1+\frac{1}{x_{n-1}}. \end{equation*}

Now let us approximate

\begin{align*} |x_n-\varphi| = \amp \left|\left( 1+\frac{1}{x_{n-1}}\right)-\left( 1+\frac{1}{\varphi}\right)\right|\\ =\amp \left|\frac{1}{x_{n-1}}-\frac{1}{\varphi}\right|\\ \leq \amp \frac{1}{\varphi}\left|x_{n-1}-\varphi\right| \quad \text{Since $x_{n}\geq 1$.}\\ \leq \amp \frac{1}{\varphi^{n-1}}\left|x_1-\varphi\right| \end{align*}

Since $0\lt \frac{1}{\varphi} \lt 1\text{,}$ we have $\frac{1}{\varphi^n}\to 0\text{.}$ This implies, $x_n\to \varphi\text{.}$

Subsection 5.3.2 Predator-Pray Model

In a certain area, it is observed that every year the number of rabbits is equal to 4 times the number of rabbits less 2 times numbers of weasels in the previous year. The number of weasels is equal to the sum of the number of rabbits number of weasels in the previous year. If the initial population of rabbits and weasels were 100 and 10 respectively, then find the number of each species after $n$ years. Let $r_n$ and $w_n$ be the number of rabbits and weasels after $n$ years respectively.

Then, as per data give we have

\begin{align*} r_{n+1}=\amp 4r_n-2w_{n}\\ w_{n+1}=\amp r_n+w_n \end{align*}

with $r_0=100, w_0=10\text{.}$

Let $X_n=\begin{pmatrix}r_n\\w_n\end{pmatrix}$ and $A=\begin{pmatrix}4 \amp -2 \\ 1\amp 1 \end{pmatrix}\text{.}$ Then the above system is equivalent to

\begin{equation*} X_{n+1}=AX_n\quad \text{ with } \quad X_0=\begin{pmatrix}100\\10\end{pmatrix}\text{.} \end{equation*}

We need to obtain successively

\begin{equation*} X_1=AX_0, X_2=A^2X_0, \ldots, X_n=A^n X_0\text{.} \end{equation*}

It is easy to check that $\lambda_1=2$ and $\lambda_2=3$ are eigenvalues of $A$ with corresponding eigenvectors $v_1=\begin{pmatrix}1\\1\end{pmatrix}$ and $v_1=\begin{pmatrix}2\\1\end{pmatrix}$ respectively. We define $P = \begin{pmatrix}1 \amp 2\\1 \amp 1\end{pmatrix}\text{.}$ It is easy to see that $P^{-1}=\begin{pmatrix}-1 \amp 2\\1 \amp -1\end{pmatrix} \text{.}$ Also $D=\begin{pmatrix}2 \amp 0\\0 \amp 3\end{pmatrix}\text{.}$ Using diagonalization, we have

\begin{equation*} A^n=\begin{pmatrix}1 \amp 2\\1 \amp 1\end{pmatrix} \begin{pmatrix}2^n \amp 0\\0 \amp 3^n\end{pmatrix} \begin{pmatrix}-1 \amp 2\\1 \amp -1\end{pmatrix} = \begin{pmatrix} -2^n + 2 \cdot 3^n \amp 2 \cdot 2^n - 2 \cdot 3^n \\ -2^n + 3^n \amp 2\cdot 2^n - 3^n \end{pmatrix}. \end{equation*}

Thus

\begin{equation*} X_n=A^nX_0=\begin{pmatrix}180\times 3^n-80\times 2^n \\90\times 3^n-80\times 2^n\end{pmatrix}\text{.} \end{equation*}

Hence

\begin{equation*} r_n=180\times 3^n-80\times 2^n \text{ and } w_n=90\times 3^n-80\times 2^n \end{equation*}

It is easy to check that

\begin{equation*} \lim_{n\to \infty} \frac{r_n}{w_n}=2. \end{equation*}

What does this mean?

Subsection 5.3.3 Solving System of Linear ODE

In this subsection, we shall deal with use of theory of eigenvalues and eigenvectors to solve system of linear differential equations. Let shall consider a system of first order linear differential equations with constant coefficients. A general form of such system can be given by

\begin{align*} x_1'=\frac{dx_1}{dt} = \amp a_{11}x_1+a_{12}x_2+\cdots + a_{1n}x_n\\ x_2'=\frac{dx_2}{dt}=\amp a_{21}x_1+a_{22}x_2+\cdots + a_{2n}x_n\\ \amp \vdots\\ x_n'=\frac{dx_n}{dt}=\amp a_{n1}x_1+a_{n2}x_2+\cdots + a_{nn}x_n \end{align*}

Here we assume that $a_{ij}\in \R$ are constants. The above system can be written as

\begin{equation*} x'=\frac{dx}{dt}=Ax \end{equation*}

where $A=[a_{ij}]_{n\times n}$ real matrix and $x'=\begin{bmatrix}x_1' \amp x_2'\amp \cdots \amp x_n'\end{bmatrix}^T$ is column matrix.

We want to find a solution of the above system, that is a column vector $x =\begin{bmatrix}x_1 \amp x_2\amp \cdots \amp x_n\end{bmatrix}^T$ of $n$ of differentiable functions on some interval $[a,b]$ which satisfies the above system.

If $S$ is the set of all solutions of the above system. Then one can show that $S$ is an $n$-dimensional vector subspace $D[a,b]$ of the set of differentiable function on $[a,b]\text{.}$ This mean, if $x_1, \ldots, x_n$ are linearly independent functions which are solutions of the above system. Then any solution can be written as a linear combination of $x_1,\ldots, x_n\text{.}$

In case, we are given $n$ initial conditions, then there exists a unique solution of the system.

In addition, suppose $A$ is diagonalizable, with $A=PDP^{-1}$ where $D={\rm diag}(\lambda_1,\ldots,\lambda_n)\text{,}$ the diagonal matrix of eigenvalues of $A$ and $P=\begin{bmatrix}u_1\amp \ldots\amp u_n\end{bmatrix}$ is matrix of eigenbasis of $A\text{.}$

We define $u=P^{-1}x\text{.}$ Then $x=Pu$ and $x'=Pu'\text{.}$ Now substituting $x$ and $x'$ in the above system, we get

\begin{equation*} Pu'=APu \implies u'=P^{-1}APu=Du. \end{equation*}

Thus the above system of linear differential equation reduced to

\begin{align*} u_1' =\amp \lambda_1 u_1\\ u_2' = \amp \lambda_2 u_2\\ \amp \vdots\\ u_n' = \amp \lambda_n u_n \end{align*}

In particular, we have $u'=Du\text{.}$

A general solution of the above system is given by

\begin{equation*} u_1=c_1e^{\lambda_1 t}, u_2=c_2e^{\lambda_2 t},\ldots, u_n=c_ne^{\lambda_n t}. \end{equation*}

The above equations can be written in matrix form as follows

\begin{align*} u=\begin{bmatrix}u_1\\u_2\\\vdots\\u_2\end{bmatrix} =\amp \begin{bmatrix}e^{\lambda_1 t} \amp 0 \amp \cdots \amp 0\\ 0 \amp e^{\lambda_2 t}\amp \cdots \amp 0\\ \vdots \amp \vdots \amp \ddots \amp \vdots\\ 0 \amp 0 \amp \cdots \amp e^{\lambda_n t} \end{bmatrix} \begin{bmatrix}c_1\\c_2\\\vdots\\c_2\end{bmatrix}=e^{tD}c \end{align*}

Here $e^{tD}$ is the exponential of the matrix $tD$ and $c$ is the column matrix of constants. From this we get a general solution of the original system as

\begin{equation*} x_i=\sum_{j=1}^n p_{ij}u_j=\sum_{j=1}^n p_{ij}c_je^{\lambda_j t} \end{equation*}

where $P=[p_{ij}]\text{.}$

Thus the solution in matrix form is given by

\begin{equation*} x(t)=Pe^{tD}c. \end{equation*}

The constant can be obtained when we have initila value problem.

Suppose we want to solve the initial value problem $x'(t)=Ax(t)$ with initial condition $x(0)=x_0\text{.}$ Then we have $c=P^{-1}x_0\text{.}$ Hence the solution in matrix form is given by

\begin{equation*} x(t)=Pe^{tD}P^{-1}x_0. \end{equation*}

Steps to solve system of linear differential equations $X'=AX\text{.}$

Find the matrix $A$ corresponding to this linear system and put the equation in matrix form $X'=AX\text{.}$
Find the eigenvalues and corresponding eigenvectors of $A\text{.}$ Let $\lambda_1, \ldots, \lambda_n$ be eigenvalues of $A$ with corresponding to eigenvectors $v_1, \ldots, v_n$ respectively.
The solution is $X=c_1e^{\lambda_1t}v_1+\cdots+ +c_n e^{\lambda_nt} v_n\text{.}$

Example 5.3.2.

Solve the following system of linear differential equations

\begin{align*} x'=\amp -5x+2y\\ y' = \amp x-4y \end{align*}

The above equations is equivalent to

\begin{equation*} X'=AX \text{ where } X=\begin{pmatrix}x\\y\end{pmatrix}, A=\begin{pmatrix}-5 \amp 2\\1 \amp -4\end{pmatrix}. \end{equation*}

Solution.

It is easy to see that the eigenvalues of $A$ are $\lambda_1=-3$ and $\lambda_2=-6$ with corresponding eigenvectors $v_1=\begin{pmatrix}1\\1\end{pmatrix}$ and $v_2=\begin{pmatrix}2\\-1\end{pmatrix}$ respectively.

Hence the solution of the given system is given by

\begin{equation*} X=c_1e^{-3t}\begin{pmatrix}1\\1\end{pmatrix}+c_2e^{-6t} \begin{pmatrix}2\\-1\end{pmatrix}=\begin{pmatrix} c_1e^{-3t}+2c_2e^{-6t}\\c_1e^{-3t}-c_2e^{-6t}\end{pmatrix}. \end{equation*}

Hence, the solution is

\begin{equation*} x(t)=c_1e^{-3t}+2c_2e^{-6t}, y(t)=c_1e^{-3t}-c_2e^{-6t}. \end{equation*}

Checkpoint 5.3.3.

What will be the solution of

\begin{align*} x'=\amp -5x+2y\\ y' = \amp x-4y \end{align*}

with initial conditions $x(0)=1, y(0)=-2\text{?}$

Example 5.3.4.

Solve the following system ODE.

\begin{equation*} x'(t)=\begin{pmatrix} 1 \amp 1 \amp 4 \\ 2 \amp 0 \amp -4 \\ -1 \amp 1 \amp 5 \end{pmatrix} x(t), \quad x(0)=\begin{pmatrix} 7\\-4\\5 \end{pmatrix}. \end{equation*}

Solution.

The eigenvalues of the coefficient matrix $A$ are

\begin{equation*} \lambda_1=1, \lambda_2=2,\lambda_3=3 \end{equation*}

and the corresponding eigenvectors are

\begin{equation*} v_1=\begin{pmatrix} 0\\4\\-1\end{pmatrix}, v_2=\begin{pmatrix} 1\\1\\0\end{pmatrix}, v_3=\begin{pmatrix} 2\\0\\1\end{pmatrix}. \end{equation*}

Hence a general solution of the given system is given by

\begin{align*} x(t) =\amp c_1e^t\begin{pmatrix} 0\\4\\-1\end{pmatrix}+ c_2e^{2t}\begin{pmatrix} 1\\1\\0\end{pmatrix}+ c_3e^{3t}\begin{pmatrix} 2\\0\\1\end{pmatrix} \end{align*}

Using the given initial conditions, we get

\begin{equation*} \left(\begin{array}{rrr} 0 \amp 1 \amp 1 \\ 4 \amp 1 \amp 0 \\ -1 \amp 0 \amp 1 \end{array}\right) \begin{pmatrix} c_1\\c_2\\c_3 \end{pmatrix}=\begin{pmatrix} 7\\-4\\5 \end{pmatrix}. \end{equation*}

Solving the above equations, yields, $c_1=-2, c_2=4,c_3=3\text{.}$ Hence the required solution of the problem is

\begin{equation*} x(t)=-2e^t\begin{pmatrix} 0\\4\\-1\end{pmatrix}+ 4e^{2t}\begin{pmatrix} 1\\1\\0\end{pmatrix}+ 3e^{3t}\begin{pmatrix} 2\\0\\1\end{pmatrix} \end{equation*}

You may write the each component of the solution.

Example 5.3.5.

Solve the following 2nd order differential equations. Consider the system

\begin{align*} x_1''(t)= \amp x_1'+2x_2'-2x_2\\ x_2''(t)= \amp 2x_1'+x_2'+x_2-x_1 \end{align*}

Solution.

Substitute $x_1'=x_3$ and $x_2'=x_4\text{.}$ Then we have

\begin{align*} x_1'(t)\ =\amp x_3\\ x_2'(t) =\amp x_4\\ x_3'(t) =\amp x_3+2x_4-2x_2\\ x_4'(t) =\amp 2x_3+x_4+x_2-x_1 \end{align*}

The above equations can be written as $x'(t)=Ax(t)$ where

\begin{equation*} x(t)=\begin{pmatrix} x_1(t)\\x_2(t)\\x_3(t)\\x_4(t) \end{pmatrix} \quad \text{and}\quad A=\begin{pmatrix} 0 \amp 0 \amp 1 \amp 0\\ 0 \amp 0 \amp 0 \amp 1\\ 0 \amp -2 \amp 1 \amp 2\\ -1 \amp 1 \amp 2 \amp 1 \end{pmatrix}. \end{equation*}

The eigenvalues of $A$ are $\lambda_1=1, \lambda_2=-2, \lambda_3=\frac{3-\sqrt{5}}{2}, \lambda_4=\frac{3+\sqrt{5}}{2}$ and the corresponding eigenvectors are

\begin{equation*} v_1=\begin{pmatrix} 1\\ -1\\ 1\\ -1 \end{pmatrix}, v_2=\begin{pmatrix} 1\\ -1\\ -2\\ 2 \end{pmatrix}, v_3=\begin{pmatrix} 3+\sqrt{5}\\ 1\\ 2 \\ \frac{3-\sqrt{5}}{2} \end{pmatrix}, v_4=\begin{pmatrix} 3-\sqrt{5}\\ 1\\ 2 \\ \frac{3+\sqrt{5}}{2} \end{pmatrix}. \end{equation*}

Thus a general solution is given by

\begin{equation*} x(t)=c_1e^{t}\begin{pmatrix} 1\\ -1\\ 1\\ -1 \end{pmatrix}+c_2e^{-2t}\begin{pmatrix} 1\\ -1\\ -2\\ 2 \end{pmatrix}+c_3e^{\frac{3-\sqrt{5}}{2}t}\begin{pmatrix} 3+\sqrt{5}\\ 1\\ 2 \\ \frac{3-\sqrt{5}}{2} \end{pmatrix}+c_4e^{\frac{3+\sqrt{5}}{2}t}\begin{pmatrix} 3-\sqrt{5}\\ 1\\ 2 \\ \frac{3+\sqrt{5}}{2} \end{pmatrix}. \end{equation*}

Hence a general solution of the given problem can be obtained by dropping the dummy variables $x_3$ and $x_4\text{.}$ It is given by

\begin{align*} x_1(t) =\amp c_1e^t+c_2e^{-2t}+c_3(3+\sqrt{5})e^{\frac{3-\sqrt{5}}{2}t}+c_4(3-\sqrt{5})e^{\frac{3+\sqrt{5}}{2}t}\\ x_2(t) =\amp -c_1e^t-c_2e^{-2t}+ c_3e^{\frac{3-\sqrt{5}}{2}t}+c_4e^{\frac{3+\sqrt{5}}{2}t} \end{align*}

In the next example, we deal with solving a system $x'(t)=Ax(t)\text{,}$ in which the coefficient matrix $A$ has complex eigenvalues.

Example 5.3.6.

Solve the system of linear homogeneous ODE

\begin{align*} x_1'(t)=\amp x(t)+y(t),\quad x_2'(t)=-x+y. \end{align*}

Solution.

The given system of ODE is

\begin{equation*} x'(t)=Ax(t), \quad \text{ where } A = \begin{pmatrix} 1 \amp 1\\-1 \amp 1\end{pmatrix}. \end{equation*}

The eigenvalues of $A$ are $\lambda_1=1-i$ and $\lambda_2=1+i,$ with eigenvectors $v_1=\begin{pmatrix} 1\\ -i \end{pmatrix}$ and $v_2=\begin{pmatrix} 1\\ i \end{pmatrix}$ respectively.

Let us define $P:=[v_1~~v_2]=\begin{pmatrix} 1 \amp 1\\-i \amp i \end{pmatrix}\text{.}$ Then it is easy to see that $P^{-1}AP=D=\begin{pmatrix} 1-i \amp 0\\0 \amp 1+i \end{pmatrix}\text{.}$

Hence the solution is given by

\begin{align*} x(t)=\amp c_1e^{(1-i)t}\begin{pmatrix} 1\\ -i \end{pmatrix}+c_2e^{(1+i)t}\begin{pmatrix} 1\\ i \end{pmatrix}\\ = \amp c_1e^{t} e^{-it}\begin{pmatrix} 1\\ -i \end{pmatrix}+c_1e^{t} e^{it}\begin{pmatrix} 1\\ i \end{pmatrix} \end{align*}

Writing $e^{it}=\cos t+i\sin t, e^{-it}=\cos t-i\sin t$ and adjusting the constants, we can write the solution

\begin{equation*} x_1(t)=e^t(d_1\cos t +d_2\sin t), x_2(t)=e^{t}(d_1\cos t +d_2\sin t). \end{equation*}

Remark 5.3.7.

In case the the coeffcient matrix $A$ of the sysstem $x'=Ax$ is not diagonailizable then one can use the concept of Jordan canonical form to solve the system. However, we shall not discuss this here.

Subsection 5.3.4 Markov Chains

In this subsection, we look at some applications of linear algebra in dealing with matrices that arise from probabilistic systems. We consider chance process in which knowledge of previous outcomes influence prediction of future experiments. Such chance process is called a Markov Chain.

Let consider the following problem and develop the terminologies and methods to solve this problem.

Example 5.3.8.

Weather of certain area is classified as "fair", "cloudy" and rainy. A fair day is followed by a fair day 60% of the time, and a cloudy day 25% of time. A cloudy day is followed by a cloudy day 35% of the time, and a rainy day 25% of the time. A rainy day is followed by a cloudy day 40% of the time and and by another rainy day 25% of time. What proportion of day are expected to be fair, cloudy and rainy over long period?

In the above example, fair, cloudy and rainy day can be thought of as three states, say state 1, state 2 and state 3 respectively. Each day can be thought of as a generation. Let us denote the data given in the example as following table:

	Fair	Cloudy	Rainy
Fair	0.60	0.25	0.15
Cloudy	0.40	0.35	0.25
Rainy	0.35	0.40	0.25

Let $p_{ij}$ be the probability of transition from state $j$ to state $i\text{.}$ The matrix $P=[p_{ij}]$ is called the transition matrix.

Definition 5.3.9.

A transition matrix of an $n$ stage Markov chain is an $n\times n$ matrix having nonnegative entries such that sum of entries in each column is 1. Such a matrix is called a stochastic matrix.

The transition matrix for the Example 5.3.8 is given by

\begin{equation*} P= \begin{pmatrix} 0.60 \amp 0.40 \amp 0.35 \\ 0.25\amp 0.35 \amp 0.40 \\ 0.15 \amp 0.25\amp 0.25 \end{pmatrix} \end{equation*}

Note that $p_{21}=0.25\text{,}$ is the probability of being a cloudy day after fair day. Similarly $p_{1,3}=0.35\text{,}$ represent the probability of being fair day after a rainy day. The transition matrix, represents the change of state from present generation to the next generation.

Remark 5.3.10.

The powers of a transition matrix have the same property of a transition matrix. That is, all the entries are non negative and sum of entries in any column is 1.

Remark 5.3.11.

If $P$ is a transition matrix of a Markov chain and if $p_{ij}^{(k)}$ denotes the $ij$-th entries of $P^k\text{,}$ then $p_{ij}^{(k)}$ is the probability of moving to state $i$ from state $j$ in $k$ generations or in $k$ time period.

Let us consider $P^5$ of the transition matrix in the Example 5.3.8

\begin{equation*} P^5= \left(\begin{array}{lll} 0.4876975 \amp 0.4872025 \amp 0.48709125 \\ 0.31116875 \amp 0.31144125 \amp 0.3115025 \\ 0.20113375 \amp 0.20135625 \amp 0.20140625 \end{array}\right) \end{equation*}

Note that $p_{23}^{(5)}=0.3115025$ is the probability of rainy day becoming cloudy day after 5 days.

We consider an initial probability vector $x^{(0)}=\begin{pmatrix} 0.4\\0.3\\0.3\end{pmatrix}\text{.}$ This mean current day has 40% chance of being a fair day, 30% chance of being cloudy and 30% of being rainy.

Then $x_k=P^kx^{(0)}\text{,}$ denotes the probability vector after $k$ stages. We are interested in long time behavior of the probability vector. In particular, we want to see if the limit $\lim_{n\to \infty} P^nx^{(0)} $ exists.

In general, the above limits does not exists, however, in case the transition matrix is regular, then one can show that limit $\displaystyle\lim_{n\to \infty} P^nx^{(0)}$ exists.

Definition 5.3.12.

A transition matrix $P$ is regular if there exists a natural number $k$ such all the entries of $P^k$ are positive.

Note that all transition matrix are not regular, for example, identity matrix is a transition matrix but is not regular.

It is easy to see that if $P$ is a regular transition matrix matrix such that all the entries of $P^m$ are positive. Then all entries of $P^{m+k}$ are positive for non negative integers $k\text{.}$

Let us assume that $P$ is a transition matrix and that $Q=\displaystyle\lim_{n\to \infty} P^n$ exists. Then we have

\begin{equation*} Q=\displaystyle\lim_{n\to \infty} P^n=P \displaystyle\lim_{n\to \infty} P^{n-1}=PQ. \end{equation*}

Suppose $q_1,\ldots, q_n$ are columns of $Q\text{.}$ Then

\begin{equation*} Q=[q_1~q_2~\cdots~ q_n]=P[q_1~q_1~\cdots~ q_n]=[Pq_1~Pq_2~\cdots~ Pq_n]. \end{equation*}

This means that each column of $Q$ is an eigenvector of $P$ corresponding to eigenvalue 1. Thus we have proved the following result:

Theorem 5.3.13.

If $P$ is an $n\times n$ regular transition matrix, then $P^n$ converges to a matrix, say, $Q$ whose columns are eigenvectors of $P$ corresponding to eigenvalues 1.

In fact one can show the following.

Theorem 5.3.14.

If $P$ is a regular, transition matrix then its eigenvalue $\lambda=1$ has multiplicity 1 and that there is only one eigenvectors associated with $\lambda=1\text{.}$ This implies that all columns of $Q$ are identical.

Now we define a a limiting state distribution vectors or steady state vector of an $n$-state Markov chain as a vector $x^{(\infty)}$ as follows:

\begin{equation*} x^{(\infty)}=\lim_{n\to \infty}x_n=\lim_{n\to \infty}P^nx^{(0)} =\left(\lim_{n\to \infty}P^n\right)x^{(0)}=Qx^{(0)}. \end{equation*}

Thus we have the following equation.

\begin{equation} x^{(\infty)} =Qx^{(0)}. \tag{5.3.1} \end{equation}

Note that all columns of $Q$ are identical, say, $q_1=\ldots=q_n=q\text{.}$ Let $q=(y_1,y_2,\ldots, y_n\text{.}$ Then

\begin{align*} Qx^{(0)}=\amp (y_1(x_1+\cdots+x_n),y_2(x_1+\cdots+x_n),\ldots, y_n(x_1+\cdots+x_n))\\ =\amp (y_1,\ldots,y_n)=q. \end{align*}

Thus $x^{(\infty)}$ is an eigenvector of $P$ with respect to eigenvalue 1 with sum of its components 1.

The (5.3.1) gives a way to find the steady state vector.

Let us find the steady state vector of the transition matrix defined in the Example 5.3.8.

Let $x^{(\infty)}=\begin{pmatrix}x\\y\\z\end{pmatrix}$ be the steady state vector. Then it is an eigenvector of PreTeXtP corresponding to eigenvalue 1. That is $Px^{(\infty)}=x^{(\infty)}$ and also, $x+y+z=1\text{.}$ Thus we get the following set of equations which we need to solve.

\begin{align*} -0.40x+0.4y+0.35z=\amp 0\\ 0.25x-0.35y+0.40z=\amp 0\\ 0.15x+0.25y-0.75z= \amp 0\\ x+y+z=\amp 0 \end{align*}

Solving the avove equations we, get

\begin{equation*} x=0.487421383647799, y=0.311320754716981, z=0.201257861635220. \end{equation*}

Hence the steady state vector is given by

\begin{equation*} x^{(\infty)}=\begin{pmatrix}0.487421383647799\\ 0.311320754716981\\0.201257861635220\end{pmatrix}. \end{equation*}

The limiting transition matrix $Q $is given by

\begin{equation*} Q=\begin{pmatrix} 0.487421383647799 \amp 0.487421383647799 \amp 0.487421383647799\\ 0.311320754716981 \amp 0.311320754716981\amp 0.311320754716981\\ 0.201257861635220 \amp 0.201257861635220 \amp 0.201257861635220 \end{pmatrix}. \end{equation*}

Thus in the long run about 48.74% chance of being a fair day, about 31.13% of being cloudy and about 20.13% chance of being rain day.

Remark 5.3.15.

We can use diagonalization of the transition matrix to find the steady state vector.

Example 5.3.16.

A travel company has a fleet of 6000 cars for renting. A car rented at one location can be returned to any of the three locations A, B and C. The various fractions of cars returned to the three locations are given in the table below.


Depots	A	B	C

A	0.3	0.4	0.5
B	0.3	0.4	0.3
C	0.4	0.2	0.2

Suppose on Monday there are 2000 cars at each location A. What will be the approximate distribution of cars on Thursday. How should the company distribute these cars at various locations in order to serve the customers smoothly.

Let the initial distributions of cars at three location be denoted by the vector $X = \begin{bmatrix}20000\\2000\\2000 \end{bmatrix}\text{.}$ The proportion of cars that are returned to various locations can be represented by the matrix $P = \begin{bmatrix}0.3 \amp 0.4 \amp 0.5\\ 0.3 \amp 0.4 \amp 0.3 \\ 0.4 \amp 0.2 \amp 0.2 \end{bmatrix}\text{,}$ which is stochastic matrix. (Any square matrix with non negative entries with column sum 1 is called columns stochastic or Markov matrix. )

Number of cars at location $A$ after one day is $0.3\times 2000 +0.4\times 2000+0.5\times 2000$

Number of cars at location $A$ after one day is $0.3\times 2000 +0.4\times 2000+0.3\times 2000$

Number of cars at location $A$ after one day is $0.4\times 2000 +0.2\times 2000+0.2\times 2000$

It is easy to see that distribution of cars after day one is $PX\text{.}$

Similarly after day two it is $P(PX)=P^2X\text{.}$ Thus on Thursday the car distribution at various location is given by $P^3X=\begin{bmatrix}2336\\ 2000.\\ 1664.0 \end{bmatrix}\text{.}$

After say 100 days the car distribution at various location is given by $P^{50}X\approx=\begin{bmatrix}2333.33\\2000.\\1666.67 \end{bmatrix}\text{.}$ In fact after $n$ large $P^nX$ is constant which is approximately $\begin{bmatrix}2333.33\\2000.\\1666.67 \end{bmatrix}\text{.}$ Thus in long run car distribution is $\begin{bmatrix}2333\\2000\\1667 \end{bmatrix}\text{.}$

The higher power of $P$ can be obtained by diagonalizing $P\text{.}$ In this can eigenvalues of $P$ are $1, 1/10, -1/5$ and the corresponding eigenvectors are $\begin{bmatrix}1 \\6/7\\5/7 \end{bmatrix} , \begin{bmatrix}1 \\-3\\2 \end{bmatrix} , \begin{bmatrix}1 \\0\\-1 \end{bmatrix}\text{.}$ Let us define $D= \left(\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp \frac{1}{10} \amp 0 \\ 0 \amp 0 \amp -\frac{1}{5} \end{array} \right)$ and $Q=\left(\begin{array}{rrr} 1 \amp 1 \amp 1 \\ \frac{6}{7} \amp -3 \amp 0 \\ \frac{5}{7} \amp 2 \amp -1 \end{array} \right)\text{.}$ Then $QDQ^{-1}=A\text{.}$ Hence $A^n = QD^nQ^{-1}\text{.}$ Here $Q^{-1} =\left(\begin{array}{rrr} \frac{7}{18} \amp \frac{7}{18} \amp \frac{7}{18} \\ \frac{1}{9} \amp -\frac{2}{9} \amp \frac{1}{9} \\ \frac{1}{2} \amp -\frac{1}{6} \amp -\frac{1}{2} \end{array} \right)\text{.}$ Hence

\begin{align*} A^n =\amp \left(\begin{array}{rrr} 1 \amp 1 \amp 1 \\ \frac{6}{7} \amp -3 \amp 0 \\ \frac{5}{7} \amp 2 \amp -1 \end{array} \right) \left(\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp \left(\frac{1}{10}\right)^n \amp 0 \\ 0 \amp 0 \amp \left(-\frac{1}{5}\right)^n \end{array} \right)\left(\begin{array}{rrr} \frac{7}{18} \amp \frac{7}{18} \amp \frac{7}{18} \\ \frac{1}{9} \amp -\frac{2}{9} \amp \frac{1}{9} \\ \frac{1}{2} \amp -\frac{1}{6} \amp -\frac{1}{2} \end{array} \right)\\ =\amp \left(\begin{array}{rrr} 1 \amp 1 \amp 1 \\ \frac{6}{7} \amp -3 \amp 0 \\ \frac{5}{7} \amp 2 \amp -1 \end{array} \right) \left(\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \end{array} \right)\left(\begin{array}{rrr} \frac{7}{18} \amp \frac{7}{18} \amp \frac{7}{18} \\ \frac{1}{9} \amp -\frac{2}{9} \amp \frac{1}{9} \\ \frac{1}{2} \amp -\frac{1}{6} \amp -\frac{1}{2} \end{array} \right) \text{ for large $n$ }\\ =\amp \left(\begin{array}{rrr} \frac{7}{18} \amp \frac{7}{18} \amp \frac{7}{18} \\ \frac{1}{3} \amp \frac{1}{3} \amp \frac{1}{3} \\ \frac{5}{18} \amp \frac{5}{18} \amp \frac{5}{18} \end{array} \right) \end{align*}

Suppose we define $\pi = \left(\begin{array}{r} \frac{7}{18}\\ \frac{1}{3} \\ \frac{5}{18} \end{array} \right)\text{.}$ Then it is easy to check that $A\pi=\pi\text{.}$ That is $\pi$ is an eigenvector of $A$ corresponding to the eigenvalue 1. This is called the steady state vector.

Checkpoint 5.3.17.

Suppose $D$ is a diagonal matrix given by $D= \left(\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 0.3 \amp 0 \\ 0 \amp 0 \amp 0.2 \end{array} \right)$ and $v\in \R^3\text{.}$ What happens to the vector $v$ geometrically when we do $D^nv$ for large $n\text{?}$ (It sucks vector $v$ into $u_1$ direction.)

Next let us consider a matrix which is diagonalizable with eigenvectors $u_1, u_2, u_3$ and corresponding eigenvalues $\lambda_1=1, \lambda_2=0.3$ and $\lambda_3=0.2$ respectively. Then what happens to the vector $v$ geometrically when we do $A^nv$ for large $n\text{?}$ (It makes $y$ and $z$ coordinates very small and it sucks vector $v$ into $x$-axis.)

Checkpoint 5.3.18.

132 fishes are placed in a box with nine rooms. See Figure 5.3.19.

Assume that, at regular intervals of time, it is equally likely that fishes will decide to go through any door in the room or stay in the room.

Find how many fishes can be found in each room in long run.

Solve this problem using a $3 \times 3$ matrix stochastic matrix.

Subsection 5.3.5 Google Search Engine

Over 900 millions Indian use Internet (as of January 2024) most of them may do Google, a search engine that was designed by Lawrence Page and Sergei Brian in 1998. There are about 200 millions active webpages in the world as on January 2024 and it is growing by seconds. More often than not, google search gives the desired results, users are looking for. Have you ever thought of how does it work? Linear algebra plays an important role in how Google’s search algorithm works. Google ranks pages according to their importance.

Success of Google lies behind its PageRank algorithm, which rates the importance of each page on the web and present the user typically most relevant and helpful pages first. PageRank uses link structure of web to determine the importance of webpages.

In this section, we shall explain how Google ranks webpages as a simple application to linear algebra.

Whenever an user or web surfer enters a keyword for searching, Google does three basic things:

Crawl the web and locate all web pages with public access.
Index the data from websites it has crawled, so that it can be searched efficiently for relevant keywords.
Rate the importance of each page in the database.

We shall briefly look at how to rank or rate webpages using Google’s PageRank algorith.

An Internet network can be represented by a directed graph in which each website is thought of as a node or vertex and links between pages are edges with arrows. If there is an arrow from note $i$ to $j\text{,}$ them it means that there is link to move from webpage $i$ to $j\text{.}$ If there is a double arrow between $i$ and $j\text{,}$ then there is link from $i$ to $j$ and also from $j$ to $i\text{.}$

To every network with $n$-webpages, we assign a matrix $H=[h_{ij}]_{n\times n}\text{,}$ where represents the probability that surfer would visit the webpage $i$ from the webpage $j\text{.}$ If the webpage $j$ has $m_j$ links to other webpages then $h_{ij}=1/m_j\text{.}$ If there is no link from $j$ to $i\text{,}$ then $h_{ij}=0$ The matrix $H$ is called the hyper matrix of the network.

Example 5.3.20.

Let us consider a network with 5 webpages as shown in Figure 5.3.21.

The hyper matrix of the network Figure 5.3.21 is the following:

\begin{equation*} H=\begin{pmatrix} 0 \amp 0 \amp \frac{1}{2} \amp \frac{1}{2} \amp 0\\[1mm] \frac{1}{3} \amp 0 \amp 0 \amp \frac{1}{2} \amp 0\\[1mm] \frac{1}{3} \amp 0 \amp 0 \amp 0 \amp 0\\[1mm] 0 \amp \frac{1}{2} \amp 0 \amp 0 \amp 0\\[1mm] \frac{1}{3} \amp \frac{1}{2} \amp \frac{1}{2} \amp 0 \amp 0\\ \end{pmatrix}. \end{equation*}

Note that $H$ is a stochastic matrix and sum of each of its row is 1.

If a surfer lands on a page after $i$ steps or clicks, then its probability is denoted by $p_k(j)=p{ij}\text{.}$

The vector vector

\begin{equation*} p_i=\begin{pmatrix} p_{i1}\\p_{i2}\\\vdots\\p_{in}\end{pmatrix} \end{equation*}

is called the $i$-th probability vector. Suppose, the surfer start from the webpage 3, then the initial probability vector is

\begin{equation*} p_0=\begin{pmatrix} 0\\0\\1\\0 \\0 \end{pmatrix}. \end{equation*}

We wish to determine the long term behaviour of the surfer. In particular, we wish to find the probability of a surfer being on page $i$ in long runs. The vector

\begin{equation*} p_1=Hp_0= \left(\begin{array}{r} \frac{1}{2} \\ 0 \\ 0 \\ 0 \\ \frac{1}{2} \end{array}\right)\text{.} \end{equation*}

Thus if the surfer starts from the webpage 3, then the on one click, the surfer reach to webpage 1 with probability $1/2$ or at webpage 5 with probability $1/2\text{.}$ We have

\begin{equation*} p_2=H^2p_0=\left(\begin{array}{r} 0 \\ \frac{1}{6} \\ \frac{1}{6} \\ \frac{1}{2} \\ \frac{1}{6} \end{array}\right), p_3=H^3p_0= \left(\begin{array}{r} \frac{1}{3} \\ \frac{1}{4} \\ 0 \\ \frac{1}{4} \\ \frac{1}{6} \end{array}\right) \end{equation*}

\begin{equation*} p_{20}=\left(\begin{array}{r} 0.19048 \\ 0.22222 \\ 0.063492 \\ 0.31746 \\ 0.20635 \end{array}\right), p_{25}= \left(\begin{array}{r} 0.19047 \\ 0.22222 \\ 0.063492 \\ 0.31746 \\ 0.20635 \end{array}\right), \ldots, p_{50}=\left(\begin{array}{r} 0.19047 \\ 0.22222 \\ 0.063492 \\ 0.31746 \\ 0.20635 \end{array}\right) \end{equation*}

Thus we see that the probability vector seems to converge to a vector

\begin{equation*} \pi=\left(\begin{array}{r} 0.19047 \\ 0.22222 \\ 0.063492 \\ 0.31746 \\ 0.20635 \end{array}\right) \end{equation*}

which means that, a surfer will visit webpage 1 with probability 0.19047, webpage 2 with probability 0.22222 and so on. Since webpage 4 has highest probability, it will get rank 1. Thus this network will be ranked as, $[4, 2, 5, 1, 2]\text{.}$

One can show that the vector $\pi$ is independent of the initial vector $p_0$ for this particular network. This is called the stationary distribution vector. Thus the stationary distribution vector for any network is $\displaystyle \lim_{n\to \infty} H^np_0$ if it exists.

Remark 5.3.22.

Finding the stationary distribution vector for network is nothing but solving a stem of linear equations $Hx=x\text{.}$

From the Example 5.3.20, it looks like that we can find the stationary distribution vector and hence rank a network after a sufficiently large iterations. However, this is not always true. In case a network has a dangling webpage (a webpage that does not link to any other page), or a trapping loop, then we may not be able to rank webpages. In case a surfer come across, dangling node or trapping loop, he can type a new url in the address bar manually. The next two examples demonstrate this.

Example 5.3.23.

Let us consider a network with 5 webpages as shown in Figure 5.3.24.

Figure 5.3.24. Network with 5 webpages with a dangling node.

In this network, webpage 3 is dangling as it does not have any outgoing link to any page. The hyper matrix of associated to this network is given by

\begin{equation*} H=\begin{pmatrix} 0 \amp 0 \amp 0 \amp \frac{1}{3} \amp 0\\[1mm] \frac{1}{4} \amp 0 \amp 0 \amp \frac{1}{3} \amp 0\\[1mm] \frac{1}{4} \amp 0 \amp 0 \amp 0 \amp 1\\[1mm] \frac{1}{4} \amp 1 \amp 0 \amp 0 \amp 0\\[1mm] \frac{1}{4} \amp 0 \amp 0 \amp \frac{1}{3} \amp 0\\ \end{pmatrix}. \end{equation*}

Note that the column corresponding to the dangling node is a zero column. If the initial probability vector, $p_0=\begin{pmatrix}0\\0\\1\\0 \\0 \end{pmatrix}\text{.}$ Then

\begin{equation*} p_1=Hp_0= \left(\begin{array}{r} 0 \\ \frac{1}{4} \\ \frac{1}{4} \\ \frac{1}{4} \\ \frac{1}{4} \end{array}\right) \end{equation*}

and we have

\begin{equation*} p_5= \left(\begin{array}{r} 0.0416 \\ 0.0503 \\ 0.0642 \\ 0.0642 \\ 0.0503 \end{array}\right), p_{10}= \left(\begin{array}{r} 0.0068 \\ 0.0094 \\ 0.0158 \\ 0.0158 \\ 0.0094 \end{array}\right),\ldots, p_{20}=\left(\begin{array}{r} 0.00029 \\ 0.00040 \\ 0.00065 \\ 0.00065 \\ 0.00040 \end{array}\right). \end{equation*}

From this it seems that $Hp_n$ approaches to a zero vector. Therefore, we cannot rank the wepages.

Example 5.3.25.

Let us consider a network with 5 web pages as shown in Figure 5.3.26. This network has a trapping loop $2\longleftrightarrow 5\text{.}$ In long run for any initial vector, $p_0\text{,}$ we may get $p_k=\begin{pmatrix} 0\amp 1/2\amp 0\amp 0\amp 1/2\end{pmatrix}\text{.}$

Several improvements have been suggested to tackle dangling and trapping loops.

Page Rank inverters, Page and Bring suggested the a new matrix called the Google matrix which is defined as follows:

\begin{equation} G = \alpha H+(1-\alpha) S.\tag{5.3.2} \end{equation}

where $S$ is $n\times n$ matrix whose all entries are $1/n\text{,}$ $\alpha$ is called the damping factor and $0\leq \alpha \leq 1\text{.}$

If $\alpha=0.9\text{,}$ then it means 90% times the surfer is using the web links and 10% times typing ulr manually in the address bar. In th original paper Page and Brin used $\alpha=0.85$ as damping factor.

Since any regular stochastic matrix $G$ has a stationary vector, that is there exists a vector vector $\pi$ such that $G\pi=\pi\text{.}$ Thus one can always find a stationary probability vector for a network.

Remark 5.3.27.

The Google matrix $G$ defined in (5.3.2) is a stochastic matrix. That is, sum of entries in any column is 1.

Remark 5.3.28.

The Google matrix $G$ defined in (5.3.2) is a regular matrix.

Prev Top Next