Originally the Fibonacci sequence appeared in the solution of the following problem posed by Leonardo of Pisa, popularly known as Fibonacci, in the book Liber Abaci in 1202.
A certain man put a pair of rabbits in a places surrounded on all sides by a wall. How many pairs of rabbits can be produced from that pair in year if it is supposed that every month each pair begets a new pair which from second month becomes productive?
We assume that the number of pairs at the end of the \(n\)-th year is denoted by \(F_n\text{.}\) We start with \(F_0=0\) and \(F_1=1\text{.}\) Note that the number of pair of rabbits in any given month is the number of pairs in the previous month plus those that are hatched in that month and those are as many as the previous month. This gives gives rise to a recurrence relation
Note that \(M\) is symmetric matrix with characteristic polynomial \(\lambda^2-\lambda-1\text{.}\) The eigenvalues of \(M\) are \(\frac{1\pm
\sqrt{5}}{2}\) and the corresponding eigenvectors are \(\begin{pmatrix}\frac{1\pm
\sqrt{5}}{2} \\1\end{pmatrix}\text{.}\) Since \(M\) has distinct eigenvalues, \(M\) is diagonalizable and we have
Let \(\varphi=\frac{1+\sqrt{5}}{2}\) be one of the above eigenvalue of \(M\text{,}\) then \(1-\varphi=\frac{1-\sqrt{5}}{2}\text{.}\) Also it is easy to check that \((1+\frac{1}{\varphi})=\varphi\text{.}\) Now we have
In a certain area, it is observed that every year the number of rabbits is equal to 4 times the number of rabbits less 2 times numbers of weasels in the previous year. The number of weasels is equal to the sum of the number of rabbits number of weasels in the previous year. If the initial population of rabbits and weasels were 100 and 10 respectively, then find the number of each species after \(n\) years. Let \(r_n\) and \(w_n\) be the number of rabbits and weasels after \(n\) years respectively.
Let \(X_n=\begin{pmatrix}r_n\\w_n\end{pmatrix}\) and \(A=\begin{pmatrix}4 \amp -2 \\
1\amp 1 \end{pmatrix}\text{.}\) Then the above system is equivalent to
\begin{equation*}
X_{n+1}=AX_n\quad \text{ with } \quad X_0=\begin{pmatrix}100\\10\end{pmatrix}\text{.}
\end{equation*}
It is easy to check that \(\lambda_1=2\) and \(\lambda_2=3\) are eigenvalues of \(A\) with corresponding eigenvectors \(v_1=\begin{pmatrix}1\\1\end{pmatrix}\) and \(v_1=\begin{pmatrix}2\\1\end{pmatrix}\) respectively. We define \(P = \begin{pmatrix}1 \amp
2\\1 \amp 1\end{pmatrix}\text{.}\) It is easy to see that \(P^{-1}=\begin{pmatrix}-1 \amp 2\\1
\amp -1\end{pmatrix} \text{.}\) Also \(D=\begin{pmatrix}2 \amp 0\\0 \amp 3\end{pmatrix}\text{.}\) Using diagonalization, we have
In this subsection, we shall deal with use of theory of eigenvalues and eigenvectors to solve system of linear differential equations. Let shall consider a system of first order linear differential equations with constant coefficients. A general form of such system can be given by
We want to find a solution of the above system, that is a column vector \(x
=\begin{bmatrix}x_1 \amp x_2\amp \cdots \amp x_n\end{bmatrix}^T\) of \(n\) of differentiable functions on some interval \([a,b]\) which satisfies the above system.
Let \(S\) be the set of all solutions of the linear system ODEβs, \(x'=Ax\text{.}\) Then \(S\) is an \(n\)-dimensional vector subspace \(D[a,b]\) of the set of differentiable function on \([a,b]\text{.}\)
This mean, if \(x_1, \ldots, x_n\) are linearly independent functions which are solutions of the above system. Then any solution can be written as a linear combination of \(x_1,\ldots,
x_n\text{.}\)
In addition, suppose \(A\) is diagonalizable, with \(A=PDP^{-1}\) where \(D={\rm
diag}(\lambda_1,\ldots,\lambda_n)\text{,}\) the diagonal matrix of eigenvalues of \(A\) and \(P=\begin{bmatrix}u_1\amp
\ldots\amp u_n\end{bmatrix}\) is matrix of eigenbasis of \(A\text{.}\)
Here \(e^{tD}\) is the exponential of the matrix \(tD\) and \(c\) is the column matrix of constants. From this we get a general solution of the original system as
Suppose we want to solve the initial value problem \(x'(t)=Ax(t)\) with initial condition \(x(0)=x_0\text{.}\) Then we have \(c=P^{-1}x_0\text{.}\) Hence the solution in matrix form is given by
Find the eigenvalues and corresponding eigenvectors of \(A\text{.}\) Let \(\lambda_1,
\ldots, \lambda_n\) be eigenvalues of \(A\) with corresponding to eigenvectors \(v_1, \ldots, v_n\) respectively.
It is easy to see that the eigenvalues of \(A\) are \(\lambda_1=-3\) and \(\lambda_2=-6\) with corresponding eigenvectors \(v_1=\begin{pmatrix}1\\1\end{pmatrix}\) and \(v_2=\begin{pmatrix}2\\-1\end{pmatrix}\) respectively.
Here \(f_1(x_1,x_2),f_2(x_1,x_2)\) represnt the velocity vector at each point \((x_1,x_2)\) at time \(t\text{.}\) This can be visualized by drawing velocity vectors at large number of grid points in some domain. This is what is called the slope field. Geometrically, the slope field describes the flow of the system. Any solution curve \((x_1(t), x_2(t))\) is tangent to these arrows at every point along its path.
We can vizualize the slope field along with the solution curve in Sage using the function plot_vector_field. Let us demonstrate this for the ODE in ExampleΒ 5.5.3
Here we have drawn unit slope vectors are all the points. Along with the slope field Sage provide a way to vizualise the trajecorty or the solution curve with a given starting point using the Sage function streamline_plot.
The eigenvalues of \(A\) are \(\lambda_1=1, \lambda_2=-2,
\lambda_3=\frac{3-\sqrt{5}}{2},
\lambda_4=\frac{3+\sqrt{5}}{2}\) and the corresponding eigenvectors are
\begin{equation*}
x'(t)=Ax(t), \quad \text{ where } A = \begin{pmatrix} 1 \amp 1\\-1 \amp 1\end{pmatrix}.
\end{equation*}
The eigenvalues of \(A\) are \(\lambda_1=1-i\) and \(\lambda_2=1+i,\) with eigenvectors \(v_1=\begin{pmatrix} 1\\ -i \end{pmatrix}\) and \(v_2=\begin{pmatrix}
1\\ i \end{pmatrix}\) respectively.
Let us define \(P:=[v_1~~v_2]=\begin{pmatrix}
1 \amp 1\\-i \amp i
\end{pmatrix}\text{.}\) Then it is easy to see that \(P^{-1}AP=D=\begin{pmatrix}
1-i \amp 0\\0 \amp 1+i
\end{pmatrix}\text{.}\)
In case the the coeffcient matrix \(A\) of the sysstem \(x'=Ax\) is not diagonailizable then one can use the concept of Jordan canonical form to solve the system. However, we shall not discuss this here.
In this subsection, we look at some applications of linear algebra in dealing with matrices that arise from probabilistic systems. We consider chance process in which knowledge of previous outcomes influence prediction of future experiments. Such chance process is called a Markov Chain.
Weather of certain area is classified as "fair", "cloudy" and rainy. A fair day is followed by a fair day 60% of the time, and a cloudy day 25% of time. A cloudy day is followed by a cloudy day 35% of the time, and a rainy day 25% of the time. A rainy day is followed by a cloudy day 40% of the time and and by another rainy day 25% of time. What proportion of day are expected to be fair, cloudy and rainy over long period?
In the above example, fair, cloudy and rainy day can be thought of as three states, say state 1, state 2 and state 3 respectively. Each day can be thought of as a generation. Let us denote the data given in the example as following table:
A transition matrix of an \(n\) stage Markov chain is an \(n\times n\) matrix having nonnegative entries such that sum of entries in each column is 1. Such a matrix is called a stochastic matrix.
Note that \(p_{21}=0.25\text{,}\) is the probability of being a cloudy day after fair day. Similarly \(p_{13}=0.35\text{,}\) represent the probability of being fair day after a rainy day. The transition matrix, represents the change of state from present generation to the next generation.
The powers of a transition matrix have the same property of a transition matrix. That is, all the entries are non negative and sum of entries in any column is 1.
If \(P\) is a transition matrix of a Markov chain and if \(p_{ij}^{(k)}\) denotes the \(ij\)-th entries of \(P^k\text{,}\) then \(p_{ij}^{(k)}\) is the probability of moving to state \(i\) from state \(j\) in \(k\) generations or in \(k\) time period.
We consider an initial probability vector \(x^{(0)}=\begin{pmatrix}
0.4\\0.3\\0.3\end{pmatrix}\text{.}\) This mean current day has 40% chance of being a fair day, 30% chance of being cloudy and 30% of being rainy.
Then \(x_k=P^kx^{(0)}\text{,}\) denotes the probability vector after $k$ stages. We are interested in long time behavior of the probability vector. In particular, we want to see if the limit \(\lim_{n\to \infty} P^nx^{(0)}\) exists.
In general, the above limits does not exists, however, in case the transition matrix is regular, then one can show that limit \(\displaystyle\lim_{n\to \infty} P^nx^{(0)}\) exists.
It is easy to see that if \(P\) is a regular transition matrix matrix such that all the entries of \(P^m\) are positive. Then all entries of \(P^{m+k}\) are positive for non negative integers \(k\text{.}\)
If \(P\) is an \(n\times n\) regular transition matrix, then \(P^n\) converges to a matrix, say, \(Q\) whose columns are eigenvectors of \(P\) corresponding to eigenvalues 1.
If \(P\) is a regular, transition matrix then its eigenvalue \(\lambda=1\) has multiplicity 1 and that there is only one eigenvectors associated with \(\lambda=1\text{.}\) This implies that all columns of \(Q\) are identical.
Let \(x^{(\infty)}=\begin{pmatrix}x\\y\\z\end{pmatrix}\) be the steady state vector. Then it is an eigenvector of PreTeXtP corresponding to eigenvalue 1. That is \(Px^{(\infty)}=x^{(\infty)}\) and also, \(x+y+z=1\text{.}\) Thus we get the following set of equations which we need to solve.
A travel company has a fleet of 6000 cars for renting. A car rented at one location can be returned to any of the three locations A, B and C. The various fractions of cars returned to the three locations are given in the table below.
Suppose on Monday there are 2000 cars at each location A. What will be the approximate distribution of cars on Thursday. How should the company distribute these cars at various locations in order to serve the customers smoothly.
Let the initial distributions of cars at three location be denoted by the vector \(X =
\begin{bmatrix}20000\\2000\\2000 \end{bmatrix}\text{.}\) The proportion of cars that are returned to various locations can be represented by the matrix \(P = \begin{bmatrix}0.3
\amp 0.4 \amp 0.5\\ 0.3 \amp 0.4 \amp 0.3 \\ 0.4 \amp 0.2 \amp 0.2 \end{bmatrix}\text{,}\) which is stochastic matrix. (Any square matrix with non negative entries with column sum 1 is called columns stochastic or Markov matrix. )
Similarly after day two it is \(P(PX)=P^2X\text{.}\) Thus on Thursday the car distribution at various location is given by \(P^3X=\begin{bmatrix}2336\\ 2000.\\ 1664.0 \end{bmatrix}\) .
After say 100 days the car distribution at various location is given by \(P^{50}X\approx=\begin{bmatrix}2333.33\\2000.\\1666.67
\end{bmatrix}\text{.}\) In fact after \(n\) large \(P^nX\) is constant which is approximately \(\begin{bmatrix}2333.33\\2000.\\1666.67 \end{bmatrix}\text{.}\) Thus in long run car distribution is \(\begin{bmatrix}2333\\2000\\1667 \end{bmatrix}\text{.}\)
Suppose we define \(\pi = \left(\begin{array}{r} \frac{7}{18}\\ \frac{1}{3} \\
\frac{5}{18} \end{array} \right)\text{.}\) Then it is easy to check that \(A\pi=\pi\text{.}\) That is \(\pi\) is an eigenvector of \(A\) corresponding to the eigenvalue 1. This is called the steady state vector.
Suppose \(D\) is a diagonal matrix given by \(D= \left(\begin{array}{rrr} 1 \amp 0
\amp 0 \\ 0 \amp 0.3 \amp 0 \\ 0 \amp 0 \amp 0.2 \end{array} \right)\) and \(v\in \R^3\text{.}\) What happens to the vector \(v\) geometrically when we do \(D^nv\) for large \(n\text{?}\) (It sucks vector \(v\) into \(u_1\) direction.)
Next let us consider a matrix which is diagonalizable with eigenvectors \(u_1, u_2, u_3\) and corresponding eigenvalues \(\lambda_1=1, \lambda_2=0.3\) and \(\lambda_3=0.2\) respectively. Then what happens to the vector \(v\) geometrically when we do \(A^nv\) for large \(n\text{?}\) (It makes \(y\) and \(z\) coordinates very small and it sucks vector \(v\) into \(x\)-axis.)
Over 900 millions Indian use Internet (as of January 2024) most of them may do Google, a search engine that was designed by Lawrence Page and Sergei Brian in 1998. There are about 200 millions active webpages in the world as on January 2024 and it is growing by seconds. More often than not, google search gives the desired results, users are looking for. Have you ever thought of how does it work? Linear algebra plays an important role in how Googleβs search algorithm works. Google ranks pages according to their importance.
Success of Google lies behind its PageRank algorithm, which rates the importance of each page on the web and present the user typically most relevant and helpful pages first. PageRank uses link structure of web to determine the importance of webpages.
An Internet network can be represented by a directed graph in which each website is thought of as a node or vertex and links between pages are edges with arrows. If there is an arrow from note \(i\) to \(j\text{,}\) them it means that there is link to move from webpage \(i\) to \(j\text{.}\) If there is a double arrow between \(i\) and \(j\text{,}\) then there is link from \(i\) to \(j\) and also from \(j\) to \(i\text{.}\)
To every network with \(n\)-webpages, we assign a matrix \(H=[h_{ij}]_{n\times n}\text{,}\) where represents the probability that surfer would visit the webpage \(i\) from the webpage \(j\text{.}\) If the webpage \(j\) has \(m_j\) links to other webpages then \(h_{ij}=1/m_j\text{.}\) If there is no link from \(j\) to \(i\text{,}\) then \(h_{ij}=0\) The matrix \(H\) is called the hyper matrix of the network.
We wish to determine the long term behaviour of the surfer. In particular, we wish to find the probability of a surfer being on page \(i\) in long runs. The vector
Thus if the surfer starts from the webpage 3, then the on one click, the surfer reach to webpage 1 with probability \(1/2\) or at webpage 5 with probability \(1/2\text{.}\)
which means that, a surfer will visit webpage 1 with probability 0.19047, webpage 2 with probability 0.22222 and so on. Since webpage 4 has highest probability, it will get rank 1. Thus this network will be ranked as, \([4, 2, 5, 1, 2]\text{.}\)
One can show that the vector \(\pi\) is independent of the initial vector \(p_0\) for this particular network. This is called the stationary distribution vector. Thus the stationary distribution vector for any network is \(\displaystyle \lim_{n\to
\infty} H^np_0\) if it exists.
From the ExampleΒ 5.5.21, it looks like that we can find the stationary distribution vector and hence rank a network after a sufficiently large iterations. However, this is not always true. In case a network has a dangling webpage (a webpage that does not link to any other page), or a trapping loop, then we may not be able to rank webpages. In case a surfer come across, dangling node or trapping loop, he can type a new url in the address bar manually. The next two examples demonstrate this.
Note that the column corresponding to the dangling node is a zero column. If the initial probability vector, \(p_0=\begin{pmatrix}0\\0\\1\\0 \\0 \end{pmatrix}\text{.}\) Then
Let us consider a network with 5 web pages as shown in FigureΒ 5.5.27. This network has a trapping loop \(2\longleftrightarrow 5\text{.}\) In long run for any initial vector, \(p_0\text{,}\) we may get \(p_k=\begin{pmatrix} 0\amp 1/2\amp 0\amp 0\amp
1/2\end{pmatrix}\text{.}\)
If \(\alpha=0.9\text{,}\) then it means 90% times the surfer is using the web links and 10% times typing ulr manually in the address bar. In th original paper Page and Brin used \(\alpha=0.85\) as damping factor.
Since any regular stochastic matrix \(G\) has a stationary vector, that is there exists a vector vector \(\pi\) such that \(G\pi=\pi\text{.}\) Thus one can always find a stationary probability vector for a network.