Skip to main content

Section 6.4 Orthogonal Diagonalizations

Recall the concept of diagonalization of a square matrix. We have seen that an \(n\times n\) matrix \(A\) is diagonalizable if there is an eigenbasis of \(\R^n\text{.}\) In this section, we shall explore if we can find an eigenbasis which is also an orthonormal. First of all we shall define what is meaning of an orthogonal matrix.

Proof.

Assume that \(P^{-1}=P^T\text{.}\) This implies \(PP^T=I\text{.}\) Let the columns of \(P\) are \(p_1, p_2,\ldots, p_n\text{.}\) Then \(\{p_1,\ldots, p_n\}\) is linearly independent. It is easy to see that the \(ij\)-th entry of \(PP^T\) is \(p_i\cdot p_j\text{.}\) Hence we have \(p_i\cdot p_j=\delta_{ij}=1\) if \(i=j\) and 0 otherwise. This proves rows of \(P\) and orthogonal and hence columns of \(P\) are orthogonal. The converse is easy.

Definition 6.4.2.

A square matrix \(P\) is called an orthogonal matrix if it satisfies any one (and hence all) the conditions of Theorem TheoremΒ 6.4.1.

Example 6.4.3.

  1. The matrix \(\begin{pmatrix}\cos \theta \amp -\sin\theta\\\sin\theta \amp \cos\theta \end{pmatrix}\) is an orthogonal matrix.
  2. \(\left(\begin{array}{rrr} -\frac{1}{3} \, \sqrt{3} \amp \sqrt{\frac{2}{3}} \amp 0 \\ \frac{1}{3} \, \sqrt{3} \amp \frac{1}{2} \, \sqrt{\frac{2}{3}} \amp -\sqrt{\frac{1}{2}} \\ \frac{1}{3} \, \sqrt{3} \amp \frac{1}{2} \, \sqrt{\frac{2}{3}} \amp \sqrt{\frac{1}{2}} \end{array} \right)\) is an orthogonal matrix.

Definition 6.4.4.

An \(n\times n\) matrix is called orthogonally diagonalizable if there exists an orthogonal matrix \(P\) such that \(P^{-1}AP=P^TAP\) is a diagonal matrix.
It is easy to easy to see that if \(P\) and \(Q\) are orthgogonal matrices then \(PQ\) is also orthogonal. (why?)

Definition 6.4.5.

Two \(n\times n\) matrices \(A\) and \(B\) are called orthogonally similar if there exists an orthogonal matrix \(P\) such that \(B =P^{-1}AP=P^TAP\) is a diagonal matrix.
Thus an orthogonally diagonally matrix is ortghogonally similar to a diagonal matrix.
Suppose a matrix \(A\) is orthogonally diagonalizable. That is \(P^TAP=D\text{,}\) a digonal matrix. This means \(A=PDP^T\text{.}\) Hence
\begin{equation*} A^T=(PDP^T)^T=PD^TP^T=PDP^T=A. \end{equation*}
Thus if \(A\) is orthogonally diagonalizable then \(A\) must be symmetric.

Proof.

We have
\begin{equation*} (\lambda_1 v_1)\cdot v_2 = (Av_1) \cdot (v_2)= {(Av_1)}^T(v_2)=v_1^T A^Tv_2=v_1^T(\lambda_2 v_2). \end{equation*}
This implies \((\lambda_1-\lambda_2)(v_1\cdot v_2)=0\text{.}\) Since \(\lambda_1\neq \lambda_2\text{,}\) we have \(v_1\cdot v_2=0\text{.}\)
The following therem shows that every real symmetric matrix is orthogonally diagonalizable.

Proof.

\((1\implies 2)\)
Let \(v_1,\ldots, v_n\) be orthogonormal eigenvectors of \(A\) such that \(Av_i=\lambda_i v_i\text{.}\) Then \(P=\begin{bmatrix} v_1\amp v_2\cdots v_n\end{bmatrix}\) is orthogonal. Hence
\begin{equation*} P^TAP={\rm diag}(\lambda_1,\ldots,\lambda_n)=D. \end{equation*}
Hence \(A\) is orthogonally diagonalizable.
\((2\implies 1)\)
Suppose there exists an orthogonal matrix \(P\) susch that \(P^{-1}AP=D\text{.}\) Then \(AP=PD\text{.}\) Let \(D={\rm diag}\{\lambda_1,\ldots,\lambda_n\}\) and \(v_1,\ldots, v_n\) be columns of \(P\text{,}\) then \(\{v_1,\ldots, v_n\}\) is an orthonormal basis of \(\R^n\text{.}\) Also \(AP=PD\) implies \(Av_i=\lambda_i v_1\text{.}\) Hence \(\beta\) is an orthonormal eigenbasis of \(A\text{.}\)
\((2\implies 3)\)
If \(A\) is orthogonally diagonalizable with \(P^TAP=D\) then
\begin{equation*} A^T={(PDP^T)}^T=PDP^T=A. \end{equation*}
Hence \(A\) is symmetric.
\((3\implies 2)\)
We prove this result using induction on \(n\text{.}\) For \(n=1\text{.}\) Let \(A=[\alpha]\text{.}\) Then \(\{1\}\) is an orthonormal basis of \(\R\) and it is also an eigenvector.
Assume that the result is true for \(n-1\text{.}\) That is if \(A\) is an \((n-1)\times (n-1)\) real symmetric matrix then it is orthogonally diagonalizable.
Let us prove the result for \(n\text{.}\) Let \(A\) be an \(n\times n\) real symmetric matrix. By the fundamental theorem of algebra, we know that every real polynomial of has a root in \(\mathbb{C}\text{.}\) Hence the characteristic polynomila of \(A\) has a complex characteristics root. By TheoremΒ 5.3.4, all eigenvalues of \(A\) are real. Thus \(A\) has a real eigenvalue, say, \(\lambda\text{.}\) Let \(u\) be a unit eigenvector corresponding to the eigenvalue \(\lambda\) and \(W=\R u\text{.}\) Then \(W\) is a one dimensional subspace of \(\R^n\text{.}\) Hence \(W^\perp\) an \((n-1)\)-dimensional subspace of \(\R^n\text{.}\) Also \(W\) is \(A-\)invariannt. Hence by CheckpointΒ 6.3.12, \(W^\perp\) is \(A\)-invariant. Also \(\R^n=W\oplus W^\perp\text{.}\)
Let \(\beta = \{u,v_1,\ldots,v_{n-1}\}\) be an extended orthonormnal basis of \(A\text{.}\) Let \(P=[u~v_1~\cdots~v_{n-1}]\text{,}\) the orthogonal matrix whose columns are vectors \(u,v_1,\ldots,v_{n-1}\text{.}\) The the matrix \(M\) of \(A\) with respect to \(\beta\) is \(P^TAP\) which is of the form
\begin{equation*} M=\left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp C \end{array}\right]\text{,} \end{equation*}
where \(C\) is an \((n-1)\times (n-1)\) real symmetric matrix. (why)? Hence by induction, there exists an \((n-1)\times (n-1)\) orthogonal matrix \(Q\) such that \(Q^TCQ=D\text{,}\) a diagonal matrix. Hence
\begin{equation*} M=\left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q \end{array}\right] \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp D \end{array}\right] \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q^T \end{array}\right] \end{equation*}
Thus we have
\begin{equation*} P^TAP = M = \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q \end{array}\right] \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp D \end{array}\right] \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q^T \end{array}\right]. \end{equation*}
This implies
\begin{equation*} A = P\left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q \end{array}\right] \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp D \end{array}\right] \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q^T \end{array}\right]P^T. \end{equation*}
Define \(P_1=P\left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp Q \end{array}\right]\) . The \(P_1\) is an orthogonal matrix and
\begin{equation*} A=P_1 \left[ \begin{array}{c|c} \lambda \amp 0 \\ \hline 0 \amp D \end{array}\right] P_1^T. \end{equation*}
The above theorem is called the spectral throem of real symmetric matrix.

Example 6.4.9.

Consider a matrix \(A=\left(\begin{array}{rrr} 5 \amp -2 \amp -4 \\ -2 \amp 8 \amp -2 \\ -4 \amp -2 \amp 5 \end{array} \right)\text{.}\) Clearly \(A\) is symmetric and hence it is orthogonally diagonalizable. The characteristic polynomial of \(A\) is
\begin{equation*} \det{(xI-A)}=x^3 - 18x^2 + 81x=x(x-9)^2\text{.} \end{equation*}
Hence \(0, 9, 9\) are eigenvalues of \(A\text{.}\) Its is easy to find that \(v_1=(1, 1/2, 1)\) is an eigenvector corresponding to the eigenvalue 0. \(v_2=(1, 0, -1), v_2=(0, 1, -1/2)\) are eigenvectors corresponding to eigenvalue 9. Hence \(P:=\left(\begin{array}{rrr} 1 \amp 1 \amp 0 \\ \frac{1}{2} \amp 0 \amp 1 \\ 1 \amp -1 \amp -\frac{1}{2} \end{array} \right)\text{.}\) Then
\begin{equation*} P^{-1}AP=\left(\begin{array}{rrr} 0 \amp 0 \amp 0 \\ 0 \amp 9 \amp 0 \\ 0 \amp 0 \amp 9 \end{array} \right) \end{equation*}

Problem 6.4.10.

For the following matrices find an orthogonal matrix \(P\) such that \(P^{-1}AP\) is a diagonal matrix.
\begin{equation*} \begin{pmatrix}2 \amp -1 \\-1 \amp 1 \end{pmatrix} , \begin{pmatrix}1 \amp 0 \amp -1\\0 \amp 1 \amp 2\\-1 \amp 2 \amp 5 \end{pmatrix} \end{equation*}

Proof.

Distance Preserving maps in \(\R^n.\)
Suppose \(f\colon \R^n \to \R^n\) is a map, that preserves the distance, that is \(\norm{f(x)-f(x)}=\norm{x-y}\) for all \(x,y\in \R^n\text{.}\) We would like to study such maps. Let us first look at a spacial case when \(f\) fixes the origin.

Proof.

From (1) and (2) we have
\begin{equation*} \norm{f(x)}=\norm{f(x)=f(0)}=\norm{f(x-0)}}=\norm{f(x)}, \end{equation*}
for all \(x\in \R^n\text{.}\) Using this we have
\begin{equation*} \norm{f(x)-f(y)}^2=\norm{x-y}^2. \end{equation*}
Exanding both sides, we get
\begin{equation*} f(x)\cdot f(y) = x\cdot y \end{equation*}
for all \(x,y\text{.}\) That is, \(f\) preserves the dot product. This implies that \(f\) maps an orthonormal basis of \(\R^n\) to an orthonormal basis of \(\R^n\text{.}\) In particulatr, \(\{f(e_i)\}\) is an orthonormal basis of \(\R^n\text{,}\) where \(\{e_i\}\) is the standard basis. Hence
\begin{equation*} f(x)=f(\sum x_i e_i)=\sum f(x)\cdot f(e_i) f(e_i)=\sum x\cdot e_i f(e_i)=\sum x_i f(e_i). \end{equation*}
This shows that \(f\) is a linear map. (why?)
Now using the abobve LemmaΒ 6.4.12, we can identify all distance preserving maps in \(\R^n\text{,}\) which is the content of the next theorem.

Proof.

Let \(x_0:=f(0)\) and \(g(x)=f(x)-x_0\text{.}\) Then it is easy to check that \(g(0)=0\) and \(\norm{g(x)-g(y)}=\norm{x-y}\) for all \(x,y\text{.}\) Hence by LemmaΒ 6.4.12, \(g\) is linear. By TheoremΒ 6.4.11, \(g(x)=Ax\) for some orthogonal linear transformation \(A\text{.}\) Hence \(f(x)=Ax+x_0.\)

Definition 6.4.14.

Let \(A\) be a symmetric positive definite matrix. A square root of \(A\) is a matrix \(B\) such that
\begin{equation*} B^2 = A. \end{equation*}
The unique symmetric positive definite square root of \(A\) is denoted by \(A^{1/2}\text{.}\)
How to find square root of a symmtric positive definte matrix?

Proof.

Since \(A\) is symmetric, it is orthogonally diagonalizable:
\begin{equation*} A = Q DQ^T, \end{equation*}
with \(D= \mathrm{diag}(\lambda_1,\dots,\lambda_n)\text{,}\) where all \(\lambda_i>0\text{.}\) Define \(D^{1/2} = \mathrm{diag}(\sqrt{\lambda_1},\dots,\sqrt{\lambda_n})\text{.}\) Then
\begin{equation*} (Q D^{1/2} Q^T)^2 = Q D^{1/2} Q^T Q D^{1/2} Q^T = Q DQ^T = A. \end{equation*}
Thus \(A^{1/2} = QD^{1/2}Q^T\) is symmetric and positive definite. Uniqueness follows from the strict positivity of eigenvalues.

Subsection 6.4.1 Applications of Affine Linear Map

Let us look some applications of affine linear transformation to fractals.

Example 6.4.16. Koch Curve.

The Koch curve is a classic fractal that can be described using the language of affine linear transformations.
The construction of the Koch curve begins with a single line segment from \((0,0)\) to \((1,0)\) called initiator.
Next we remove the middle third of the line, and replace it with two lines that each have the same length (1/3 or orgininal) as the remaining lines on each side. This new form is called the generator, because it specifies a rule that is used to generate a new form. Note that length of each seqment is 1/3. See FigureΒ 6.4.18.
Figure 6.4.17. Initiator
Figure 6.4.18. Generator
Next we repeat the above steps to each of four segments in the generator. Then we get the cureve as in FigureΒ 6.4.19. Length of each segment in this case is \(16/9\text{.}\) If we apply the generator once again, we ge the curve as in FigureΒ 6.4.20. Length of each segment in this case is \(64/27\text{.}\)
Figure 6.4.19. After 2 iterations
Figure 6.4.20. After 3 iterators
If we keep apply this process, we get what is called the Koch Curve (named after the mathematician Helge von Koch in 1904). After applying the 7 iterators we get the curve as in FigureΒ 6.4.21
Figure 6.4.21. Koch Curve with 7 iterations.
Now let us construct the Koch curve as an application of affine linear map. The construction begins with a single line segment from \((0,0)\) to \((1,0)\text{.}\) At each step, this segment is replaced by four smaller segments:
  1. The first third (straight, scaled by \(1/3\)).
  2. The second third (scaled by \(1/3\) and rotated by \(+60^\circ\)).
  3. The third third (scaled by \(1/3\) and rotated by \(-60^\circ\)).
  4. The last third (straight, shifted).
Each of these pieces is obtained from the original segment by applying one of four affine linear maps.
\begin{align*} T_1(z) \amp= \tfrac{1}{3}z, \\ T_2(z) \amp= \tfrac{1}{3}z + \tfrac{1}{3}, \\ T_3(z) \amp= \tfrac{1}{3} e^{i\pi/3} z + \tfrac{1}{3}, \\ T_4(z) \amp= \tfrac{1}{3}z + \tfrac{2}{3}. \end{align*}
Written in real coordinates, these are of the form \(T_j(x) = A_j x + b_j\) with
  • \(A_1 = \tfrac{1}{3}I, \quad b_1 = (0,0)\text{,}\)
  • \(A_2 = \tfrac{1}{3}I, \quad b_2 = (1/3,0)\text{,}\)
  • \(A_3 = \tfrac{1}{3}R_{60}, \quad b_3 = (1/3,0)\text{,}\)
  • \(A_4 = \tfrac{1}{3}I, \quad b_4 = (2/3,0)\text{,}\)
where \(R_{60}\) is the \(2\times 2\) rotation matrix for \(60^\circ\text{.}\)
We deomontrate this in Sage.

Example 6.4.22. SierpiΕ„ski Triangle.

The Sierpinski triangle (named after the Polish mathematician Waclaw Sierpinski), also called the Sierpinski gasket, is a self-similar fractal subset \(S\) of the plane. It can be obtained by an iterative geometric construction starting from a filled equilateral triangle and applying iterated function system (IFS) consisting of affine maps.
The construction begins with a filled equilateral triangle (stage 0). At each stage we subdivide and remove parts according to the following rules:
  • Stage 0: Start with a solid equilateral triangle of side length 1. See FigureΒ 6.4.23.
  • Stage 1: Subdivide the triangle into four smaller equilateral triangles of side length 1/2 and remove the central one. See FigureΒ 6.4.24.
  • Stage \(n+1\text{:}\) For each filled triangle from stage \(n\text{,}\) repeat the same process: divide into four, remove the central one. See FigureΒ 6.4.25 and FigureΒ 6.4.26 for two and three iterations.
Continuing indefinitely, the limit of this process is the SierpiΕ„ski triangle. See FigureΒ 6.4.27 after 8 iterations.
Figure 6.4.23. Original Triangle
Figure 6.4.24. After One Iteration
Figure 6.4.25. After Two Iterations
Figure 6.4.26. After Three Iterations
Figure 6.4.27. Sierpinski Triangle after 8 iterations.
The SierpiΕ„ski triangle arises from three specific affine maps:
\begin{align*} T_1(x) \amp = \tfrac{1}{2} x, \\ T_2(x) \amp= \tfrac{1}{2} x + (1/2, 0), \\ T_3(x) \amp= \tfrac{1}{2} x + (1/4, \tfrac{\sqrt{3}}{4}) \end{align*}
Each of these maps scales the plane by a factor of \(1/2\) and then translates:
  • \(T_1\) shrinks towards the origin.
  • \(T_2\) shrinks and shifts right to cover the bottom-right subtriangle.
  • \(T_3\) shrinks and shifts upward to cover the top subtriangle.
If \(S\) denotes the SierpiΕ„ski triangle, then it satisfies the fundamental iterated function system equation:
\begin{equation*} S = T_1(S) \cup T_2(S) \cup T_3(S). \end{equation*}
Now let us see how we can make the Sierspinki triangle in Sage.

Example 6.4.28. Sierpinski Carpet.

The Sierpinski carpet is the planar fractal obtained by repeatedly removing the open central square from a subdivided unit square. Equivalently, it is the unique nonempty set \(C\) satisfying an iterated-function system (IFS) of eight contractive affine maps.

Example 6.4.29. Sierpinski Pyramid.

The Sierpinski pyramid (also called the Sierpinski tetra-pyramid when based on a triangle, or SierpiΕ„ski square pyramid when based on a square) is a three-dimensional fractal obtained by repeatedly subdividing a pyramid into smaller self-similar pyramids. It provides a natural extension of the ideas behind the SierpiΕ„ski triangle and SierpiΕ„ski carpet to three dimensions.
The construction can be described as an application of affine linear maps. Starting from an initial pyramid \(P_0\text{,}\) we apply scaling by a factor of \(\tfrac{1}{2}\) followed by translations to position the smaller pyramids. In the square-based case, four pyramids are placed at the corners of the base, and one is placed on the top near the apex. This gives a total of five affine maps:
\begin{equation*} P = T_1(P) \cup T_2(P) \cup T_3(P) \cup T_4(P) \cup T_5(P), \end{equation*}
where each \(T_j\) is of the form \(T_j(x) = A x + b_j\text{,}\) with \(A\) being the scaling matrix and \(b_j\) the translation vector.

Subsection 6.4.2 Quadratic Forms and Conic Sections

In this subsection, we give an application of orthogonal diagonalizability to conic sections.
A general second-degree equation in two variables is given by
\begin{equation*} Q(x,y) = ax^2 + 2bxy + cy^2 + dx + ey + f = 0, \end{equation*}
where \(a,b,c,d,e,f \in \mathbb{R}\text{.}\)
This equation can be written compactly in matrix notation as
\begin{equation*} Q(x,y) = \begin{bmatrix}x \amp y\end{bmatrix} \begin{bmatrix} a \amp b \\ b \amp c \end{bmatrix} \begin{bmatrix}x \\ y\end{bmatrix}+ \begin{bmatrix} d \amp e \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} + f = 0. \end{equation*}
Here,
\begin{equation*} A = \begin{bmatrix} a \amp b \\ b \amp c \end{bmatrix} \end{equation*}
is the symmetric matrix associated with the quadratic part \(ax^2+2bxy+cy^2\text{.}\)
Since \(A\) is symmetric, it is orthogonally diagonalizable. That is, there exists an orthogonal matrix \(P\) such that
\begin{equation*} P^TAP = D = \begin{pmatrix} \lambda_1 \amp 0\\ 0 \amp \lambda_2\end{pmatrix}, \end{equation*}
where \(\lambda_1,\lambda_2\) are eigenvalues of \(A\) and \(P\) is the column matrix of orthogonal eigenbasis.
With the change of variables
\begin{equation*} \begin{bmatrix} x \\ y \end{bmatrix} = P \begin{bmatrix} u \\ v \end{bmatrix}, \end{equation*}
the quadratic form simplifies to
\begin{equation*} Q(u,v) = \lambda_1 u^2 + \lambda_2 v^2 + \alpha u+\beta v + f = 0. \end{equation*}
Note that here
\begin{equation*} \begin{bmatrix}\alpha\\\beta\end{bmatrix} =\begin{bmatrix}d\amp e\end{bmatrix}P. \end{equation*}
Thus, the cross term \(2bxy\) is eliminated by using the orthogonal linear trasformation \(\begin{bmatrix} u \\ v \end{bmatrix}=P\begin{bmatrix} x \\ y \end{bmatrix} \) and the conic aligns with its principal axes, that is along the eigenvectors directions.
Now we have various cases. If we assume that \(\lambda_1\) and \(\lambda_2\) are positive, then we can complete square and we get
\begin{equation*} Q(u,v) = \lambda_1\left(u+\frac{\alpha}{2\lambda_1}\right)^2+ \lambda_2\left(v+\frac{\beta}{2\lambda_2}\right)^2-g, \end{equation*}
for some real number \(g\text{.}\) What is \(g\text{?}\) It is \(\left(\frac{\alpha^2}{4\lambda_1} + \frac{\beta^2}{4\lambda_2} - f\right)\text{.}\)
The origin of this quadratic in \(uv\)-coordinates is
\begin{equation*} \begin{bmatrix} u_0\\v_0\end{bmatrix}= \begin{bmatrix}-\frac{\alpha}{2\lambda_1}\\-\frac{\beta}{2\lambda_2}\end{bmatrix}. \end{equation*}
Hence the orgin in terms of \(xy\)-coordinates is
\begin{equation*} \begin{bmatrix} x_0\\y_0\end{bmatrix}= P\begin{bmatrix}-\frac{\alpha}{2\lambda_1}\\-\frac{\beta}{2\lambda_2}\end{bmatrix}. \end{equation*}
Thus we have converted the original quadratic \(Q(x,y)\) to
\begin{equation*} Q(\tilde{x},\tilde{y})=\lambda_1\tilde{x}^2+\lambda_2\tilde{y}^2-g = \frac{\tilde{x}^2}{\left(\frac{g}{\lambda_1}\right)^2}+\frac{\tilde{y}^2}{\left(\frac{g}{\lambda_1}\right)^2}-1. \end{equation*}
This is an ellipse. Here, we have
\begin{equation*} \begin{pmatrix} \tilde{x}\\ \tilde{y} \end{pmatrix} = \begin{pmatrix} u+\frac{\alpha}{2\lambda_1}\\v+\frac{\beta}{2\lambda_2}\end{pmatrix} =\begin{pmatrix} u\\v\end{pmatrix}+\begin{pmatrix} \frac{\alpha}{2\lambda_1}\\\frac{\beta}{2\lambda_2}\end{pmatrix}= P^{-1}\begin{pmatrix}x\\y \end{pmatrix}+\begin{pmatrix} \frac{\alpha}{2\lambda_1}\\\frac{\beta}{2\lambda_2}\end{pmatrix}. \end{equation*}
The transformation, \(P^{-1}\begin{pmatrix}x\\y \end{pmatrix}+\begin{pmatrix} \frac{\alpha}{2\lambda_1}\\\frac{\beta}{2\lambda_2}\end{pmatrix} \) is called an affine linear transformation. Here \(P^{-1}\) is a orthogonl linear map. Thus an affine linear transformation on \(\R^n\) is a map of the form \(T(x)=Px+v\) ,where \(P\) is an orthogonal transformation and \(v\) is a called a translation vector. Such maps are also called isometries.
In case, \(\lambda_1\) and \(\lambda_2\) both are negative then, we can multiply the whole equation by \(-1\) and we get the a similar expression except, the right hand changes its sign.
In case one of the \(\lambda's\text{,}\) say \(\lambda_2<0\text{,}\) then the conic tranforms to
\begin{equation*} Q(\tilde{x},\tilde{y})=\lambda_1\tilde{x}^2-\lambda_2\tilde{y}^2-g, \end{equation*}
which is a hyperbola.
In case one of the \(\lambda's\text{,}\) say \(\lambda_2=0\text{,}\) then the conic tranforms to
\begin{equation*} Q(\tilde{x},\tilde{y})=\lambda_1\tilde{x}^2+\beta \tilde{y} -g, \end{equation*}
which is parabola. Here \(\tilde{y}=v\) and \(g=\alpha^2/(4\lambda_1)-f\text{.}\)
Classification of Conics in two variables
Based on the above discussions, the classification of the above conic section depends on the eigenvalues of \(A\text{.}\)
  • Ellipse: If both eigenvalues \(\lambda_1, \lambda_2\) have the same sign, then the quadratic is an ellipe of the form \(x^2/a^2+y^2/b^2=1\text{.}\)
  • Circle: When \(\lambda_1 = \lambda_2\text{,}\) then the quadratic is circle.
  • Hyperbola: If eigenvalues have opposite signs, then the quadratic is a hyperbola of the form
    \begin{equation*} x^2/a^2 -y^2/b^2= 1. \end{equation*}
  • Parabola: If one eigenvalue is zero, then it is a parabola.

Example 6.4.30.

Condiser the quadratic \((Q(x,y)=7 \, x^{2} - 6 \, x y + 7 \, y^{2} - 2 \, x + 3 \, y - 24\text{.}\) Let us convert this quadrtic into a conic section in canonial form.
Solution.
The associated symmetric matrix of this quadratic \(A\) is given by
\begin{equation*} A = \begin{pmatrix} 7 \amp -3 \\-3 \amp 7\end{pmatrix}. \end{equation*}
It is easy to check the the eigenvalues of \(A\) are \(\lambda_1=10\) and \(\lambda_2=4\) with the corresponding eigenvectors \(v_1 = \left(\frac{1}{2} \, \sqrt{2},\,-\frac{1}{2} \, \sqrt{2}\right)\) and \(v_2=\left(\frac{1}{2} \, \sqrt{2},\,\frac{1}{2} \, \sqrt{2}\right)\text{.}\) Hence we have
\begin{equation*} D= \left(\begin{array}{rr} 10 \amp 0 \\ 0 \amp 4 \end{array}\right), P = \left(\begin{array}{rr} \frac{1}{\sqrt{2}} \amp \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \amp \frac{1}{\sqrt{2}} \end{array}\right). \end{equation*}
The new coordinates in terms of \(uv\) is
\begin{equation*} \begin{bmatrix} x \\ y \end{bmatrix} = P \begin{bmatrix} u \\ v \end{bmatrix}=\left(\begin{array}{rr} \frac{1}{\sqrt{2}} u + \frac{1}{\sqrt{2}} v \\ -\frac{1}{\sqrt{2}} u + \frac{1}{\sqrt{2}v} \end{array}\right). \end{equation*}
Now substituting \(x=\frac{1}{\sqrt{2}} u + \frac{1}{\sqrt{2}} v\) and \(y=-\frac{1}{\sqrt{2}} u + \frac{1}{\sqrt{2}} v \) in the given quadratic, we get
\begin{equation*} Q(u,v)=10 \, u^{2} + 4 \, v^{2} - 7 \, \sqrt{2} u + \sqrt{2} v - 56. \end{equation*}
After completing the squares, we get
\begin{equation*} Q(u,v)=10 \left(u- \frac{7\sqrt{2}}{20}\right)^2+ 4 \left(v+ \frac{\sqrt{2}}{8}\right)^2 - 2343/40. \end{equation*}
This can be written as an equation of ellipes. Note that here the translation vector is given by
\begin{equation*} \begin{pmatrix}x_0\\y_0\end{pmatrix}=P\begin{pmatrix} \frac{7\sqrt{2}}{20}\\\frac{-\sqrt{2}}{8}\end{pmatrix}= \begin{pmatrix}\frac{9}{40}\\ -\frac{19}{40}\end{pmatrix}. \end{equation*}
Let us explore this in Sage. Here we plot the orginal quadratic curve along with the transformed coordinates.

Example 6.4.31.

Consider the quadratic equation \(3x^{2}+4xy + 2 y^{2} - 8 x + 6 y-3=0\text{.}\) We wish the classify this as a conic section.
Let us first plot the graph of this curve in Sage.
The symmetric matrix associated with quadratic tem is given by
\begin{equation*} A = \begin{pmatrix}3 \amp 2 \\ 2 \amp 2 \end{pmatrix}\text{.} \end{equation*}
It is easy to check that he eigenvalues are \(\lambda_1=0.4384471871911698, \lambda_2=4.561552812808830\text{.}\) Since both the eigenvalues are positive, this quadratic is an ellipse. This is what the graph shows.
Now we give all the steps in Sage to plot the curve along with the now coordinate system.

Example 6.4.32.

Consider the quadratic eqation given by \(-x^2+4xy-y^2-30x,+y+20=0\text{.}\) Use Sage to classify this and plot the curve along with the transformed coordinates system.

Example 6.4.33.

Consider the quadratic equation \(Q(x,y) = 3 \, x^{2} - 6 \, x y + 3 \, y^{2} - 6 \, x + 8 \, y + 5\) and classify this to a conic section.
Solution.
The matrix associated with the quadratic part of the above equation is \(A = \left(\begin{array}{rr} 3 \amp -3 \\ -3 \amp 3 \end{array}\right)\text{.}\) It is easy to check that the eigenvalues of \(A\) are \(\lambda_1=6, \lambda_2=0\text{.}\) Since one of the eigenvalues is 0, this curve is a parabola. Let us draw this curve along with the tranformed orgini and the two new coordinate directions in Sage.

Activity 6.4.1.

For given quadratic equation \(Q(x,y)=ax^2+2bxy+cy^2+dx+ey+f=0\text{,}\) write down the corresponding canonical conics by describing the new orgin \((x_0,y_0)\text{,}\) and the new coordinate vectors by codisering different cases in a tabular form.

Subsection 6.4.3 Classification of Quadratic Surfaces in Three Variables

The classification of quadratic equation in three varibale can be done in a very similar manner as we have seen in case of two variable SubsectionΒ 6.4.2
A general quadratic equation in three variables is
\begin{equation*} Q(x,y,z) = ax^2 + by^2 + cz^2 + 2dxy + 2eyz + 2fzx + gx + hy + iz + j=0 \end{equation*}
where \(a,b,c,d,e,f,g,h,i,j \in \mathbb{R}\text{.}\)
In matrix form,
\begin{equation*} Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x} + \mathbf{b}^T \mathbf{x} + j, \quad \mathbf{x} = \begin{bmatrix} x \\ y \\ z \end{bmatrix}, \quad A = \begin{bmatrix} a \amp d \amp f \\ d \amp b \amp e \\ f \amp e \amp c \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} g \\ h \\ i \end{bmatrix}. \end{equation*}
Since \(A\) is symmetric, there exists an orthogonal matrix \(P\) such that
\begin{equation*} P^T A P = D= \operatorname{diag}(\lambda_1,\lambda_2,\lambda_3). \end{equation*}
After an orthogonal change of variables \(\mathbf{x} = P\mathbf{u}\) and a translation to eliminate linear terms, the quadratic form reduces to the canonical form.
\begin{equation*} Q(u,v,w) = \lambda_1 u^2 + \lambda_2 v^2 + \lambda_3 w^2 + j = 0. \end{equation*}
Classification of Quadrics
Depending on the signs of \(\lambda_1, \lambda_2, \lambda_3\text{,}\) we obtain the following surfaces:
  1. Ellipsoid: All eigenvalues positive.
    \begin{equation*} \frac{u^2}{a^2} + \frac{v^2}{b^2} + \frac{w^2}{c^2} = 1, a,b,c > 0\text{.} \end{equation*}
  2. Hyperboloid of One Sheet: Two positive eigenvalues, one negative.
    \begin{equation*} \frac{u^2}{a^2} + \frac{v^2}{b^2} - \frac{w^2}{c^2} = 1. \end{equation*}
  3. Hyperboloid of Two Sheets: One positive eigenvalue, two negative.
    \begin{equation*} -\frac{u^2}{a^2} - \frac{v^2}{b^2} + \frac{w^2}{c^2} = 1. \end{equation*}
  4. Elliptic Cone: Two positibve and one negative eigenvalues with no constant term.
    \begin{equation*} \frac{u^2}{a^2} + \frac{v^2}{b^2} - \frac{w^2}{c^2} = 0. \end{equation*}
  5. Elliptic Paraboloid: (Bowl-shaped surface) Two positive eigenvalues, one zero.
    \begin{equation*} \frac{u^2}{a^2} + \frac{v^2}{b^2} = \frac{w}{c}. \end{equation*}
  6. Hyperbolic Paraboloid: (Saddle Surface) One positive eigenvalue, one negative, one zero.
    \begin{equation*} \frac{u^2}{a^2} - \frac{v^2}{b^2} = \frac{w}{c}. \end{equation*}
  7. Elliptic Cylinder: Two positive eigenvalues, third zero.
    \begin{equation*} \frac{u^2}{a^2} + \frac{v^2}{b^2} = 1. \end{equation*}
  8. Hyperbolic Cylinder: One positive, one negative, third zero.
    \begin{equation*} \frac{u^2}{a^2} - \frac{v^2}{b^2} = 1. \end{equation*}
  9. Parabolic Cylinder: Only one nonzero eigenvalue.
    \begin{equation*} \frac{u^2}{a^2} = \frac{v}{b}. \end{equation*}