Steven R. Dunbar
Department of Mathematics
203 Avery Hall
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466

Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar

__________________________________________________________________________

Eigenvalues, Eigenvectors and Normal Forms of Matrices

_______________________________________________________________________ ### Rating

Mathematically Mature: may contain mathematics beyond calculus with proofs.

_______________________________________________________________________________________________ ### Question of the Day

What is the definition of an eigenvalue and an eigenvector of a matrix? What is the geometric meaning and interpretation of an eigenvalue and eigenvector for a matrix?

_______________________________________________________________________________________________ ### Key Concepts

1. The eigenvalues of a real symmetric matrix are real numbers.
2. Eigenvectors of a symmetric matrix corresponding to distinct eigenvalues are orthogonal.
3. Let $A$ be a matrix of order $n$ with elements from $ℂ$. Then there exists a unitary matrix $U$ such that
$T={U}^{\ast }AU$

is upper triangular. The diagonal elements of $T$ are the eigenvalues of $A$.

4. Let $A$ be a Hermitian matrix of order $n$, so ${A}^{\ast }=A$. There is a unitary matrix $U$ for which
${U}^{\ast }AU=D=diag\left[{\lambda }_{1},\dots ,{\lambda }_{n}\right]$

is a diagonal matrix with diagonal elements which are the eigenvalues ${\lambda }_{1},\dots {\lambda }_{n}$. If $A$ is real, then $U$ can be taken as orthogonal.

__________________________________________________________________________ ### Vocabulary

1. A matrix is stochastic if the column sums ${\sum }_{i=1}^{n}{P}_{ij}=1$ for $j=1\dots n$. This is identical to saying that
$1P=1$

2. Let $A$ be a matrix of order $n$ with elements from $ℂ$. Then there exists a unitary matrix $U$ such that
$T={U}^{\ast }AU$

is upper triangular. The diagonal elements of $T$ are the eigenvalues of $A$. This similar matrix is called the Schur Normal Form of the matrix $A$.

__________________________________________________________________________ ### Mathematical Ideas

This section provides proofs of some standard facts from linear algebra about eigenvalues, eigenvectors and normal forms of matrices. These facts are needed in the section on the fastest mixing Markov chain.

#### Some basic facts about eigenvalues of symmetric matrices.

Lemma 1. The eigenvalues of a real symmetric matrix are real numbers.

Proof. Let $P$ be a symmetric matrix, so that ${P}^{T}=P$. Let $\lambda$ be an eigenvalue of $P$ with corresponding eigenvector $x$. Denote the (complex) inner product on ${ℂ}^{n}$ as $\left(x,y\right)={\sum }_{i=1}^{n}\stackrel{̄}{{x}_{i}}{y}_{i}$ where $\stackrel{̄}{{x}_{j}}$ is the complex conjugate. Note that we use the complex inner product because $\lambda$ and $x$ are potentially complex. Then

$\begin{array}{llll}\hfill \lambda \left(x,x\right)& =\left(x,\lambda x\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left(x,Px\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left({P}^{T}x,x\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left(Px,x\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left(\lambda x,x\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\stackrel{̄}{\lambda }\left(x,x\right).\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

Hence, $\stackrel{̄}{\lambda }=\lambda$ and $\lambda$ must be real. □

Lemma 2. Let ${x}_{k}$ and ${x}_{l}$ be eigenvectors corresponding to distinct eigenvalues ${\lambda }_{k}\ne {\lambda }_{l}$ of the symmetric matrix $P$. Then

$\left({x}_{k},{x}_{l}\right)=0$

That is, eigenvectors of a symmetric matrix corresponding to distinct eigenvalues are orthogonal.

Proof.

$\begin{array}{llll}\hfill {\lambda }_{l}\left({x}_{k},{x}_{l}\right)& =\left({x}_{k},{\lambda }_{l}{x}_{l}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left({x}_{k},P{x}_{l}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left({P}^{T}{x}_{k},{x}_{l}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & ={\left(P{x}_{k},x\right)}_{l}\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left({\lambda }_{k}{x}_{k},{x}_{l}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\stackrel{̄}{{\lambda }_{k}}\left({x}_{k},{x}_{l}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\stackrel{̄}{{\lambda }_{k}}\left({x}_{k},{x}_{l}\right).\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

Then

$\left({x}_{k},{x}_{l}\right)=0.$

Remark.

1. These are standard proofs in numerical analysis, linear algebra, operator theory and many other places in mathematics. They are included here for completeness.
2. The same proofs show that for a Hermitian matrix ($P={P}^{\star }$) the eigenvalues are real.

Remark. Actually, more is true and the next two theorems prove that a symmetric matrix has a complete set of orthogonal eigenvectors even if some eigenvalues are repeated. We will need this crucial fact in order to prove an essential inequality about the rate of mixing of Markov chains.

Theorem 3 (Schur Normal Form). Let $A$ be a matrix of order $n$ with elements from $ℂ$. Then there exists a unitary matrix $U$ such that

$T={U}^{\ast }AU$

is upper triangular. The diagonal elements of $T$ are the eigenvalues of $A$.

Proof. The proof is by induction on the order $n$ of $A$. The result is trivially true for $k=1$ using $U=\left(1\right)$. We assume the result is true for all matrices of order $n\le k-1$, and we will then prove it to be true of all matrices of order $n=k$.

Let ${\lambda }_{1}$ be an eigenvalue of $A$ and let ${u}^{\left(1\right)}$ be an associated eigenvector with $||{u}^{\left(1\right)}|{|}_{2}=1$. Beginning with ${u}^{\left(1\right)}$, pick an orthonormal basis for ${ℂ}^{k}$. (Note that this can be done by filling out a basis with the standard unit vectors and then using the Gram-Schmidt orthogonalization procedure.) Call the basis so obtained $\left\{{u}^{\left(1\right)},{u}^{\left(2\right)},\dots {u}^{\left(k\right)}\right\}$. Define the $k×k$ column matrix

${P}_{1}=\left[{u}^{\left(1\right)},{u}^{\left(2\right)},\dots ,{u}^{\left(k\right)}\right]$

Note that ${P}_{1}^{\ast }{P}_{1}=I$, so that ${P}_{1}^{-1}={P}_{1}^{\ast }$. Define

${B}_{1}={P}_{1}^{\ast }A{P}_{1}$

The claim is that

${B}_{1}=\left(\begin{array}{cccc}\hfill {\lambda }_{1}\hfill & \hfill {\alpha }_{2}\hfill & \hfill \dots \phantom{\rule{0em}{0ex}}\hfill & \hfill {\alpha }_{k}\hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \\ \hfill ⋮\hfill & \hfill \hfill & \hfill {A}_{2}\hfill & \hfill \hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \end{array}\right)$

with ${A}_{2}$ of order $k-1$ and ${\alpha }_{2},{\alpha }_{3},\dots ,{\alpha }_{k}$ some numbers. To prove this, simply multiply:

$\begin{array}{llllll}\hfill A{P}_{1}& =A\left[{u}^{\left(1\right)},{u}^{\left(2\right)},\dots ,{u}^{\left(k\right)}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left[A{u}^{\left(1\right)},A{u}^{\left(2\right)},\dots ,A{u}^{\left(k\right)}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left[{\lambda }_{1}{u}^{\left(1\right)},{v}^{\left(2\right)},\dots ,{v}^{\left(k\right)}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill {B}_{1}& ={P}_{1}^{\ast }A{P}_{1}\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left[{\lambda }_{1}{P}_{1}^{\ast }{u}^{\left(1\right)},{P}_{1}^{\ast }{v}^{\left(2\right)},\dots ,{P}_{1}^{\ast }{v}^{\left(k\right)}\right]\phantom{\rule{2em}{0ex}}& \hfill =\left[{\lambda }_{1}{e}^{\left(1\right)},{w}^{\left(2\right)},\dots ,{w}^{\left(k\right)}\right]& \phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}& \hfill \end{array}$

The last equality follows from the constructed orthonormality of ${u}^{\left(1\right)}$ and the associated construction of ${P}_{1}$ and ${P}_{1}^{\ast }$, with $e=\left(1,0,\dots ,0\right)$ and ${w}^{\left(j\right)}={P}_{1}^{\ast }\left(j\right)$. Note that ${B}_{1}$ has the desired form, and the claim is established.

By the induction hypothesis, there exists a unitary matrix ${\stackrel{̂}{P}}_{2}$ of order $k-1$ for which

$\stackrel{̂}{T}={\stackrel{̂}{P}}_{2}^{\ast }{A}_{2}{\stackrel{̂}{P}}_{2}$

is upper triangular of order $k-1$. Define

${P}_{2}=\left(\begin{array}{cccc}\hfill 1\hfill & \hfill 0\hfill & \hfill \dots \phantom{\rule{0em}{0ex}}\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \\ \hfill ⋮\hfill & \hfill \hfill & \hfill {\stackrel{̂}{P}}_{2}\hfill & \hfill \hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \end{array}\right)$

Then ${P}_{2}$ is unitary and

$\begin{array}{llll}\hfill {P}_{2}^{\ast }{B}_{1}{P}_{2}& =\left(\begin{array}{cccc}\hfill {\lambda }_{1}\hfill & \hfill {\gamma }_{2}\hfill & \hfill \dots \phantom{\rule{0em}{0ex}}\hfill & \hfill {\gamma }_{k}\hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \\ \hfill ⋮\hfill & \hfill \hfill & \hfill {\stackrel{̂}{P}}_{2}^{\ast }{A}_{2}{\stackrel{̂}{P}}_{2}\hfill & \hfill \hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \end{array}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill & =\left(\begin{array}{cccc}\hfill {\lambda }_{1}\hfill & \hfill {\gamma }_{2}\hfill & \hfill \dots \phantom{\rule{0em}{0ex}}\hfill & \hfill {\gamma }_{k}\hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \\ \hfill ⋮\hfill & \hfill \hfill & \hfill \stackrel{̂}{T}\hfill & \hfill \hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill \end{array}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

is an upper triangular matrix. Thus

$T={P}_{2}^{\ast }{B}_{1}{P}_{2}={P}_{2}^{\ast }{P}_{1}^{\ast }A{P}_{1}{P}_{2}={\left({P}_{1}{P}_{2}\right)}^{\ast }A\left({P}_{1}{P}_{2}\right)$

Define $U={P}_{1}{P}_{2}$, which is easily seen to be unitary. This completes the induction and the proof. □

Theorem 4 (Principal Axes Theorem). Let $A$ be a Hermitian matrix of order $n$, so ${A}^{\ast }=A$. Then $A$ has $n$ real eigenvalues ${\lambda }_{1},\dots ,{\lambda }_{n}$, not necessarily distinct, and $n$ corresponding eigenvectors $\left\{{u}^{\left(1\right)},{u}^{\left(2\right)},\dots ,{u}^{\left(k\right)}\right\}$ that form an orthogonal basis for ${ℂ}^{n}$. If $A$ is real, the eigenvectors can be taken as real and they form an orthonormal basis for ${ℝ}^{n}$. Finally, there is a unitary matrix $U$ for which

${U}^{\ast }AU=D=diag\left[{\lambda }_{1},\dots ,{\lambda }_{n}\right]$

is a diagonal matrix with diagonal elements ${\lambda }_{1},\dots {\lambda }_{n}$. If $A$ is real, then $U$ can be taken as orthogonal.

Proof. From the Schur Normal Form theorem, there is a unitary matrix $U$ with

${U}^{\ast }AU=T$

with $T$ upper triangular. Take the conjugate transpose of both sides to obtain

${T}^{\ast }={\left({U}^{\ast }AU\right)}^{\ast }={U}^{\ast }{A}^{\ast }{\left({U}^{\ast }\right)}^{\ast }={U}^{\ast }AU=T.$

Since $T$ is upper triangular, then ${T}^{\ast }$ is lower triangular, and since the two are equal, $T$ must be a diagonal matrix

$T=diag\left[{\lambda }_{1},\dots ,{\lambda }_{n}\right].$

Also, ${T}^{\ast }=T$ involves complex conjugation of all elements of $T$ and thus all diagonal elements of $T$ must be real.

Write $U$ as

$U=\left[{u}^{\left(1\right)},{u}^{\left(2\right)},\dots ,{u}^{\left(n\right)}\right].$

Then $T={U}^{\ast }AU$ implies $AU=UT$,

$\begin{array}{llll}\hfill A\left[{u}^{\left(1\right)},{u}^{\left(2\right)},\dots ,{u}^{\left(n\right)}\right]& =\left[{u}^{\left(1\right)},{u}^{\left(2\right)},\dots ,{u}^{\left(n\right)}\right]\left(\begin{array}{ccc}\hfill {\lambda }_{1}\hfill & \hfill \hfill & \hfill 0\hfill \\ \hfill \hfill & \hfill \ddots \hfill & \hfill \hfill \\ \hfill 0\hfill & \hfill \hfill & \hfill {\lambda }_{n}\hfill \\ \hfill \hfill \end{array}\right)\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\\ \hfill \left[A{u}^{\left(1\right)},A{u}^{\left(2\right)},\dots ,A{u}^{\left(n\right)}\right]& =\left[{\lambda }_{1}{u}^{\left(1\right)},{\lambda }_{2}{u}^{\left(2\right)},\dots ,{\lambda }_{n}{u}^{\left(n\right)}\right]\phantom{\rule{2em}{0ex}}& \hfill & \phantom{\rule{2em}{0ex}}\end{array}$

Hence,

$A{u}^{\left(j\right)}={\lambda }_{j}{u}^{\left(j\right)}\phantom{\rule{2em}{0ex}}j=1,\dots ,n$

Since the columns of $U$ are orthonormal, and since the dimension of ${ℂ}^{n}$ is $n$ these must form an orthonormal basis for ${ℂ}^{n}$. □

Lemma 5. If $P$ is a stochastic matrix, then $1$ is an eigenvector corresponding to eigenvalue $1$.

Proof. To say the matrix is stochastic is to say that the column sums ${\sum }_{i=1}^{n}{P}_{ij}=1$ for each row $j=1\dots n$. But this is identical to saying that

${1}^{T}P={1}^{T}$

which is to say that $1$ is an left eigenvector corresponding to eigenvalue $1$ of $P$. □

Remark. Again, this is a completely standard and well-known fact from Markov chain analysis.

Lemma 6. The eigenvalues of a symmetric, stochastic matrix may be arranged in nonincreasing order:

$1={\lambda }_{1}\left(P\right)\ge {\lambda }_{2}\left(P\right)\ge \cdots \ge {\lambda }_{n}\left(P\right)\ge -1$

Proof. Since the eigenvalues are real, and one eigenvalue has value $1$, the proof must show that the remaining eigenvalues are between $-1$ and $1$. This follows from the Gershgorin disk theorem. The Gershgorin disk around the $j$-th diagonal element of $P$ is

$|z-{p}_{jj}|\le \sum _{i=1,i\ne j}^{n}|{p}_{ij}|=1-{p}_{jj}$

The eigenvalues must lie in the union of these disks in the complex plane. However, we know that $0\le {p}_{jj}\le 1$ and further that the eigenvalues must be real, so $-1\le {\lambda }_{k}\le 1$. □

Remark. This is a specialized case of well-known facts from Frobenius theory about the eigenvalues and eigenspaces of non-negative and strictly positive matrices.

#### Sources

This section is completely standard material that can be found in most books on linear algebra or allied subjects such as numerical analysis, advanced calculus and engineering mathematics, or functional analysis.

_______________________________________________________________________________________________ ### Problems to Work for Understanding

__________________________________________________________________________ ### References

   S. Karlin and H. Taylor. A Second Course in Stochastic Processes. Academic Press, 1981.

__________________________________________________________________________ __________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable effort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions or policies of my employer.

Information on this website is subject to change without notice.