Math 314

Topics since the third exam

The final exam is on Wednesday, May 5, from 10:00am to noon. It will cover the material from the entire course, with a slight emphasis on the material from this sheet.

Chapter 4: Eigenvalues

§ 3:
Gram-Schmidt orthogonalization
We've seen how a basis consitsing of vectors orthogonal to one another can prove useful; this section is about how to build such a basis.

The starting point is our old formula for the projection of one vector onto another;

v-[( < w,v > )/( < w,w > )]w is perpendicular to w.

Gram-Scmidt orthogonalization consists of repeatedly using this formula to replace a collection of vectors with ones that are orthogonal to one, without changing their span. Starting with a collection {v1,,vn} of vectors in V,

let w1 = v1, then let w2 = v2-[( < w1,v2 > )/( < w1,w1 > )]w1 .

Then w1 and w2 are orthogonal, and since w2 is a linear combination of w1 = v1 and v2, while the above equation can also be rewritten to give v2 as a linaear combination of w1 and w2, the span is unchanged. Continuing,

let w3 = v3-[( < w1,v3 > )/( < w1,w1 > )]w1-[( < w2,v3 > )/( < w2,w2 > )]w2 ; then since w1 and w2 are orthogonal, it is not hard to check that w3 is orthogonal to both of them, and using the same argument, the span is unchanged (in this case, span{w1,w2,w3} =span{w1,w2,v3}=span{v1,v2,v3}).

Continuing this, we let wk = vk-[( < w1,vk > )/( < w1,w1 > )]w1--[( < wk-1,vk > )/( < wk-1,wk-1 > )]wk-1

Doing this all the way to n will replace v1,,vn with orthogonal vectors w1,,wn, without changing the span.

One thing worth noting is that the if two vectors are orthogonal, then any scalar multiples of them are, too. This means that if the coordinates of one of our wk are not to our satisfaction (having an ugly denomenator, perhaps), we can scale it to change the coordinates to something more pleasant. It is interesting to note that in so doing, the the later vectors wk are unchanged, since our scalar, can be pulled out of both the top inner product and the bottom one in later calculations, and cancelled.

We've seen that if w1,,wn is an orthogonal basis for a subspace W of V, and w W, then w = [( < w1,w > )/( < w1,w1 > )]w1++[( < wk-1,w > )/( < wk-1,wk-1 > )]wk-1

On the other hand, if v V , we can define the orthogonal projection

projW(v) = [( < w1,v > )/( < w1,w1 > )]w1++[( < wk-1,v > )/( < wk-1,wk-1 > )]wk-1

of v into W. This vector is in W, and by the Gram-Schmidt argument, v-projW(v) is orthogonal to all of the wi, so it is orthogonal to every linear combination, i.e., it is orthonal to every vector in W. As a result:

||v-projW(v)|| ||v-w|| for every vector w in W. (**)

In the case that the wi are not just orthogonal but also orthnormal, we can simplify this somewhat:

projW(v) = < w1,v > w1++ < wn,v > wn = (w1w1T++wnwnT)v = Pv ,

where P = (w1w1T++wnwnT) is the projection matrix giving us orthogonal projection.

This projection matrix has three useful properties: (1) since it has the property (**), the matrix you get will be the same no matter what orthonormal basis you will use to build it; (2) it is symmetric (PT = P), and (3) it is idempotent, meaning P2 = P (this is because the orthogonal projection of a vector in W (e.g., Pv) is the same vector).

If we think of the vectors wi as the columns of a matrix A, then W = \cal C(A), and so the result (**) is talking about the least squares solution to the equation Ax = v ! The closest vector Ax to v is then Pv, which, looking at what we did before, means that P = A(ATA)-1AT. This, however, makes sense even if the columns of A are not orthogonal; if we picked orthonormal ones, and computed P, we would still get the least squares solution, which this formula also gives!

§ 4:
Orthogonal matrices
We've seen that having a basis consisting of orthonormal vectors can simplify some of our previous calculations. Now we'll see where some of them come from.

An n×n matrix Q is called orthogonal if it's columns form an orthonormal basis for Rn. This means < (ith column of Q),(jth column of Q> = 1 if i = j, 0 otherwise . This in turn means that QTQ = I, which in turn means QT = Q-1 ! So an orthogonal matrix is one whose inverse is equal to its own transpose.

A basic fact about an orthogonal matrix Q : for any v,w Rn, < Qv,Qw > = < v,w > .

A basic fact about a symmetric matrix A : if v1 and v2 are eigenvectors for A with different eigenvalues l1,l2, then v1 and v2 are orthogonal.

This is a main ingredient needed to show: If A is a symmetric n×n matrix, then A is always diagonalizable; in fact there is an orthonormal basis for Rn consisting of eigenvectors of A. This means that the matrix P, with AP = PD , whose columns are a basis of eigenvectors for A, can (when A is symmetric) be chosen to be an orthogonal matrix.

Wow, short section.

§ 5:
Orthogonal complements
This notion of orthogonal vectors can even be used to reinterpret some of our dearly-held results about systems of linear equations, where all of this stuff began.

Starting with Ax = 0, this can be interpreted as saying that < (every row of A),x > =0, i.e., x is orthogonal to every row of A. This in turn implies that x is orthogonal to every linear combination of rows of A, i.e., x is orthogonal to every vector in the row space of A.

This leads us to introduce a new concept: the orthogonal complement of a subspace W in a vector space V, denoted W^, is the collection of vectors v with v^w for every vector w W. It is not hard to see that these vectors form a subspace of V; the sum of two vectors orthogonal to w, for example, is orthogonal to w, so the sum of two vectors in W^ is also in W^ . The same is true for scalar multiples.

Some basic facts:

For every subspace W, WW^ = {0} (since anything in both is orthogonal to itself, and only the 0-vector has that property).

Any vector v V can be written, uniquely, as v = w+w^, for w W and w^ W^ ; w in fact is projW(v) . v-projW(v) will be in W^, more or less by definition of projW(v) . The uniqueness comes from the result above about intersections.

Even further, a basis for W and a basis for W^ together form a basis for V; this implies that dim(W)+dim(W^) = dim(V) .

Finally, (W^)^ = W ; this is because W is contained in (W^)^ (a vector in W is orthogonal to every vector that is orthogonal to things in W), and the dimensions of the two spaces are the same.

The importance that this has to systems of equations stems from the following facts:

\cal N(A) = \cal R(A)^ (this is what we noted, actually, at the beginning of this section!)

\cal R(A) = \cal N(A)^

\cal C(A) = \cal N(AT)^

So, for example, to compute a basis for W^, start with a basis for W, writing them as the columns of a matrix A, so W = \cal C(A), then W^ = \cal C(A)^ = \cal R(AT)^ = \cal N(AT), which we know how to compute a basis for!

File translated from TEX by TTH, version 0.9.