[an error occurred while processing this directive] [an error occurred while processing this directive]
This section was written by LT Grant and is used with permission. Pieces of this section are adapted from: Elementary Linear Algebra, Richard Hill, 1996.
A rectangular array of real numbers with m rows and n columns is an m × n matrix A. We denote the matrix and its elements as follows:
We will mention a few standard forms for A as a n × n square matrix, although most of these terms have parallels in non-square matrices. We say A is diagonal if it is zero for all non-diagonal entries, that is, a_{ij} = 0 for . This however does not mean each diagonal entry must be nonzero. We say A is an upper-triangular matrix if it has all zero entries below the diagonal, that is, a_{ij} = 0 for . Similarly, we say A is lower-triangular if all entries above the diagonal are zero , that is, a_{ij} = 0 for . The following three matrices are examples of each type respectively:
The transpose of a matrix A is denoted A^{T} . Thus if we consider A as the m × n matrix defined above, then A^{T} is an n × m matrix with entries a_{ij}^{T} = a_{ji}. One can think about as the matrix A “flipped” about the diagonal entries.
In order to solve a linear system, A = , one (computationally inefficient and inaccurate but theoretically simple) way is to take the inverse of the matrix A. However, this is only possible if A is nonsingular, or equivalently, . To calculate the inverse, one can either perform the operation by hand, or use a computer program such as MATLAB or Maple.
Computing the inverse of even small matrices by hand is tedious, and the use of the computer helps to speed this process up considerably. However, one should be aware that for large matrices computing inverses can be a very difficult, inaccurate, misleading and time consuming task. Thus we almost always “transform” our matrix into a “nicer” form in order to save computational steps. In doing this we often can get additional results and facts about our problem that would not have been made clear by just using the computer. What makes a “nicer” matrix depends on the context and that determines the nature of the transformation to obtain the “nicer” from.
In order to transform matrices into “nicer” forms we perform row (and column) operations. In simple terms this consists of either multiplying a row of the matrix by a constant, adding one row to another, or swapping rows. These same operations are available for column operations.
More formally one can represent any such row operations as left multiplication by a nonsingular matrix. For example, given a matrix A, if we multiply BA, this will double the first row, subtract the second from the third, and swap the fourth and the fifth.
We can also represent column operations by right multiplication. That is AB would double the first column, and so on.
We say two vectors, , are linearly dependent if there are two constants c_{1},c_{2}, both of which cannot be zero, such that c_{1} + c_{2} = . If there are no two numbers which do this, then we say , are linearly independent. This idea extends to sets of n vectors. For example if {v_{1},...,v_{n}} is a set of linearly independent vectors and
then we know that c_{1} = c_{2} = ... = c_{n} = 0.
An equivalent interpretation is that a set of vectors is linearly dependent if one vector in the set can be written as a linear combination of the other vectors.
One might wonder how large sets of linear independent vectors can be, and there is an upper limit. Consider any set of 3 distinct 2 × 1 vectors, say u_{1},u_{2},u_{3}. No matter which three vectors you choose, you will always be able to write one as a linear combination of the other two. In general, when working in , any set of n + 1 vectors must be linearly dependent. This however does not guarantee that sets of size smaller than n will necessarily be linearly independent, consider:
We note that = -.
Let A be an matrix. Then we say that , a nonzero vector in , is an eigenvector with eigenvalue if:
We call an eigenpair for the matrix A. We note that while we require the vector to be nonzero, may be. For an matrix we will have n, not necessarily distinct, eigenvalues. To find the eigenvalues of a matrix we solve the equation:
Example 1 Find all the eigenvalues of the following matrix:
We begin by using the formula above:
While we can find all eigenvalues in one set of operations, we must treat each eigenvalue individually when we look for our eigenvectors. So with a fixed eigenvalue, we try to find a vector such that:
Note that for a given eigenvalue the set of all eigenvalues is the null-space of the matrix , and so is a subspace of called the eigenspace. We many times need a basis of the eigenspace.
Example 2 Find all the eigenvectors of the following matrix:
From above we know . We first find :
A few additional facts which may be useful relating the eigenvalues and the trace of an n × n matrix A. The trace is the sum of the diagonal elements of a matrix.
Special forms of matrices can make finding eigenvalues trivial. In particular, the diagonal entries of diagonal matrices are the eigenvalues of the matrix. This fact also holds for upper and lower triangular matrices.
In order to compare the relative size of vectors, we introduce the notion of a norm. In the most general sense one can view the norm as describing the size of a vector, or as we will see later, a matrix. We denote the norm of the vector by ||||. More specifically, we say something is a norm if it satisfies the following three requirements:
We note that one has multiple choices regarding norms, and at first this appears to be overkill, one finds that each norm has characteristics which make it more useful than others in certain situations. We will simply list the three most common vector norms, and the common names. We will also show how to calculate the norm for a generic
It can also be defined using inner product, or dot product, notation, ||||_{2} = .
One can apply any vector norm to create a norm for matrices operating on those vectors through the following definition:
The following is an equivalent, and often more useful definition of the matrix norm:
For the norms mentioned above, the matrix norms can be calculated without the use of these definitions.
Orthogonal - We say that a collection of vectors in is orthogonal if any two vectors in the set are perpendicular. That is for any v_{i},v_{j}, with , we have . If these vectors are all nonzero, then they are linearly independent. If we have ||v_{i}|| = 1 for every vector in our collection, then we say the set is orthonormal.
These definitions allow us to introduce a useful class of matrices. The n × n matrix A is said to be orthogonal if its columns are all orthonormal. This group of matrices has many special properties, one such property is that the transpose of the matrix is its inverse, that is AA^{T} = I.
Hermitian/Symmetric - While we have limited most of our discussion to matrices with real entries, we mention briefly a class of matrices with complex entries. We define the conjugate transpose of a matrix A as , or more explicitly
We call a matrix Hermitian if it is equal to its conjugate transpose, that is A = A^{*}. In the case of real matrices, we know that a_{ij} = a_{ij}, thus a real Hermitian matrix is just equal to its transpose, A^{T} = A, and we call this group symmetric matrices.
Among the many properties these matrices have, one of the nicest is that all the eigenvalues of Hermitian or symmetric matrix are real numbers.
Stochastic - If A has only positive elements, and if each row of A sums to 1, then we call A a stochastic matrix. Similarly, if all the rows and all the columns sum to 1, then we call it a doubly stochastic matrix. These matrices can be used to represent transition probabilities, and will play a major role in our study of Markov Chains.
(Problem created by LT Grant, 2006.)
(Problem created by LT Grant, 2006.)
(Problem created by LT Grant, 2006.)
[an error occurred while processing this directive] [an error occurred while processing this directive]
Last modified: [an error occurred while processing this directive]