Basic Linear Algebra

Contents

Let VV be a set of elements on which vector addition and scalar multiplication are defined. VV is a vector space if the following axioms are satisfied:

  1. If x,yV\mathbf{x}, \mathbf{y} \in V, then x+yV\mathbf{x} + \mathbf{y} \in V
  2. For all x,yV\mathbf{x}, \mathbf{y} \in V, x+y=y+x\mathbf{x}+\mathbf{y}=\mathbf{y}+\mathbf{x}
  3. For all x,y,zV\mathbf{x}, \mathbf{y}, \mathbf{z} \in V, (x+y)+z=x+(y+z)(\mathbf{x}+\mathbf{y})+\mathbf{z} = \mathbf{x}+(\mathbf{y}+\mathbf{z})
  4. There is a unique vector 0V\mathbf{0} \in V such that 0+x=x+0=x\mathbf{0}+\mathbf{x} = \mathbf{x}+\mathbf{0} = \mathbf{x}
  5. For each xV\mathbf{x} \in V, there exists a vector x-\mathbf{x} such that x+(x)=(x)+x=0\mathbf{x}+(-\mathbf{x}) = (-\mathbf{x})+\mathbf{x} = \mathbf{0}
  1. If kk is any scalar and xV\mathbf{x} \in V, then kxVk\mathbf{x}\in V
  2. k(x+y)=kx+kyk(\mathbf{x}+\mathbf{y}) = k\mathbf{x}+k\mathbf{y}
  3. (k1+k2)x=k1x+k2x(k_1+k_2)\mathbf{x} = k_1\mathbf{x} + k_2\mathbf{x}
  4. k1(k2x)=(k1k2)xk_1(k_2\mathbf{x}) = (k_1k_2)\mathbf{x}
  5. 1x=x1\mathbf{x} = \mathbf{x}
  • Subspace - A subset WW of a vector space VV that is itself a vector space under the operations of vector addition and scalar multiplication defined in VV.
    • Criteria for a subspace
      • Closed vector addition satisfied: If x\mathbf{x}, yW\mathbf{y} \in W, then x+yW\mathbf{x}+\mathbf{y} \in W
      • Closed scalar multiplication satisfied: If xW\mathbf{x} \in W and kk is a scalar, then kxWk\mathbf{x} \in W
  • Linearly independent - a set of vectors where k1x1+k2x2++knxn=0k_1\mathbf{x}_1 + k_2\mathbf{x}_2 + \cdots + k_n\mathbf{x}_n = \mathbf{0} can only satisfied by k1=k2==kn=0k_1=k_2=\cdots=k_n=0
  • Linearly dependent - a set of vectors that are not linearly independent
  • Basis - a set of vectors BVB \in V that is linearly independent and every vector in VV can be expressed as a linear combination of vectors in BB, i.e. span(B)=V\mathrm{span}(B) = V
    • Standard basis in Rn\Reals^n
      • e1=1,0,0,,0\mathbf{e}_1 = \langle 1,0,0,…,0 \rangle
      • e2=0,1,0,,0\mathbf{e}_2 = \langle 0,1,0,…,0 \rangle
      • \cdots
      • en=0,0,0,,1\mathbf{e}_n = \langle 0,0,0,…,1 \rangle
    • Coordinates cic_i of v\mathbf{v} relative to the basis B=x1,x2,,xnB={\mathbf{x}_1, \mathbf{x}_2, …, \mathbf{x}_n}
      • v=c1x1+c2x2++cnxn\mathbf{v} = c_1\mathbf{x}_1 + c_2\mathbf{x}_2 + \cdots + c_n\mathbf{x}_n
  • Dimension - number of vectors in a basis BB for a vector space VV
    • the number of vectors in the spanning set SS
  • Span - the set of all linear combinations of the any set of vectors S=x1,x2,,xnS={\mathbf{x}_1, \mathbf{x}_2, …, \mathbf{x}_n} in VV, where span(S)=c1x1+c2x2++cnxn\mathrm{span}(S) = {c_1\mathbf{x}_1 + c_2\mathbf{x}_2 + … + c_n\mathbf{x}_n}
    • Spanning set - SS where V=span(S)V = \mathrm{span}(S)
  • Coordinates of u\mathbf{u} relative to an orthogonal basis B=w1,w2,,wnB = {\mathbf{w}_1, \mathbf{w}_2, …, \mathbf{w}_n}
    • u=(uw1)w1+(uw2)w2++(uwn)wn\mathbf{u} = (\mathbf{u}\cdot\mathbf{w}_1)\mathbf{w}_1 + (\mathbf{u}\cdot\mathbf{w}_2)\mathbf{w}_2 + \cdots + (\mathbf{u}\cdot\mathbf{w}_n)\mathbf{w}_n
  • Gram-Scmidt Orthogonalization
    • Given B=u1,u2,,unB = {\mathbf{u}_1, \mathbf{u}_2, …, \mathbf{u}_n} is a set of basis,
    • then B=v1,v2,,vnB' = {\mathbf{v}_1, \mathbf{v}_2, …, \mathbf{v}_n} is a set of orthogonal basis where
      • v1=u1\mathbf{v}_1 = \mathbf{u}_1
      • v2=u2proju2v1\mathbf{v}_2 = \mathbf{u}_2 - \mathbf{proj}_{\mathbf{u}_2}\mathbf{v}_1
      • v3=u3proju3v1proju3v2\mathbf{v}_3 = \mathbf{u}_3 - \mathbf{proj}_{\mathbf{u}_3}\mathbf{v}_1 - \mathbf{proj}_{\mathbf{u}_3}\mathbf{v}_2
      • \cdots
      • vn=unprojunv1projunv2projunvn1\mathbf{v}_n = \mathbf{u}_n - \mathbf{proj}_{\mathbf{u}_n}\mathbf{v}_1 - \mathbf{proj}_{\mathbf{u}_n}\mathbf{v}_2 - \cdots - \mathbf{proj}_{\mathbf{u}_n}\mathbf{v}_{n-1}
      • Note: projuavb=(uavbvbvb)vb\mathbf{proj}_{\mathbf{u}_a}\mathbf{v}_b = \left(\dfrac{\mathbf{u}_a\cdot\mathbf{v}_b}{\mathbf{v}_b\cdot\mathbf{v}_b}\right)\mathbf{v}_b
    • and B=w1,w2,,wnB'' = {\mathbf{w}_1, \mathbf{w}_2, …, \mathbf{w}_n} is a set of orthonormal basis where
      • wi=vivi\mathbf{w}_i = \dfrac{\mathbf{v}_i}{\lVert\mathbf{v}_i\rVert}
  • Consistent - system with at least one solution
    • Unique solution
    • Infinitely many solution
  • Inconsistent - system with no solutions
  • Homogeneous - all constants are zero Ax=0\mathbf{A}\mathbf{x} = \mathbf{0}
  • Nonhomogeneous - not all constants are zero Ax=b\mathbf{A}\mathbf{x} = \mathbf{b}
  • Elementary row operations
    • cRicR_i - Multiply a row by a nonzero constant
    • RiRjR_i \leftrightarrow R_j - Interchange any two rows
    • cRi+RjcR_i + R_j - Add a nonzero multiple of one row to any other row
  • Row equivalent - matrix obtained by a sequence of elementary row operations on another matrix
  • Row reduction - procedure of carrying out elementary row operations on a matrix
  • Row-echelon form - augmented matrix which…
    • Rows consisting of all zeros are at the bottom of the matrix
    • In consecutive nonzero rows, the first entry nonzero entry (pivot) in the lower row appears to the right of the pivot in the higher row
    • The first nonzero entry in a nonzero row is a 1
      • (depend on text, could be definition of reduced row-echelon form)
    • Note: row-echelon form is non-unique
  • Reduced row-echelon form - augmented matrix which…
    • Is in row-echelon form
    • A column containing a first entry 1 has zeros everywhere else
      • column of leading entries are standard vectors
    • Note: reduced row-echelon form is unique
  • Gaussian elimination - row-reduce the augmented matrix until arrive at row-echelon form
  • Gauss-Jordan elimination - row-reduce the augmented matrix until arrive at reduced row-echelon form
  • Overdetermined - linear systems with more equations than variables
  • Underdetermined - linear systems with fewer equations than variables
  • Trivial solution - solution consisting of all zeros, i.e. x=0\mathbf{x} = \mathbf{0}
  • Nontrivial solution - solution with at least one nonzero entry, i.e. x0\mathbf{x} \not= \mathbf{0}
    • Existence of nontrivial solutions
      • An underdetermiend homogeneous system has nontrivial solutions
  • Superposition principle
    • If x1,x2\mathbf{x}_1, \mathbf{x}_2 are solutions of homogeneous system Ax=0\mathbf{A}\mathbf{x} = \mathbf{0}
      • then their linear combination is also a solution: x=c1x1+c2x2\mathbf{x} = c_1\mathbf{x}_1 + c_2\mathbf{x}_2
  • Row vector - 1×n1 \times n matrices that are rows of a matrix
    • Row space - the span of all row vectors
  • Column vector - n×1n \times 1 matrices that are columns of a matrix
    • Column space - the span of all column vectors
  • Rank - maximum number of linearly independent row vector in a matrix
    • Rank by row reduction: A\mathbf{A} is row equivalent to a row-echelon form B\mathbf{B}
      • Row space of A\mathbf{A} = Row space of B\mathbf{B}
      • Nonzero rows of B\mathbf{B} form a basis for the row space of A\mathbf{A}
      • rank(A)\mathrm{rank}(\mathbf{A}) = number of nonzero rows in B\mathbf{B}
  • Consistency of nonhomogeneous systems
    • Ax=b\mathbf{A}\mathbf{x} = \mathbf{b} is consistent     rank(A)=rank(AB)\iff \mathrm{rank}(\mathbf{A}) = \mathrm{rank}(\mathbf{A}|\mathbf{B})
    • If Ax=b\mathbf{A}\mathbf{x} = \mathbf{b} with mm equations and nn variables is consistent with rank(A)=r\mathrm{rank}(A) = r
      • then the solution contains nrn-r parameters

Ax=0Always consistent{Unique solution: x=0Infinity of solutions: rank(A)<n,nr params\footnotesize\mathbf{A}\mathbf{x} = \mathbf{0} \implies \text{Always consistent} \begin{cases} \text{Unique solution: } \mathbf{x} = \mathbf{0} \cr \text{Infinity of solutions: } \mathrm{rank}(\mathbf{A}) < n, n-r \text{ params} \end{cases}

Ax=b{Consistent: rank(A)=rank(AB){Unique solution: rank(A)=nInfinity of solutions: rank(A)<n,nr paramsInconsistent: rank(A)<rank(AB)\footnotesize\mathbf{A}\mathbf{x} = \mathbf{b} \begin{cases} \text{Consistent: } \mathrm{rank}(\mathbf{A}) = \mathrm{rank}(\mathbf{A}|\mathbf{B}) \begin{cases} \text{Unique solution: } \mathrm{rank}(\mathbf{A}) = n \cr \text{Infinity of solutions: } \mathrm{rank}(\mathbf{A}) < n, n-r \text{ params} \end{cases} \cr \text{Inconsistent: } \mathrm{rank}(\mathbf{A}) < \mathrm{rank}(\mathbf{A}|\mathbf{B}) \end{cases}

  • Determinant of a square matrix - detA\det\mathbf{A}
    • detA=a11a12a21a22=a11a22a12a21\det\mathbf{A} = \begin{vmatrix} a_{11} & a_{12} \cr a_{21} & a_{22} \end{vmatrix} = a_{11}a_{22} - a_{12}a_{21}
  • Minor determinant - MijM_{ij}, the determinant of the submatrix obtained by deleting the iith row and the jjth column of A\mathbf{A}
  • Cofactor - CijC_{ij}, signed minor determinant
    • Cij=(1)i+jMijC_{ij} = (-1)^{i+j}M_{ij}
    • The sign gives checkerboard pattern starting from plus
  • Cofactor expansion of detA\det\mathbf{A} along the iith row
    • detA=ai1Ci1+ai2Ci2++ainCin\det\mathbf{A} = a_{i1}C_{i1} + a_{i2}C_{i2} + \cdots + a_{in}C_{in}
  • Cofactor expansion of detA\det\mathbf{A} along the jjth column
    • detA=a1jC1j+a2jC2j++anjCnj\det\mathbf{A} = a_{1j}C_{1j} + a_{2j}C_{2j} + \cdots + a_{nj}C_{nj}
  • detAT=detA\det\mathbf{A}^T = \det\mathbf{A}
  • det(AB)=detAdetB\det(\mathbf{A}\mathbf{B}) = \det\mathbf{A}\cdot\det\mathbf{B}
  • If A\mathbf{A} has two rows or columns the same,
    • then detA=0\det\mathbf{A} = 0
  • If A\mathbf{A} has all zero entries in a row or column,
    • then detA=0\det\mathbf{A} = 0
  • If A\mathbf{A} is a triangular matrix,
    • then detA=a11a22ann\det\mathbf{A} = a_{11}a_{22}\cdots a_{nn}
  • If B\mathbf{B} by interchange row/column of A\mathbf{A},
    • then detB=detA\det\mathbf{B} = -\det\mathbf{A}
  • If B\mathbf{B} by multiplying row/column of A\mathbf{A} by kk,
    • then detB=kdetA\det\mathbf{B} = k\det\mathbf{A}
  • If B\mathbf{B} by multiplying row/column of A\mathbf{A} by kk and add to another,
    • then detB=detA\det\mathbf{B} = \det\mathbf{A}
  • Inverse - A1\mathbf{A}^{-1} such that A1A=AA1=I\mathbf{A}^{-1}\mathbf{A} = \mathbf{A}\mathbf{A}^{-1} = \mathbf{I}
    • Invertible (nonsingular) - matrix with inverse
    • Noninvertible (singular) - matrix with no inverse
  • Properties of inverse
    • (A1)1=A(\mathbf{A}^{-1})^{-1} = \mathbf{A}
    • (AB)1=B1A1(\mathbf{A}\mathbf{B})^{-1} = \mathbf{B}^{-1}\mathbf{A}^{-1}
    • (AT)1=(A1)T(\mathbf{A}^{T})^{-1} = (\mathbf{A}^{-1})^{T}
  • Adjoint - the transpose of the matrix of cofactors of A\mathbf{A}
    • adjA=[C11C12C1nC21C22C2nCn1Cn2Cnn]T\mathrm{adj}\mathbf{A} = \begin{bmatrix} C_{11} & C_{12} & \cdots & C_{1n} \cr C_{21} & C_{22} & \cdots & C_{2n} \cr \vdots & \vdots & \ddots & \vdots \cr C_{n1} & C_{n2} & \cdots & C_{nn} \end{bmatrix}^{T}
  • Adjoint method
    • A1=(1detA)adjA\mathbf{A}^{-1} = \left(\dfrac{1}{\det\mathbf{A}}\right)\mathrm{adj}\mathbf{A}
    • Nonsingular     detA0\iff \det\mathbf{A} \not= 0
    • Singular     detA=0\iff \det\mathbf{A} = 0
  • Row operation method
    • (AI)Elementary row operation(IA)(\mathbf{A}|\mathbf{I}) \xRightarrow{\text{Elementary row operation}} (\mathbf{I}|\mathbf{A})
  • Nonmobonegeous system Ax=b\mathbf{A}\mathbf{x} = \mathbf{b}
    • x=A1b\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}
  • Homogeneous system Ax=0\mathbf{A}\mathbf{x} = \mathbf{0}
    • Nonsingular     \iff Only trivial solution x=0\mathbf{x} = \mathbf{0}
    • Singular     \iff Has nontrivial solution
  • If Aib\mathbf{A}\xleftarrow{i} \mathbf{b} is the matrix obtained by replacing A\mathbf{A}’s iith column by b\mathbf{b},
    • then the iith component of the solution x\mathbf{x} is xi=detAibdetAx_i = \dfrac{\det\mathbf{A}\xleftarrow{i} \mathbf{b}}{\det\mathbf{A}}
  • Eigenvalue - a number λ\lambda that satisfies Av=λv\mathbf{A}\mathbf{v} = \lambda\mathbf{v}
  • Eigenvector - a non-zero vector v\mathbf{v} that satisfies Av=λv\mathbf{A}\mathbf{v} = \lambda\mathbf{v} with corresponding λ\lambda
    • Characteristic equation - det(AλI)=0\det(\mathbf{A} - \lambda\mathbf{I}) = 0
      • Eigenvalues are the roots of the characteristic equation
      • Eigenvectors is the solution to (AλI)v=0(\mathbf{A} - \lambda\mathbf{I})\mathbf{v} = 0 by applying Gauss-Jordan elimination to (AλI0)(\mathbf{A} - \lambda\mathbf{I} | 0)
        • Eigenvectors are not unique
        • A nonzero constant multiple of an eigenvector is another eigenvector
  • Complex eigenvalues and eigenvectors
    • If λ=α+iβ\lambda = \alpha + i\beta is a complex eigenvalue,
      • then its complex conjugate λ=αiβ\overline{\lambda} = \alpha - i\beta is also an eigenvalue
    • If v\mathbf{v} is an eigenvector corresponding to λ\lambda,
      • then v\overline{\mathbf{v}} is an eigenvector corresponding to λ\overline{\lambda}
  • Singular matrix and zero eigenvalue
    • Singular     λ=0\iff \lambda = 0
    • Nonsingular     λ0\iff \lambda \not= 0
    • Determinant of A\mathbf{A} is the product of its eigenvalues
      • detA=λ1λ2λn\det\mathbf{A} = \lambda_1\lambda_2\cdots\lambda_n
  • If A\mathbf{A} has eigenvalue λ\lambda with corresponding eigenvector v\mathbf{v},
    • then A1\mathbf{A}^{-1} has eigenvalue 1/λ1/\lambda with the same corresponding eigenvector v\mathbf{v}
  • If A\mathbf{A} is a triangular matrix,
    • then the eigenvalues are the main diagonal entries.
  • Cayley-Hamilton theorem - an n×nn \times n matrix A\mathbf{A} satisfies its own characteristic equation
    • Characteristic equation: (1)nλn+cn1λn1++c1λ+c0=0(-1)^n \lambda^{n} + c_{n-1}\lambda^{n-1} + \cdots + c_1\lambda + c_0 = 0
    • Cayley-Hamilton theorem: (1)nAn+cn1An1++c1A+c0I=0(-1)^n \mathbf{A}^{n} + c_{n-1}\mathbf{A}^{n-1} + \cdots + c_1\mathbf{A} + c_0\mathbf{I} = \mathbf{0}
  • Matrix of order 2
    • Am=c0I+c1A\mathbf{A}^m = c_0\mathbf{I} + c_1\mathbf{A}
    • λm=c0+c1λ\lambda^m = c_0 + c_1\lambda
  • Matrix of order n
    • Am=c0I+c1A+c2A2++cn1An1\mathbf{A}^m = c_0\mathbf{I} + c_1\mathbf{A} + c_2\mathbf{A}^2 + \cdots + c_{n-1}\mathbf{A}^{n-1}
    • λm=c0+c1λ+c2λ2++cn1λn1\lambda^m = c_0 + c_1\lambda + c_2\lambda^2 + \cdots + c_{n-1}\lambda^{n-1}
  • Symmetric - A\mathbf{A} where AT=A\mathbf{A}^T = \mathbf{A}
    • Symmetric matrix has real eigenvalues
    • Symmetric matrix has orthogonal eigenvectors corresponding to distinct eigenvalues
  • Orthogonal - A\mathbf{A} where AT=A1\mathbf{A}^T = \mathbf{A}^{-1}
    • ATA=I\mathbf{A}^T\mathbf{A} = \mathbf{I}
    • Orthonormal - a set of vectors where every pair of distinct vectors is orthogonal and is a unit vector
    • Orthogonal     \iff Columns form an orthonormal set
  • Dominant eigenvalue - eigenvalue with absolute value greatest than that of the remaining λk>λi,i=1,2,,n\lvert \lambda_k \rvert > \lvert \lambda_i \rvert, i=1,2,…,n
  • Dominant eigenvector - eigenvector corresponding to dominant eigenvalue λk\lambda_k
  • Power method
    • Amx0λ1mc1v1\mathbf{A}^m\mathbf{x}_0\approx\lambda_1^m c_1 \mathbf{v}_1 approximates dominant eigenvector v1\mathbf{v}_1
    • λ1Axmxmxmxm\lambda_1\approx\dfrac{\mathbf{A}\mathbf{x}_m\cdot\mathbf{x}_m}{\mathbf{x}_m\cdot\mathbf{x}_m} approximates dominant eigenvalue λ1\lambda_1
      • Rayleigh quotient - λ1Axmxmxmxm\lambda_1\approx\dfrac{\mathbf{A}\mathbf{x}_m\cdot\mathbf{x}_m}{\mathbf{x}_m\cdot\mathbf{x}_m}
  • Diagonalizable - A\mathbf{A} has a nonsingular P\mathbf{P} that satisfies P1AP=D\mathbf{P}^{-1}\mathbf{A}\mathbf{P} = \mathbf{D} where D\mathbf{D} is a diagonal matrix
    • P1AP=D\mathbf{P}^{-1}\mathbf{A}\mathbf{P} = \mathbf{D}
    • AP=PD\mathbf{A}\mathbf{P} = \mathbf{P}\mathbf{D}
    • Diagonalizable     \iff Has nn linearly independent eigenvectors
      • nn distinct eigenvalues     \implies diagonalizable
  • Symmetric matrix can always be diagonalized
    • Orthogonally diagonalizable - an orthogonal matrix P\mathbf{P} diagonalizes A\mathbf{A}
      • A\mathbf{A} is symmetric     A\iff \mathbf{A} can be orthogonally diagonalized
  • LU-factorization - A=LU\mathbf{A} = \mathbf{L}\mathbf{U}
    • Lower triangular matrix - square matrix L\mathbf{L} whose entries above the main diagonal are all zero
    • Upper triangular matrix - square matrix U\mathbf{U} whose entries below the main diagonal are all zero
    • LU-factorization is not uniquely determined
  • Doolittle’s method