Linear Algebra Primer

by Chee Yee Lim


Posted on 2021-04-06



Collection of notes on linear algebra - covers core concepts of linear algebra.


Set

Set, Cardinality, Cartesian Product

  • Set is an unordered collection of objects.
    • E.g. \( \{C, A\} \)
  • \( \in \) indicates an object belongs to a set.
    • E.g. \( D \in \{C, A\} \)
  • \( A \subseteq B \)
    • \( A \) is a subset of \( B \).
    • \( B \) is a superset of \( A \).
  • \( A = B \)
    • Two sets are equal if they contain exactly the same elements. (Order does not matter)
    • \( A = B \Leftrightarrow A \subseteq B and B \subseteq A \)
  • \( \{ x \in R : x \geq 0 \} \)
    • The set consists of all real number \( x \), which is greater than or equal to 0.
  • \( |S| \) indicates cardinality, i.e. the number of elements in set \( S \).
    • \( |S| \) only exists for a finite set.
  • \( A \times B \)
    • Cartesian product between set \( A \) and \( B \).
    • E.g. for \( A = \{ 1,2 \} \) and \( B = \{ c,a \} \), \( A \times B \) is \( \{ (1c, 2c, 1a, 2a) \} \).
  • \( |A \times B| = |A| \times |B| \)
  • \( \{ (x,y) \in R \times R : y = x^2 \} \)
    • The set consists of a pair of real numbers, where \( y \) is squared \( x \).

Function

  • For each input element in set \( A \), a function assigns a single output element from another set \( B \).
    • \( A \) is the domain of the function.
    • \( B \) is the co-domain of the function.
    • The output of a given input is the image of that input.
  • \( f(q) = r \Leftrightarrow q \mapsto r \)
    • \( q \) maps to \( r \) under \( f \).
  • \( f : D \rightarrow F \)
    • \( f \) is a function with domain \( D \) and co-domain \( F \).
  • \( Im f \)
    • The image of a function is the set of all images of inputs.
    • E.g. for a function \( f : \{ 1,2,3,4 \} \rightarrow \{ a,b,c,d,e \} \) and \( \{ (1 \mapsto a), (2 \mapsto b), (3 \mapsto c), (4 \mapsto e) \} \), the image of \( f \) is \( Im f = \{ a,b,c,e \} \)
    • Note that not all elements in co-domain must be mapped to an element in domain.
  • \( id_D : D \rightarrow D \)
    • Identity function is a function that maps each domain element \(d \) to itself in any domain \( D \).
  • For functions \( f : A \rightarrow B \) and \( g : B \rightarrow C \), the functional composition of \( f \) and \( g \) is the function \( (g \circ f) : A \rightarrow C \) defined by \( (g \circ f)(x) = g(f(x)) \).
    • E.g. the functional composition of \( g(y) = y^2 \) and \( f(x) = x + 1 \) is \( (g \circ f)(x) = (x + 1)^2) \)
  • \( h \circ (g \circ f) = (h \circ g) \circ f \)
    • Associativity of function composition
  • Functions \( f \) and \( g \) are functional inverses if \( f \circ g \) and \( g \circ f \) are defined and are identity functions.
    • A function that has an inverse is invertible.
  • \( f : D \mapsto F \) is one-to-one if \( f(x) = f(y) \) implies \( x = y \).
    • Invertible functions are one-to-one.
  • \( f : D \mapsto F \) is onto if for every \( z \in F \) there exists an \( a \) such that \( f(a) = z \).
    • Invertible functions are onto.
    • Onto means every output is mapped to an input.
  • Function invertibility theorem
    • A function \( f \) is invertible if and only if it is one-to-one and onto.

Field

Field

  • A field \( F \) is a collection of numbers.
    • 3 main fields are :
      • The field \( R \) of real numbers.
      • The field \( C \) of complex numbers.
      • The finite field \( GF(2) \), which consists of 0 and 1 under mod 2 arithmetic.

Complex Numbers

  • Complex number is the precursor to vector.
  • Complex numbers
    • A real number plus an imaginary number is a complex number.
    • A complex number has a real part and an imaginary part.
  • By interpreting real and imaginary parts of a complex number as \( x \) and \( y \) coordinates, a complex number becomes a point in the complex plane.
    • Complex Coordinates
  • Absolute value of \( z \), \( |z| \)
    • \( |z| \) corresponds to the distance from origin to the point \( z \) on the complex plane.
    • Complex Modulus
  • Addition \( \Leftrightarrow \) Translation
    • \( f(z) = z + z_0 \) where \( z_0 \) is a complex number.
    • A translation will move a point in the plane.
    • Complex Translation
  • Euler's formula
    • For any real number \( \theta \), \( e^{\theta i} \) is the point \( z \) on the unit circle with argument \( \theta \).
    • Euler 1
    • When \( \theta = \pi \), \( e^{\pi i} = -1 \).
    • Euler 2
  • Every complex number can be written in the form \( z = re^{\theta i} \).
    • \( r \) is the absolute value of \( z \).
    • \( \theta \) is the argument of \( z \).
  • Rotation \( \Leftrightarrow \) Increase the argument of \( z \)
    • By using exponentiation law \( e^a \cdot e^b = e^{a+b} \), we can increase the argument of \( z \).
    • To rotate by an angle \( \tau \), \( re^{\theta i} \cdot re^{\tau i} = re^{\theta i + \tau i} = re^{(\theta + \tau) i } \)

Galois Field 2

  • Galois Field 2, \( GF(2) \), has just two elements: 0 and 1.
  • Addition is like exclusive-or.
  • Multiplication is like ordinary multiplication.
  • Applications of \( GF(2) \)
    • Cryptosystem
    • Network coding

Vector

Vector

  • For a field \( F \) and a set \( D \), a \( D \)-vector over \( F \) is a function from \( D \) to \( F \).
    • The set of such functions is written \( F^D \).
    • Therefore we can consider a vector as a function, i.e. a mapping between vector index/key and a vector value. (Like a Python dictionary)
  • A vector with 2 elements, 2-vector \( [x,y] \), can be interpreted as a point in the plane.
    • A vector with 3 elements, 3-vector \( [x,y,z] \), can be interpreted as a point in the space.
  • Zero vector
    • A zero vector, \( \boldsymbol{0}_D \) or \( \boldsymbol{0} \), has all zero elements.
    • \( \boldsymbol{v} + \boldsymbol{0} = \boldsymbol{v} \)

Vector Addition

  • Vector addition / translation
    • \( [u_1,u_2,...,u_n] + [v_1,v_2,...,v_n] = [u_1 + v_1, u_2 + v_2, ..., u_n + v_n] \)
    • Vector addition is done by adding values across the same index in both vectors.
  • Vector addition is associative.
    • \( (x + y) + z = x + (y + z) \)
  • Vector addition is commutative.
    • \( x + y = y + x \)
  • \( n-vectors \) over \( R \) can be visualised as arrows in \( R^n \).
    • \( 2-vector [3, 1.5] \) can be represented by the two following arrows with different origins and endpoints.
    • Vector Addition 1
    • Vector Addition 2
  • Addition of vectors can also be visualised using arrows.
    • E.g. \( \boldsymbol{u} + \boldsymbol{v} \)
    • Vector Addition 3

Scalar-vector Multiplication

  • Scalar-vector multiplication / scaling
    • \( \alpha \boldsymbol{v} = \alpha [v_1,v_2,...,v_n] = [\alpha v_1, \alpha v_2, ..., \alpha v_n] \), where \( \alpha \) is a scalar.
    • Multiplying a vector \( \boldsymbol{v} \) by a scalar \( \alpha \) is defined as multiplying each entry of \( \boldsymbol{v} \) by \( \alpha \).
  • Scalar-vector multiplication only changes lengths, but not rotations. Negative scalars give opposite direction.
    • E.g. given a vector \( [3,1.5] \),
      • Scalar-vector Multiplication 1
      • \( 2 \times [3,1.5] \) gives a vector with twice the length
      • Scalar-vector Multiplication 2
    • E.g. multiplying \( \boldsymbol{v} = [3,2] \) by a series of \( \alpha \) with different magnitudes.
      • Scalar-vector Multiplication 3
  • Scalar-vector multiplication is associative.
    • \( \alpha (\beta \boldsymbol{v}) = (\alpha \beta) \boldsymbol{v} \)
  • Scalar-vector multiplication distributes over vector addition.
    • \( \alpha (\boldsymbol{u} + \boldsymbol{v}) = \alpha \boldsymbol{u} + \alpha \boldsymbol{v} \)
  • Convex combination of \( \boldsymbol{u} \) and \( \boldsymbol{v} \)
    • \( \alpha \boldsymbol{u} + \beta \boldsymbol{v} \), where \( 0 \leq \alpha \leq 1 \), \( 0 \leq \beta \leq 1 \) and \( \alpha + \beta = 1 \)
    • \( \boldsymbol{u}-to-\boldsymbol{v} \) line segment consists of the set of convex combinations of \( \boldsymbol{u} \) and \( \boldsymbol{v} \).
  • Affine combination of \( \boldsymbol{u} \) and \( \boldsymbol{v} \)
    • \( \alpha \boldsymbol{u} + \beta \boldsymbol{v} \), where \( \alpha \in R \), \( \beta \in R \) and \( \alpha + \beta = 1 \)
    • The line through \( \boldsymbol{u} \) and \( \boldsymbol{v} \) consists of the set of affine combinations of \( \boldsymbol{u} \) and \( \boldsymbol{v} \).

Dot-product

  • Dot-product / scalar product
    • \( \boldsymbol{u} \cdot \boldsymbol{v} = \sum_{k \in D} \boldsymbol{u} [k] \boldsymbol{v} [k] = u_1 v_1 + u_2 v_2 + ... + u_n v_n \)
    • Dot-product of two \( D \)-vectors is the sum of product of corresponding entries.
    • Output is a scalar, not a vector.
  • Linear equation
    • \( \boldsymbol{a} \cdot \boldsymbol{x} = \beta \), where \( \boldsymbol{a} \) is a vector, \( \beta \) is a scalar and \( \boldsymbol{x} \) is a vector of variables.
  • Algebraic properties of dot-product
    • Commutativity - \( \boldsymbol{u} \cdot \boldsymbol{v} = \boldsymbol{v} \cdot \boldsymbol{u} \)
    • Homogeneity - \( (\alpha \boldsymbol{u}) \cdot \boldsymbol{v} = \alpha (\boldsymbol{u} \cdot \boldsymbol{v}) \)
    • Distributive law - \( (\boldsymbol{v}_1 + \boldsymbol{v}_2) \cdot \boldsymbol{x} = \boldsymbol{v}_1 \cdot \boldsymbol{x} + \boldsymbol{v}_2 \cdot \boldsymbol{x} \)

Vector Space

Linear Combination, Span

  • Linear combination of vectors
    • \( \alpha_1 \boldsymbol{v}_1 + ... + \alpha_n \boldsymbol{v}_n \), where \( \alpha_1, ..., \alpha_n \) are the coefficients of the linear combination.
  • Span of a set of vectors, \( V = \{ \boldsymbol{v}_1, ..., \boldsymbol{v}_n \} \)
    • It corresponds to the set of all linear combinations of some vectors \( \boldsymbol{v}_1, ..., \boldsymbol{v}_n \).
    • E.g. \( [19,3] \in span(V) \), where \( V = \{ [2,3], [1,2] \} \).
    • This is because \( 35 \times [2,3] - 51 \times [1,2] = [19,3] \).
  • Given a set of vectors \( V \), if \( \boldsymbol{v}_1, ..., \boldsymbol{v}_n \), are vectors such that \( V = Span \{ \boldsymbol{v}_1, ..., \boldsymbol{v}_n \} \), then:
    • \( \{ \boldsymbol{v}_1, ..., \boldsymbol{v}_n \} \) is a generating set for \( V \).
    • \( \boldsymbol{v}_1, ..., \boldsymbol{v}_n \) are generators for \( V \).
  • It is possible to express a vector as a linear combination of some other vectors.
    • E.g. \( [3,0,0] = 2[2,0,1] - 1[1,0,2] + 0[2,2,2] \)
  • A linear combination is trivial when all the \( \alpha_1, ..., \alpha_n = 0 \), otherwise it is non-trivial.

Specification of Geometric Objects

  • There are two ways to specify a geometric object that passes through origin.
    1. Spans of some vectors
      • A line in 3-dimensions: \( \{ [4,-1,1] \} \)
      • A plane in 3-dimensions: \( \{ [4,-1,1], [0,1,1] \} \)
    2. Solution set of linear equations with zero right-hand sides
      • A line in 3-dimensions: \( \{ [x,y,z] : \boldsymbol{a}_1 \cdot [x,y,z] = 0, \boldsymbol{a}_2 \cdot [x,y,z] = 0 \} \)
      • A plane in 3-dimensions: \( \{ (x,y,z) : ax + by + cz = 0 \} \)
      • A plane in 3-dimensions (using dot-product): \( \{ [x,y,z] : [a,b,c] \cdot [x,y,z] = 0 \} \)
  • To represent a geometric object that does not pass through origin,
    • Perform translation, by adding a vector \( \boldsymbol{c} \) to every vector in \( V : \{ \boldsymbol{c} + \boldsymbol{v} : \boldsymbol{v} \in V \} \).

Affine Combination, Convex Combination

  • Affine space
    • If \( \boldsymbol{c} \) is a vector and \( V \) is a vector space, then \( \boldsymbol{c} + V \) is called an affine space.
  • Affine combination
    • A linear combination \( \alpha_1 \boldsymbol{u}_1 + \alpha_2 \boldsymbol{u}_2 + ... + \alpha_n \boldsymbol{u}_n \), where \( \alpha_1 + \alpha_2 + ... + \alpha_n = 1 \) is an affine combination.
    • The set of all affine combinations of vectors \( \boldsymbol{u}_1, \boldsymbol{u}_2, ..., \boldsymbol{u}_n \) is called the affine hull of these vectors.
    • Affine hull of some vectors is an affine space.
  • The solution set of a linear system is either empty of an affine space.
  • A linear equation \( \boldsymbol{\alpha}_1 \cdot \boldsymbol{x} = 0 \) with zero right-hand side is a homogeneous linear equation.
    • A system of homogeneous linear equations is called a homogeneous linear system.
  • Convex combination
    • A linear combination \( \alpha_1 \boldsymbol{u}_1 + \alpha_2 \boldsymbol{u}_2 + ... + \alpha_n \boldsymbol{u}_n \), where \( \boldsymbol{u}_1, ..., \boldsymbol{u}_n \in R \), \( \alpha_1, ..., \alpha_n > 0 \) and \( \alpha_1 + \alpha_2 + ... \alpha_n = 1 \) is a convex combination.

Matrix

Matrix

  • A matrix is a two-dimensional array.
    • For a matrix \( A \), the \( i,j \) element of \( A \) is the element in row \( i \) and column \( j \).
    • It is traditionally written as \( A_{i,j} \).
  • A matrix can be viewed as a nested column of row vectors, or a nested row of column vectors.
    • E.g. \( A = \begin{pmatrix} r_1 \\\ r_2 \\\ r_3 \end{pmatrix} = \begin{pmatrix} c_1 & c_2 & c_3 \end{pmatrix} \)
  • For finite sets \( R \) and \( C \), an \( R \times C \) matrix over \( F \) is a function from \( R \times C \) to \( F \).
    • E.g. we can retrieve a value from \( F \) given an index from \( R \) and an index from \( C \).
  • Identity matrix
    • A \( D \times D \) identity matrix is the matrix \( I_D \) such that \( I_D[k,k] = 1 \) for all \( k \in D \) and zero elsewhere where \( k = diagonal \).
    • I.e. identity matrix only has 1s for the diagonals and 0s for everything else.
  • Matrix transpose
    • Swap rows and columns in a matrix.

Matrix-Vector Multiplication

  • For matrix multiplication, it is either multiplying:
    • Each row element in matrix 1 with each column element in matrix 2 OR
    • Each column in matrix 1 with each row in matrix 2.
  • Matrix-vector multiplication
    • Let \( M \) be an \( R \times C \) matrix and \( \boldsymbol{v} \) be an \( C \)-vector, then \( M \times \boldsymbol{v} = \sum_{c \in C} \boldsymbol{v} [c] \) where \( [c] = column\ c\ of\ M \).
    • E.g. multiplying row element in matrix 1 with column element in matrix 2.
    • \( \begin{pmatrix} 1 & 2 & 3 \\\ 10 & 20 & 30 \end{pmatrix} \times \begin{pmatrix} 7 \\\ 0 \\\ 4 \end{pmatrix} = \begin{pmatrix} 1 \times 7 + 2 \times 0 + 3 \times 4 \\\ 10 \times 7 + 20 \times 0 + 30 \times 4 \end{pmatrix} = \begin{pmatrix} 19 \\\ 190 \end{pmatrix} \)
    • E.g. multiplying column in matrix 1 with row in matrix 2.
    • \( \begin{pmatrix} 1 & 2 & 3 \\\ 10 & 20 & 30 \end{pmatrix} \times \begin{pmatrix} 7 \\\ 0 \\\ 4 \end{pmatrix} = 7 \begin{pmatrix} 1 \\\ 10 \end{pmatrix} + 0 \begin{pmatrix} 2 \\\ 20 \end{pmatrix} + 4 \begin{pmatrix} 3 \\\ 30 \end{pmatrix} = \begin{pmatrix} 19 \\\ 190 \end{pmatrix} \)
  • Vector-matrix multiplication
    • Let \( M \) be an \( R \times C \) matrix and \( \boldsymbol{w} \) be an \( R \)-vector, then \( \boldsymbol{w} \times M = \sum_{r \in R} \boldsymbol{w} [r] \) where \( [r] = row\ r\ of\ M \).
    • E.g. multiplying row element in matrix 1 with column element in matrix 2
    • \( \begin{pmatrix} 3 & 4 \end{pmatrix} \times \begin{pmatrix} 1 & 2 & 3 \\\ 10 & 20 & 30 \end{pmatrix} = \begin{pmatrix} 3 \times 1 + 4 \times 10 & 3 \times 2 + 4 \times 20 & 3 \times 3 + 4 \times 30 \end{pmatrix} = \begin{pmatrix} 43 & 86 & 129 \end{pmatrix} \)
    • E.g. multiplying column in matrix 1 with row in matrix 2
    • \( \begin{pmatrix} 3 & 4 \end{pmatrix} \times \begin{pmatrix} 1 & 2 & 3 \\\ 10 & 20 & 30 \end{pmatrix} = 3 \begin{pmatrix} 1 & 2 & 3 \end{pmatrix} + 4 \begin{pmatrix} 10 & 20 & 30 \end{pmatrix} = \begin{pmatrix} 43 & 86 & 129 \end{pmatrix} \)
  • As a vector is always assume to be a column vector, vector-matrix multiplication is usually written with a transposed vector.
    • E.g. \( \boldsymbol{b}^T A \)
  • Dot-product definition of matrix-vector multiplication
    • Let \( M \) be an \( R \times C \) matrix, \( \boldsymbol{v} \) be an \( R \)-vector and \( \boldsymbol{u} \) be an \( C \)-vector, \( M \times \boldsymbol{u} = \boldsymbol{v} \) such that \( \boldsymbol{v} [r] \) is the dot-product of row \( r \) of \( M \) with \( \boldsymbol{u} \).
    • E.g. \( M \cdot \boldsymbol{u} = \begin{pmatrix} \boldsymbol{v}_1 \\\ \boldsymbol{v}_2 \end{pmatrix} \cdot \boldsymbol{u} = \begin{pmatrix} \boldsymbol{v}_1 \cdot \boldsymbol{u} \\\ \boldsymbol{v}_2 \cdot \boldsymbol{u} \end{pmatrix} \)
  • Algebraic properties of matrix-vector multiplication
    • \( A \times (\alpha \boldsymbol{v}) = \alpha (A \times \boldsymbol{v}) \)
    • \( A \times (\boldsymbol{u} + \boldsymbol{v}) = A \times \boldsymbol{u} + A \times \boldsymbol{v} \)
  • Matrix-vector multiplication is the same as vector-matrix multiplication, \( M \boldsymbol{v} = \boldsymbol{v}^T M \) if \( M \) is a square matrix.
  • Matrices with non-matching dimensions cannot be multiplied.
    • E.g. \( \begin{pmatrix} 1 & 2 & 3 \\\ 10 & 20 & 30 \end{pmatrix} \times \begin{pmatrix} 7 \\\ 0 \end{pmatrix} = undefined \)
  • Matrix conditioning
    • Some matrix-vector equations are ill-conditioned, which can prevent an algorithm from getting approximate solutions even when solutions exist.
  • Triangular matrix
    • An \( n \times n \) upper triangular matrix \( A \) is a matrix where \( A_{i,j} = 0 \) for \( j > i \).
    • Matrix-vector equation with a triangular matrix can be solved by backward substitution.

Matrix-Matrix Multiplication

  • Definitions of matrix-matrix multiplications
    • Vector-matrix definition
      • For each row \( r \) of \( A \), \( row\ r\ of\ AB = (row\ r\ of\ A) \times B \).
      • E.g. \( \begin{pmatrix} 1 & 0 & 0 \\\ 2 & 1 & 0 \\\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} \boldsymbol{b}_1 \\\ \boldsymbol{b}_2 \\\ \boldsymbol{b}_3 \end{pmatrix} = \begin{pmatrix} \begin{pmatrix} 1 & 0 & 0 \end{pmatrix} \times \begin{pmatrix} \boldsymbol{b}_1 \\\ \boldsymbol{b}_2 \\\ \boldsymbol{b}_3 \end{pmatrix} \\\ \begin{pmatrix} 2 & 1 & 0 \end{pmatrix} \times \begin{pmatrix} \boldsymbol{b}_1 \\\ \boldsymbol{b}_2 \\\ \boldsymbol{b}_3 \end{pmatrix} \\\ \begin{pmatrix} 0 & 0 & 1 \end{pmatrix} \times \begin{pmatrix} \boldsymbol{b}_1 \\\ \boldsymbol{b}_2 \\\ \boldsymbol{b}_3 \end{pmatrix} \end{pmatrix} = \begin{pmatrix} \boldsymbol{b}_1 \\\ 2 \boldsymbol{b}_1 + \boldsymbol{b}_2 \\\ \boldsymbol{b}_3 \end{pmatrix} \)
    • Matrix-vector definition
      • For each column \( c \) of \( B \), \( column\ c\ of\ AB = A \times (column\ c\ of\ B) \).
      • E.g. \( \begin{pmatrix} 1 & 2 \\\ -1 & 1 \end{pmatrix} \begin{pmatrix} 4 & 2 & 0 \\\ 3 & 1 & -1 \end{pmatrix} = \begin{pmatrix} \begin{pmatrix} 1 & 2 \\\ -1 & 1 \end{pmatrix} \times \begin{pmatrix} 4 \\\ 3 \end{pmatrix} & \begin{pmatrix} 1 & 2 \\\ -1 & 1 \end{pmatrix} \times \begin{pmatrix} 2 \\\ 1 \end{pmatrix} & \begin{pmatrix} 1 & 2 \\\ -1 & 1 \end{pmatrix} \times \begin{pmatrix} 0 \\\ -1 \end{pmatrix} \end{pmatrix} = \begin{pmatrix} 10 & 4 & -2 \\\ -1 & -1 & -1 \end{pmatrix} \)
    • Dot-product definition
      • \( Element\ rc\ of\ AB = (row\ r\ of\ AB) \cdot (column\ c\ of\ B) \)
      • E.g. \( \begin{pmatrix} 1 & 0 & 2 \\\ 3 & 1 & 0 \\\ 2 & 0 & 1 \end{pmatrix} \begin{pmatrix} 2 & 1 \\\ 5 & 0 \\\ 1 & 3 \end{pmatrix} = \begin{pmatrix} \begin{pmatrix} 1 & 0 & 2 \end{pmatrix} \cdot \begin{pmatrix} 2 \\\ 5 \\\ 1 \end{pmatrix} & \begin{pmatrix} 1 & 0 & 2 \end{pmatrix} \cdot \begin{pmatrix} 1 \\\ 0 \\\ 3 \end{pmatrix} \\\ \begin{pmatrix} 3 & 1 & 0 \end{pmatrix} \cdot \begin{pmatrix} 2 \\\ 5 \\\ 1 \end{pmatrix} & \begin{pmatrix} 3 & 1 & 0 \end{pmatrix} \cdot \begin{pmatrix} 1 \\\ 0 \\\ 3 \end{pmatrix} \\\ \begin{pmatrix} 2 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} 2 \\\ 5 \\\ 1 \end{pmatrix} & \begin{pmatrix} 2 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} 1 \\\ 0 \\\ 3 \end{pmatrix} \\\ \end{pmatrix} = \begin{pmatrix} 4 & 7 \\\ 11 & 3 \\\ 5 & 5 \end{pmatrix} \)
  • Algebraic properties of matrix-matrix multiplication.
    • \( A \cdot B \neq B \cdot A \)
    • \( (AB)C = A(BC) \)
    • \( (AB)^T = B^T A^T \)
    • \( (AB)^T \neq A^T B^T \)

Null Space, Solution Space for Matrix Vector Equation

  • Null space of a matrix \( A \) is \( \{ \boldsymbol{u} : A \times \boldsymbol{u} = 0 \} \), i.e. a vector that returns zero vectors when multiplied with matrix \( A \).
    • E.g. \( \begin{pmatrix} 1 & 2 & 4 \\\ 2 & 3 & 9 \end{pmatrix} \times \begin{pmatrix} 6 \\\ -1 \\\ -1 \end{pmatrix} = \begin{pmatrix} 0 \\\ 0 \end{pmatrix} \), where the null space includes \( \begin{pmatrix} 6 \\\ -1 \\\ -1 \end{pmatrix} \).
  • Null space of a matrix \( \begin{pmatrix} \boldsymbol{a}_1 \\\ ... \\\ \boldsymbol{a}_m \end{pmatrix} \) equals the solution set of the homogeneous linear system \( \begin{pmatrix} \boldsymbol{a}_1 \cdot x = 0 \\\ ... \\\ \boldsymbol{a}_m \cdot x = 0 \end{pmatrix} \).
  • If \( \boldsymbol{u}_1 \) is a solution to \( A \boldsymbol{x} = \boldsymbol{b} \), then solution set is \( \boldsymbol{u}_1 + V \) where \( V = Null A \).
    • E.g. given null space of \( \begin{pmatrix} 1 & 2 & 4 \\\ 2 & 3 & 9 \end{pmatrix} \) is \( Span \{ \begin{pmatrix} 6 \\\ -1 \\\ -1 \end{pmatrix} \} \), and one solution to \( \begin{pmatrix} 1 & 2 & 4 \\\ 2 & 3 & 9 \end{pmatrix} \times \boldsymbol{x} = \begin{pmatrix} 1 \\\ 1 \end{pmatrix} \) is \( x = \begin{pmatrix} -1 \\\ 1 \\\ 0 \end{pmatrix} \), therefore solution set for \( \boldsymbol{x} \) is \( \begin{pmatrix} -1 \\\ 1 \\\ 0 \end{pmatrix} + Span \{ \begin{pmatrix} 6 \\\ -1 \\\ -1 \end{pmatrix} \} \).
    • Examples of solution \( \boldsymbol{x} \) are:
      • \( \begin{pmatrix} -1 \\\ 1 \\\ 0 \end{pmatrix} + \begin{pmatrix} 0 \\\ 0 \\\ 0 \end{pmatrix} \)
      • \( \begin{pmatrix} -1 \\\ 1 \\\ 0 \end{pmatrix} + \begin{pmatrix} 6 \\\ -1 \\\ -1 \end{pmatrix} \)
      • \( \begin{pmatrix} -1 \\\ 1 \\\ 0 \end{pmatrix} + 2 \begin{pmatrix} 6 \\\ -1 \\\ -1 \end{pmatrix} \)
  • \( A \boldsymbol{x} = \boldsymbol{b} \) has at most one solution if and only if \( Null A \) is a trivial vector space.
  • Knowing the relationships governing \( A \boldsymbol{x} = \boldsymbol{b} \) allows us to solve for any of \( A \), \( \boldsymbol{x} \) and \( \boldsymbol{b} \).

Inner Products, Outer Products

  • Inner products
    • Let \( \boldsymbol{u} \) and \( \boldsymbol{v} \) be two column vectors, the inner product is \( \boldsymbol{u}^T \boldsymbol{v} \).
    • E.g. \( \boldsymbol{u}^T \boldsymbol{v} = \begin{pmatrix} u_1 & u_2 & u_3 \end{pmatrix} \begin{pmatrix} v_1 \\\ v_2 \\\ v_3 \end{pmatrix} = \begin{pmatrix} u_1 v_1 + u_2 v_2 + u_3 v_3 \end{pmatrix} \)
  • Outer products
    • Let \( \boldsymbol{u} \) and \( \boldsymbol{v} \) be two column vectors, the outer product is \( \boldsymbol{u} \boldsymbol{v}^T \).
    • E.g. \( \boldsymbol{u} \boldsymbol{v}^T = \begin{pmatrix} u_1 \\\ u_2 \\\ u_3 \end{pmatrix} \begin{pmatrix} v_1 & v_2 & v_3 \end{pmatrix} = \begin{pmatrix} u_1 v_1 & u_1 v_2 & u_1 v_3 \\\ u_2 v_1 & u_2 v_2 & u_2 v_3 \\\ u_3 v_1 & u_3 v_2 & u_3 v_3 \end{pmatrix} \)

Matrix Inverse

  • The inverse of matrix \( A \) is denoted as \( A^{-1} \).
    • A matrix that can be inversed is an invertible matrix.
  • A matrix that is not invertible is also known as a singular matrix.
    • An invertible matrix is also known as a non-singular matrix.
  • Matrix invertibility
    • Let \( A \) be an \( R \times C \) matrix, then \( A \) is invertible if and only if \( |R| = |C| \) and the columns of \( A \) are linearly independent.
  • The transpose of an invertible matrix is invertible.
  • There are multiple ways to obtain the inverse of a matrix (when exists):
    1. Using determinants.
    2. Using Gauss-Jordan method (i.e. elementary row operations).
    3. Using minors, cofactors and adjugate.
  • If \( A \) is invertible, then \( A \boldsymbol{x} = \boldsymbol{b} \) has exactly one \( \boldsymbol{x} \) solution for each \( \boldsymbol{b} \).
  • \( AB \) is an invertible matrix if and only if both \( A \) and \( B \) are invertible matrices.
  • If the \( R \times C \) matrix \( A \) has an inverse \( A^{-1} \), then \( A A^{-1} \) is the \( R \times R \) identity matrix.
    • \( A A^{-1} = A^{-1} A = I \)
  • Matrix \( A \) is invertible if the function \( f(\boldsymbol{x}) = A \boldsymbol{x} \) is an invertible function, i.e. the function is one-to-one and onto.
  • Matrix inverse is very useful in solving matrix multiplication.
    • E.g. \( \begin{eqnarray} AX &=& B \\\ A^{-1} AX &=& A^{-1} B \\\ IX &=& A^{-1} B \\\ X &=& A^{-1} B \end{eqnarray} \)

Functions

Linear Functions

  • Linear function
    • Let \( V \) and \( W \) be vector spaces over a field \( F \), and given a function \( f : V \rightarrow W \), \( f \) is a linear function if it satisfies 2 conditions.
      1. For every vector \( \boldsymbol{v} \) in \( V \) and every scalar \( \alpha \) in \( F \), \( f(\alpha \boldsymbol{v}) = \alpha f(\boldsymbol{v}) \).
      2. For every two vectors \( \boldsymbol{u} \) and \( \boldsymbol{v} \) in \( V \), \( f (\boldsymbol{u} + \boldsymbol{v}) = f(\boldsymbol{u}) + f(\boldsymbol{v}) \).
  • A linear function \( f \) is able to map from vector \( \boldsymbol{v} \) to vector \( \boldsymbol{w} \) by multiplication with matrix \( M \).
    • E.g. \( f(\boldsymbol{v}) = M \boldsymbol{v} = \boldsymbol{w} \)
  • We can solve for the matrix \( M \) in \( f(\boldsymbol{x}) = M \boldsymbol{x} \) by plugging in standard generators \( \boldsymbol{e}_1 = [1,0,...,0,0], ..., \boldsymbol{e}_n = [0,0,...,0,1] \), provided that \( M \) exists.
    • I.e. we can solve for \( M \) if we know some values for \( f(\boldsymbol{x}) \) and \( \boldsymbol{x} \).
  • Kernel of a linear function \( f \) is \( \{ \boldsymbol{v} : f(\boldsymbol{v}) = 0 \} \).
    • For a function \( f(\boldsymbol{x}) = M \boldsymbol{x} \), \( Kernel f = Null M \).
  • A linear function is one-to-one if and only if its kernel is a trivial vector space.

Function Composition

  • Function composition
    • Given functions \( f(\boldsymbol{y}) = A \boldsymbol{y} \), \( g(\boldsymbol{x}) = B \boldsymbol{x} \) and \( h(\boldsymbol{x}) = (AB) \boldsymbol{x} \), \( f \circ g = h \). I.e. \( f(g(\boldsymbol{x})) = h(\boldsymbol{x}) \)
  • Function inverse
    • Given functions \( f(\boldsymbol{y}) = A \boldsymbol{y} \), \( g(\boldsymbol{x}) = B \boldsymbol{x} \) and \( h(\boldsymbol{x}) = (AB) \boldsymbol{x} \).
    • If \( f \) and \( g \) are functional inverses of each other, then \( A \) and \( B \) are matrix inverses of each other.

Linear Function Invertibility

  • The image of \( f \) is a subspace of \( W \).
  • \( f \) is one-to-one if and only if \( dim Ker f = 0 \).
  • \( f \) is onto if \( dim Im f = dim W \).
  • Linear-function invertibility theorem
    • Let \( f : V \rightarrow W \) be a linear function, then \( f \) is invertible if and only if \( dim Ker f = 0 \) and \( dim V = dim W \).
    • \( f \) is invertible if \( dim Ker f = 0 \) and \( dim Im f = dim W \).
  • Kernel-image theorem
    • For any linear function \( f : V \rightarrow W \), \( dim Ker f + dim Im f = dim V \).
  • The nullity of matrix \( A \) is \( dim Null A \).
  • Rank-nullity theorem
    • For any n-column matrix \( A \), \( nullity A + rank A = n \)

Basis

Coordinate Representation

  • Coordinate representation
    • The coordinate representation of \( v \) in terms of generators \( \boldsymbol{a}_1, ..., \boldsymbol{a}_n \) is the vector \( [ \alpha_1, ..., \alpha_n ] \) such that
      • \( \boldsymbol{v} = \alpha_1 \boldsymbol{a}_1 + ... + \alpha_n \boldsymbol{a}_n \)
      • In this context, the coefficients \( \alpha_1, ..., \alpha_n \) are the coordinates.
    • E.g. when expressing \( \boldsymbol{v} = [1,3,5,3] \) using standard generators [1,0,0,0], [0,1,0,0], [0,0,1,0], [0,0,0,1],
      • \( \boldsymbol{v} = [1,3,5,3] = 1[1,0,0,0] + 3[0,1,0,0] + 5[0,0,1,0] + 3[0,0,0,1] \)
      • Where the coordinate representation of \( \boldsymbol{v} \) is \( [1,3,5,3] \).
    • E.g. when expressing \( \boldsymbol{v} = [6,3,2,5] \) using standard generators [2,2,2,3], [1,0,-1,0], [0,1,0,1],
      • \( \boldsymbol{v} = [6,3,2,5] = 2[2,2,2,3] + 2[1,0,-1,0] - 1[0,1,0,1] \)
      • Where the coordinate representation of \( \boldsymbol{v} \) is \( [2,2,-1] \).
    • \( \boldsymbol{u} \) is the coordinate representation of \( \boldsymbol{v} \) in terms of \( \boldsymbol{a}_1, ..., \boldsymbol{a}_n \) can be written as the matrix-vector equation \( A \boldsymbol{u} = \boldsymbol{v} \).
      • To go from coordinate representation \( \boldsymbol{u} \) to \( \boldsymbol{v} \), we solve for \( A \boldsymbol{u} \).
      • To go from vector \( \boldsymbol{v} \) to \( \boldsymbol{u} \), we solve for \( A \boldsymbol{x} \).
    • Standard generators are usually called the standard basis vectors.

Linear Dependence

  • Superfluous vector
    • FOr any set \( S \) and any vector \( \boldsymbol{v} \in S \), if \( \boldsymbol{v} \) can be written as a linear combination of the other vectors in \( S \) then \( Span(S - \{ \boldsymbol{v} \}) = Span S \).
    • i.e. this means that vector \( \boldsymbol{v} \) does not contain information that is not present in \( Span S \).
  • Linear dependence
    • Vectors \( \boldsymbol{v}_1, ..., \boldsymbol{v}_n \) are linearly dependent if the zero vector can be written as a non-trivial linear combination of the vectors: \( \boldsymbol{0} = \alpha_1 \boldsymbol{v}_1 + ... + \alpha_n \boldsymbol{v}_n \).
    • E.g. vectors \( [1,0,0], [0,2,0], [2,4,0] \) are linearly dependent as shown below:
      • \( 2[1,0,0] + 2[0,2,0] - 1[2,4,0] = [0,0,0] \)
    • E.g. vectors \( [1,0,0], [0,2,0], [0,0,4] \) are linearly independent as the only linear combination that equals the zero vector is the trivial linear combination, where everything equals zero.
      • \( \alpha_1 [1,0,0] + \alpha_2 [0,2,0] - \alpha_3 [2,4,0] = [ \alpha_1, 2 \alpha_2, 4 \alpha_3 ] \)
      • The only situation where \( [ \alpha_1, 2 \alpha_2, 4 \alpha_3 ] = [0,0,0] \) is when \( \alpha_1 = \alpha_2 = \alpha_3 = 0 \), which is a trivial linear combination.
    • This also means that vectors \( \boldsymbol{v}_1, ..., \boldsymbol{v}_n \) are linearly dependent if and only if the null space of the matrix is non-trivial.
  • Hereditary of linear dependence
    • If a finite set \( S \) of vectors is linearly dependent and \( S \) is a subset of \( T \) then \( T \) is linearly dependent.

Unique Representation

  • Unique representation
    • Let \( \boldsymbol{a}_1, ..., \boldsymbol{a}_n \) be a basis for \( V \). For any vector \( \boldsymbol{v} \in V \), there is exactly one representation of \( \boldsymbol{v} \) in terms of the basis vectors.
    • I.e. if we represent a 2-vector \( \boldsymbol{v} \) with basis vectors \( [1,0] \) and \( [0,1] \), this will be a unique representation.
  • Vector basis
    • A vector basis of a vector space \( V \) is defined as a subset \( \boldsymbol{v}_1, ..., \boldsymbol{v}_n \) of vectors in \( V \) that are linearly independent and \( Span V \).
  • Change of basis
    • Given two bases of \( F^M \), \( \boldsymbol{a}_1, ..., \boldsymbol{a}_n \) and \( \boldsymbol{c}_1, ..., \boldsymbol{c}_n \), there is a matrix \( B \) such that multiplication by \( B \) converts form one coordinate representation to the other.
      • \( B = \begin{pmatrix} \boldsymbol{a}_1 & ... & \boldsymbol{a}_n \end{pmatrix}^{-1} \begin{pmatrix} \boldsymbol{c}_1 & ... & \boldsymbol{c}_k \end{pmatrix} \)
    • E.g. to map from coordinate representations \( [1,2,3], [2,1,0], [0,1,4] \) to coordinate representations \( [2,0,1], [0,1,-1], [1,2,0] \), we need to obtain \( B \).
      • \( B = \begin{pmatrix} 2 & 0 & 1 \\\ 0 & 1 & 2 \\\ 1 & -1 & 0 \end{pmatrix}^{-1} \begin{pmatrix} 1 & 2 & 0 \\\ 2 & 1 & 1 \\\ 3 & 0 & 4 \end{pmatrix} = \begin{pmatrix} \frac{2}{3} & -\frac{1}{3} & -\frac{1}{3} \\\ \frac{2}{3} & -\frac{1}{3} & -\frac{4}{3} \\\ -\frac{1}{3} & \frac{2}{3} & \frac{2}{3} \end{pmatrix} \begin{pmatrix} 1 & 2 & 0 \\\ 2 & 1 & 1 \\\ 3 & 0 & 4 \end{pmatrix} = \begin{pmatrix} -1 & 1 & -\frac{5}{3} \\\ -4 & 1 & -\frac{17}{3} \\\ 3 & 9 & \frac{10}{3} \end{pmatrix} \)
  • Simplified exchange lemma
    • Suppose \( S \) is a set of vectors, and \( \boldsymbol{z} \) is a nonzero vector in \( Span S \). Then there is a vector \( \boldsymbol{w} \) in \( S \) such that \( Span(S \cup \{ \boldsymbol{z} \} - \{ \boldsymbol{w} \}) = Span S \).
  • Exchange lemma
    • Suppose \( S \) is a set of vectors and \( A \) is a subset of \( S \). Suppose \( \boldsymbol{z} \) is a vector in \( Span S \) such that \( A \cup \{ \boldsymbol{z} \} \) is linearly independent.
    • Then there is a vector \( \boldsymbol{w} \in S - A \) such that \( Span S = Span(S \cup \{ \boldsymbol{z} \} - \{ \boldsymbol{w} \}) \).

Dimension

Dimension, Rank

  • Dimension
    • The dimension of a vector space is the size of a basis for that vector space.
    • The dimension of a vector space \( V \) is written as \( dim V \).
  • Morphing lemma
    • Suppose \( S \) is a set of vectors, and \( B \) is a linearly independent set of vectors in \( Span S \).
    • Then \( |S| \geq |B| \).
  • Rank
    • The rank of a set \( S \) of vectors is the dimension of \( Span S \).
    • The rank of a set \( S \) is written as \( rank S \).
    • E.g. the vectors \( [1,0,0], [0,2,0], [2,4,0] \) are linearly dependent, where first two of these vectors form a basis for the span of all three.
      • So the rank is 2. (Not 3)
    • E.g. the vector space \( Span \{ [0,0,0] \} \) is spanned by an empty set of vectors.
      • Therefore the rank of \( \{ [0,0,0] \} \) is zero.
  • For a matrix \( M \), the row rank of \( M \) is the rank of its rows, and the column rank of \( M \) is the rank of its columns.
    • The row rank of \( M \) is the dimension of \( Row M \), and the column rank of \( M \) is the dimension of \( Col M \).
    • E.g. given the matrix \( M = \begin{pmatrix} 1 & 0 & 0 \\\ 0 & 2 & 0 \\\ 2 & 4 & 0 \end{pmatrix} \),
      • The row rank is 2, because last row is a linear combination of the first two rows.
      • The column rank is 2, because last column is a zero vector which is not needed to span the column space.
  • For every matrix \( M \), row rank and column rank are always equal.

Direct Sum

  • Direct sum is a sum between vector subspaces, not between vectors or between vector spaces.
  • Direct sum
    • If \( U \) and \( V \) share only the zero vector, then we define the direct sum of \( U \) and \( V \) to be the set \( \{ \boldsymbol{u} + \boldsymbol{v} : \boldsymbol{u} \in U, \boldsymbol{v} \in V \} \), written as \( U \oplus V \).
  • Union of a basis of \( U \) and a basis of \( V \) is a basis of \( U \oplus V \).
  • If \( U \oplus V = W \), we say \( U \) and \( V \) are complementary subspaces of \( W \).

Annihilator

  • A vector space can be represented in 2 ways:
    • As the solution set of homogeneous linear system, \( \boldsymbol{a}_1 \cdot \boldsymbol{x} = 0, ..., \boldsymbol{a}_m \cdot \boldsymbol{x} = 0 \).
    • As the span, \( Span \{ [ b_1, ..., b_k ] \} \).
  • An annihilator converts the linear system representation into the span representation for a vector space.
  • Annihilator
    • The set of vectors \( \boldsymbol{u} \) such that \( \boldsymbol{u} \cdot \boldsymbol{v} = 0 \) for every vector \( \boldsymbol{v} \) in \( V \) is called the annihilator of \( V \), written as \( V^{\circ} \).

Gaussian Elimination

Echelon Form

  • Echelon form
    • An \( m \times n \) matrix \( A \) is in echelon form if it satisfies the following condition: for any row, if that row's first nonzero entry is in position \( k \) then every previous row's first nonzero entry is in some position less than \( k \).
    • Echelon Form
    • Echelon form is a generalisation of triangular matrices.
  • Uses of echelon form
    • If a matrix is in echelon form, the nonzero rows form a basis for the row space.
    • E.g. a basis for the row space of \( \begin{pmatrix} 0 & 2 & 3 & 0 & 5 & 6 \\\ 0 & 0 & 1 & 0 & 3 & 4 \\\ 0 & 0 & 0 & 0 & 0 & 0 \\\ 0 & 0 & 0 & 0 & 0 & 0 \end{pmatrix} \) is \( \{ [0,2,3,0,5,6], [0,0,1,0,3,4] \} \).
    • E.g. if every row is nonzero, then all the rows form a basis for the row space.

Gaussian Elimination

  • Uses
    • Finding a basis for the span of given vectors.
    • Solving a matrix equation, which is the same as solving a system of linear equations, which is the same as expressing a given vector as a linear combination of other given vectors.
    • Finding a basis for the null space of a matrix, which is the same as finding a basis for the solution set of a homogeneous linear system.