Linear Algebra(4):线性系统

线性代数

Linear System

什么是线性系统

看待矩阵非常重要的一个视角就是将其看作对系统的描述，准确来说是对线性系统的描述。所谓的线性系统其实就是我们初中小学就接触过的线性方程组。

比如下面的二元一次方程组就是一个线性系统：

$\begin{cases} x+2y=5\\ 3x+4y=6 \end{cases}$

那么什么是线性？用数学语言来看就是未知数只能是一次方项，而像 $x^2-1=0;\sqrt{z}-4=0;sin(x)=\pi$ 这种是非线性方程.而线性系统就是若干个这样的线性方程组织在一起，研究线性系统最重要的原因就是要解决这个线性系统即解线性方程组，

怎么解这个线性系统？初中我们就学过一种叫消元法的思路，比如下面的三元一次方程组：

$\begin{array}{c} \begin{aligned} \begin{cases} x+2y+4z=7\\ 3x+7y+2z=-11\\ 2x+3y+3z=1 \end{cases} \quad \rightarrow \begin{cases} x+2y+4z=7\\ y-10z=-32\\ -y-5z=-13 \end{cases} \quad \rightarrow \begin{cases} x+2y+4z=7\\ y-10z=-32\\ -15z=-45 \end{cases} \end{aligned} \\[1.5em] % 加一点间距 \begin{aligned} \begin{cases} x+2y+4z=7\\ y=-2\\ z=3 \end{cases} \end{aligned} \end{array}$

上面解方程的过程就是消元法.这个过程涉及的主要操作：

一个方程左右两边同时乘以一个常数
一个方程加(减)另一个方程
交换两个方程的位置

这三个基本操作其实就是线性代数矩阵的初等变换。

高斯消元法

上面解决这个线性系统问题并没有使用矩阵，但是我们可以非常方便的使用矩阵表示这个线性系统，相应也可以使用基于矩阵的操作解决这个问题。将上面的线性系统用下面矩阵形式表达：

$\begin{pmatrix} 1 & 2 & 4\\ 3 & 7 & 2\\ 2 & 3 & 3 \end{pmatrix} \begin{pmatrix} x\\ y\\ z \end{pmatrix} =\begin{pmatrix} 7\\ -11\\ 1 \end{pmatrix}$

解决线性系统相关问题时，可以更进一步直接扔掉未知数，把系数矩阵和右侧结果列向量合在一起形成一个增广矩阵：

$\left[ \begin{array}{ccc|c} 1 & 2 & 4 & 7\\ 3 & 7 & 2&-11\\ 2 & 3 & 3&1 \end{array} \right]$

增广矩阵的每一行都对应了线性方程组中的一个方程，之前在线性方程组中所有的操作都可以直接在这个矩阵中完成。

把上面每一步方程组消元过程中对应的增广矩阵写出来：

$\begin{pmatrix} 1 & 2 & 4 & 7\\ 3 & 7 & 2&-11\\ 2 & 3 & 3&1 \end{pmatrix} \quad \rightarrow \begin{pmatrix} 1 & 2 & 4 & 7\\ 0 & 1 & -10&-32\\ 2 & 3 & 3&1 \end{pmatrix} \quad\rightarrow \begin{pmatrix} 1 & 2 & 4 & 7\\ 0 & 1 & -10&-32\\ 0 & -1 & -5&-13 \end{pmatrix} \quad\rightarrow \begin{pmatrix} 1 & 2 & 4 & 7\\ 0 & 1 & -10&-32\\ 0 & 0 & -15&-45 \end{pmatrix}$

$\quad\rightarrow \begin{pmatrix} 1 & 2 & 4 & 7\\ 0 & 1 & -10&-32\\ 0 & 0 & 1&3 \end{pmatrix} \\$

上面的三个基本操作都可以转换为在矩阵上的三个基本操作:

矩阵的某一行乘以一个常数
矩阵的一行加(减)另一行
交换矩阵的两行

同时上面的消元过程就叫做高斯消元法，这个计算过程学过线性代数的话都很熟练. 简单来说高斯消元法就是通过行变换把矩阵变成上三角，在回代求解.

然而，这种方法仍然需要额外的回代过程。所以能不能像得到的矩阵最后一行的样子直接看出来未知数的值?为了进一步简化计算，我们可以继续对矩阵做行变换，使其不仅成为上三角矩阵，而是化为行最简阶梯形。这一方法被称为 高斯-约旦消元法。

高斯-约旦消元法

进一步使用高斯约旦消元法(Gauss-Jordan Elimination)：

$\begin{pmatrix} 1 & 2 & 4 & 7\\ 0 & 1 & -10&-32\\ 0 & 0 & 1&3 \end{pmatrix} \quad\rightarrow \begin{pmatrix} 1 & 2 & 4 & 7\\ 0 & 1 & 0&-2\\ 0 & 0 & 1&3 \end{pmatrix}\quad\rightarrow \begin{pmatrix} 1 & 2 & 0 & -5\\ 0 & 1 & 0&-2\\ 0 & 0 & 1&3 \end{pmatrix} \quad\rightarrow \begin{pmatrix} 1 & 0 & 0 & -1\\ 0 & 1 & 0&-2\\ 0 & 0 & 1&3 \end{pmatrix} \\$

此时就得到了：

$\begin{cases} x=-1\\ y=-2\\ z=3 \end{cases}$

整个过程分成了两部分：

前向过程(从上到下)：选择最上的主元，化为1；主元下面的所有行减去主元所在行的某个倍数，使得主元下面所有元素都为0.
后向过程(从下到上)：选择最下的主元；主元上面的所有行减去主元所在行的某个倍数，使得主元上面所有元素都为0.

当前解决的线性系统还有一个局限性，就是n个未知数相应有n行，并且上面的线性系统是有唯一解的。不过不妨碍我们先尝试代码实现：

from .Matrix import Matrix
from .Vector import Vector

class LinearSystem:
    def __init__(self, A, b):
        assert A.row_num() == len(b),\
        "row number of A must be equal to  the length of b"
        self._m = A.row_num()
        self._n = A.col_num()

        self.Ab = [Vector(A.row_vector(i).underlying_list() + [b[i]]) for i in range(self._m)]
    
    def _max_row(self, index, n):
        best, ret= self.Ab[index][index], index
        for i in range(index+1, n):
            if self.Ab[i][index] > best:
                best, ret = self.Ab[i][index], i
        return ret

    def _forward(self):
        '''前向过程:化成上三角矩阵'''
        n = self._m
        for i in range(n): 
            #Ab[i][i]为主元
            max_row = self._max_row(i, n)
            self.Ab[i], self.Ab[max_row] = self.Ab[max_row], self.Ab[i]

            self.Ab[i] = self.Ab[i]/self.Ab[i][i]
            for j in range(i+1, n):
                self.Ab[j] = self.Ab[j] - self.Ab[j][i]*self.Ab[i]
    
    def _backward(self):
        '''后向过程:化成行最简阶梯形'''
        n = self._m
        for i in range(n-1, 0, -1): 
            for j in range(i-1, 0, -1):
                self.Ab[j] = self.Ab[j] - self.Ab[j][i]*self.Ab[i]

    def gauss_jordan_elimination(self):

        self._forward()  # 前向过程
        self._backward() # 后向过程

    def fancy_print(self):
        for i in range(self._m):
            print(" ".join(str(self.Ab[i][j])for j in range(self._n)),end=" ")
            print("|", self.Ab[i][-1])

测试代码：

from LA.LinearSystem import LinearSystem
from LA.Matrix import Matrix
from LA.Vector import Vector


if __name__ == "__main__":
    A = Matrix([[1,2,4],[3,7,2],[2,3,3]])
    b = Vector([7,-11,1])

    ls = LinearSystem(A, b)
    ls.fancy_print()
    ls.gauss_jordan_elimination()
    ls.fancy_print()

上面代码中的使用的A,b就是上面举的例子，得到打印结果如下：

可以发现初始的增广矩阵为左边，经过高斯约旦消元法后得到结果右边的增广矩阵：

$\begin{pmatrix} 1 & 2 & 4 & 7\\ 3 & 7 & 2&-11\\ 2 & 3 & 3&1 \end{pmatrix} \quad \begin{pmatrix} 1 & 0 & 0 & -1\\ 0 & 1 & 0&-2\\ 0 & 0 & 1&3 \end{pmatrix}$

线性方程组解的结构

上面的例子中线性方程组是有唯一的解的，但实际上一个线性系统还可能没有解或者有无数个解，这些情况如何处理呢？

下面分别是无解和无数组解的例子：

整体来看三种情况：

前面对增广矩阵进行高斯消元的过程本质上是把增广矩阵变换成一个阶梯型矩阵，如何形式化的表示什么是阶梯型矩阵？可以发现如果矩阵存在全0行的话，这个全0行一定在整个矩阵的最下面；其次对于非全0行的每一行，每一行都有非0的元素，其中第一个非零元素即主元为1，主元所在列其他元素均为0.

简而言之上面的过程就是执行高斯-约旦消元法得到的矩阵，叫做行最简形式RREF(reduced row echelon form). 为什么要引入这个概念？因为无论是要求解线性方程组还是要看线性方程组解的结构，都是通过这个行最简形式来看的.

总结来看：

更一般的线性系统求解

上面的例子是方程个数等于未知数个数，接下来看看更一般的情况即方程个数不等于未知数个数的情况：

代码实现：

from .Matrix import Matrix
from .Vector import Vector
from ._global import is_zero

class LinearSystem:
    def __init__(self, A, b):
        assert A.row_num() == len(b),\
        "row number of A must be equal to  the length of b"
        self._m = A.row_num()
        self._n = A.col_num()

        self.Ab = [Vector(A.row_vector(i).underlying_list() + [b[i]]) for i in range(self._m)]
        self.pivots = []
    
    def _max_row(self, index_i, index_j, n):
        best, ret= self.Ab[index_i][index_j], index_i
        for i in range(index_i+1, n):
            if self.Ab[i][index_j] > best:
                best, ret = self.Ab[i][index_j], i
        return ret

    def _forward(self):
        i, k = 0, 0
        while i < self._m and k < self._n: 
            # 看Ab[i][k]是否为主元
            max_row = self._max_row(i, k, self._m)
            self.Ab[i], self.Ab[max_row] = self.Ab[max_row], self.Ab[i]

            if is_zero(self.Ab[i][k]) :
                k += 1
            else:
                self.Ab[i] = self.Ab[i]/self.Ab[i][k]
                for j in range(i+1, self._m):
                    self.Ab[j] = self.Ab[j] - self.Ab[j][k]*self.Ab[i]
                self.pivots.append(k)
                i+=1
    
    def _backward(self):
        n = len(self.pivots)
        for i in range(n-1, -1, -1): 
            k = self.pivots(i)
            for j in range(i-1, -1, -1):
                self.Ab[j] = self.Ab[j] - self.Ab[j][k]*self.Ab[i]

    def gauss_jordan_elimination(self):

        self._forward()
        self._backward()

    def fancy_print(self):
        for i in range(self._m):
            print(" ".join(str(self.Ab[i][j])for j in range(self._n)),end=" ")
            print("|", self.Ab[i][-1])

测试代码，测试用例就是上面的例子：

from LA.LinearSystem import LinearSystem
from LA.Matrix import Matrix
from LA.Vector import Vector


if __name__ == "__main__":
    A = Matrix([[1,-1,2,0,3],[-1,1,0,2,-5],[1,-1,4,2,4],[-2,2,-5,-1,-3]])
    b = Vector([1,5,13,-1])

    ls = LinearSystem(A, b)
    ls.fancy_print()
    ls.gauss_jordan_elimination()
    ls.fancy_print()