跳转至

3.Hessian矩阵与Lagrange乘数法


3.1.二阶偏导与Hessian矩阵

对于点集多元函数\(z=f(x,y)\),其微分为\(dz=\nabla f\cdot(dx,dy)^T\),将Jacobi矩阵即梯度\(\nabla f\)视为\(z\)的导函数。对\(dz\)微分:

\[ d^2z= \begin{pmatrix} \frac{\partial^2 f}{\partial x^2}dx+\frac{\partial^2f}{\partial x\partial y}dy \\ \frac{\partial^2f}{\partial x\partial y}dx+\frac{\partial^2f}{\partial y^2}dy \end{pmatrix}^T\cdot \begin{pmatrix} dx \\ dy \end{pmatrix}= \mathbf h^TH\mathbf{h}\ ,\\ H= \begin{pmatrix} \frac{\partial^2f}{\partial x^2} & \frac{\partial^2f}{\partial x\partial y}\\ \frac{\partial^2f}{\partial y \partial x} & \frac{\partial^2f}{\partial y^2} \end{pmatrix} \]

将二阶微分写为以上二次型形式,其系数矩阵\(H\)为Hessian矩阵。

\(f_{xy},f_{yx}\)\((x_0,y_0,z_0)\)处存在且连续,则\(f_{xy}=f_{yx}\)。由此\(H=H^T\),Hessian矩阵为实对称矩阵。

对于函数\(z=f(x,y)\),有Taylor公式的二阶近似:

\[ dz=\nabla f\cdot\mathbf h+\frac{1}{2}\mathbf h^TH\mathbf h+o(||\mathbf h^2||) \]

不难证明,\((x_0,y_0,z_0)\)为极值点的充分条件为\(\nabla f=\mathbf 0\),若极小值点存在,\(dz=\mathbf h^TH\mathbf h>0\),反之\(dz=\mathbf h^TH\mathbf h<0\),即\(H\)为严格正(负)定矩阵。若此点处极值点不存在,则总存在\(\mathbf p,\mathbf q\),使得\(\mathbf p^TH\mathbf p<0<\mathbf q^TH\mathbf q\)

\(H\)的特征值为\(\lambda\),特征向量为\(\mathbf x\),由此可得:

\[ H\mathbf x=\lambda\mathbf x \]

两边同时乘以\(\mathbf x^T\)

\[ \mathbf x^TH\mathbf x= \mathbf x^T\lambda\mathbf x=\lambda ||\mathbf x||^2 \]

若要\(H\)为严格正(负)定矩阵,特征值\(\lambda_i\)满足\(\lambda_1+\lambda_2>0(\lambda_1+\lambda_2<0),\lambda_1\lambda_2>0\),即\(tr(H)>0(tr(H)<0),det(H)>0\)

即函数\(z=f(x,y)\)\((x_0,y_0,z_0)\)处为极小(大)值的充要条件为:

  1. \(\nabla f(x_0,y_0)=0\)
  2. \((x_0,y_0,z_0)处H_f严格正(负)定\)

3.2.Lagrange乘数法

对于如下有约束条件的函数极值:

\[ \left\{\begin{matrix} z=f(x,y) \\ g(x,y)=0 \end{matrix}\right. \]

对于\(g(x,y)=0\)\(\mathbf r \parallel \nabla g\),从直观上,\(\mathbf {x_0}=(x_0,y_0,z_0)\)处总存在\(\mathbf h\),始终可以保证\(\mathbf {x_0}+\mathbf h\)落在\(z=f(x,y)\)上。要使在\(g(x,y)=0\)条件约束下保证\(z=f(x,y)\)增大or减小,上述存在的\(\mathbf h\)要与\(\nabla f\)夹锐角or钝角。让存在的\(\mathbf h\)不满足上述条件时,即\(\mathbf h \perp \nabla f\)\(\nabla f \parallel \nabla g\)

因此满足以下方程组:

\[ \left\{\begin{matrix} \nabla f=\lambda\nabla g \\ g(x,y)=0 \end{matrix}\right. \]