海森矩阵(德语:Hesse-Matrix;英语:Hessian matrix 或 Hessian),又译作黑塞矩阵、海塞(赛)矩阵或海瑟矩阵等,是一个由多变量实值函数的所有二阶偏导数组成的方阵,由德国数学家奥托·黑塞引入并以其命名。
假设有一实值函数
,如果
的所有二阶偏导数都存在并在定义域内连续,那么函数
的黑塞矩阵为
![{\displaystyle \mathbf {H} ={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/23f4db415be866163432946603c07edbc4a21a41)
或使用下标记号表示为
![{\displaystyle \mathbf {H} _{ij}={\frac {\partial ^{2}f}{\partial x_{i}\partial x_{j}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8aa2fc260814bfe867df5ee7b9ba4d663771ebae)
显然黑塞矩阵
是一个
方阵。黑塞矩阵的行列式被称为黑塞式(英语:Hessian),而需注意的是英语环境下使用Hessian一词时可能指上述矩阵也可能指上述矩阵的行列式[1]。
由高等数学知识可知,若一元函数
在
点的某个邻域内具有任意阶导数,则函数
在
点处的泰勒展开式为
![{\displaystyle f(x)=f(x_{0})+f'(x_{0})\Delta x+{\frac {f''(x_{0})}{2!}}\Delta x^{2}+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c7099923f582ffe5517e204a02780e3981e6c101)
其中,
。
同理,二元函数
在
点处的泰勒展开式为
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+f_{x_{1}}(x_{10},x_{20})\Delta x_{1}+f_{x_{2}}(x_{10},x_{20})\Delta x_{2}+{\frac {1}{2}}[f_{x_{1}x_{1}}(x_{10},x_{20})\Delta x_{1}^{2}+2f_{x_{1}x_{2}}(x_{10},x_{20})\Delta x_{1}\Delta x_{2}+f_{x_{2}x_{2}}(x_{10},x_{20})\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6f716863c511e8df89e2239a014ac68cb4552072)
其中,
,
,
,
,
,
,
。
将上述展开式写成矩阵形式,则有
![{\displaystyle f(x)=f(x_{0})+\nabla f(x_{0})^{\mathrm {T} }\Delta x+{\frac {1}{2}}\Delta x^{\mathrm {T} }G(x_{0})\Delta x+\cdots }](https://wikimedia.org/api/rest_v1/media/math/render/svg/410d9cadefc4015ace1832de2c31dc8163eda8f0)
其中,
,
是
的转置,
是函数
在
的梯度,矩阵
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a05ada63da2ce012b28f4f9a58499b96f3954ffa)
即函数
在
点处的
黑塞矩阵。它是由函数
在
点处的所有二阶偏导数所组成的方阵。
由函数的二次连续性,有
![{\displaystyle {\frac {\partial ^{2}f}{\partial x_{1}\partial x_{2}}}={\frac {\partial ^{2}f}{\partial x_{2}\partial x_{1}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/33744c4388fb98b61ce851e5c5e7ecf2c9a59d29)
所以,黑塞矩阵
为对称矩阵。
将二元函数的泰勒展开式推广到多元函数,函数
在
点处的泰勒展开式为
![{\displaystyle f(x)=f(x_{0})+\nabla f(x_{0})^{\mathrm {T} }\Delta x+{\frac {1}{2}}\Delta x^{\mathrm {T} }G(x_{0})\Delta x+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c1190639eb839919d8982edaf464e8360e510102)
其中,
为函数
在
点的梯度,
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/05c00292d8658bcad4960686b04a604d7323d663)
为函数
在
点的
黑塞矩阵。若函数有
次连续性,则函数的
黑塞矩阵是对称矩阵。
说明:在优化设计领域中,黑塞矩阵常用
表示,且梯度有时用
表示。[2]
函数
的黑塞矩阵和雅可比矩阵有如下关系:
![{\displaystyle \mathrm {H} (f)=\mathrm {J} (\nabla f)^{T}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2b23bfb757383e1bcb2a908d970a947358b0e769)
即函数
的黑塞矩阵等于其梯度的雅可比矩阵。
函数的极值条件[编辑]
对于一元函数
,在给定区间内某
点处可导,并在
点处取得极值,其必要条件是
![{\displaystyle f'(x_{0})=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ccb00c237e61f1121e81c87826a52a394a31b720)
即函数
的极值必定在驻点处取得,或者说可导函数
的极值点必定是驻点;但反过来,函数的驻点不一定是极值点。检验驻点是否为极值点,可以采用二阶导数的正负号来判断。根据函数
在
点处的泰勒展开式,考虑到上述极值必要条件,有
![{\displaystyle f(x)=f(x_{0})+{\frac {f''(x_{0})}{2!}}\Delta x^{2}+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/307ffc2a2d5a9926c3590d54f540c9f8ef03b1da)
若
在
点处取得极小值,则要求在
某一邻域内一切点
都必须满足
![{\displaystyle f(x)-f(x_{0})>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bc124cae2932535e1ade0cef2fd43eb5b4844062)
即要求
![{\displaystyle {\frac {f''(x_{0})}{2!}}\Delta x^{2}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4ad8dc656e4b7f04ddf2cd716638a480f8d23c03)
亦即要求
![{\displaystyle f''(x_{0})>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/12f46feabcab044cd1440902b7b50a7bf0f2a878)
在
点处取得极大值的讨论与之类似。于是有极值充分条件:
设一元函数
在
点处具有二阶导数,且
,
,则
- 当
时,函数
在
处取得极小值;
- 当
时,函数
在
处取得极大值。
而当
时,无法直接判断,还需要逐次检验其更高阶导数的正负号。由此有一个规律:若其开始不为零的导数阶数为偶数,则驻点是极值点;若为奇数,则为拐点,而不是极值点。
对于二元函数
,在给定区域内某
点处可导,并在
点处取得极值,其必要条件是
![{\displaystyle f_{x_{1}}(x_{0})=f_{x_{2}}(x_{0})=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a0e415d3dd559fcbce0f4360324230e051e145b2)
即
![{\displaystyle \nabla f(x_{0})=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/de546d17e32536211c8d6ecf39eb17d52acd1c3a)
同样,这只是必要条件,要进一步判断
是否为极值点需要找到取得极值的充分条件。根据函数
在
点处的泰勒展开式,考虑到上述极值必要条件,有
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+{\frac {1}{2}}[f_{x_{1}x_{1}}(x_{0})\Delta x_{1}^{2}+2f_{x_{1}x_{2}}(x_{0})\Delta x_{1}\Delta x_{2}+f_{x_{2}x_{2}}(x_{0})\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ee48435bd67521c76d889b01677bf7c454a822b1)
设
,
,
,则
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+{\frac {1}{2}}[A\Delta x_{1}^{2}+2B\Delta x_{1}\Delta x_{2}+C\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ae8df8836f38fe763c0bbb2c3d54910d98b40110)
或
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+{\frac {1}{2A}}[(A\Delta x_{1}+B\Delta x_{2})^{2}+(AC-B^{2})\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/74791d552062316bab8b1607f69c415631ebb6df)
若
在
点处取得极小值,则要求在
某一邻域内一切点
都必须满足
![{\displaystyle f(x_{1},x_{2})-f(x_{10},x_{20})>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/fbf9eb2e0bccf2c322984a9cb0486f89efaf0056)
即要求
![{\displaystyle {\frac {1}{2A}}[(A\Delta x_{1}+B\Delta x_{2})^{2}+(AC-B^{2})\Delta x_{2}^{2}]>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f005560ba6efd5567601712a8164c01a2c94bd19)
亦即要求
,
即
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle {\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}-({\frac {\partial ^{2}f}{\partial x_{1}\partial x_{2}}})^{2}\end{bmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d9ddd41047c623b067cce9e3fea5d43faa9b6eaa)
此条件反映了
在
点处的黑塞矩阵
的各阶主子式都大于零,即对于
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a05ada63da2ce012b28f4f9a58499b96f3954ffa)
要求
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle |G(x_{0})|={\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3034ba0cfc2dd22fa6e6b8af7bdc0078c9c15fb6)
在
点处取得极大值的讨论与之类似。于是有极值充分条件:
设二元函数
在
点的邻域内连续且具有一阶和二阶连续偏导数,又有
,同时令
,
,
,则
- 当
,
时,函数
在
处取得极小值;
- 当
,
时,函数
在
处取得极大值。
此外可以判断,当
时,函数
在
点处没有极值,此点称为鞍点。而当
时,无法直接判断,对此,补充一个规律:当
时,如果有
,那么函数
在
有极值,且当
有极小值,当
有极大值。
由线性代数的知识可知,若矩阵
满足
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle {\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d4e5188c5bf5eea088f17ad7e048a2202649506f)
则矩阵
是正定矩阵,或者说矩阵
正定。
若矩阵
满足
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}<0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/410e4e3dc279b8ea0b5b9baef24343e337b0b32d)
![{\displaystyle {\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d4e5188c5bf5eea088f17ad7e048a2202649506f)
则矩阵
是负定矩阵,或者说矩阵
负定。[3]
于是,二元函数
在
点处取得极值的条件表述为:二元函数
在
点处的黑塞矩阵正定,则取得极小值;在
点处的黑塞矩阵负定,则取得极大值。
对于多元函数
,若在
点处取得极值,则极值存在的必要条件为
取得极小值的充分条件为
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/05c00292d8658bcad4960686b04a604d7323d663)
正定,即要求
的各阶主子式都大于零,即
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle {\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d4e5188c5bf5eea088f17ad7e048a2202649506f)
![{\displaystyle \vdots }](https://wikimedia.org/api/rest_v1/media/math/render/svg/f8039d9feb6596ae092e5305108722975060c083)
![{\displaystyle |G(x_{0})|>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/243c218404919cdd0bc052fd06277593317e22bc)
取得极大值的充分条件为
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/05c00292d8658bcad4960686b04a604d7323d663)
负定。[4][5][6]
拓展阅读[编辑]
参考文献[编辑]