转自 http://www.cnblogs.com/90zeng/p/Lagrange_duality.html 原作者:博客园-90Zeng
1.原始问题
假设
![](https://img.haomeiwen.com/i7549805/0bf3799c68a8fa8c.jpg)
![](https://img.haomeiwen.com/i7549805/447f27f1f93a3de1.jpg)
![](https://img.haomeiwen.com/i7549805/e1908c2c7364f032.jpg)
![](https://img.haomeiwen.com/i7549805/ac4f475c020b944b.jpg)
称为约束最优化问题的原始问题。
现在如果不考虑约束条件,原始问题就是:
![](https://img.haomeiwen.com/i7549805/e1908c2c7364f032.jpg)
因为假设其连续可微,利用高中的知识,对
![](https://img.haomeiwen.com/i7549805/e5c665f347df686f.jpg)
引进广义拉格朗日函数(generalized Lagrange function):
![](https://img.haomeiwen.com/i7549805/cb3a30364f343672.jpg)
不要怕这个式子,也不要被拉格朗日这个高大上的名字给唬住了,让我们慢慢剖析!这里
![](https://img.haomeiwen.com/i7549805/0c85bee4f4beb6f5.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/39eaa2d7d1be534d.jpg)
.
现在,如果把![](https://img.haomeiwen.com/i7549805/827252078a98f6c3.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/69b96e1361032451.jpg)
再次注意
![](https://img.haomeiwen.com/i7549805/827252078a98f6c3.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/827252078a98f6c3.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/827252078a98f6c3.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/69b96e1361032451.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/ef6ef7fccf1f9e20.jpg)
其中
![](https://img.haomeiwen.com/i7549805/cb3a30364f343672.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
是否满足约束条件两方面来分析这个函数:
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/83441d974674869c.jpg)
![](https://img.haomeiwen.com/i7549805/4955276216b00487.jpg)
,那么:
![](https://img.haomeiwen.com/i7549805/c866dac743154dce.jpg)
注意中间的最大化式子就是确定
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/83441d974674869c.jpg)
![](https://img.haomeiwen.com/i7549805/4cc621ad568ea108.jpg)
![](https://img.haomeiwen.com/i7549805/4955276216b00487.jpg)
![](https://img.haomeiwen.com/i7549805/7c3cdb232744b715.jpg)
![](https://img.haomeiwen.com/i7549805/0dd7653f28c4b7dc.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/39d77e63e3a709d8.jpg)
![](https://img.haomeiwen.com/i7549805/2b907b480d2a6c87.jpg)
![](https://img.haomeiwen.com/i7549805/e5c665f347df686f.jpg)
就是个常量,常量的最大值显然是本身.
通过上面两条分析可以得出:
![](https://img.haomeiwen.com/i7549805/be78a3c56a7734b3.jpg)
那么在满足约束条件下:
![](https://img.haomeiwen.com/i7549805/3597a76c010ab31f.jpg)
即
![](https://img.haomeiwen.com/i7549805/b3a0cffdcd4133ec.jpg)
![](https://img.haomeiwen.com/i7549805/b3a0cffdcd4133ec.jpg)
代表原始问题,下标 P 表示原始问题,定义原始问题的最优值:
![](https://img.haomeiwen.com/i7549805/35abbc84f8a0715f.jpg)
原始问题讨论就到这里,做一个总结:通过拉格朗日这位大神的办法重新定义一个无约束问题(大家都喜欢无拘无束),这个无约束问题等价于原来的约束优化问题,从而将约束问题无约束化!
2.对偶问题
定义关于
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
![](https://img.haomeiwen.com/i7549805/838671fd3b781aff.jpg)
注意等式右边是关于
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
![](https://img.haomeiwen.com/i7549805/838671fd3b781aff.jpg)
,即
![](https://img.haomeiwen.com/i7549805/4dd1eceecdeb0715.jpg)
这就是原始问题的对偶问题,再把原始问题写出来:
![](https://img.haomeiwen.com/i7549805/c1dbf05ca490b69c.jpg)
形式上可以看出很对称,只不过原始问题是先固定
![](https://img.haomeiwen.com/i7549805/827252078a98f6c3.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
定义对偶问题的最优值:
![](https://img.haomeiwen.com/i7549805/9e7a8f088662ac5b.jpg)
3. 原始问题与对偶问题的关系
定理:若原始问题与对偶问题都有最优值,则
![](https://img.haomeiwen.com/i7549805/72af362f3cd2ee4c.jpg)
证明:对任意的
![](https://img.haomeiwen.com/i7549805/62e290d4ef1b418f.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/96aa5176dce7f4d8.jpg)
即
![](https://img.haomeiwen.com/i7549805/7a713ffc13cc2a65.jpg)
由于原始问题与对偶问题都有最优值,所以
![](https://img.haomeiwen.com/i7549805/c585c3c0ea98553e.jpg)
即
![](https://img.haomeiwen.com/i7549805/72af362f3cd2ee4c.jpg)
也就是说原始问题的最优值不小于对偶问题的最优值,但是我们要通过对偶问题来求解原始问题,就必须使得原始问题的最优值与对偶问题的最优值相等,于是可以得出下面的推论:
推论:设
![](https://img.haomeiwen.com/i7549805/6fa322ebd25323a7.jpg)
![](https://img.haomeiwen.com/i7549805/1f3ea13b52230919.jpg)
![](https://img.haomeiwen.com/i7549805/6fa322ebd25323a7.jpg)
![](https://img.haomeiwen.com/i7549805/1f3ea13b52230919.jpg)
![](https://img.haomeiwen.com/i7549805/1f3ea13b52230919.jpg)
4. KKT 条件
定理:对于原始问题和对偶问题,假设函数
![](https://img.haomeiwen.com/i7549805/e5c665f347df686f.jpg)
![](https://img.haomeiwen.com/i7549805/e61367538a340dcd.jpg)
![](https://img.haomeiwen.com/i7549805/2bedb3a64aee972f.jpg)
![](https://img.haomeiwen.com/i7549805/e61367538a340dcd.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/a4c8e259275023e6.jpg)
![](https://img.haomeiwen.com/i7549805/29cc09682a01bc4b.jpg)
![](https://img.haomeiwen.com/i7549805/6fa322ebd25323a7.jpg)
![](https://img.haomeiwen.com/i7549805/371732078e32af80.jpg)
![](https://img.haomeiwen.com/i7549805/df655b961cb66959.jpg)
![](https://img.haomeiwen.com/i7549805/3dfb36b40b5cc971.jpg)
![](https://img.haomeiwen.com/i7549805/e5c665f347df686f.jpg)
![](https://img.haomeiwen.com/i7549805/e61367538a340dcd.jpg)
![](https://img.haomeiwen.com/i7549805/2bedb3a64aee972f.jpg)
![](https://img.haomeiwen.com/i7549805/e61367538a340dcd.jpg)
![](https://img.haomeiwen.com/i7549805/eb78c8541a71ad07.jpg)
![](https://img.haomeiwen.com/i7549805/a4c8e259275023e6.jpg)
![](https://img.haomeiwen.com/i7549805/29cc09682a01bc4b.jpg)
![](https://img.haomeiwen.com/i7549805/6fa322ebd25323a7.jpg)
![](https://img.haomeiwen.com/i7549805/6fa322ebd25323a7.jpg)
![](https://img.haomeiwen.com/i7549805/6e066a4bdd51b6c0.jpg)
关于KKT 条件的理解:前面三个条件是由解析函数的知识,对于各个变量的偏导数为0(这就解释了一开始为什么假设三个函数连续可微,如果不连续可微的话,这里的偏导数存不存在就不能保证),后面四个条件就是原始问题的约束条件以及拉格朗日乘子需要满足的约束。
![](https://img.haomeiwen.com/i7549805/c9c29aaf51704763.jpg)
![](https://img.haomeiwen.com/i7549805/6110e96be4de932e.jpg)
,这个知识点会在 SVM 的推导中用到.
5. 总结
一句话,某些条件下,把原始的约束问题通过拉格朗日函数转化为无约束问题,如果原始问题求解棘手,在满足KKT的条件下用求解对偶问题来代替求解原始问题,使得问题求解更加容易。
网友评论