Files
2025-12-16 09:23:53 +08:00

5741 lines
223 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "d9ba5257",
"metadata": {
"origin_pos": 0
},
"source": [
"# 梯度下降\n",
":label:`sec_gd`\n",
"\n",
"尽管*梯度下降*gradient descent)很少直接用于深度学习,\n",
"但了解它是理解下一节随机梯度下降算法的关键。\n",
"例如,由于学习率过大,优化问题可能会发散,这种现象早已在梯度下降中出现。\n",
"同样地,*预处理*preconditioning)是梯度下降中的一种常用技术,\n",
"还被沿用到更高级的算法中。\n",
"让我们从简单的一维梯度下降开始。\n",
"\n",
"## 一维梯度下降\n",
"\n",
"为什么梯度下降算法可以优化目标函数?\n",
"一维中的梯度下降给我们很好的启发。\n",
"考虑一类连续可微实值函数$f: \\mathbb{R} \\rightarrow \\mathbb{R}$\n",
"利用泰勒展开,我们可以得到\n",
"\n",
"$$f(x + \\epsilon) = f(x) + \\epsilon f'(x) + \\mathcal{O}(\\epsilon^2).$$\n",
":eqlabel:`gd-taylor`\n",
"\n",
"即在一阶近似中,$f(x+\\epsilon)$可通过$x$处的函数值$f(x)$和一阶导数$f'(x)$得出。\n",
"我们可以假设在负梯度方向上移动的$\\epsilon$会减少$f$。\n",
"为了简单起见,我们选择固定步长$\\eta > 0$,然后取$\\epsilon = -\\eta f'(x)$。\n",
"将其代入泰勒展开式我们可以得到\n",
"\n",
"$$f(x - \\eta f'(x)) = f(x) - \\eta f'^2(x) + \\mathcal{O}(\\eta^2 f'^2(x)).$$\n",
":eqlabel:`gd-taylor-2`\n",
"\n",
"如果其导数$f'(x) \\neq 0$没有消失,我们就能继续展开,这是因为$\\eta f'^2(x)>0$。\n",
"此外,我们总是可以令$\\eta$小到足以使高阶项变得不相关。\n",
"因此,\n",
"\n",
"$$f(x - \\eta f'(x)) \\lessapprox f(x).$$\n",
"\n",
"这意味着,如果我们使用\n",
"\n",
"$$x \\leftarrow x - \\eta f'(x)$$\n",
"\n",
"来迭代$x$,函数$f(x)$的值可能会下降。\n",
"因此,在梯度下降中,我们首先选择初始值$x$和常数$\\eta > 0$\n",
"然后使用它们连续迭代$x$,直到停止条件达成。\n",
"例如,当梯度$|f'(x)|$的幅度足够小或迭代次数达到某个值时。\n",
"\n",
"下面我们来展示如何实现梯度下降。为了简单起见,我们选用目标函数$f(x)=x^2$。\n",
"尽管我们知道$x=0$时$f(x)$能取得最小值,\n",
"但我们仍然使用这个简单的函数来观察$x$的变化。\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7b783f74",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:23.083492Z",
"iopub.status.busy": "2023-08-18T07:01:23.082697Z",
"iopub.status.idle": "2023-08-18T07:01:26.147110Z",
"shell.execute_reply": "2023-08-18T07:01:26.145926Z"
},
"origin_pos": 2,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import torch\n",
"from d2l import torch as d2l"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b544b8ce",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:26.152743Z",
"iopub.status.busy": "2023-08-18T07:01:26.151955Z",
"iopub.status.idle": "2023-08-18T07:01:26.158133Z",
"shell.execute_reply": "2023-08-18T07:01:26.157003Z"
},
"origin_pos": 5,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"def f(x): # 目标函数\n",
" return x ** 2\n",
"\n",
"def f_grad(x): # 目标函数的梯度(导数)\n",
" return 2 * x"
]
},
{
"cell_type": "markdown",
"id": "784941de",
"metadata": {
"origin_pos": 6
},
"source": [
"接下来,我们使用$x=10$作为初始值,并假设$\\eta=0.2$。\n",
"使用梯度下降法迭代$x$共10次,我们可以看到,$x$的值最终将接近最优解。\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3674fea9",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:26.163008Z",
"iopub.status.busy": "2023-08-18T07:01:26.162385Z",
"iopub.status.idle": "2023-08-18T07:01:26.170126Z",
"shell.execute_reply": "2023-08-18T07:01:26.169003Z"
},
"origin_pos": 7,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: 0.060466\n"
]
}
],
"source": [
"def gd(eta, f_grad):\n",
" x = 10.0\n",
" results = [x]\n",
" for i in range(10):\n",
" x -= eta * f_grad(x)\n",
" results.append(float(x))\n",
" print(f'epoch 10, x: {x:f}')\n",
" return results\n",
"\n",
"results = gd(0.2, f_grad)"
]
},
{
"cell_type": "markdown",
"id": "a005e799",
"metadata": {
"origin_pos": 9
},
"source": [
"对进行$x$优化的过程可以绘制如下。\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "56e7d8fb",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:26.174775Z",
"iopub.status.busy": "2023-08-18T07:01:26.174138Z",
"iopub.status.idle": "2023-08-18T07:01:26.480544Z",
"shell.execute_reply": "2023-08-18T07:01:26.479162Z"
},
"origin_pos": 10,
"tab": [
"pytorch"
]
},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"249.465625pt\" height=\"180.65625pt\" viewBox=\"0 0 249.465625 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:26.426728</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 249.465625 180.65625 \n",
"L 249.465625 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 46.965625 143.1 \n",
"L 242.265625 143.1 \n",
"L 242.265625 7.2 \n",
"L 46.965625 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 55.842898 143.1 \n",
"L 55.842898 7.2 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"mcbc85dced4\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mcbc85dced4\" x=\"55.842898\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(45.290554 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 100.229261 143.1 \n",
"L 100.229261 7.2 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#mcbc85dced4\" x=\"100.229261\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(92.858168 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 144.615625 143.1 \n",
"L 144.615625 7.2 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mcbc85dced4\" x=\"144.615625\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(141.434375 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 189.001989 143.1 \n",
"L 189.001989 7.2 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#mcbc85dced4\" x=\"189.001989\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(185.820739 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 233.388352 143.1 \n",
"L 233.388352 7.2 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#mcbc85dced4\" x=\"233.388352\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(227.025852 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- x -->\n",
" <g transform=\"translate(141.65625 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 46.965625 136.922727 \n",
"L 242.265625 136.922727 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m313674a0ef\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m313674a0ef\" x=\"46.965625\" y=\"136.922727\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(33.603125 140.721946)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 46.965625 112.213636 \n",
"L 242.265625 112.213636 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m313674a0ef\" x=\"46.965625\" y=\"112.213636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(27.240625 116.012855)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 46.965625 87.504545 \n",
"L 242.265625 87.504545 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m313674a0ef\" x=\"46.965625\" y=\"87.504545\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 40 -->\n",
" <g transform=\"translate(27.240625 91.303764)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 46.965625 62.795455 \n",
"L 242.265625 62.795455 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#m313674a0ef\" x=\"46.965625\" y=\"62.795455\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 60 -->\n",
" <g transform=\"translate(27.240625 66.594673)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
"Q 1688 2584 1439 2293 \n",
"Q 1191 2003 1191 1497 \n",
"Q 1191 994 1439 701 \n",
"Q 1688 409 2113 409 \n",
"Q 2538 409 2786 701 \n",
"Q 3034 994 3034 1497 \n",
"Q 3034 2003 2786 2293 \n",
"Q 2538 2584 2113 2584 \n",
"z\n",
"M 3366 4563 \n",
"L 3366 3988 \n",
"Q 3128 4100 2886 4159 \n",
"Q 2644 4219 2406 4219 \n",
"Q 1781 4219 1451 3797 \n",
"Q 1122 3375 1075 2522 \n",
"Q 1259 2794 1537 2939 \n",
"Q 1816 3084 2150 3084 \n",
"Q 2853 3084 3261 2657 \n",
"Q 3669 2231 3669 1497 \n",
"Q 3669 778 3244 343 \n",
"Q 2819 -91 2113 -91 \n",
"Q 1303 -91 875 529 \n",
"Q 447 1150 447 2328 \n",
"Q 447 3434 972 4092 \n",
"Q 1497 4750 2381 4750 \n",
"Q 2619 4750 2861 4703 \n",
"Q 3103 4656 3366 4563 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-36\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 46.965625 38.086364 \n",
"L 242.265625 38.086364 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#m313674a0ef\" x=\"46.965625\" y=\"38.086364\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 80 -->\n",
" <g transform=\"translate(27.240625 41.885582)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
"Q 1584 2216 1326 1975 \n",
"Q 1069 1734 1069 1313 \n",
"Q 1069 891 1326 650 \n",
"Q 1584 409 2034 409 \n",
"Q 2484 409 2743 651 \n",
"Q 3003 894 3003 1313 \n",
"Q 3003 1734 2745 1975 \n",
"Q 2488 2216 2034 2216 \n",
"z\n",
"M 1403 2484 \n",
"Q 997 2584 770 2862 \n",
"Q 544 3141 544 3541 \n",
"Q 544 4100 942 4425 \n",
"Q 1341 4750 2034 4750 \n",
"Q 2731 4750 3128 4425 \n",
"Q 3525 4100 3525 3541 \n",
"Q 3525 3141 3298 2862 \n",
"Q 3072 2584 2669 2484 \n",
"Q 3125 2378 3379 2068 \n",
"Q 3634 1759 3634 1313 \n",
"Q 3634 634 3220 271 \n",
"Q 2806 -91 2034 -91 \n",
"Q 1263 -91 848 271 \n",
"Q 434 634 434 1313 \n",
"Q 434 1759 690 2068 \n",
"Q 947 2378 1403 2484 \n",
"z\n",
"M 1172 3481 \n",
"Q 1172 3119 1398 2916 \n",
"Q 1625 2713 2034 2713 \n",
"Q 2441 2713 2670 2916 \n",
"Q 2900 3119 2900 3481 \n",
"Q 2900 3844 2670 4047 \n",
"Q 2441 4250 2034 4250 \n",
"Q 1625 4250 1398 4047 \n",
"Q 1172 3844 1172 3481 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-38\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 46.965625 13.377273 \n",
"L 242.265625 13.377273 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#m313674a0ef\" x=\"46.965625\" y=\"13.377273\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 100 -->\n",
" <g transform=\"translate(20.878125 17.176491)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798438 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 55.842898 13.377273 \n",
"L 60.459075 25.891924 \n",
"L 64.89772 37.295193 \n",
"L 69.247573 47.871029 \n",
"L 73.508668 57.655837 \n",
"L 77.680989 66.685159 \n",
"L 81.764535 74.993839 \n",
"L 85.75931 82.615983 \n",
"L 89.576536 89.431856 \n",
"L 93.304987 95.648166 \n",
"L 96.944671 101.29605 \n",
"L 100.495581 106.405892 \n",
"L 103.957717 111.00734 \n",
"L 107.331081 115.129312 \n",
"L 110.615671 118.799969 \n",
"L 113.811488 122.046742 \n",
"L 116.918535 124.89632 \n",
"L 120.02558 127.443208 \n",
"L 123.043852 129.627491 \n",
"L 125.973351 131.474372 \n",
"L 128.814079 133.008313 \n",
"L 131.654806 134.289232 \n",
"L 134.406761 135.288838 \n",
"L 137.158716 136.050991 \n",
"L 139.821898 136.562469 \n",
"L 142.48508 136.851565 \n",
"L 145.148261 136.91828 \n",
"L 147.811443 136.762612 \n",
"L 150.474625 136.384563 \n",
"L 153.137807 135.784132 \n",
"L 155.889762 134.930062 \n",
"L 158.641715 133.838539 \n",
"L 161.39367 132.50956 \n",
"L 164.234398 130.888644 \n",
"L 167.163897 128.952069 \n",
"L 170.093397 126.746412 \n",
"L 173.111671 124.19248 \n",
"L 176.218715 121.265071 \n",
"L 179.414535 117.938238 \n",
"L 182.61035 114.29118 \n",
"L 185.894944 110.20911 \n",
"L 189.268309 105.664614 \n",
"L 192.730444 100.62952 \n",
"L 196.281354 95.074914 \n",
"L 199.921034 88.971153 \n",
"L 203.649489 82.287839 \n",
"L 207.55549 74.818774 \n",
"L 211.550261 66.685159 \n",
"L 215.633807 57.853636 \n",
"L 219.806119 48.290118 \n",
"L 224.067214 37.959738 \n",
"L 228.417076 26.826932 \n",
"L 232.855721 14.855356 \n",
"L 233.299578 13.624247 \n",
"L 233.299578 13.624247 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_24\">\n",
" <path d=\"M 233.388352 13.377273 \n",
"L 197.879261 92.446364 \n",
"L 176.573807 120.911236 \n",
"L 163.790534 131.158591 \n",
"L 156.12057 134.847638 \n",
"L 151.518592 136.175695 \n",
"L 148.757405 136.653796 \n",
"L 147.100693 136.825912 \n",
"L 146.106666 136.887874 \n",
"L 145.51025 136.91018 \n",
"L 145.1524 136.91821 \n",
"\" clip-path=\"url(#p62bcb874e4)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m1e30e819a8\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p62bcb874e4)\">\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"233.388352\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"197.879261\" y=\"92.446364\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"176.573807\" y=\"120.911236\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"163.790534\" y=\"131.158591\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"156.12057\" y=\"134.847638\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"151.518592\" y=\"136.175695\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"148.757405\" y=\"136.653796\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"147.100693\" y=\"136.825912\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"146.106666\" y=\"136.887874\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"145.51025\" y=\"136.91018\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m1e30e819a8\" x=\"145.1524\" y=\"136.91821\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 46.965625 143.1 \n",
"L 46.965625 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 242.265625 143.1 \n",
"L 242.265625 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 46.965625 143.1 \n",
"L 242.265625 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 46.965625 7.2 \n",
"L 242.265625 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p62bcb874e4\">\n",
" <rect x=\"46.965625\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def show_trace(results, f):\n",
" n = max(abs(min(results)), abs(max(results)))\n",
" f_line = torch.arange(-n, n, 0.01)\n",
" d2l.set_figsize()\n",
" d2l.plot([f_line, results], [[f(x) for x in f_line], [\n",
" f(x) for x in results]], 'x', 'f(x)', fmts=['-', '-o'])\n",
"\n",
"show_trace(results, f)"
]
},
{
"cell_type": "markdown",
"id": "467465ec",
"metadata": {
"origin_pos": 12
},
"source": [
"### 学习率\n",
":label:`subsec_gd-learningrate`\n",
"\n",
"*学习率*learning rate)决定目标函数能否收敛到局部最小值,以及何时收敛到最小值。\n",
"学习率$\\eta$可由算法设计者设置。\n",
"请注意,如果我们使用的学习率太小,将导致$x$的更新非常缓慢,需要更多的迭代。\n",
"例如,考虑同一优化问题中$\\eta = 0.05$的进度。\n",
"如下所示,尽管经过了10个步骤,我们仍然离最优解很远。\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "f785cac5",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:26.486541Z",
"iopub.status.busy": "2023-08-18T07:01:26.485458Z",
"iopub.status.idle": "2023-08-18T07:01:26.770061Z",
"shell.execute_reply": "2023-08-18T07:01:26.768785Z"
},
"origin_pos": 13,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: 3.486784\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"249.465625pt\" height=\"180.65625pt\" viewBox=\"0 0 249.465625 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:26.717468</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 249.465625 180.65625 \n",
"L 249.465625 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 46.965625 143.1 \n",
"L 242.265625 143.1 \n",
"L 242.265625 7.2 \n",
"L 46.965625 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 55.842898 143.1 \n",
"L 55.842898 7.2 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m55e3e10dda\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m55e3e10dda\" x=\"55.842898\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(45.290554 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 100.229261 143.1 \n",
"L 100.229261 7.2 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m55e3e10dda\" x=\"100.229261\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(92.858168 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 144.615625 143.1 \n",
"L 144.615625 7.2 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m55e3e10dda\" x=\"144.615625\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(141.434375 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 189.001989 143.1 \n",
"L 189.001989 7.2 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m55e3e10dda\" x=\"189.001989\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(185.820739 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 233.388352 143.1 \n",
"L 233.388352 7.2 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m55e3e10dda\" x=\"233.388352\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(227.025852 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- x -->\n",
" <g transform=\"translate(141.65625 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 46.965625 136.922727 \n",
"L 242.265625 136.922727 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"mf6d008a5b3\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mf6d008a5b3\" x=\"46.965625\" y=\"136.922727\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(33.603125 140.721946)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 46.965625 112.213636 \n",
"L 242.265625 112.213636 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#mf6d008a5b3\" x=\"46.965625\" y=\"112.213636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(27.240625 116.012855)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 46.965625 87.504545 \n",
"L 242.265625 87.504545 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#mf6d008a5b3\" x=\"46.965625\" y=\"87.504545\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 40 -->\n",
" <g transform=\"translate(27.240625 91.303764)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 46.965625 62.795455 \n",
"L 242.265625 62.795455 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#mf6d008a5b3\" x=\"46.965625\" y=\"62.795455\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 60 -->\n",
" <g transform=\"translate(27.240625 66.594673)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
"Q 1688 2584 1439 2293 \n",
"Q 1191 2003 1191 1497 \n",
"Q 1191 994 1439 701 \n",
"Q 1688 409 2113 409 \n",
"Q 2538 409 2786 701 \n",
"Q 3034 994 3034 1497 \n",
"Q 3034 2003 2786 2293 \n",
"Q 2538 2584 2113 2584 \n",
"z\n",
"M 3366 4563 \n",
"L 3366 3988 \n",
"Q 3128 4100 2886 4159 \n",
"Q 2644 4219 2406 4219 \n",
"Q 1781 4219 1451 3797 \n",
"Q 1122 3375 1075 2522 \n",
"Q 1259 2794 1537 2939 \n",
"Q 1816 3084 2150 3084 \n",
"Q 2853 3084 3261 2657 \n",
"Q 3669 2231 3669 1497 \n",
"Q 3669 778 3244 343 \n",
"Q 2819 -91 2113 -91 \n",
"Q 1303 -91 875 529 \n",
"Q 447 1150 447 2328 \n",
"Q 447 3434 972 4092 \n",
"Q 1497 4750 2381 4750 \n",
"Q 2619 4750 2861 4703 \n",
"Q 3103 4656 3366 4563 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-36\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 46.965625 38.086364 \n",
"L 242.265625 38.086364 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#mf6d008a5b3\" x=\"46.965625\" y=\"38.086364\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 80 -->\n",
" <g transform=\"translate(27.240625 41.885582)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
"Q 1584 2216 1326 1975 \n",
"Q 1069 1734 1069 1313 \n",
"Q 1069 891 1326 650 \n",
"Q 1584 409 2034 409 \n",
"Q 2484 409 2743 651 \n",
"Q 3003 894 3003 1313 \n",
"Q 3003 1734 2745 1975 \n",
"Q 2488 2216 2034 2216 \n",
"z\n",
"M 1403 2484 \n",
"Q 997 2584 770 2862 \n",
"Q 544 3141 544 3541 \n",
"Q 544 4100 942 4425 \n",
"Q 1341 4750 2034 4750 \n",
"Q 2731 4750 3128 4425 \n",
"Q 3525 4100 3525 3541 \n",
"Q 3525 3141 3298 2862 \n",
"Q 3072 2584 2669 2484 \n",
"Q 3125 2378 3379 2068 \n",
"Q 3634 1759 3634 1313 \n",
"Q 3634 634 3220 271 \n",
"Q 2806 -91 2034 -91 \n",
"Q 1263 -91 848 271 \n",
"Q 434 634 434 1313 \n",
"Q 434 1759 690 2068 \n",
"Q 947 2378 1403 2484 \n",
"z\n",
"M 1172 3481 \n",
"Q 1172 3119 1398 2916 \n",
"Q 1625 2713 2034 2713 \n",
"Q 2441 2713 2670 2916 \n",
"Q 2900 3119 2900 3481 \n",
"Q 2900 3844 2670 4047 \n",
"Q 2441 4250 2034 4250 \n",
"Q 1625 4250 1398 4047 \n",
"Q 1172 3844 1172 3481 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-38\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 46.965625 13.377273 \n",
"L 242.265625 13.377273 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#mf6d008a5b3\" x=\"46.965625\" y=\"13.377273\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 100 -->\n",
" <g transform=\"translate(20.878125 17.176491)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798438 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 55.842898 13.377273 \n",
"L 60.459075 25.891924 \n",
"L 64.89772 37.295193 \n",
"L 69.247573 47.871029 \n",
"L 73.508668 57.655837 \n",
"L 77.680989 66.685159 \n",
"L 81.764535 74.993839 \n",
"L 85.75931 82.615983 \n",
"L 89.576536 89.431856 \n",
"L 93.304987 95.648166 \n",
"L 96.944671 101.29605 \n",
"L 100.495581 106.405892 \n",
"L 103.957717 111.00734 \n",
"L 107.331081 115.129312 \n",
"L 110.615671 118.799969 \n",
"L 113.811488 122.046742 \n",
"L 116.918535 124.89632 \n",
"L 120.02558 127.443208 \n",
"L 123.043852 129.627491 \n",
"L 125.973351 131.474372 \n",
"L 128.814079 133.008313 \n",
"L 131.654806 134.289232 \n",
"L 134.406761 135.288838 \n",
"L 137.158716 136.050991 \n",
"L 139.821898 136.562469 \n",
"L 142.48508 136.851565 \n",
"L 145.148261 136.91828 \n",
"L 147.811443 136.762612 \n",
"L 150.474625 136.384563 \n",
"L 153.137807 135.784132 \n",
"L 155.889762 134.930062 \n",
"L 158.641715 133.838539 \n",
"L 161.39367 132.50956 \n",
"L 164.234398 130.888644 \n",
"L 167.163897 128.952069 \n",
"L 170.093397 126.746412 \n",
"L 173.111671 124.19248 \n",
"L 176.218715 121.265071 \n",
"L 179.414535 117.938238 \n",
"L 182.61035 114.29118 \n",
"L 185.894944 110.20911 \n",
"L 189.268309 105.664614 \n",
"L 192.730444 100.62952 \n",
"L 196.281354 95.074914 \n",
"L 199.921034 88.971153 \n",
"L 203.649489 82.287839 \n",
"L 207.55549 74.818774 \n",
"L 211.550261 66.685159 \n",
"L 215.633807 57.853636 \n",
"L 219.806119 48.290118 \n",
"L 224.067214 37.959738 \n",
"L 228.417076 26.826932 \n",
"L 232.855721 14.855356 \n",
"L 233.299578 13.624247 \n",
"L 233.299578 13.624247 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_24\">\n",
" <path d=\"M 233.388352 13.377273 \n",
"L 224.51108 36.850909 \n",
"L 216.521534 55.864555 \n",
"L 209.330943 71.265607 \n",
"L 202.859411 83.74046 \n",
"L 197.035033 93.845091 \n",
"L 191.793092 102.029842 \n",
"L 187.075345 108.65949 \n",
"L 182.829373 114.029505 \n",
"L 179.007998 118.379217 \n",
"L 175.568761 121.902484 \n",
"\" clip-path=\"url(#p9c16b0ba75)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"mbc97294271\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p9c16b0ba75)\">\n",
" <use xlink:href=\"#mbc97294271\" x=\"233.388352\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"224.51108\" y=\"36.850909\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"216.521534\" y=\"55.864555\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"209.330943\" y=\"71.265607\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"202.859411\" y=\"83.74046\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"197.035033\" y=\"93.845091\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"191.793092\" y=\"102.029842\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"187.075345\" y=\"108.65949\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"182.829373\" y=\"114.029505\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"179.007998\" y=\"118.379217\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mbc97294271\" x=\"175.568761\" y=\"121.902484\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 46.965625 143.1 \n",
"L 46.965625 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 242.265625 143.1 \n",
"L 242.265625 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 46.965625 143.1 \n",
"L 242.265625 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 46.965625 7.2 \n",
"L 242.265625 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p9c16b0ba75\">\n",
" <rect x=\"46.965625\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"show_trace(gd(0.05, f_grad), f)"
]
},
{
"cell_type": "markdown",
"id": "ffd34ac9",
"metadata": {
"origin_pos": 14
},
"source": [
"相反,如果我们使用过高的学习率,$\\left|\\eta f'(x)\\right|$对于一阶泰勒展开式可能太大。\n",
"也就是说, :eqref:`gd-taylor`中的$\\mathcal{O}(\\eta^2 f'^2(x))$可能变得显著了。\n",
"在这种情况下,$x$的迭代不能保证降低$f(x)$的值。\n",
"例如,当学习率为$\\eta=1.1$时,$x$超出了最优解$x=0$并逐渐发散。\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "dabdbe10",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:26.775496Z",
"iopub.status.busy": "2023-08-18T07:01:26.774773Z",
"iopub.status.idle": "2023-08-18T07:01:27.351226Z",
"shell.execute_reply": "2023-08-18T07:01:27.349971Z"
},
"origin_pos": 15,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: 61.917364\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"255.828125pt\" height=\"183.635382pt\" viewBox=\"0 0 255.828125 183.635382\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:27.300248</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 183.635382 \n",
"L 255.828125 183.635382 \n",
"L 255.828125 0 \n",
"L 0 0 \n",
"L 0 183.635382 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 53.328125 146.079132 \n",
"L 248.628125 146.079132 \n",
"L 248.628125 10.179132 \n",
"L 53.328125 10.179132 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 79.291672 146.079132 \n",
"L 79.291672 10.179132 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"mefc5dc4dba\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mefc5dc4dba\" x=\"79.291672\" y=\"146.079132\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 50 -->\n",
" <g transform=\"translate(68.739328 160.677569)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 115.134899 146.079132 \n",
"L 115.134899 10.179132 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#mefc5dc4dba\" x=\"115.134899\" y=\"146.079132\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 25 -->\n",
" <g transform=\"translate(104.582555 160.677569)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 150.978126 146.079132 \n",
"L 150.978126 10.179132 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mefc5dc4dba\" x=\"150.978126\" y=\"146.079132\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(147.796876 160.677569)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 186.821353 146.079132 \n",
"L 186.821353 10.179132 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#mefc5dc4dba\" x=\"186.821353\" y=\"146.079132\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 25 -->\n",
" <g transform=\"translate(180.458853 160.677569)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 222.664581 146.079132 \n",
"L 222.664581 10.179132 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#mefc5dc4dba\" x=\"222.664581\" y=\"146.079132\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 50 -->\n",
" <g transform=\"translate(216.302081 160.677569)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- x -->\n",
" <g transform=\"translate(148.01875 174.355694)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 53.328125 139.901859 \n",
"L 248.628125 139.901859 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"me500198b42\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#me500198b42\" x=\"53.328125\" y=\"139.901859\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(39.965625 143.701078)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 53.328125 107.676199 \n",
"L 248.628125 107.676199 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#me500198b42\" x=\"53.328125\" y=\"107.676199\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 1000 -->\n",
" <g transform=\"translate(20.878125 111.475418)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 53.328125 75.450539 \n",
"L 248.628125 75.450539 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#me500198b42\" x=\"53.328125\" y=\"75.450539\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 2000 -->\n",
" <g transform=\"translate(20.878125 79.249758)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 53.328125 43.224879 \n",
"L 248.628125 43.224879 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#me500198b42\" x=\"53.328125\" y=\"43.224879\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 3000 -->\n",
" <g transform=\"translate(20.878125 47.024098)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-33\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 53.328125 10.999219 \n",
"L 248.628125 10.999219 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#me500198b42\" x=\"53.328125\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 4000 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798437 86.750225)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 62.205398 16.356404 \n",
"L 66.778997 28.758673 \n",
"L 71.25223 40.254351 \n",
"L 75.625103 50.885672 \n",
"L 79.89762 60.693932 \n",
"L 84.069766 69.71943 \n",
"L 88.141557 78.001586 \n",
"L 92.098652 85.552371 \n",
"L 95.955382 92.439193 \n",
"L 99.711756 98.698485 \n",
"L 103.367764 104.365689 \n",
"L 106.937749 109.495131 \n",
"L 110.407374 114.097459 \n",
"L 113.790972 118.222147 \n",
"L 117.088551 121.896578 \n",
"L 120.300104 125.147432 \n",
"L 123.439968 128.013076 \n",
"L 126.493813 130.503685 \n",
"L 129.490307 132.663297 \n",
"L 132.415114 134.499735 \n",
"L 135.282572 136.039782 \n",
"L 138.092681 137.298905 \n",
"L 140.859778 138.296815 \n",
"L 143.583863 139.044708 \n",
"L 146.279274 139.555719 \n",
"L 148.960347 139.838031 \n",
"L 151.627083 139.895257 \n",
"L 154.293819 139.729507 \n",
"L 156.960556 139.340782 \n",
"L 159.641629 138.725189 \n",
"L 162.351377 137.874004 \n",
"L 165.0898 136.779918 \n",
"L 167.871235 135.427954 \n",
"L 170.710018 133.797992 \n",
"L 173.591814 131.884886 \n",
"L 176.545295 129.654006 \n",
"L 179.556127 127.098282 \n",
"L 182.638644 124.187247 \n",
"L 185.792849 120.900114 \n",
"L 189.033074 117.198512 \n",
"L 192.359325 113.056218 \n",
"L 195.757267 108.466448 \n",
"L 199.255561 103.362913 \n",
"L 202.839884 97.735889 \n",
"L 206.524573 91.531403 \n",
"L 210.309617 84.714684 \n",
"L 214.166346 77.306827 \n",
"L 218.123436 69.221459 \n",
"L 222.195227 60.389213 \n",
"L 226.353043 50.833935 \n",
"L 230.61122 40.486193 \n",
"L 234.969758 29.305793 \n",
"L 239.428651 17.251603 \n",
"L 239.744073 16.375279 \n",
"L 239.744073 16.375279 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <path d=\"M 165.315417 136.679293 \n",
"L 133.773377 135.261364 \n",
"L 171.623825 133.219546 \n",
"L 126.203288 130.279329 \n",
"L 180.707933 126.045415 \n",
"L 115.302359 119.94858 \n",
"L 193.789047 111.169137 \n",
"L 99.605021 98.526739 \n",
"L 212.625853 80.321686 \n",
"L 77.000855 54.10641 \n",
"L 239.750852 16.356413 \n",
"\" clip-path=\"url(#p5e3e8d099e)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m907449e165\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p5e3e8d099e)\">\n",
" <use xlink:href=\"#m907449e165\" x=\"165.315417\" y=\"136.679293\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"133.773377\" y=\"135.261364\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"171.623825\" y=\"133.219546\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"126.203288\" y=\"130.279329\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"180.707933\" y=\"126.045415\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"115.302359\" y=\"119.94858\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"193.789047\" y=\"111.169137\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"99.605021\" y=\"98.526739\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"212.625853\" y=\"80.321686\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"77.000855\" y=\"54.10641\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m907449e165\" x=\"239.750852\" y=\"16.356413\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 53.328125 146.079132 \n",
"L 53.328125 10.179132 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 248.628125 146.079132 \n",
"L 248.628125 10.179132 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 53.328125 146.079132 \n",
"L 248.628125 146.079132 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 53.328125 10.179132 \n",
"L 248.628125 10.179132 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p5e3e8d099e\">\n",
" <rect x=\"53.328125\" y=\"10.179132\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"show_trace(gd(1.1, f_grad), f)"
]
},
{
"cell_type": "markdown",
"id": "f6918593",
"metadata": {
"origin_pos": 16
},
"source": [
"### 局部最小值\n",
"\n",
"为了演示非凸函数的梯度下降,考虑函数$f(x) = x \\cdot \\cos(cx)$,其中$c$为某常数。\n",
"这个函数有无穷多个局部最小值。\n",
"根据我们选择的学习率,我们最终可能只会得到许多解的一个。\n",
"下面的例子说明了(不切实际的)高学习率如何导致较差的局部最小值。\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "51583e82",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:27.356773Z",
"iopub.status.busy": "2023-08-18T07:01:27.355855Z",
"iopub.status.idle": "2023-08-18T07:01:27.649547Z",
"shell.execute_reply": "2023-08-18T07:01:27.648346Z"
},
"origin_pos": 17,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: -1.528166\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:27.605331</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 51.497585 143.1 \n",
"L 51.497585 7.2 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"me00d650fad\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#me00d650fad\" x=\"51.497585\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(40.945241 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 95.883949 143.1 \n",
"L 95.883949 7.2 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#me00d650fad\" x=\"95.883949\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(88.512855 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 140.270312 143.1 \n",
"L 140.270312 7.2 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#me00d650fad\" x=\"140.270312\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(137.089062 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 184.656676 143.1 \n",
"L 184.656676 7.2 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#me00d650fad\" x=\"184.656676\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(181.475426 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 229.04304 143.1 \n",
"L 229.04304 7.2 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#me00d650fad\" x=\"229.04304\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(222.68054 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- x -->\n",
" <g transform=\"translate(137.310937 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 42.620312 119.411597 \n",
"L 237.920313 119.411597 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m91884548f2\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m91884548f2\" x=\"42.620312\" y=\"119.411597\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(20.878125 123.210816)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 42.620312 75.15 \n",
"L 237.920313 75.15 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m91884548f2\" x=\"42.620312\" y=\"75.15\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 78.949219)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 42.620312 30.888403 \n",
"L 237.920313 30.888403 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m91884548f2\" x=\"42.620312\" y=\"30.888403\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(29.257812 34.687622)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798437 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 51.497585 75.150001 \n",
"L 54.515859 61.507494 \n",
"L 57.090268 50.885046 \n",
"L 59.309586 42.62211 \n",
"L 61.351355 35.837137 \n",
"L 63.215583 30.377167 \n",
"L 64.991044 25.861634 \n",
"L 66.588956 22.386195 \n",
"L 68.098084 19.624675 \n",
"L 69.518451 17.490893 \n",
"L 70.850038 15.899777 \n",
"L 72.092859 14.769329 \n",
"L 73.246906 14.022221 \n",
"L 74.312178 13.586851 \n",
"L 75.377449 13.390552 \n",
"L 76.442725 13.427546 \n",
"L 77.507997 13.691167 \n",
"L 78.662039 14.22383 \n",
"L 79.816086 15.003166 \n",
"L 81.058907 16.105109 \n",
"L 82.479269 17.678382 \n",
"L 83.988402 19.689963 \n",
"L 85.675089 22.313917 \n",
"L 87.628087 25.787156 \n",
"L 89.847405 30.206604 \n",
"L 92.599359 36.214766 \n",
"L 96.59413 45.535707 \n",
"L 102.719453 59.79435 \n",
"L 105.648948 66.04003 \n",
"L 108.134584 70.834186 \n",
"L 110.353902 74.634558 \n",
"L 112.306903 77.556075 \n",
"L 114.17113 79.946414 \n",
"L 115.857812 81.757913 \n",
"L 117.45572 83.158273 \n",
"L 118.964857 84.195921 \n",
"L 120.385222 84.920387 \n",
"L 121.805586 85.403652 \n",
"L 123.225949 85.651421 \n",
"L 124.646313 85.67174 \n",
"L 126.066676 85.474901 \n",
"L 127.575812 85.041746 \n",
"L 129.173722 84.350543 \n",
"L 130.949177 83.330131 \n",
"L 132.902176 81.942544 \n",
"L 135.210267 80.014888 \n",
"L 138.22854 77.174086 \n",
"L 146.04054 69.663818 \n",
"L 148.34863 67.823794 \n",
"L 150.301631 66.531906 \n",
"L 152.077085 65.614112 \n",
"L 153.674995 65.026673 \n",
"L 155.18413 64.700263 \n",
"L 156.604495 64.61058 \n",
"L 158.024858 64.743475 \n",
"L 159.445222 65.10813 \n",
"L 160.865585 65.711433 \n",
"L 162.285949 66.557882 \n",
"L 163.795086 67.725879 \n",
"L 165.392994 69.263456 \n",
"L 167.079676 71.216922 \n",
"L 168.855131 73.628385 \n",
"L 170.808131 76.680042 \n",
"L 172.938677 80.448158 \n",
"L 175.33554 85.168809 \n",
"L 178.087492 91.099238 \n",
"L 181.815947 99.711378 \n",
"L 189.272858 117.052872 \n",
"L 191.936041 122.62392 \n",
"L 194.155359 126.773504 \n",
"L 196.019583 129.822238 \n",
"L 197.706269 132.178722 \n",
"L 199.215402 133.924756 \n",
"L 200.635769 135.226991 \n",
"L 201.878586 136.07617 \n",
"L 203.032632 136.608833 \n",
"L 204.097904 136.872454 \n",
"L 205.16318 136.909448 \n",
"L 206.228452 136.713149 \n",
"L 207.293723 136.277775 \n",
"L 208.358995 135.598428 \n",
"L 209.513041 134.582583 \n",
"L 210.667088 133.272041 \n",
"L 211.909901 131.528413 \n",
"L 213.241497 129.276672 \n",
"L 214.661859 126.438757 \n",
"L 216.259771 122.713826 \n",
"L 217.946449 118.183227 \n",
"L 219.721902 112.772058 \n",
"L 221.674904 106.096761 \n",
"L 223.805448 98.012828 \n",
"L 226.202315 88.022895 \n",
"L 228.954265 75.566753 \n",
"L 228.954265 75.566753 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <path d=\"M 229.04304 75.149999 \n",
"L 145.376715 70.243883 \n",
"L 129.559103 84.150541 \n",
"L 120.033699 84.763719 \n",
"L 128.344442 84.73779 \n",
"L 120.680218 85.040448 \n",
"L 127.613196 85.028227 \n",
"L 121.143037 85.207934 \n",
"L 127.097544 85.203029 \n",
"L 121.500618 85.319975 \n",
"L 126.704368 85.319115 \n",
"\" clip-path=\"url(#p179f62d71c)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"mc9bbaa1c85\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p179f62d71c)\">\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"229.04304\" y=\"75.149999\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"145.376715\" y=\"70.243883\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"129.559103\" y=\"84.150541\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"120.033699\" y=\"84.763719\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"128.344442\" y=\"84.73779\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"120.680218\" y=\"85.040448\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"127.613196\" y=\"85.028227\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"121.143037\" y=\"85.207934\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"127.097544\" y=\"85.203029\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"121.500618\" y=\"85.319975\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mc9bbaa1c85\" x=\"126.704368\" y=\"85.319115\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p179f62d71c\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"c = torch.tensor(0.15 * np.pi)\n",
"\n",
"def f(x): # 目标函数\n",
" return x * torch.cos(c * x)\n",
"\n",
"def f_grad(x): # 目标函数的梯度\n",
" return torch.cos(c * x) - c * x * torch.sin(c * x)\n",
"\n",
"show_trace(gd(2, f_grad), f)"
]
},
{
"cell_type": "markdown",
"id": "3a19f7d8",
"metadata": {
"origin_pos": 18
},
"source": [
"## 多元梯度下降\n",
"\n",
"现在我们对单变量的情况有了更好的理解,让我们考虑一下$\\mathbf{x} = [x_1, x_2, \\ldots, x_d]^\\top$的情况。\n",
"即目标函数$f: \\mathbb{R}^d \\to \\mathbb{R}$将向量映射成标量。\n",
"相应地,它的梯度也是多元的,它是一个由$d$个偏导数组成的向量:\n",
"\n",
"$$\\nabla f(\\mathbf{x}) = \\bigg[\\frac{\\partial f(\\mathbf{x})}{\\partial x_1}, \\frac{\\partial f(\\mathbf{x})}{\\partial x_2}, \\ldots, \\frac{\\partial f(\\mathbf{x})}{\\partial x_d}\\bigg]^\\top.$$\n",
"\n",
"梯度中的每个偏导数元素$\\partial f(\\mathbf{x})/\\partial x_i$代表了当输入$x_i$时$f$在$\\mathbf{x}$处的变化率。\n",
"和先前单变量的情况一样,我们可以对多变量函数使用相应的泰勒近似来思考。\n",
"具体来说,\n",
"\n",
"$$f(\\mathbf{x} + \\boldsymbol{\\epsilon}) = f(\\mathbf{x}) + \\mathbf{\\boldsymbol{\\epsilon}}^\\top \\nabla f(\\mathbf{x}) + \\mathcal{O}(\\|\\boldsymbol{\\epsilon}\\|^2).$$\n",
":eqlabel:`gd-multi-taylor`\n",
"\n",
"换句话说,在$\\boldsymbol{\\epsilon}$的二阶项中,\n",
"最陡下降的方向由负梯度$-\\nabla f(\\mathbf{x})$得出。\n",
"选择合适的学习率$\\eta > 0$来生成典型的梯度下降算法:\n",
"\n",
"$$\\mathbf{x} \\leftarrow \\mathbf{x} - \\eta \\nabla f(\\mathbf{x}).$$\n",
"\n",
"这个算法在实践中的表现如何呢?\n",
"我们构造一个目标函数$f(\\mathbf{x})=x_1^2+2x_2^2$\n",
"并有二维向量$\\mathbf{x} = [x_1, x_2]^\\top$作为输入,\n",
"标量作为输出。\n",
"梯度由$\\nabla f(\\mathbf{x}) = [2x_1, 4x_2]^\\top$给出。\n",
"我们将从初始位置$[-5, -2]$通过梯度下降观察$\\mathbf{x}$的轨迹。\n",
"\n",
"我们还需要两个辅助函数:\n",
"第一个是update函数,并将其应用于初始值20次;\n",
"第二个函数会显示$\\mathbf{x}$的轨迹。\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "ef6dbba5",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:27.655203Z",
"iopub.status.busy": "2023-08-18T07:01:27.654471Z",
"iopub.status.idle": "2023-08-18T07:01:27.663479Z",
"shell.execute_reply": "2023-08-18T07:01:27.662393Z"
},
"origin_pos": 19,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"def train_2d(trainer, steps=20, f_grad=None): #@save\n",
" \"\"\"用定制的训练机优化2D目标函数\"\"\"\n",
" # s1和s2是稍后将使用的内部状态变量\n",
" x1, x2, s1, s2 = -5, -2, 0, 0\n",
" results = [(x1, x2)]\n",
" for i in range(steps):\n",
" if f_grad:\n",
" x1, x2, s1, s2 = trainer(x1, x2, s1, s2, f_grad)\n",
" else:\n",
" x1, x2, s1, s2 = trainer(x1, x2, s1, s2)\n",
" results.append((x1, x2))\n",
" print(f'epoch {i + 1}, x1: {float(x1):f}, x2: {float(x2):f}')\n",
" return results"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d47839ef",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:27.668397Z",
"iopub.status.busy": "2023-08-18T07:01:27.667552Z",
"iopub.status.idle": "2023-08-18T07:01:27.675730Z",
"shell.execute_reply": "2023-08-18T07:01:27.674619Z"
},
"origin_pos": 21,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"def show_trace_2d(f, results): #@save\n",
" \"\"\"显示优化过程中2D变量的轨迹\"\"\"\n",
" d2l.set_figsize()\n",
" d2l.plt.plot(*zip(*results), '-o', color='#ff7f0e')\n",
" x1, x2 = torch.meshgrid(torch.arange(-5.5, 1.0, 0.1),\n",
" torch.arange(-3.0, 1.0, 0.1), indexing='ij')\n",
" d2l.plt.contour(x1, x2, f(x1, x2), colors='#1f77b4')\n",
" d2l.plt.xlabel('x1')\n",
" d2l.plt.ylabel('x2')"
]
},
{
"cell_type": "markdown",
"id": "bd6785e8",
"metadata": {
"origin_pos": 23
},
"source": [
"接下来,我们观察学习率$\\eta = 0.1$时优化变量$\\mathbf{x}$的轨迹。\n",
"可以看到,经过20步之后,$\\mathbf{x}$的值接近其位于$[0, 0]$的最小值。\n",
"虽然进展相当顺利,但相当缓慢。\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "6d58435d",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:27.679558Z",
"iopub.status.busy": "2023-08-18T07:01:27.679151Z",
"iopub.status.idle": "2023-08-18T07:01:27.882280Z",
"shell.execute_reply": "2023-08-18T07:01:27.881035Z"
},
"origin_pos": 24,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: -0.057646, x2: -0.000073\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:27.838043</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"m5e41782ad5\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m5e41782ad5\" x=\"88.39375\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(81.022656 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#m5e41782ad5\" x=\"149.425\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(142.053907 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#m5e41782ad5\" x=\"210.456251\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(207.275001 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(134.129687 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"m66f6a67b37\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m66f6a67b37\" x=\"42.620312\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 3 -->\n",
" <g transform=\"translate(20.878125 146.899219)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#m66f6a67b37\" x=\"42.620312\" y=\"108.253846\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(20.878125 112.053065)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m66f6a67b37\" x=\"42.620312\" y=\"73.407692\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(20.878125 77.206911)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_7\">\n",
" <g>\n",
" <use xlink:href=\"#m66f6a67b37\" x=\"42.620312\" y=\"38.561538\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 42.360757)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798437 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <path d=\"M 57.878125 108.253846 \n",
"L 88.39375 80.376923 \n",
"L 112.80625 63.650769 \n",
"L 132.33625 53.615076 \n",
"L 147.96025 47.593661 \n",
"L 160.45945 43.980812 \n",
"L 170.45881 41.813102 \n",
"L 178.458299 40.512476 \n",
"L 184.857889 39.732101 \n",
"L 189.977561 39.263876 \n",
"L 194.073299 38.982941 \n",
"L 197.349889 38.814379 \n",
"L 199.971162 38.713243 \n",
"L 202.068179 38.652561 \n",
"L 203.745794 38.616152 \n",
"L 205.087885 38.594306 \n",
"L 206.161558 38.581199 \n",
"L 207.020497 38.573334 \n",
"L 207.707647 38.568616 \n",
"L 208.257368 38.565785 \n",
"L 208.697145 38.564086 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m961f3d55a7\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p304360d856)\">\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"57.878125\" y=\"108.253846\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"88.39375\" y=\"80.376923\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"112.80625\" y=\"63.650769\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"132.33625\" y=\"53.615076\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"147.96025\" y=\"47.593661\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"160.45945\" y=\"43.980812\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"170.45881\" y=\"41.813102\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"178.458299\" y=\"40.512476\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"184.857889\" y=\"39.732101\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"189.977561\" y=\"39.263876\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"194.073299\" y=\"38.982941\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"197.349889\" y=\"38.814379\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"199.971162\" y=\"38.713243\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"202.068179\" y=\"38.652561\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"203.745794\" y=\"38.616152\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"205.087885\" y=\"38.594306\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"206.161558\" y=\"38.581199\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"207.020497\" y=\"38.573334\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"207.707647\" y=\"38.568616\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"208.257368\" y=\"38.565785\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m961f3d55a7\" x=\"208.697145\" y=\"38.564086\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 133.389338 7.2 \n",
"L 131.354961 10.684614 \n",
"L 131.115628 11.149219 \n",
"L 129.618631 14.16923 \n",
"L 128.121641 17.653845 \n",
"L 128.064069 17.812225 \n",
"L 126.898922 21.138461 \n",
"L 125.900228 24.623076 \n",
"L 125.123465 28.107691 \n",
"L 125.012502 28.804598 \n",
"L 124.58421 31.592307 \n",
"L 124.262994 35.076923 \n",
"L 124.155921 38.561539 \n",
"L 124.262994 42.046154 \n",
"L 124.58421 45.530769 \n",
"L 125.012502 48.318478 \n",
"L 125.123465 49.015384 \n",
"L 125.900228 52.500001 \n",
"L 126.898922 55.984615 \n",
"L 128.064069 59.310851 \n",
"L 128.121641 59.469231 \n",
"L 129.618631 62.953845 \n",
"L 131.115628 65.973855 \n",
"L 131.354961 66.438459 \n",
"L 133.389338 69.923076 \n",
"L 134.167188 71.115182 \n",
"L 135.724107 73.407692 \n",
"L 137.218755 75.398907 \n",
"L 138.387435 76.892308 \n",
"L 140.270314 79.089128 \n",
"L 141.423126 80.37692 \n",
"L 143.321874 82.328309 \n",
"L 144.883135 83.861536 \n",
"L 146.373441 85.216671 \n",
"L 148.829573 87.346153 \n",
"L 149.425 87.826789 \n",
"L 152.476564 90.169895 \n",
"L 153.383783 90.830769 \n",
"L 155.528127 92.292059 \n",
"L 158.57969 94.25918 \n",
"L 158.672164 94.315385 \n",
"L 161.631253 96.004895 \n",
"L 164.682813 97.641606 \n",
"L 164.998499 97.800001 \n",
"L 167.734376 99.094286 \n",
"L 170.785939 100.438353 \n",
"L 172.861005 101.284618 \n",
"L 173.837499 101.66133 \n",
"L 176.889062 102.744387 \n",
"L 179.940626 103.733262 \n",
"L 182.992189 104.627962 \n",
"L 183.530712 104.769234 \n",
"L 186.04375 105.394675 \n",
"L 189.095313 106.064794 \n",
"L 192.146877 106.645562 \n",
"L 195.198438 107.136982 \n",
"L 198.250001 107.539054 \n",
"L 201.301564 107.851774 \n",
"L 204.353126 108.075148 \n",
"L 207.404689 108.20917 \n",
"L 210.456251 108.253846 \n",
"L 213.507813 108.20917 \n",
"L 216.559376 108.075148 \n",
"L 219.610939 107.851774 \n",
"L 222.662501 107.539054 \n",
"L 225.714063 107.136982 \n",
"L 228.765626 106.645562 \n",
"L 231.817188 106.064794 \n",
"L 234.868751 105.394675 \n",
"L 237.38179 104.769234 \n",
"L 237.920313 104.627962 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 94.741 7.2 \n",
"L 94.496869 7.814946 \n",
"L 93.387217 10.684614 \n",
"L 92.198295 14.16923 \n",
"L 91.44531 16.715693 \n",
"L 91.174923 17.653845 \n",
"L 90.325119 21.138461 \n",
"L 89.629825 24.623076 \n",
"L 89.089044 28.107691 \n",
"L 88.702769 31.592307 \n",
"L 88.471007 35.076923 \n",
"L 88.393757 38.561206 \n",
"L 88.39375 38.561539 \n",
"L 88.393757 38.561871 \n",
"L 88.471007 42.046154 \n",
"L 88.702769 45.530769 \n",
"L 89.089044 49.015384 \n",
"L 89.629825 52.500001 \n",
"L 90.325119 55.984615 \n",
"L 91.174923 59.469231 \n",
"L 91.44531 60.407382 \n",
"L 92.198295 62.953845 \n",
"L 93.387217 66.438459 \n",
"L 94.496869 69.30813 \n",
"L 94.741 69.923076 \n",
"L 96.287127 73.407692 \n",
"L 97.548436 75.979666 \n",
"L 98.008263 76.892308 \n",
"L 99.931161 80.37692 \n",
"L 100.599996 81.491996 \n",
"L 102.061308 83.861536 \n",
"L 103.651563 86.249145 \n",
"L 104.403397 87.346153 \n",
"L 106.703122 90.470284 \n",
"L 106.976403 90.830769 \n",
"L 109.754682 94.25917 \n",
"L 109.801639 94.315385 \n",
"L 112.806249 97.694405 \n",
"L 112.903126 97.800001 \n",
"L 115.857816 100.836595 \n",
"L 116.308047 101.284618 \n",
"L 118.909375 103.733262 \n",
"L 120.04725 104.769234 \n",
"L 121.960942 106.422193 \n",
"L 124.155924 108.253846 \n",
"L 125.012502 108.93377 \n",
"L 128.064069 111.271016 \n",
"L 128.697415 111.738466 \n",
"L 131.115628 113.440255 \n",
"L 133.748347 115.223078 \n",
"L 134.167188 115.494103 \n",
"L 137.218755 117.391282 \n",
"L 139.426266 118.70769 \n",
"L 140.270314 119.189606 \n",
"L 143.321874 120.857774 \n",
"L 145.876677 122.192311 \n",
"L 146.373441 122.441211 \n",
"L 149.425 123.899059 \n",
"L 152.476564 125.285791 \n",
"L 153.383788 125.676923 \n",
"L 155.528127 126.565159 \n",
"L 158.57969 127.760863 \n",
"L 161.631253 128.888239 \n",
"L 162.418734 129.161535 \n",
"L 164.682813 129.917634 \n",
"L 167.734376 130.870972 \n",
"L 170.785939 131.758564 \n",
"L 173.837499 132.580406 \n",
"L 174.102859 132.646155 \n",
"L 176.889062 133.311398 \n",
"L 179.940626 133.976642 \n",
"L 182.992189 134.57853 \n",
"L 186.04375 135.117062 \n",
"L 189.095313 135.592236 \n",
"L 192.146877 136.004057 \n",
"L 193.256511 136.130768 \n",
"L 195.198438 136.344737 \n",
"L 198.250001 136.619839 \n",
"L 201.301564 136.833805 \n",
"L 204.353126 136.986639 \n",
"L 207.404689 137.078339 \n",
"L 210.456251 137.108907 \n",
"L 213.507813 137.078339 \n",
"L 216.559376 136.986639 \n",
"L 219.610939 136.833805 \n",
"L 222.662501 136.619839 \n",
"L 225.714063 136.344737 \n",
"L 227.655991 136.130768 \n",
"L 228.765626 136.004057 \n",
"L 231.817188 135.592236 \n",
"L 234.868751 135.117062 \n",
"L 237.920313 134.57853 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 66.101283 7.2 \n",
"L 65.009145 10.684614 \n",
"L 64.045489 14.16923 \n",
"L 63.981244 14.437284 \n",
"L 63.226219 17.653845 \n",
"L 62.534115 21.138461 \n",
"L 61.967845 24.623076 \n",
"L 61.527415 28.107691 \n",
"L 61.21282 31.592307 \n",
"L 61.024065 35.076923 \n",
"L 60.961145 38.561539 \n",
"L 61.024065 42.046154 \n",
"L 61.21282 45.530769 \n",
"L 61.527415 49.015384 \n",
"L 61.967845 52.500001 \n",
"L 62.534115 55.984615 \n",
"L 63.226219 59.469231 \n",
"L 63.981244 62.685792 \n",
"L 64.045489 62.953845 \n",
"L 65.009145 66.438459 \n",
"L 66.101283 69.923076 \n",
"L 67.032818 72.582402 \n",
"L 67.328125 73.407692 \n",
"L 68.706252 76.892308 \n",
"L 70.084378 80.073917 \n",
"L 70.218509 80.37692 \n",
"L 71.895191 83.861536 \n",
"L 73.135938 86.249145 \n",
"L 73.718821 87.346153 \n",
"L 75.70748 90.830769 \n",
"L 76.187512 91.617635 \n",
"L 77.871125 94.315385 \n",
"L 79.239071 96.374484 \n",
"L 80.208385 97.800001 \n",
"L 82.290631 100.687261 \n",
"L 82.731817 101.284618 \n",
"L 85.342191 104.627962 \n",
"L 85.455214 104.769234 \n",
"L 88.39375 108.253846 \n",
"L 88.393757 108.253854 \n",
"L 91.44531 111.610969 \n",
"L 91.564212 111.738466 \n",
"L 94.496869 114.736843 \n",
"L 94.98513 115.223078 \n",
"L 97.548436 117.662305 \n",
"L 98.677098 118.70769 \n",
"L 100.599996 120.412926 \n",
"L 102.66303 122.192311 \n",
"L 103.651563 123.010127 \n",
"L 106.703122 125.463574 \n",
"L 106.976403 125.676923 \n",
"L 109.754682 127.760856 \n",
"L 111.679511 129.161535 \n",
"L 112.806249 129.950509 \n",
"L 115.857816 132.021554 \n",
"L 116.808306 132.646155 \n",
"L 118.909375 133.97664 \n",
"L 121.960942 135.845668 \n",
"L 122.442758 136.130768 \n",
"L 125.012502 137.597976 \n",
"L 128.064069 139.279152 \n",
"L 128.697416 139.615388 \n",
"L 131.115628 140.855672 \n",
"L 134.167188 142.361734 \n",
"L 135.724104 143.1 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 42.620312 71.115188 \n",
"L 43.320213 73.407692 \n",
"L 44.496043 76.892308 \n",
"L 45.671872 80.073903 \n",
"L 45.785952 80.37692 \n",
"L 47.211915 83.861536 \n",
"L 48.723432 87.281609 \n",
"L 48.7525 87.346153 \n",
"L 50.438126 90.830769 \n",
"L 51.775006 93.41615 \n",
"L 52.249024 94.315385 \n",
"L 54.204398 97.800001 \n",
"L 54.826565 98.845393 \n",
"L 56.307021 101.284618 \n",
"L 57.878125 103.733263 \n",
"L 58.556253 104.769234 \n",
"L 60.929685 108.209162 \n",
"L 60.961151 108.253846 \n",
"L 63.540821 111.738466 \n",
"L 63.981244 112.305719 \n",
"L 66.294014 115.223078 \n",
"L 67.032818 116.113596 \n",
"L 69.231252 118.70769 \n",
"L 70.084378 119.671522 \n",
"L 72.364666 122.192311 \n",
"L 73.135938 123.010127 \n",
"L 75.70748 125.676923 \n",
"L 76.187512 126.155214 \n",
"L 79.239071 129.127391 \n",
"L 79.274952 129.161535 \n",
"L 82.290631 131.922942 \n",
"L 83.09947 132.646155 \n",
"L 85.342191 134.578536 \n",
"L 87.188189 136.130768 \n",
"L 88.393757 137.108915 \n",
"L 91.44531 139.52368 \n",
"L 91.564212 139.615388 \n",
"L 94.496869 141.800646 \n",
"L 96.287127 143.1 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 42.620312 115.494102 \n",
"L 44.943982 118.70769 \n",
"L 45.671872 119.671517 \n",
"L 47.611182 122.192311 \n",
"L 48.723432 123.579037 \n",
"L 50.438126 125.676923 \n",
"L 51.775006 127.248424 \n",
"L 53.434101 129.161535 \n",
"L 54.826565 130.706605 \n",
"L 56.609163 132.646155 \n",
"L 57.878125 133.97664 \n",
"L 59.974143 136.130768 \n",
"L 60.929685 137.078341 \n",
"L 63.540815 139.615388 \n",
"L 63.981244 140.028815 \n",
"L 67.032818 142.834235 \n",
"L 67.32812 143.1 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 42.620312 142.361735 \n",
"L 43.320214 143.1 \n",
"\" clip-path=\"url(#p304360d856)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p304360d856\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def f_2d(x1, x2): # 目标函数\n",
" return x1 ** 2 + 2 * x2 ** 2\n",
"\n",
"def f_2d_grad(x1, x2): # 目标函数的梯度\n",
" return (2 * x1, 4 * x2)\n",
"\n",
"def gd_2d(x1, x2, s1, s2, f_grad):\n",
" g1, g2 = f_grad(x1, x2)\n",
" return (x1 - eta * g1, x2 - eta * g2, 0, 0)\n",
"\n",
"eta = 0.1\n",
"show_trace_2d(f_2d, train_2d(gd_2d, f_grad=f_2d_grad))"
]
},
{
"cell_type": "markdown",
"id": "1e90e61f",
"metadata": {
"origin_pos": 25
},
"source": [
"## 自适应方法\n",
"\n",
"正如我们在 :numref:`subsec_gd-learningrate`中所看到的,选择“恰到好处”的学习率$\\eta$是很棘手的。\n",
"如果我们把它选得太小,就没有什么进展;如果太大,得到的解就会振荡,甚至可能发散。\n",
"如果我们可以自动确定$\\eta$,或者完全不必选择学习率,会怎么样?\n",
"除了考虑目标函数的值和梯度、还考虑它的曲率的二阶方法可以帮我们解决这个问题。\n",
"虽然由于计算代价的原因,这些方法不能直接应用于深度学习,但它们为如何设计高级优化算法提供了有用的思维直觉,这些算法可以模拟下面概述的算法的许多理想特性。\n",
"\n",
"### 牛顿法\n",
"\n",
"回顾一些函数$f: \\mathbb{R}^d \\rightarrow \\mathbb{R}$的泰勒展开式,事实上我们可以把它写成\n",
"\n",
"$$f(\\mathbf{x} + \\boldsymbol{\\epsilon}) = f(\\mathbf{x}) + \\boldsymbol{\\epsilon}^\\top \\nabla f(\\mathbf{x}) + \\frac{1}{2} \\boldsymbol{\\epsilon}^\\top \\nabla^2 f(\\mathbf{x}) \\boldsymbol{\\epsilon} + \\mathcal{O}(\\|\\boldsymbol{\\epsilon}\\|^3).$$\n",
":eqlabel:`gd-hot-taylor`\n",
"\n",
"为了避免繁琐的符号,我们将$\\mathbf{H} \\stackrel{\\mathrm{def}}{=} \\nabla^2 f(\\mathbf{x})$定义为$f$的Hessian,是$d \\times d$矩阵。\n",
"当$d$的值很小且问题很简单时,$\\mathbf{H}$很容易计算。\n",
"但是对于深度神经网络而言,考虑到$\\mathbf{H}$可能非常大,\n",
"$\\mathcal{O}(d^2)$个条目的存储代价会很高,\n",
"此外通过反向传播进行计算可能雪上加霜。\n",
"然而,我们姑且先忽略这些考量,看看会得到什么算法。\n",
"\n",
"毕竟,$f$的最小值满足$\\nabla f = 0$。\n",
"遵循 :numref:`sec_calculus`中的微积分规则,\n",
"通过取$\\boldsymbol{\\epsilon}$对 :eqref:`gd-hot-taylor`的导数,\n",
"再忽略不重要的高阶项,我们便得到\n",
"\n",
"$$\\nabla f(\\mathbf{x}) + \\mathbf{H} \\boldsymbol{\\epsilon} = 0 \\text{ and hence }\n",
"\\boldsymbol{\\epsilon} = -\\mathbf{H}^{-1} \\nabla f(\\mathbf{x}).$$\n",
"\n",
"也就是说,作为优化问题的一部分,我们需要将Hessian矩阵$\\mathbf{H}$求逆。\n",
"\n",
"举一个简单的例子,对于$f(x) = \\frac{1}{2} x^2$,我们有$\\nabla f(x) = x$和$\\mathbf{H} = 1$。\n",
"因此,对于任何$x$,我们可以获得$\\epsilon = -x$。\n",
"换言之,单单一步就足以完美地收敛,而无须任何调整。\n",
"我们在这里比较幸运:泰勒展开式是确切的,因为$f(x+\\epsilon)= \\frac{1}{2} x^2 + \\epsilon x + \\frac{1}{2} \\epsilon^2$。\n",
"\n",
"让我们看看其他问题。\n",
"给定一个凸双曲余弦函数$c$,其中$c$为某些常数,\n",
"我们可以看到经过几次迭代后,得到了$x=0$处的全局最小值。\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "de14c854",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:27.887060Z",
"iopub.status.busy": "2023-08-18T07:01:27.886284Z",
"iopub.status.idle": "2023-08-18T07:01:28.178108Z",
"shell.execute_reply": "2023-08-18T07:01:28.176813Z"
},
"origin_pos": 26,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: tensor(0.)\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"243.103125pt\" height=\"180.65625pt\" viewBox=\"0 0 243.103125 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:28.130732</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 243.103125 180.65625 \n",
"L 243.103125 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 40.603125 143.1 \n",
"L 235.903125 143.1 \n",
"L 235.903125 7.2 \n",
"L 40.603125 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 49.480398 143.1 \n",
"L 49.480398 7.2 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"mb45305dd2c\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mb45305dd2c\" x=\"49.480398\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(38.928054 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 93.866761 143.1 \n",
"L 93.866761 7.2 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#mb45305dd2c\" x=\"93.866761\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(86.495668 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 138.253125 143.1 \n",
"L 138.253125 7.2 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mb45305dd2c\" x=\"138.253125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(135.071875 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 182.639489 143.1 \n",
"L 182.639489 7.2 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#mb45305dd2c\" x=\"182.639489\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(179.458239 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 227.025852 143.1 \n",
"L 227.025852 7.2 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#mb45305dd2c\" x=\"227.025852\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(220.663352 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- x -->\n",
" <g transform=\"translate(135.29375 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 40.603125 138.610277 \n",
"L 235.903125 138.610277 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m35bd69b896\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m35bd69b896\" x=\"40.603125\" y=\"138.610277\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(27.240625 142.409496)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 40.603125 104.859278 \n",
"L 235.903125 104.859278 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m35bd69b896\" x=\"40.603125\" y=\"104.859278\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(20.878125 108.658497)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 40.603125 71.108278 \n",
"L 235.903125 71.108278 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m35bd69b896\" x=\"40.603125\" y=\"71.108278\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 40 -->\n",
" <g transform=\"translate(20.878125 74.907497)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 40.603125 37.357279 \n",
"L 235.903125 37.357279 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#m35bd69b896\" x=\"40.603125\" y=\"37.357279\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 60 -->\n",
" <g transform=\"translate(20.878125 41.156498)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
"Q 1688 2584 1439 2293 \n",
"Q 1191 2003 1191 1497 \n",
"Q 1191 994 1439 701 \n",
"Q 1688 409 2113 409 \n",
"Q 2538 409 2786 701 \n",
"Q 3034 994 3034 1497 \n",
"Q 3034 2003 2786 2293 \n",
"Q 2538 2584 2113 2584 \n",
"z\n",
"M 3366 4563 \n",
"L 3366 3988 \n",
"Q 3128 4100 2886 4159 \n",
"Q 2644 4219 2406 4219 \n",
"Q 1781 4219 1451 3797 \n",
"Q 1122 3375 1075 2522 \n",
"Q 1259 2794 1537 2939 \n",
"Q 1816 3084 2150 3084 \n",
"Q 2853 3084 3261 2657 \n",
"Q 3669 2231 3669 1497 \n",
"Q 3669 778 3244 343 \n",
"Q 2819 -91 2113 -91 \n",
"Q 1303 -91 875 529 \n",
"Q 447 1150 447 2328 \n",
"Q 447 3434 972 4092 \n",
"Q 1497 4750 2381 4750 \n",
"Q 2619 4750 2861 4703 \n",
"Q 3103 4656 3366 4563 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-36\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798437 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 49.480398 13.377273 \n",
"L 51.522166 26.980492 \n",
"L 53.563944 39.105811 \n",
"L 55.605721 49.913743 \n",
"L 57.647489 59.547321 \n",
"L 59.689258 68.134144 \n",
"L 61.731035 75.787923 \n",
"L 63.772812 82.609959 \n",
"L 65.814581 88.690547 \n",
"L 67.945124 94.332011 \n",
"L 70.075672 99.335114 \n",
"L 72.206215 103.771963 \n",
"L 74.336763 107.706548 \n",
"L 76.556081 111.332057 \n",
"L 78.775399 114.530783 \n",
"L 80.994718 117.352778 \n",
"L 83.302811 119.93542 \n",
"L 85.69967 122.283408 \n",
"L 88.185308 124.404097 \n",
"L 90.759717 126.306978 \n",
"L 93.4229 128.003223 \n",
"L 96.263625 129.549831 \n",
"L 99.281899 130.939216 \n",
"L 102.566489 132.19986 \n",
"L 106.117397 133.316347 \n",
"L 109.934624 134.280667 \n",
"L 114.195716 135.121514 \n",
"L 118.90067 135.816968 \n",
"L 124.138261 136.360754 \n",
"L 129.908489 136.732881 \n",
"L 136.12258 136.910562 \n",
"L 142.514216 136.873892 \n",
"L 148.728306 136.620389 \n",
"L 154.498534 136.165615 \n",
"L 159.647353 135.541931 \n",
"L 164.263535 134.763827 \n",
"L 168.435851 133.837354 \n",
"L 172.164306 132.787244 \n",
"L 175.626444 131.582519 \n",
"L 178.822263 130.233733 \n",
"L 181.751762 128.759507 \n",
"L 184.503716 127.13061 \n",
"L 187.1669 125.291583 \n",
"L 189.652538 123.305207 \n",
"L 192.049397 121.105704 \n",
"L 194.35749 118.686198 \n",
"L 196.576808 116.042311 \n",
"L 198.796127 113.045339 \n",
"L 200.92667 109.792602 \n",
"L 203.057218 106.124383 \n",
"L 205.09899 102.17027 \n",
"L 207.140763 97.733708 \n",
"L 209.182536 92.755956 \n",
"L 211.224309 87.171112 \n",
"L 213.266078 80.905247 \n",
"L 215.21908 74.198133 \n",
"L 217.172083 66.71084 \n",
"L 219.125076 58.352722 \n",
"L 221.078079 49.022466 \n",
"L 223.031081 38.607112 \n",
"L 224.984084 26.980492 \n",
"L 226.937078 14.001838 \n",
"L 226.937078 14.001838 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <path d=\"M 227.025852 13.377273 \n",
"L 209.272924 92.522075 \n",
"L 191.530286 121.607711 \n",
"L 173.86341 132.226396 \n",
"L 156.74043 135.92216 \n",
"L 142.920474 136.86408 \n",
"L 138.357749 136.922698 \n",
"L 138.253126 136.922727 \n",
"L 138.253125 136.922727 \n",
"L 138.253125 136.922727 \n",
"L 138.253125 136.922727 \n",
"\" clip-path=\"url(#p20691e33d4)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m4983b9dc7d\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p20691e33d4)\">\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"227.025852\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"209.272924\" y=\"92.522075\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"191.530286\" y=\"121.607711\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"173.86341\" y=\"132.226396\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"156.74043\" y=\"135.92216\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"142.920474\" y=\"136.86408\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"138.357749\" y=\"136.922698\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"138.253126\" y=\"136.922727\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"138.253125\" y=\"136.922727\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"138.253125\" y=\"136.922727\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m4983b9dc7d\" x=\"138.253125\" y=\"136.922727\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 40.603125 143.1 \n",
"L 40.603125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 235.903125 143.1 \n",
"L 235.903125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 40.603125 143.1 \n",
"L 235.903125 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 40.603125 7.2 \n",
"L 235.903125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p20691e33d4\">\n",
" <rect x=\"40.603125\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"c = torch.tensor(0.5)\n",
"\n",
"def f(x): # O目标函数\n",
" return torch.cosh(c * x)\n",
"\n",
"def f_grad(x): # 目标函数的梯度\n",
" return c * torch.sinh(c * x)\n",
"\n",
"def f_hess(x): # 目标函数的Hessian\n",
" return c**2 * torch.cosh(c * x)\n",
"\n",
"def newton(eta=1):\n",
" x = 10.0\n",
" results = [x]\n",
" for i in range(10):\n",
" x -= eta * f_grad(x) / f_hess(x)\n",
" results.append(float(x))\n",
" print('epoch 10, x:', x)\n",
" return results\n",
"\n",
"show_trace(newton(), f)"
]
},
{
"cell_type": "markdown",
"id": "61c98547",
"metadata": {
"origin_pos": 27
},
"source": [
"现在让我们考虑一个非凸函数,比如$f(x) = x \\cos(c x)$$c$为某些常数。\n",
"请注意在牛顿法中,我们最终将除以Hessian。\n",
"这意味着如果二阶导数是负的,$f$的值可能会趋于增加。\n",
"这是这个算法的致命缺陷!\n",
"让我们看看实践中会发生什么。\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "302d97de",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:28.183563Z",
"iopub.status.busy": "2023-08-18T07:01:28.182810Z",
"iopub.status.idle": "2023-08-18T07:01:28.660824Z",
"shell.execute_reply": "2023-08-18T07:01:28.659664Z"
},
"origin_pos": 28,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: tensor(26.8341)\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"251.482813pt\" height=\"180.65625pt\" viewBox=\"0 0 251.482813 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:28.616405</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 180.65625 \n",
"L 251.482813 180.65625 \n",
"L 251.482813 0 \n",
"L -0 0 \n",
"L -0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 48.982813 143.1 \n",
"L 244.282813 143.1 \n",
"L 244.282813 7.2 \n",
"L 48.982813 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 82.72875 143.1 \n",
"L 82.72875 7.2 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"mf2d922344a\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mf2d922344a\" x=\"82.72875\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(72.176406 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 146.632812 143.1 \n",
"L 146.632812 7.2 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#mf2d922344a\" x=\"146.632812\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(143.451562 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 210.536875 143.1 \n",
"L 210.536875 7.2 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mf2d922344a\" x=\"210.536875\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(204.174375 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x -->\n",
" <g transform=\"translate(143.673438 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 48.982813 121.334161 \n",
"L 244.282813 121.334161 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <defs>\n",
" <path id=\"m24b54810d0\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m24b54810d0\" x=\"48.982813\" y=\"121.334161\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(20.878125 125.133379)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 48.982813 98.242083 \n",
"L 244.282813 98.242083 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m24b54810d0\" x=\"48.982813\" y=\"98.242083\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(20.878125 102.041301)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 48.982813 75.150004 \n",
"L 244.282813 75.150004 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <g>\n",
" <use xlink:href=\"#m24b54810d0\" x=\"48.982813\" y=\"75.150004\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(35.620313 78.949223)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 48.982813 52.057926 \n",
"L 244.282813 52.057926 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m24b54810d0\" x=\"48.982813\" y=\"52.057926\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(29.257813 55.857145)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 48.982813 28.965848 \n",
"L 244.282813 28.965848 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m24b54810d0\" x=\"48.982813\" y=\"28.965848\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(29.257813 32.765067)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798438 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 57.860085 130.630593 \n",
"L 58.690839 133.600136 \n",
"L 59.393781 135.384032 \n",
"L 60.000872 136.379287 \n",
"L 60.512104 136.824142 \n",
"L 60.927477 136.921895 \n",
"L 61.342857 136.785185 \n",
"L 61.79019 136.378934 \n",
"L 62.301422 135.591557 \n",
"L 62.908506 134.220602 \n",
"L 63.643401 131.952571 \n",
"L 64.506107 128.489803 \n",
"L 65.496619 123.54133 \n",
"L 66.678847 116.45201 \n",
"L 68.148643 106.212157 \n",
"L 70.289423 89.536543 \n",
"L 74.12367 59.579166 \n",
"L 75.721271 48.919626 \n",
"L 77.031304 41.585852 \n",
"L 78.149626 36.520711 \n",
"L 79.108185 33.14269 \n",
"L 79.938938 30.972341 \n",
"L 80.641887 29.698751 \n",
"L 81.248977 29.01727 \n",
"L 81.760209 28.743383 \n",
"L 82.239488 28.733358 \n",
"L 82.718768 28.958682 \n",
"L 83.23 29.453259 \n",
"L 83.837084 30.37181 \n",
"L 84.540032 31.866536 \n",
"L 85.37078 34.194101 \n",
"L 86.361297 37.694113 \n",
"L 87.543525 42.767301 \n",
"L 89.013316 50.150417 \n",
"L 91.122149 62.039573 \n",
"L 94.988348 83.923689 \n",
"L 96.585947 91.624331 \n",
"L 97.895979 96.915357 \n",
"L 99.014302 100.569831 \n",
"L 100.004813 103.082637 \n",
"L 100.867519 104.692 \n",
"L 101.602418 105.63016 \n",
"L 102.241458 106.122308 \n",
"L 102.816592 106.310654 \n",
"L 103.39173 106.262399 \n",
"L 103.966864 105.983805 \n",
"L 104.605907 105.413848 \n",
"L 105.340805 104.436356 \n",
"L 106.171556 102.945478 \n",
"L 107.162067 100.686479 \n",
"L 108.344295 97.40669 \n",
"L 109.846038 92.541074 \n",
"L 112.242444 83.896549 \n",
"L 115.213981 73.366242 \n",
"L 116.843535 68.398749 \n",
"L 118.18552 64.9901 \n",
"L 119.335792 62.645275 \n",
"L 120.358259 61.043905 \n",
"L 121.252915 60.027156 \n",
"L 122.083668 59.404774 \n",
"L 122.850517 59.100057 \n",
"L 123.585414 59.042627 \n",
"L 124.320311 59.203493 \n",
"L 125.119111 59.609444 \n",
"L 126.013769 60.323785 \n",
"L 127.068184 61.47173 \n",
"L 128.346266 63.219121 \n",
"L 130.103628 66.042762 \n",
"L 134.417152 73.130377 \n",
"L 135.886946 75.040979 \n",
"L 137.133074 76.309805 \n",
"L 138.251395 77.139628 \n",
"L 139.273861 77.631716 \n",
"L 140.264373 77.867359 \n",
"L 141.254887 77.876908 \n",
"L 142.309303 77.660658 \n",
"L 143.491529 77.180935 \n",
"L 144.92937 76.342454 \n",
"L 147.613341 74.448763 \n",
"L 149.498511 73.261171 \n",
"L 150.808544 72.686589 \n",
"L 151.926865 72.43211 \n",
"L 152.94933 72.42667 \n",
"L 153.939843 72.65011 \n",
"L 154.930355 73.11038 \n",
"L 155.952821 73.83659 \n",
"L 157.071142 74.913786 \n",
"L 158.317271 76.432515 \n",
"L 159.819018 78.631182 \n",
"L 162.023706 82.308106 \n",
"L 164.963294 87.146223 \n",
"L 166.369183 89.035444 \n",
"L 167.487504 90.189224 \n",
"L 168.446065 90.869772 \n",
"L 169.276819 91.195271 \n",
"L 170.043667 91.25703 \n",
"L 170.778564 91.087316 \n",
"L 171.513461 90.683763 \n",
"L 172.28031 90.006537 \n",
"L 173.143013 88.928355 \n",
"L 174.069622 87.399934 \n",
"L 175.124042 85.209395 \n",
"L 176.306264 82.221877 \n",
"L 177.712152 78.032974 \n",
"L 179.501467 71.94848 \n",
"L 185.093075 52.382355 \n",
"L 186.371155 48.955419 \n",
"L 187.425569 46.712774 \n",
"L 188.320228 45.296142 \n",
"L 189.087076 44.4752 \n",
"L 189.726116 44.087495 \n",
"L 190.301254 43.979388 \n",
"L 190.844442 44.093122 \n",
"L 191.419576 44.446838 \n",
"L 192.058616 45.125017 \n",
"L 192.761562 46.219401 \n",
"L 193.560359 47.903235 \n",
"L 194.455018 50.332755 \n",
"L 195.477482 53.780638 \n",
"L 196.659707 58.587519 \n",
"L 198.065601 65.288813 \n",
"L 199.88686 75.15378 \n",
"L 205.574324 106.838877 \n",
"L 206.916309 112.652587 \n",
"L 208.002679 116.440748 \n",
"L 208.929291 118.912859 \n",
"L 209.696139 120.376643 \n",
"L 210.335182 121.168029 \n",
"L 210.87836 121.522082 \n",
"L 211.357646 121.585043 \n",
"L 211.836925 121.410173 \n",
"L 212.348157 120.95868 \n",
"L 212.923288 120.122039 \n",
"L 213.594284 118.706717 \n",
"L 214.361132 116.516725 \n",
"L 215.223839 113.34237 \n",
"L 216.214356 108.814651 \n",
"L 217.364625 102.468655 \n",
"L 218.738563 93.565332 \n",
"L 220.495923 80.585678 \n",
"L 226.79047 32.544559 \n",
"L 228.068557 25.15944 \n",
"L 229.122973 20.250096 \n",
"L 230.017632 17.046236 \n",
"L 230.78448 15.06564 \n",
"L 231.423517 13.985428 \n",
"L 231.934749 13.507618 \n",
"L 232.382076 13.377326 \n",
"L 232.797456 13.500003 \n",
"L 233.244782 13.896603 \n",
"L 233.75602 14.686958 \n",
"L 234.363111 16.091744 \n",
"L 235.066053 18.344575 \n",
"L 235.385569 19.587277 \n",
"L 235.385569 19.587277 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <path d=\"M 178.584844 75.150004 \n",
"L 162.608828 83.314287 \n",
"L 226.496497 34.435752 \n",
"L 235.40554 19.669416 \n",
"L 232.264564 13.385311 \n",
"L 232.373458 13.377273 \n",
"L 232.373312 13.377273 \n",
"L 232.373318 13.377273 \n",
"L 232.373318 13.377273 \n",
"L 232.373318 13.377273 \n",
"L 232.373318 13.377273 \n",
"\" clip-path=\"url(#pc9dd1287a9)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m5feaa7c4a8\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#pc9dd1287a9)\">\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"178.584844\" y=\"75.150004\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"162.608828\" y=\"83.314287\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"226.496497\" y=\"34.435752\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"235.40554\" y=\"19.669416\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.264564\" y=\"13.385311\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.373458\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.373312\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.373318\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.373318\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.373318\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m5feaa7c4a8\" x=\"232.373318\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 48.982813 143.1 \n",
"L 48.982813 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 244.282813 143.1 \n",
"L 244.282813 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 48.982813 143.1 \n",
"L 244.282812 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 48.982813 7.2 \n",
"L 244.282812 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"pc9dd1287a9\">\n",
" <rect x=\"48.982813\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"c = torch.tensor(0.15 * np.pi)\n",
"\n",
"def f(x): # 目标函数\n",
" return x * torch.cos(c * x)\n",
"\n",
"def f_grad(x): # 目标函数的梯度\n",
" return torch.cos(c * x) - c * x * torch.sin(c * x)\n",
"\n",
"def f_hess(x): # 目标函数的Hessian\n",
" return - 2 * c * torch.sin(c * x) - x * c**2 * torch.cos(c * x)\n",
"\n",
"show_trace(newton(), f)"
]
},
{
"cell_type": "markdown",
"id": "28696853",
"metadata": {
"origin_pos": 29
},
"source": [
"这发生了惊人的错误。我们怎样才能修正它?\n",
"一种方法是用取Hessian的绝对值来修正,另一个策略是重新引入学习率。\n",
"这似乎违背了初衷,但不完全是——拥有二阶信息可以使我们在曲率较大时保持谨慎,而在目标函数较平坦时则采用较大的学习率。\n",
"让我们看看在学习率稍小的情况下它是如何生效的,比如$\\eta = 0.5$。\n",
"如我们所见,我们有了一个相当高效的算法。\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "1d0aa6d6",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:01:28.666394Z",
"iopub.status.busy": "2023-08-18T07:01:28.665576Z",
"iopub.status.idle": "2023-08-18T07:01:28.904465Z",
"shell.execute_reply": "2023-08-18T07:01:28.903311Z"
},
"origin_pos": 30,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 10, x: tensor(7.2699)\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:01:28.869577</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 51.497585 143.1 \n",
"L 51.497585 7.2 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m5e67eabe5f\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m5e67eabe5f\" x=\"51.497585\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(40.945241 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 95.883949 143.1 \n",
"L 95.883949 7.2 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m5e67eabe5f\" x=\"95.883949\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(88.512855 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 140.270312 143.1 \n",
"L 140.270312 7.2 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m5e67eabe5f\" x=\"140.270312\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(137.089062 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 184.656676 143.1 \n",
"L 184.656676 7.2 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m5e67eabe5f\" x=\"184.656676\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(181.475426 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 229.04304 143.1 \n",
"L 229.04304 7.2 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m5e67eabe5f\" x=\"229.04304\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(222.68054 157.698438)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- x -->\n",
" <g transform=\"translate(137.310937 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 42.620312 119.411597 \n",
"L 237.920313 119.411597 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m4fe7b34166\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m4fe7b34166\" x=\"42.620312\" y=\"119.411597\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(20.878125 123.210816)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 42.620312 75.15 \n",
"L 237.920313 75.15 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m4fe7b34166\" x=\"42.620312\" y=\"75.15\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 78.949219)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 42.620312 30.888403 \n",
"L 237.920313 30.888403 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m4fe7b34166\" x=\"42.620312\" y=\"30.888403\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 5 -->\n",
" <g transform=\"translate(29.257812 34.687622)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- f(x) -->\n",
" <g transform=\"translate(14.798437 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
"L 2375 4384 \n",
"L 1825 4384 \n",
"Q 1516 4384 1395 4259 \n",
"Q 1275 4134 1275 3809 \n",
"L 1275 3500 \n",
"L 2222 3500 \n",
"L 2222 3053 \n",
"L 1275 3053 \n",
"L 1275 0 \n",
"L 697 0 \n",
"L 697 3053 \n",
"L 147 3053 \n",
"L 147 3500 \n",
"L 697 3500 \n",
"L 697 3744 \n",
"Q 697 4328 969 4595 \n",
"Q 1241 4863 1831 4863 \n",
"L 2375 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
"Q 1566 4138 1362 3434 \n",
"Q 1159 2731 1159 2009 \n",
"Q 1159 1288 1364 580 \n",
"Q 1569 -128 1984 -844 \n",
"L 1484 -844 \n",
"Q 1016 -109 783 600 \n",
"Q 550 1309 550 2009 \n",
"Q 550 2706 781 3412 \n",
"Q 1013 4119 1484 4856 \n",
"L 1984 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
"L 1013 4856 \n",
"Q 1481 4119 1714 3412 \n",
"Q 1947 2706 1947 2009 \n",
"Q 1947 1309 1714 600 \n",
"Q 1481 -109 1013 -844 \n",
"L 513 -844 \n",
"Q 928 -128 1133 580 \n",
"Q 1338 1288 1338 2009 \n",
"Q 1338 2731 1133 3434 \n",
"Q 928 4138 513 4856 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-66\"/>\n",
" <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
" <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
" <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 51.497585 75.150001 \n",
"L 54.515859 61.507494 \n",
"L 57.090268 50.885046 \n",
"L 59.309586 42.62211 \n",
"L 61.351355 35.837137 \n",
"L 63.215583 30.377167 \n",
"L 64.991044 25.861634 \n",
"L 66.588956 22.386195 \n",
"L 68.098084 19.624675 \n",
"L 69.518451 17.490893 \n",
"L 70.850038 15.899777 \n",
"L 72.092859 14.769329 \n",
"L 73.246906 14.022221 \n",
"L 74.312178 13.586851 \n",
"L 75.377449 13.390552 \n",
"L 76.442725 13.427546 \n",
"L 77.507997 13.691167 \n",
"L 78.662039 14.22383 \n",
"L 79.816086 15.003166 \n",
"L 81.058907 16.105109 \n",
"L 82.479269 17.678382 \n",
"L 83.988402 19.689963 \n",
"L 85.675089 22.313917 \n",
"L 87.628087 25.787156 \n",
"L 89.847405 30.206604 \n",
"L 92.599359 36.214766 \n",
"L 96.59413 45.535707 \n",
"L 102.719453 59.79435 \n",
"L 105.648948 66.04003 \n",
"L 108.134584 70.834186 \n",
"L 110.353902 74.634558 \n",
"L 112.306903 77.556075 \n",
"L 114.17113 79.946414 \n",
"L 115.857812 81.757913 \n",
"L 117.45572 83.158273 \n",
"L 118.964857 84.195921 \n",
"L 120.385222 84.920387 \n",
"L 121.805586 85.403652 \n",
"L 123.225949 85.651421 \n",
"L 124.646313 85.67174 \n",
"L 126.066676 85.474901 \n",
"L 127.575812 85.041746 \n",
"L 129.173722 84.350543 \n",
"L 130.949177 83.330131 \n",
"L 132.902176 81.942544 \n",
"L 135.210267 80.014888 \n",
"L 138.22854 77.174086 \n",
"L 146.04054 69.663818 \n",
"L 148.34863 67.823794 \n",
"L 150.301631 66.531906 \n",
"L 152.077085 65.614112 \n",
"L 153.674995 65.026673 \n",
"L 155.18413 64.700263 \n",
"L 156.604495 64.61058 \n",
"L 158.024858 64.743475 \n",
"L 159.445222 65.10813 \n",
"L 160.865585 65.711433 \n",
"L 162.285949 66.557882 \n",
"L 163.795086 67.725879 \n",
"L 165.392994 69.263456 \n",
"L 167.079676 71.216922 \n",
"L 168.855131 73.628385 \n",
"L 170.808131 76.680042 \n",
"L 172.938677 80.448158 \n",
"L 175.33554 85.168809 \n",
"L 178.087492 91.099238 \n",
"L 181.815947 99.711378 \n",
"L 189.272858 117.052872 \n",
"L 191.936041 122.62392 \n",
"L 194.155359 126.773504 \n",
"L 196.019583 129.822238 \n",
"L 197.706269 132.178722 \n",
"L 199.215402 133.924756 \n",
"L 200.635769 135.226991 \n",
"L 201.878586 136.07617 \n",
"L 203.032632 136.608833 \n",
"L 204.097904 136.872454 \n",
"L 205.16318 136.909448 \n",
"L 206.228452 136.713149 \n",
"L 207.293723 136.277775 \n",
"L 208.358995 135.598428 \n",
"L 209.513041 134.582583 \n",
"L 210.667088 133.272041 \n",
"L 211.909901 131.528413 \n",
"L 213.241497 129.276672 \n",
"L 214.661859 126.438757 \n",
"L 216.259771 122.713826 \n",
"L 217.946449 118.183227 \n",
"L 219.721902 112.772058 \n",
"L 221.674904 106.096761 \n",
"L 223.805448 98.012828 \n",
"L 226.202315 88.022895 \n",
"L 228.954265 75.566753 \n",
"L 228.954265 75.566753 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <path d=\"M 229.04304 75.149999 \n",
"L 206.849858 136.488573 \n",
"L 205.848192 136.810382 \n",
"L 205.331865 136.89407 \n",
"L 205.069008 136.915488 \n",
"L 204.936286 136.920908 \n",
"L 204.869587 136.922276 \n",
"L 204.83615 136.922613 \n",
"L 204.819408 136.922702 \n",
"L 204.811031 136.922723 \n",
"L 204.806841 136.922727 \n",
"\" clip-path=\"url(#p427918317c)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m2f04cbd60e\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p427918317c)\">\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"229.04304\" y=\"75.149999\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"206.849858\" y=\"136.488573\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"205.848192\" y=\"136.810382\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"205.331865\" y=\"136.89407\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"205.069008\" y=\"136.915488\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"204.936286\" y=\"136.920908\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"204.869587\" y=\"136.922276\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"204.83615\" y=\"136.922613\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"204.819408\" y=\"136.922702\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"204.811031\" y=\"136.922723\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2f04cbd60e\" x=\"204.806841\" y=\"136.922727\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p427918317c\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"show_trace(newton(0.5), f)"
]
},
{
"cell_type": "markdown",
"id": "1575b769",
"metadata": {
"origin_pos": 31
},
"source": [
"### 收敛性分析\n",
"\n",
"在此,我们以部分目标凸函数$f$为例,分析它们的牛顿法收敛速度。\n",
"这些目标凸函数三次可微,而且二阶导数不为零,即$f'' > 0$。\n",
"由于多变量情况下的证明是对以下一维参数情况证明的直接拓展,对我们理解这个问题不能提供更多帮助,因此我们省略了多变量情况的证明。\n",
"\n",
"用$x^{(k)}$表示$x$在第$k^\\mathrm{th}$次迭代时的值,\n",
"令$e^{(k)} \\stackrel{\\mathrm{def}}{=} x^{(k)} - x^*$表示$k^\\mathrm{th}$迭代时与最优性的距离。\n",
"通过泰勒展开,我们得到条件$f'(x^*) = 0$可以写成\n",
"\n",
"$$0 = f'(x^{(k)} - e^{(k)}) = f'(x^{(k)}) - e^{(k)} f''(x^{(k)}) + \\frac{1}{2} (e^{(k)})^2 f'''(\\xi^{(k)}),$$\n",
"\n",
"这对某些$\\xi^{(k)} \\in [x^{(k)} - e^{(k)}, x^{(k)}]$成立。\n",
"将上述展开除以$f''(x^{(k)})$得到\n",
"\n",
"$$e^{(k)} - \\frac{f'(x^{(k)})}{f''(x^{(k)})} = \\frac{1}{2} (e^{(k)})^2 \\frac{f'''(\\xi^{(k)})}{f''(x^{(k)})}.$$\n",
"\n",
"回想之前的方程$x^{(k+1)} = x^{(k)} - f'(x^{(k)}) / f''(x^{(k)})$。\n",
"代入这个更新方程,取两边的绝对值,我们得到\n",
"\n",
"$$\\left|e^{(k+1)}\\right| = \\frac{1}{2}(e^{(k)})^2 \\frac{\\left|f'''(\\xi^{(k)})\\right|}{f''(x^{(k)})}.$$\n",
"\n",
"因此,每当我们处于有界区域$\\left|f'''(\\xi^{(k)})\\right| / (2f''(x^{(k)})) \\leq c$\n",
"我们就有一个二次递减误差\n",
"\n",
"$$\\left|e^{(k+1)}\\right| \\leq c (e^{(k)})^2.$$\n",
"\n",
"另一方面,优化研究人员称之为“线性”收敛,而将$\\left|e^{(k+1)}\\right| \\leq \\alpha \\left|e^{(k)}\\right|$这样的条件称为“恒定”收敛速度。\n",
"请注意,我们无法估计整体收敛的速度,但是一旦我们接近极小值,收敛将变得非常快。\n",
"另外,这种分析要求$f$在高阶导数上表现良好,即确保$f$在如何变化它的值方面没有任何“超常”的特性。\n",
"\n",
"### 预处理\n",
"\n",
"计算和存储完整的Hessian非常昂贵,而改善这个问题的一种方法是“预处理”。\n",
"它回避了计算整个Hessian,而只计算“对角线”项,即如下的算法更新:\n",
"\n",
"$$\\mathbf{x} \\leftarrow \\mathbf{x} - \\eta \\mathrm{diag}(\\mathbf{H})^{-1} \\nabla f(\\mathbf{x}).$$\n",
"\n",
"虽然这不如完整的牛顿法精确,但它仍然比不使用要好得多。\n",
"为什么预处理有效呢?\n",
"假设一个变量以毫米表示高度,另一个变量以公里表示高度的情况。\n",
"假设这两种自然尺度都以米为单位,那么我们的参数化就出现了严重的不匹配。\n",
"幸运的是,使用预处理可以消除这种情况。\n",
"梯度下降的有效预处理相当于为每个变量选择不同的学习率(矢量$\\mathbf{x}$的坐标)。\n",
"我们将在后面一节看到,预处理推动了随机梯度下降优化算法的一些创新。\n",
"\n",
"### 梯度下降和线搜索\n",
"\n",
"梯度下降的一个关键问题是我们可能会超过目标或进展不足,\n",
"解决这一问题的简单方法是结合使用线搜索和梯度下降。\n",
"也就是说,我们使用$\\nabla f(\\mathbf{x})$给出的方向,\n",
"然后进行二分搜索,以确定哪个学习率$\\eta$使$f(\\mathbf{x} - \\eta \\nabla f(\\mathbf{x}))$取最小值。\n",
"\n",
"有关分析和证明,此算法收敛迅速(请参见 :cite:`Boyd.Vandenberghe.2004`)。\n",
"然而,对深度学习而言,这不太可行。\n",
"因为线搜索的每一步都需要评估整个数据集上的目标函数,实现它的方式太昂贵了。\n",
"\n",
"## 小结\n",
"\n",
"* 学习率的大小很重要:学习率太大会使模型发散,学习率太小会没有进展。\n",
"* 梯度下降会可能陷入局部极小值,而得不到全局最小值。\n",
"* 在高维模型中,调整学习率是很复杂的。\n",
"* 预处理有助于调节比例。\n",
"* 牛顿法在凸问题中一旦开始正常工作,速度就会快得多。\n",
"* 对于非凸问题,不要不作任何调整就使用牛顿法。\n",
"\n",
"## 练习\n",
"\n",
"1. 用不同的学习率和目标函数进行梯度下降实验。\n",
"1. 在区间$[a, b]$中实现线搜索以最小化凸函数。\n",
" 1. 是否需要导数来进行二分搜索,即决定选择$[a, (a+b)/2]$还是$[(a+b)/2, b]$。\n",
" 1. 算法的收敛速度有多快?\n",
" 1. 实现该算法,并将其应用于求$\\log (\\exp(x) + \\exp(-2x -3))$的最小值。\n",
"1. 设计一个定义在$\\mathbb{R}^2$上的目标函数,它的梯度下降非常缓慢。提示:不同坐标的缩放方式不同。\n",
"1. 使用预处理实现牛顿方法的轻量版本。\n",
" 1. 使用对角Hessian作为预条件子。\n",
" 1. 使用它的绝对值,而不是实际值(可能有符号)。\n",
" 1. 将此应用于上述问题。\n",
"1. 将上述算法应用于多个目标函数(凸或非凸)。如果把坐标旋转$45$度会怎么样?\n"
]
},
{
"cell_type": "markdown",
"id": "5db14585",
"metadata": {
"origin_pos": 33,
"tab": [
"pytorch"
]
},
"source": [
"[Discussions](https://discuss.d2l.ai/t/3836)\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"required_libs": []
},
"nbformat": 4,
"nbformat_minor": 5
}