Files
2025-12-16 09:23:53 +08:00

8629 lines
333 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "5ccbf7d3",
"metadata": {
"origin_pos": 0
},
"source": [
"# 动量法\n",
":label:`sec_momentum`\n",
"\n",
"在 :numref:`sec_sgd`一节中,我们详述了如何执行随机梯度下降,即在只有嘈杂的梯度可用的情况下执行优化时会发生什么。\n",
"对于嘈杂的梯度,我们在选择学习率需要格外谨慎。\n",
"如果衰减速度太快,收敛就会停滞。\n",
"相反,如果太宽松,我们可能无法收敛到最优解。\n",
"\n",
"## 基础\n",
"\n",
"本节将探讨更有效的优化算法,尤其是针对实验中常见的某些类型的优化问题。\n",
"\n",
"### 泄漏平均值\n",
"\n",
"上一节中我们讨论了小批量随机梯度下降作为加速计算的手段。\n",
"它也有很好的副作用,即平均梯度减小了方差。\n",
"小批量随机梯度下降可以通过以下方式计算:\n",
"\n",
"$$\\mathbf{g}_{t, t-1} = \\partial_{\\mathbf{w}} \\frac{1}{|\\mathcal{B}_t|} \\sum_{i \\in \\mathcal{B}_t} f(\\mathbf{x}_{i}, \\mathbf{w}_{t-1}) = \\frac{1}{|\\mathcal{B}_t|} \\sum_{i \\in \\mathcal{B}_t} \\mathbf{h}_{i, t-1}.\n",
"$$\n",
"\n",
"为了保持记法简单,在这里我们使用$\\mathbf{h}_{i, t-1} = \\partial_{\\mathbf{w}} f(\\mathbf{x}_i, \\mathbf{w}_{t-1})$作为样本$i$的随机梯度下降,使用时间$t-1$时更新的权重$t-1$。\n",
"如果我们能够从方差减少的影响中受益,甚至超过小批量上的梯度平均值,那很不错。\n",
"完成这项任务的一种选择是用*泄漏平均值*(leaky average)取代梯度计算:\n",
"\n",
"$$\\mathbf{v}_t = \\beta \\mathbf{v}_{t-1} + \\mathbf{g}_{t, t-1}$$\n",
"\n",
"其中$\\beta \\in (0, 1)$。\n",
"这有效地将瞬时梯度替换为多个“过去”梯度的平均值。\n",
"$\\mathbf{v}$被称为*动量*momentum),\n",
"它累加了过去的梯度。\n",
"为了更详细地解释,让我们递归地将$\\mathbf{v}_t$扩展到\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathbf{v}_t = \\beta^2 \\mathbf{v}_{t-2} + \\beta \\mathbf{g}_{t-1, t-2} + \\mathbf{g}_{t, t-1}\n",
"= \\ldots, = \\sum_{\\tau = 0}^{t-1} \\beta^{\\tau} \\mathbf{g}_{t-\\tau, t-\\tau-1}.\n",
"\\end{aligned}$$\n",
"\n",
"其中,较大的$\\beta$相当于长期平均值,而较小的$\\beta$相对于梯度法只是略有修正。\n",
"新的梯度替换不再指向特定实例下降最陡的方向,而是指向过去梯度的加权平均值的方向。\n",
"这使我们能够实现对单批量计算平均值的大部分好处,而不产生实际计算其梯度的代价。\n",
"\n",
"上述推理构成了\"加速\"梯度方法的基础,例如具有动量的梯度。\n",
"在优化问题条件不佳的情况下(例如,有些方向的进展比其他方向慢得多,类似狭窄的峡谷),\"加速\"梯度还额外享受更有效的好处。\n",
"此外,它们允许我们对随后的梯度计算平均值,以获得更稳定的下降方向。\n",
"诚然,即使是对于无噪声凸问题,加速度这方面也是动量如此起效的关键原因之一。\n",
"\n",
"正如人们所期望的,由于其功效,动量是深度学习及其后优化中一个深入研究的主题。\n",
"例如,请参阅[文章](https://distill.pub/2017/momentum/)(作者是 :cite:`Goh.2017`),观看深入分析和互动动画。\n",
"动量是由 :cite:`Polyak.1964`提出的。\n",
" :cite:`Nesterov.2018`在凸优化的背景下进行了详细的理论讨论。\n",
"长期以来,深度学习的动量一直被认为是有益的。\n",
"有关实例的详细信息,请参阅 :cite:`Sutskever.Martens.Dahl.ea.2013`的讨论。\n",
"\n",
"### 条件不佳的问题\n",
"\n",
"为了更好地了解动量法的几何属性,我们复习一下梯度下降,尽管它的目标函数明显不那么令人愉快。\n",
"回想我们在 :numref:`sec_gd`中使用了$f(\\mathbf{x}) = x_1^2 + 2 x_2^2$,即中度扭曲的椭球目标。\n",
"我们通过向$x_1$方向伸展它来进一步扭曲这个函数\n",
"\n",
"$$f(\\mathbf{x}) = 0.1 x_1^2 + 2 x_2^2.$$\n",
"\n",
"与之前一样,$f$在$(0, 0)$有最小值,\n",
"该函数在$x_1$的方向上*非常*平坦。\n",
"让我们看看在这个新函数上执行梯度下降时会发生什么。\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aa0feaa9",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:08.280335Z",
"iopub.status.busy": "2023-08-18T07:02:08.279709Z",
"iopub.status.idle": "2023-08-18T07:02:10.386954Z",
"shell.execute_reply": "2023-08-18T07:02:10.386084Z"
},
"origin_pos": 2,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: -0.943467, x2: -0.000073\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:10.355545</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"m905ff13dfe\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m905ff13dfe\" x=\"88.39375\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(81.022656 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#m905ff13dfe\" x=\"149.425\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(142.053907 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#m905ff13dfe\" x=\"210.456251\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(207.275001 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(134.129687 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"m0fe3ec2b71\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m0fe3ec2b71\" x=\"42.620312\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 3 -->\n",
" <g transform=\"translate(20.878125 146.899219)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#m0fe3ec2b71\" x=\"42.620312\" y=\"112.283673\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(20.878125 116.082892)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m0fe3ec2b71\" x=\"42.620312\" y=\"81.467347\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(20.878125 85.266566)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_7\">\n",
" <g>\n",
" <use xlink:href=\"#m0fe3ec2b71\" x=\"42.620312\" y=\"50.65102\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 54.450239)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m0fe3ec2b71\" x=\"42.620312\" y=\"19.834694\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(29.257812 23.633913)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798437 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 57.878125 112.283673 \n",
"L 70.084375 13.671429 \n",
"L 81.314125 72.838776 \n",
"L 91.645495 37.338367 \n",
"L 101.150356 58.638612 \n",
"L 109.894827 45.858465 \n",
"L 117.939741 53.526553 \n",
"L 125.341062 48.925701 \n",
"L 132.150277 51.686212 \n",
"L 138.414755 50.029905 \n",
"L 144.178075 51.023689 \n",
"L 149.480329 50.427419 \n",
"L 154.358402 50.785181 \n",
"L 158.84623 50.570524 \n",
"L 162.975032 50.699318 \n",
"L 166.773529 50.622042 \n",
"L 170.268147 50.668408 \n",
"L 173.483195 50.640588 \n",
"L 176.44104 50.65728 \n",
"L 179.162257 50.647265 \n",
"L 181.665776 50.653274 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"mf4060ef2eb\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p02a00d0013)\">\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"57.878125\" y=\"112.283673\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"70.084375\" y=\"13.671429\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"81.314125\" y=\"72.838776\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"91.645495\" y=\"37.338367\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"101.150356\" y=\"58.638612\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"109.894827\" y=\"45.858465\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"117.939741\" y=\"53.526553\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"125.341062\" y=\"48.925701\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"132.150277\" y=\"51.686212\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"138.414755\" y=\"50.029905\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"144.178075\" y=\"51.023689\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"149.480329\" y=\"50.427419\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"154.358402\" y=\"50.785181\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"158.84623\" y=\"50.570524\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"162.975032\" y=\"50.699318\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"166.773529\" y=\"50.622042\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"170.268147\" y=\"50.668408\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"173.483195\" y=\"50.640588\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"176.44104\" y=\"50.65728\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"179.162257\" y=\"50.647265\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#mf4060ef2eb\" x=\"181.665776\" y=\"50.653274\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 97.100868 22.916327 \n",
"L 94.496869 23.4964 \n",
"L 91.44531 24.194299 \n",
"L 88.393757 24.910323 \n",
"L 85.342191 25.644477 \n",
"L 83.908323 25.997959 \n",
"L 82.290631 26.449931 \n",
"L 79.239071 27.32306 \n",
"L 76.187512 28.216732 \n",
"L 73.307377 29.079592 \n",
"L 73.135938 29.138856 \n",
"L 70.084378 30.217425 \n",
"L 67.032818 31.3197 \n",
"L 64.752171 32.161224 \n",
"L 63.981244 32.497404 \n",
"L 60.929685 33.856123 \n",
"L 57.878125 35.242857 \n",
"L 54.826565 36.971992 \n",
"L 52.48604 38.32449 \n",
"L 51.775006 38.852763 \n",
"L 48.723432 41.164001 \n",
"L 48.409729 41.406122 \n",
"L 45.671872 44.364491 \n",
"L 45.55989 44.487755 \n",
"L 43.880133 47.569388 \n",
"L 43.320214 50.651021 \n",
"L 43.880133 53.732654 \n",
"L 45.55989 56.814286 \n",
"L 45.671872 56.93755 \n",
"L 48.409729 59.895919 \n",
"L 48.723432 60.13804 \n",
"L 51.775006 62.449278 \n",
"L 52.486042 62.977552 \n",
"L 54.826565 64.330047 \n",
"L 57.878125 66.059184 \n",
"L 60.929685 67.445918 \n",
"L 63.981244 68.804637 \n",
"L 64.752171 69.140817 \n",
"L 67.032818 69.982341 \n",
"L 70.084378 71.084616 \n",
"L 73.135938 72.163185 \n",
"L 73.307377 72.222449 \n",
"L 76.187512 73.085309 \n",
"L 79.239071 73.978981 \n",
"L 82.290631 74.852111 \n",
"L 83.908319 75.30408 \n",
"L 85.342191 75.657562 \n",
"L 88.393757 76.391718 \n",
"L 91.44531 77.107741 \n",
"L 94.496869 77.80564 \n",
"L 97.100868 78.385714 \n",
"L 97.548436 78.47492 \n",
"L 100.599996 79.066917 \n",
"L 103.651563 79.642697 \n",
"L 106.703122 80.202255 \n",
"L 109.754682 80.745594 \n",
"L 112.806249 81.272716 \n",
"L 113.968755 81.467347 \n",
"L 115.857816 81.753499 \n",
"L 118.909375 82.201068 \n",
"L 121.960942 82.633966 \n",
"L 125.012502 83.052187 \n",
"L 128.064069 83.455735 \n",
"L 131.115628 83.844607 \n",
"L 134.167188 84.218805 \n",
"L 136.969651 84.54898 \n",
"L 137.218755 84.575777 \n",
"L 140.270314 84.890639 \n",
"L 143.321874 85.192103 \n",
"L 146.373441 85.480169 \n",
"L 149.425 85.754835 \n",
"L 152.476564 86.016105 \n",
"L 155.528127 86.263976 \n",
"L 158.57969 86.498447 \n",
"L 161.631253 86.719521 \n",
"L 164.682813 86.927197 \n",
"L 167.734376 87.121472 \n",
"L 170.785939 87.302352 \n",
"L 173.837499 87.469831 \n",
"L 176.889062 87.623913 \n",
"L 177.034329 87.63061 \n",
"L 179.940626 87.753878 \n",
"L 182.992189 87.870979 \n",
"L 186.04375 87.975756 \n",
"L 189.095313 88.068204 \n",
"L 192.146877 88.148327 \n",
"L 195.198438 88.216122 \n",
"L 198.250001 88.271592 \n",
"L 201.301564 88.314734 \n",
"L 204.353126 88.345552 \n",
"L 207.404689 88.364041 \n",
"L 210.456251 88.370204 \n",
"L 213.507813 88.364041 \n",
"L 216.559376 88.345552 \n",
"L 219.610939 88.314734 \n",
"L 222.662501 88.271592 \n",
"L 225.714063 88.216122 \n",
"L 228.765626 88.148327 \n",
"L 231.817188 88.068204 \n",
"L 234.868751 87.975756 \n",
"L 237.920313 87.870979 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 42.620312 88.216122 \n",
"L 45.671872 88.887919 \n",
"L 48.723432 89.547385 \n",
"L 51.775006 90.194531 \n",
"L 54.263642 90.712243 \n",
"L 54.826565 90.820674 \n",
"L 57.878125 91.397052 \n",
"L 60.929685 91.962018 \n",
"L 63.981244 92.51557 \n",
"L 67.032818 93.05771 \n",
"L 70.084378 93.588436 \n",
"L 71.29158 93.793877 \n",
"L 73.135938 94.086102 \n",
"L 76.187512 94.558974 \n",
"L 79.239071 95.021218 \n",
"L 82.290631 95.472837 \n",
"L 85.342191 95.913829 \n",
"L 88.393757 96.344195 \n",
"L 91.44531 96.763934 \n",
"L 92.277553 96.87551 \n",
"L 94.496869 97.15385 \n",
"L 97.548436 97.526629 \n",
"L 100.599996 97.889466 \n",
"L 103.651563 98.242363 \n",
"L 106.703122 98.585318 \n",
"L 109.754682 98.918333 \n",
"L 112.806249 99.241407 \n",
"L 115.857816 99.554542 \n",
"L 118.909375 99.857734 \n",
"L 119.943828 99.957144 \n",
"L 121.960942 100.139238 \n",
"L 125.012502 100.40538 \n",
"L 128.064069 100.662182 \n",
"L 131.115628 100.909646 \n",
"L 134.167188 101.147773 \n",
"L 137.218755 101.376562 \n",
"L 140.270314 101.596012 \n",
"L 143.321874 101.806121 \n",
"L 146.373441 102.006896 \n",
"L 149.425 102.19833 \n",
"L 152.476564 102.380427 \n",
"L 155.528127 102.553186 \n",
"L 158.57969 102.716605 \n",
"L 161.631253 102.870687 \n",
"L 164.682813 103.015431 \n",
"L 165.20896 103.038777 \n",
"L 167.734376 103.144432 \n",
"L 170.785939 103.263294 \n",
"L 173.837499 103.373353 \n",
"L 176.889062 103.474608 \n",
"L 179.940626 103.567056 \n",
"L 182.992189 103.650701 \n",
"L 186.04375 103.725539 \n",
"L 189.095313 103.791576 \n",
"L 192.146877 103.848806 \n",
"L 195.198438 103.89723 \n",
"L 198.250001 103.936853 \n",
"L 201.301564 103.967669 \n",
"L 204.353126 103.989679 \n",
"L 207.404689 104.002887 \n",
"L 210.456251 104.007289 \n",
"L 213.507813 104.002887 \n",
"L 216.559376 103.989679 \n",
"L 219.610939 103.967669 \n",
"L 222.662501 103.936853 \n",
"L 225.714063 103.89723 \n",
"L 228.765626 103.848806 \n",
"L 231.817188 103.791576 \n",
"L 234.868751 103.725539 \n",
"L 237.920313 103.650701 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 42.620312 103.89723 \n",
"L 45.671872 104.377084 \n",
"L 48.723432 104.848133 \n",
"L 51.775006 105.310381 \n",
"L 54.826565 105.76382 \n",
"L 57.27387 106.12041 \n",
"L 57.878125 106.203696 \n",
"L 60.929685 106.615968 \n",
"L 63.981244 107.019912 \n",
"L 67.032818 107.415527 \n",
"L 70.084378 107.802813 \n",
"L 73.135938 108.181769 \n",
"L 76.187512 108.552401 \n",
"L 79.239071 108.914701 \n",
"L 81.716227 109.202044 \n",
"L 82.290631 109.265256 \n",
"L 85.342191 109.59317 \n",
"L 88.393757 109.913188 \n",
"L 91.44531 110.2253 \n",
"L 94.496869 110.529513 \n",
"L 97.548436 110.825823 \n",
"L 100.599996 111.114232 \n",
"L 103.651563 111.39474 \n",
"L 106.703122 111.667347 \n",
"L 109.754682 111.932052 \n",
"L 112.806249 112.188853 \n",
"L 113.968764 112.283673 \n",
"L 115.857816 112.430237 \n",
"L 118.909375 112.659484 \n",
"L 121.960942 112.881211 \n",
"L 125.012502 113.095419 \n",
"L 128.064069 113.302115 \n",
"L 131.115628 113.501295 \n",
"L 134.167188 113.692956 \n",
"L 137.218755 113.877101 \n",
"L 140.270314 114.053731 \n",
"L 143.321874 114.222845 \n",
"L 146.373441 114.384444 \n",
"L 149.425 114.538527 \n",
"L 152.476564 114.685091 \n",
"L 155.528127 114.824139 \n",
"L 158.57969 114.955675 \n",
"L 161.631253 115.079692 \n",
"L 164.682813 115.196189 \n",
"L 167.734376 115.305175 \n",
"L 169.542874 115.365311 \n",
"L 170.785939 115.404722 \n",
"L 173.837499 115.494303 \n",
"L 176.889062 115.576718 \n",
"L 179.940626 115.651967 \n",
"L 182.992189 115.72005 \n",
"L 186.04375 115.780967 \n",
"L 189.095313 115.834717 \n",
"L 192.146877 115.881298 \n",
"L 195.198438 115.920717 \n",
"L 198.250001 115.952966 \n",
"L 201.301564 115.978049 \n",
"L 204.353126 115.995966 \n",
"L 207.404689 116.006713 \n",
"L 210.456251 116.010298 \n",
"L 213.507813 116.006713 \n",
"L 216.559376 115.995966 \n",
"L 219.610939 115.978049 \n",
"L 222.662501 115.952966 \n",
"L 225.714063 115.920717 \n",
"L 228.765626 115.881298 \n",
"L 231.817188 115.834717 \n",
"L 234.868751 115.780967 \n",
"L 237.920313 115.72005 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 42.620312 115.920717 \n",
"L 45.671872 116.311294 \n",
"L 48.723432 116.694707 \n",
"L 51.775006 117.070955 \n",
"L 54.826565 117.440034 \n",
"L 57.878125 117.801946 \n",
"L 60.929685 118.156692 \n",
"L 63.477901 118.44694 \n",
"L 63.981244 118.501724 \n",
"L 67.032818 118.827008 \n",
"L 70.084378 119.145442 \n",
"L 73.135938 119.457031 \n",
"L 76.187512 119.761769 \n",
"L 79.239071 120.05966 \n",
"L 82.290631 120.350703 \n",
"L 85.342191 120.634898 \n",
"L 88.393757 120.912246 \n",
"L 91.44531 121.182743 \n",
"L 94.496869 121.446393 \n",
"L 95.47338 121.52857 \n",
"L 97.548436 121.695764 \n",
"L 100.599996 121.935082 \n",
"L 103.651563 122.167844 \n",
"L 106.703122 122.39405 \n",
"L 109.754682 122.6137 \n",
"L 112.806249 122.82679 \n",
"L 115.857816 123.033325 \n",
"L 118.909375 123.233306 \n",
"L 121.960942 123.426728 \n",
"L 125.012502 123.61359 \n",
"L 128.064069 123.7939 \n",
"L 131.115628 123.967653 \n",
"L 134.167188 124.134847 \n",
"L 137.218755 124.295485 \n",
"L 140.270314 124.449566 \n",
"L 143.321874 124.597092 \n",
"L 143.605786 124.610207 \n",
"L 146.373441 124.732842 \n",
"L 149.425 124.861769 \n",
"L 152.476564 124.984403 \n",
"L 155.528127 125.10075 \n",
"L 158.57969 125.210811 \n",
"L 161.631253 125.31458 \n",
"L 164.682813 125.412057 \n",
"L 167.734376 125.503249 \n",
"L 170.785939 125.588153 \n",
"L 173.837499 125.666765 \n",
"L 176.889062 125.739088 \n",
"L 179.940626 125.805123 \n",
"L 182.992189 125.864869 \n",
"L 186.04375 125.918326 \n",
"L 189.095313 125.965495 \n",
"L 192.146877 126.006372 \n",
"L 195.198438 126.040964 \n",
"L 198.250001 126.069264 \n",
"L 201.301564 126.091276 \n",
"L 204.353126 126.106999 \n",
"L 207.404689 126.11643 \n",
"L 210.456251 126.119576 \n",
"L 213.507813 126.11643 \n",
"L 216.559376 126.106999 \n",
"L 219.610939 126.091276 \n",
"L 222.662501 126.069264 \n",
"L 225.714063 126.040964 \n",
"L 228.765626 126.006372 \n",
"L 231.817188 125.965495 \n",
"L 234.868751 125.918326 \n",
"L 237.920313 125.864869 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 42.620312 126.040964 \n",
"L 45.671872 126.383715 \n",
"L 48.723432 126.72018 \n",
"L 51.775006 127.050356 \n",
"L 54.826565 127.374241 \n",
"L 57.878125 127.691837 \n",
"L 60.929685 127.990936 \n",
"L 63.981244 128.283994 \n",
"L 67.032818 128.57101 \n",
"L 70.084378 128.851981 \n",
"L 73.135938 129.126913 \n",
"L 76.187512 129.3958 \n",
"L 79.239071 129.658644 \n",
"L 82.290631 129.915447 \n",
"L 85.342191 130.166208 \n",
"L 88.393757 130.410927 \n",
"L 91.44531 130.649601 \n",
"L 93.070112 130.773466 \n",
"L 94.496869 130.878129 \n",
"L 97.548436 131.096168 \n",
"L 100.599996 131.308394 \n",
"L 103.651563 131.514805 \n",
"L 106.703122 131.715403 \n",
"L 109.754682 131.910186 \n",
"L 112.806249 132.099153 \n",
"L 115.857816 132.282307 \n",
"L 118.909375 132.459647 \n",
"L 121.960942 132.631172 \n",
"L 125.012502 132.79688 \n",
"L 128.064069 132.956776 \n",
"L 131.115628 133.110859 \n",
"L 134.167188 133.259125 \n",
"L 137.218755 133.401577 \n",
"L 140.270314 133.538216 \n",
"L 143.321874 133.66904 \n",
"L 146.373441 133.79405 \n",
"L 147.936485 133.855104 \n",
"L 149.425 133.911132 \n",
"L 152.476564 134.020389 \n",
"L 155.528127 134.124043 \n",
"L 158.57969 134.222097 \n",
"L 161.631253 134.314546 \n",
"L 164.682813 134.40139 \n",
"L 167.734376 134.482634 \n",
"L 170.785939 134.558275 \n",
"L 173.837499 134.628311 \n",
"L 176.889062 134.692744 \n",
"L 179.940626 134.751575 \n",
"L 182.992189 134.804804 \n",
"L 186.04375 134.85243 \n",
"L 189.095313 134.894453 \n",
"L 192.146877 134.930871 \n",
"L 195.198438 134.961689 \n",
"L 198.250001 134.986902 \n",
"L 201.301564 135.006512 \n",
"L 204.353126 135.02052 \n",
"L 207.404689 135.028922 \n",
"L 210.456251 135.031725 \n",
"L 213.507813 135.028922 \n",
"L 216.559376 135.02052 \n",
"L 219.610939 135.006512 \n",
"L 222.662501 134.986902 \n",
"L 225.714063 134.961689 \n",
"L 228.765626 134.930871 \n",
"L 231.817188 134.894453 \n",
"L 234.868751 134.85243 \n",
"L 237.920313 134.804804 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 42.620312 134.961686 \n",
"L 45.671872 135.267051 \n",
"L 48.723432 135.566806 \n",
"L 51.775006 135.860966 \n",
"L 54.826565 136.149521 \n",
"L 57.878125 136.432465 \n",
"L 60.929685 136.709815 \n",
"L 63.477871 136.936733 \n",
"L 63.981244 136.979986 \n",
"L 67.032818 137.236787 \n",
"L 70.084378 137.488185 \n",
"L 73.135938 137.734174 \n",
"L 76.187512 137.974759 \n",
"L 79.239071 138.209936 \n",
"L 82.290631 138.439707 \n",
"L 85.342191 138.66407 \n",
"L 88.393757 138.883028 \n",
"L 91.44531 139.096579 \n",
"L 94.496869 139.304725 \n",
"L 97.548436 139.507465 \n",
"L 100.599996 139.704798 \n",
"L 103.651563 139.896722 \n",
"L 105.641791 140.01837 \n",
"L 106.703122 140.081043 \n",
"L 109.754682 140.256016 \n",
"L 112.806249 140.425768 \n",
"L 115.857816 140.590295 \n",
"L 118.909375 140.749602 \n",
"L 121.960942 140.903684 \n",
"L 125.012502 141.052541 \n",
"L 128.064069 141.196177 \n",
"L 131.115628 141.334588 \n",
"L 134.167188 141.467779 \n",
"L 137.218755 141.595744 \n",
"L 140.270314 141.71849 \n",
"L 143.321874 141.83601 \n",
"L 146.373441 141.948305 \n",
"L 149.425 142.055379 \n",
"L 152.476564 142.157229 \n",
"L 155.528127 142.253858 \n",
"L 158.57969 142.345262 \n",
"L 161.631253 142.431441 \n",
"L 164.682813 142.512399 \n",
"L 167.734376 142.588138 \n",
"L 170.785939 142.658646 \n",
"L 173.837499 142.723939 \n",
"L 176.889062 142.784001 \n",
"L 179.940626 142.838844 \n",
"L 182.992189 142.888466 \n",
"L 186.04375 142.932863 \n",
"L 189.095313 142.972034 \n",
"L 192.146877 143.005986 \n",
"L 195.198438 143.034712 \n",
"L 198.250001 143.058213 \n",
"L 201.301564 143.076494 \n",
"L 204.353126 143.089555 \n",
"L 207.404689 143.09739 \n",
"L 210.456251 143.1 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" <path d=\"M 210.456251 143.1 \n",
"L 213.507813 143.09739 \n",
"L 216.559376 143.089555 \n",
"L 219.610939 143.076494 \n",
"L 222.662501 143.058213 \n",
"L 225.714063 143.034712 \n",
"L 228.765626 143.005986 \n",
"L 231.817188 142.972034 \n",
"L 234.868751 142.932863 \n",
"L 237.920313 142.888466 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\">\n",
" <path d=\"M 42.620312 143.034712 \n",
"L 43.320206 143.1 \n",
"\" clip-path=\"url(#p02a00d0013)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_9\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p02a00d0013\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"import torch\n",
"from d2l import torch as d2l\n",
"\n",
"eta = 0.4\n",
"def f_2d(x1, x2):\n",
" return 0.1 * x1 ** 2 + 2 * x2 ** 2\n",
"def gd_2d(x1, x2, s1, s2):\n",
" return (x1 - eta * 0.2 * x1, x2 - eta * 4 * x2, 0, 0)\n",
"\n",
"d2l.show_trace_2d(f_2d, d2l.train_2d(gd_2d))"
]
},
{
"cell_type": "markdown",
"id": "4d786bff",
"metadata": {
"origin_pos": 5
},
"source": [
"从构造来看,$x_2$方向的梯度比水平$x_1$方向的梯度大得多,变化也快得多。\n",
"因此,我们陷入两难:如果选择较小的学习率,我们会确保解不会在$x_2$方向发散,但要承受在$x_1$方向的缓慢收敛。相反,如果学习率较高,我们在$x_1$方向上进展很快,但在$x_2$方向将会发散。\n",
"下面的例子说明了即使学习率从$0.4$略微提高到$0.6$,也会发生变化。\n",
"$x_1$方向上的收敛有所改善,但整体来看解的质量更差了。\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a201276f",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.391031Z",
"iopub.status.busy": "2023-08-18T07:02:10.390173Z",
"iopub.status.idle": "2023-08-18T07:02:10.517281Z",
"shell.execute_reply": "2023-08-18T07:02:10.516451Z"
},
"origin_pos": 6,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: -0.387814, x2: -1673.365109\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"264.207812pt\" height=\"180.65625pt\" viewBox=\"0 0 264.207812 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:10.489285</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 264.207812 180.65625 \n",
"L 264.207812 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 61.707813 143.1 \n",
"L 257.007812 143.1 \n",
"L 257.007812 7.2 \n",
"L 61.707813 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"ma26511a81d\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#ma26511a81d\" x=\"107.48125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(100.110156 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#ma26511a81d\" x=\"168.5125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(161.141407 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#ma26511a81d\" x=\"229.543751\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(226.362501 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(153.217188 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"mdeda8b93ee\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mdeda8b93ee\" x=\"61.707813\" y=\"107.922362\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 1000 -->\n",
" <g transform=\"translate(20.878125 111.721581)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"147.412109\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"211.035156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"274.658203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#mdeda8b93ee\" x=\"61.707813\" y=\"64.854545\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(48.345313 68.653764)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mdeda8b93ee\" x=\"61.707813\" y=\"21.786729\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 1000 -->\n",
" <g transform=\"translate(29.257812 25.585947)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"190.869141\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798438 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 76.965625 64.940681 \n",
"L 95.275 64.733956 \n",
"L 111.38725 65.023371 \n",
"L 125.56603 64.618189 \n",
"L 138.043357 65.185444 \n",
"L 149.023404 64.391287 \n",
"L 158.685846 65.503107 \n",
"L 167.188794 63.94656 \n",
"L 174.671389 66.125726 \n",
"L 181.256072 63.074893 \n",
"L 187.050594 67.346059 \n",
"L 192.149773 61.366427 \n",
"L 196.63705 69.737912 \n",
"L 200.585854 58.017833 \n",
"L 204.060802 74.425943 \n",
"L 207.118755 51.454589 \n",
"L 209.809755 83.614484 \n",
"L 212.177834 38.590631 \n",
"L 214.261744 101.624026 \n",
"L 216.095585 13.377273 \n",
"L 217.709365 136.922727 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"me332fceb19\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p6dc97a64f2)\">\n",
" <use xlink:href=\"#me332fceb19\" x=\"76.965625\" y=\"64.940681\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"95.275\" y=\"64.733956\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"111.38725\" y=\"65.023371\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"125.56603\" y=\"64.618189\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"138.043357\" y=\"65.185444\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"149.023404\" y=\"64.391287\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"158.685846\" y=\"65.503107\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"167.188794\" y=\"63.94656\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"174.671389\" y=\"66.125726\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"181.256072\" y=\"63.074893\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"187.050594\" y=\"67.346059\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"192.149773\" y=\"61.366427\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"196.63705\" y=\"69.737912\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"200.585854\" y=\"58.017833\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"204.060802\" y=\"74.425943\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"207.118755\" y=\"51.454589\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"209.809755\" y=\"83.614484\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"212.177834\" y=\"38.590631\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"214.261744\" y=\"101.624026\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"216.095585\" y=\"13.377273\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#me332fceb19\" x=\"217.709365\" y=\"136.922727\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 116.188368 64.815784 \n",
"L 113.584369 64.816595 \n",
"L 110.53281 64.81757 \n",
"L 107.481257 64.818571 \n",
"L 104.429691 64.819597 \n",
"L 102.995823 64.820091 \n",
"L 101.378131 64.820723 \n",
"L 98.326571 64.821943 \n",
"L 95.275012 64.823192 \n",
"L 92.394877 64.824398 \n",
"L 92.223438 64.824481 \n",
"L 89.171878 64.825988 \n",
"L 86.120318 64.827529 \n",
"L 83.839671 64.828705 \n",
"L 83.068744 64.829175 \n",
"L 80.017185 64.831073 \n",
"L 76.965625 64.833012 \n",
"L 73.914065 64.835428 \n",
"L 71.57354 64.837318 \n",
"L 70.862506 64.838057 \n",
"L 67.810932 64.841287 \n",
"L 67.497229 64.841625 \n",
"L 64.759372 64.84576 \n",
"L 64.64739 64.845932 \n",
"L 62.967633 64.850239 \n",
"L 62.407714 64.854545 \n",
"L 62.967633 64.858852 \n",
"L 64.64739 64.863159 \n",
"L 64.759372 64.863331 \n",
"L 67.497229 64.867466 \n",
"L 67.810932 64.867804 \n",
"L 70.862506 64.871034 \n",
"L 71.573542 64.871773 \n",
"L 73.914065 64.873663 \n",
"L 76.965625 64.876079 \n",
"L 80.017185 64.878017 \n",
"L 83.068744 64.879916 \n",
"L 83.839671 64.880386 \n",
"L 86.120318 64.881562 \n",
"L 89.171878 64.883103 \n",
"L 92.223438 64.88461 \n",
"L 92.394877 64.884693 \n",
"L 95.275012 64.885899 \n",
"L 98.326571 64.887148 \n",
"L 101.378131 64.888368 \n",
"L 102.995819 64.889 \n",
"L 104.429691 64.889494 \n",
"L 107.481257 64.89052 \n",
"L 110.53281 64.89152 \n",
"L 113.584369 64.892496 \n",
"L 116.188368 64.893306 \n",
"L 116.635936 64.893431 \n",
"L 119.687496 64.894259 \n",
"L 122.739063 64.895063 \n",
"L 125.790622 64.895845 \n",
"L 128.842182 64.896605 \n",
"L 131.893749 64.897341 \n",
"L 133.056255 64.897613 \n",
"L 134.945316 64.898013 \n",
"L 137.996875 64.898639 \n",
"L 141.048442 64.899244 \n",
"L 144.100002 64.899828 \n",
"L 147.151569 64.900392 \n",
"L 150.203128 64.900936 \n",
"L 153.254688 64.901459 \n",
"L 156.057151 64.90192 \n",
"L 156.306255 64.901958 \n",
"L 159.357814 64.902398 \n",
"L 162.409374 64.902819 \n",
"L 165.460941 64.903221 \n",
"L 168.5125 64.903605 \n",
"L 171.564064 64.90397 \n",
"L 174.615627 64.904317 \n",
"L 177.66719 64.904645 \n",
"L 180.718753 64.904954 \n",
"L 183.770313 64.905244 \n",
"L 186.821876 64.905515 \n",
"L 189.873439 64.905768 \n",
"L 192.924999 64.906002 \n",
"L 195.976562 64.906217 \n",
"L 196.121829 64.906227 \n",
"L 199.028126 64.906399 \n",
"L 202.079689 64.906563 \n",
"L 205.13125 64.906709 \n",
"L 208.182813 64.906838 \n",
"L 211.234377 64.90695 \n",
"L 214.285938 64.907045 \n",
"L 217.337501 64.907123 \n",
"L 220.389064 64.907183 \n",
"L 223.440626 64.907226 \n",
"L 226.492189 64.907252 \n",
"L 229.543751 64.90726 \n",
"L 232.595313 64.907252 \n",
"L 235.646876 64.907226 \n",
"L 238.698439 64.907183 \n",
"L 241.750001 64.907123 \n",
"L 244.801563 64.907045 \n",
"L 247.853126 64.90695 \n",
"L 250.904688 64.906838 \n",
"L 253.956251 64.906709 \n",
"L 257.007812 64.906563 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 61.707812 64.907045 \n",
"L 64.759372 64.907984 \n",
"L 67.810932 64.908906 \n",
"L 70.862506 64.90981 \n",
"L 73.351142 64.910534 \n",
"L 73.914065 64.910685 \n",
"L 76.965625 64.911491 \n",
"L 80.017185 64.91228 \n",
"L 83.068744 64.913054 \n",
"L 86.120318 64.913812 \n",
"L 89.171878 64.914553 \n",
"L 90.37908 64.91484 \n",
"L 92.223438 64.915249 \n",
"L 95.275012 64.91591 \n",
"L 98.326571 64.916556 \n",
"L 101.378131 64.917187 \n",
"L 104.429691 64.917803 \n",
"L 107.481257 64.918405 \n",
"L 110.53281 64.918991 \n",
"L 111.365053 64.919147 \n",
"L 113.584369 64.919536 \n",
"L 116.635936 64.920057 \n",
"L 119.687496 64.920564 \n",
"L 122.739063 64.921057 \n",
"L 125.790622 64.921537 \n",
"L 128.842182 64.922002 \n",
"L 131.893749 64.922454 \n",
"L 134.945316 64.922891 \n",
"L 137.996875 64.923315 \n",
"L 139.031328 64.923454 \n",
"L 141.048442 64.923708 \n",
"L 144.100002 64.92408 \n",
"L 147.151569 64.924439 \n",
"L 150.203128 64.924785 \n",
"L 153.254688 64.925118 \n",
"L 156.306255 64.925438 \n",
"L 159.357814 64.925744 \n",
"L 162.409374 64.926038 \n",
"L 165.460941 64.926319 \n",
"L 168.5125 64.926586 \n",
"L 171.564064 64.926841 \n",
"L 174.615627 64.927082 \n",
"L 177.66719 64.92731 \n",
"L 180.718753 64.927526 \n",
"L 183.770313 64.927728 \n",
"L 184.29646 64.927761 \n",
"L 186.821876 64.927908 \n",
"L 189.873439 64.928075 \n",
"L 192.924999 64.928228 \n",
"L 195.976562 64.92837 \n",
"L 199.028126 64.928499 \n",
"L 202.079689 64.928616 \n",
"L 205.13125 64.928721 \n",
"L 208.182813 64.928813 \n",
"L 211.234377 64.928893 \n",
"L 214.285938 64.92896 \n",
"L 217.337501 64.929016 \n",
"L 220.389064 64.929059 \n",
"L 223.440626 64.92909 \n",
"L 226.492189 64.929108 \n",
"L 229.543751 64.929114 \n",
"L 232.595313 64.929108 \n",
"L 235.646876 64.92909 \n",
"L 238.698439 64.929059 \n",
"L 241.750001 64.929016 \n",
"L 244.801563 64.92896 \n",
"L 247.853126 64.928893 \n",
"L 250.904688 64.928813 \n",
"L 253.956251 64.928721 \n",
"L 257.007812 64.928616 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 61.707812 64.92896 \n",
"L 64.759372 64.929631 \n",
"L 67.810932 64.930289 \n",
"L 70.862506 64.930935 \n",
"L 73.914065 64.931569 \n",
"L 76.36137 64.932068 \n",
"L 76.965625 64.932184 \n",
"L 80.017185 64.93276 \n",
"L 83.068744 64.933325 \n",
"L 86.120318 64.933878 \n",
"L 89.171878 64.934419 \n",
"L 92.223438 64.934948 \n",
"L 95.275012 64.935466 \n",
"L 98.326571 64.935973 \n",
"L 100.803727 64.936374 \n",
"L 101.378131 64.936463 \n",
"L 104.429691 64.936921 \n",
"L 107.481257 64.937368 \n",
"L 110.53281 64.937804 \n",
"L 113.584369 64.93823 \n",
"L 116.635936 64.938644 \n",
"L 119.687496 64.939047 \n",
"L 122.739063 64.939439 \n",
"L 125.790622 64.93982 \n",
"L 128.842182 64.94019 \n",
"L 131.893749 64.940549 \n",
"L 133.056264 64.940681 \n",
"L 134.945316 64.940886 \n",
"L 137.996875 64.941206 \n",
"L 141.048442 64.941516 \n",
"L 144.100002 64.941816 \n",
"L 147.151569 64.942104 \n",
"L 150.203128 64.942383 \n",
"L 153.254688 64.942651 \n",
"L 156.306255 64.942908 \n",
"L 159.357814 64.943155 \n",
"L 162.409374 64.943391 \n",
"L 165.460941 64.943617 \n",
"L 168.5125 64.943832 \n",
"L 171.564064 64.944037 \n",
"L 174.615627 64.944232 \n",
"L 177.66719 64.944415 \n",
"L 180.718753 64.944589 \n",
"L 183.770313 64.944752 \n",
"L 186.821876 64.944904 \n",
"L 188.630374 64.944988 \n",
"L 189.873439 64.945043 \n",
"L 192.924999 64.945168 \n",
"L 195.976562 64.945283 \n",
"L 199.028126 64.945388 \n",
"L 202.079689 64.945484 \n",
"L 205.13125 64.945569 \n",
"L 208.182813 64.945644 \n",
"L 211.234377 64.945709 \n",
"L 214.285938 64.945764 \n",
"L 217.337501 64.945809 \n",
"L 220.389064 64.945844 \n",
"L 223.440626 64.945869 \n",
"L 226.492189 64.945884 \n",
"L 229.543751 64.945889 \n",
"L 232.595313 64.945884 \n",
"L 235.646876 64.945869 \n",
"L 238.698439 64.945844 \n",
"L 241.750001 64.945809 \n",
"L 244.801563 64.945764 \n",
"L 247.853126 64.945709 \n",
"L 250.904688 64.945644 \n",
"L 253.956251 64.945569 \n",
"L 257.007812 64.945484 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 61.707812 64.945764 \n",
"L 64.759372 64.94631 \n",
"L 67.810932 64.946846 \n",
"L 70.862506 64.947372 \n",
"L 73.914065 64.947887 \n",
"L 76.965625 64.948393 \n",
"L 80.017185 64.948889 \n",
"L 82.565401 64.949295 \n",
"L 83.068744 64.949371 \n",
"L 86.120318 64.949826 \n",
"L 89.171878 64.950271 \n",
"L 92.223438 64.950706 \n",
"L 95.275012 64.951132 \n",
"L 98.326571 64.951549 \n",
"L 101.378131 64.951955 \n",
"L 104.429691 64.952352 \n",
"L 107.481257 64.95274 \n",
"L 110.53281 64.953118 \n",
"L 113.584369 64.953487 \n",
"L 114.56088 64.953601 \n",
"L 116.635936 64.953835 \n",
"L 119.687496 64.95417 \n",
"L 122.739063 64.954495 \n",
"L 125.790622 64.954811 \n",
"L 128.842182 64.955118 \n",
"L 131.893749 64.955416 \n",
"L 134.945316 64.955704 \n",
"L 137.996875 64.955984 \n",
"L 141.048442 64.956254 \n",
"L 144.100002 64.956515 \n",
"L 147.151569 64.956767 \n",
"L 150.203128 64.95701 \n",
"L 153.254688 64.957244 \n",
"L 156.306255 64.957468 \n",
"L 159.357814 64.957684 \n",
"L 162.409374 64.95789 \n",
"L 162.693286 64.957908 \n",
"L 165.460941 64.95808 \n",
"L 168.5125 64.95826 \n",
"L 171.564064 64.958431 \n",
"L 174.615627 64.958594 \n",
"L 177.66719 64.958748 \n",
"L 180.718753 64.958893 \n",
"L 183.770313 64.959029 \n",
"L 186.821876 64.959156 \n",
"L 189.873439 64.959275 \n",
"L 192.924999 64.959385 \n",
"L 195.976562 64.959486 \n",
"L 199.028126 64.959578 \n",
"L 202.079689 64.959662 \n",
"L 205.13125 64.959736 \n",
"L 208.182813 64.959802 \n",
"L 211.234377 64.959859 \n",
"L 214.285938 64.959908 \n",
"L 217.337501 64.959947 \n",
"L 220.389064 64.959978 \n",
"L 223.440626 64.96 \n",
"L 226.492189 64.960013 \n",
"L 229.543751 64.960018 \n",
"L 232.595313 64.960013 \n",
"L 235.646876 64.96 \n",
"L 238.698439 64.959978 \n",
"L 241.750001 64.959947 \n",
"L 244.801563 64.959908 \n",
"L 247.853126 64.959859 \n",
"L 250.904688 64.959802 \n",
"L 253.956251 64.959736 \n",
"L 257.007812 64.959662 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 61.707812 64.959908 \n",
"L 64.759372 64.960387 \n",
"L 67.810932 64.960857 \n",
"L 70.862506 64.961318 \n",
"L 73.914065 64.961771 \n",
"L 76.965625 64.962215 \n",
"L 80.017185 64.962633 \n",
"L 83.068744 64.963043 \n",
"L 86.120318 64.963444 \n",
"L 89.171878 64.963836 \n",
"L 92.223438 64.964221 \n",
"L 95.275012 64.964596 \n",
"L 98.326571 64.964964 \n",
"L 101.378131 64.965323 \n",
"L 104.429691 64.965673 \n",
"L 107.481257 64.966015 \n",
"L 110.53281 64.966349 \n",
"L 112.157612 64.966522 \n",
"L 113.584369 64.966668 \n",
"L 116.635936 64.966973 \n",
"L 119.687496 64.967269 \n",
"L 122.739063 64.967558 \n",
"L 125.790622 64.967838 \n",
"L 128.842182 64.96811 \n",
"L 131.893749 64.968375 \n",
"L 134.945316 64.96863 \n",
"L 137.996875 64.968878 \n",
"L 141.048442 64.969118 \n",
"L 144.100002 64.96935 \n",
"L 147.151569 64.969573 \n",
"L 150.203128 64.969788 \n",
"L 153.254688 64.969996 \n",
"L 156.306255 64.970195 \n",
"L 159.357814 64.970386 \n",
"L 162.409374 64.970569 \n",
"L 165.460941 64.970743 \n",
"L 167.023985 64.970829 \n",
"L 168.5125 64.970907 \n",
"L 171.564064 64.97106 \n",
"L 174.615627 64.971204 \n",
"L 177.66719 64.971341 \n",
"L 180.718753 64.971471 \n",
"L 183.770313 64.971592 \n",
"L 186.821876 64.971706 \n",
"L 189.873439 64.971811 \n",
"L 192.924999 64.971909 \n",
"L 195.976562 64.971999 \n",
"L 199.028126 64.972081 \n",
"L 202.079689 64.972156 \n",
"L 205.13125 64.972222 \n",
"L 208.182813 64.972281 \n",
"L 211.234377 64.972332 \n",
"L 214.285938 64.972375 \n",
"L 217.337501 64.97241 \n",
"L 220.389064 64.972438 \n",
"L 223.440626 64.972457 \n",
"L 226.492189 64.972469 \n",
"L 229.543751 64.972473 \n",
"L 232.595313 64.972469 \n",
"L 235.646876 64.972457 \n",
"L 238.698439 64.972438 \n",
"L 241.750001 64.97241 \n",
"L 244.801563 64.972375 \n",
"L 247.853126 64.972332 \n",
"L 250.904688 64.972281 \n",
"L 253.956251 64.972222 \n",
"L 257.007812 64.972156 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 61.707812 64.972375 \n",
"L 64.759372 64.972802 \n",
"L 67.810932 64.973221 \n",
"L 70.862506 64.973632 \n",
"L 73.914065 64.974035 \n",
"L 76.965625 64.974431 \n",
"L 80.017185 64.974818 \n",
"L 82.565371 64.975135 \n",
"L 83.068744 64.975196 \n",
"L 86.120318 64.975555 \n",
"L 89.171878 64.975906 \n",
"L 92.223438 64.97625 \n",
"L 95.275012 64.976586 \n",
"L 98.326571 64.976915 \n",
"L 101.378131 64.977236 \n",
"L 104.429691 64.977549 \n",
"L 107.481257 64.977855 \n",
"L 110.53281 64.978154 \n",
"L 113.584369 64.978445 \n",
"L 116.635936 64.978728 \n",
"L 119.687496 64.979004 \n",
"L 122.739063 64.979272 \n",
"L 124.729291 64.979442 \n",
"L 125.790622 64.97953 \n",
"L 128.842182 64.979774 \n",
"L 131.893749 64.980011 \n",
"L 134.945316 64.980241 \n",
"L 137.996875 64.980464 \n",
"L 141.048442 64.980679 \n",
"L 144.100002 64.980887 \n",
"L 147.151569 64.981088 \n",
"L 150.203128 64.981282 \n",
"L 153.254688 64.981468 \n",
"L 156.306255 64.981647 \n",
"L 159.357814 64.981818 \n",
"L 162.409374 64.981982 \n",
"L 165.460941 64.982139 \n",
"L 168.5125 64.982289 \n",
"L 171.564064 64.982431 \n",
"L 174.615627 64.982566 \n",
"L 177.66719 64.982694 \n",
"L 180.718753 64.982815 \n",
"L 183.770313 64.982928 \n",
"L 186.821876 64.983034 \n",
"L 189.873439 64.983132 \n",
"L 192.924999 64.983223 \n",
"L 195.976562 64.983307 \n",
"L 199.028126 64.983384 \n",
"L 202.079689 64.983453 \n",
"L 205.13125 64.983515 \n",
"L 208.182813 64.98357 \n",
"L 211.234377 64.983618 \n",
"L 214.285938 64.983658 \n",
"L 217.337501 64.983691 \n",
"L 220.389064 64.983716 \n",
"L 223.440626 64.983734 \n",
"L 226.492189 64.983745 \n",
"L 229.543751 64.983749 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" <path d=\"M 229.543751 64.983749 \n",
"L 232.595313 64.983745 \n",
"L 235.646876 64.983734 \n",
"L 238.698439 64.983716 \n",
"L 241.750001 64.983691 \n",
"L 244.801563 64.983658 \n",
"L 247.853126 64.983618 \n",
"L 250.904688 64.98357 \n",
"L 253.956251 64.983515 \n",
"L 257.007812 64.983453 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\">\n",
" <path d=\"M 61.707812 64.983658 \n",
"L 62.407706 64.983749 \n",
"\" clip-path=\"url(#p6dc97a64f2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_9\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 61.707813 143.1 \n",
"L 61.707813 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 257.007812 143.1 \n",
"L 257.007812 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 61.707812 143.1 \n",
"L 257.007812 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 61.707812 7.2 \n",
"L 257.007812 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p6dc97a64f2\">\n",
" <rect x=\"61.707813\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"eta = 0.6\n",
"d2l.show_trace_2d(f_2d, d2l.train_2d(gd_2d))"
]
},
{
"cell_type": "markdown",
"id": "2003ec3e",
"metadata": {
"origin_pos": 7
},
"source": [
"### 动量法\n",
"\n",
"*动量法*(momentum)使我们能够解决上面描述的梯度下降问题。\n",
"观察上面的优化轨迹,我们可能会直觉到计算过去的平均梯度效果会很好。\n",
"毕竟,在$x_1$方向上,这将聚合非常对齐的梯度,从而增加我们在每一步中覆盖的距离。\n",
"相反,在梯度振荡的$x_2$方向,由于相互抵消了对方的振荡,聚合梯度将减小步长大小。\n",
"使用$\\mathbf{v}_t$而不是梯度$\\mathbf{g}_t$可以生成以下更新等式:\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mathbf{v}_t &\\leftarrow \\beta \\mathbf{v}_{t-1} + \\mathbf{g}_{t, t-1}, \\\\\n",
"\\mathbf{x}_t &\\leftarrow \\mathbf{x}_{t-1} - \\eta_t \\mathbf{v}_t.\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"请注意,对于$\\beta = 0$,我们恢复常规的梯度下降。\n",
"在深入研究它的数学属性之前,让我们快速看一下算法在实验中的表现如何。\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c31034be",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.520714Z",
"iopub.status.busy": "2023-08-18T07:02:10.520405Z",
"iopub.status.idle": "2023-08-18T07:02:10.648963Z",
"shell.execute_reply": "2023-08-18T07:02:10.648142Z"
},
"origin_pos": 8,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: 0.007188, x2: 0.002553\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:10.620611</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"maedafd561d\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#maedafd561d\" x=\"88.39375\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(81.022656 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#maedafd561d\" x=\"149.425\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(142.053907 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#maedafd561d\" x=\"210.456251\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(207.275001 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(134.129687 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"m4afc7c710b\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m4afc7c710b\" x=\"42.620312\" y=\"120.784729\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(20.878125 124.583948)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#m4afc7c710b\" x=\"42.620312\" y=\"76.154187\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 79.953406)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m4afc7c710b\" x=\"42.620312\" y=\"31.523645\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(29.257812 35.322864)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798437 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 57.878125 120.784729 \n",
"L 76.1875 13.671429 \n",
"L 101.454438 110.073399 \n",
"L 127.168124 76.868276 \n",
"L 150.019542 58.551901 \n",
"L 168.697657 91.6392 \n",
"L 183.047745 71.018819 \n",
"L 193.51181 73.033513 \n",
"L 200.777175 81.530479 \n",
"L 205.571347 72.875862 \n",
"L 208.554621 76.416534 \n",
"L 210.274454 77.557238 \n",
"L 211.156186 74.760268 \n",
"L 211.51306 76.707189 \n",
"L 211.564679 76.353445 \n",
"L 211.457478 75.698354 \n",
"L 211.28373 76.464808 \n",
"L 211.097558 76.102545 \n",
"L 210.927516 76.045355 \n",
"L 210.785942 76.277957 \n",
"L 210.675593 76.09721 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m0d2b1b2271\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#pff4acb2cbb)\">\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"57.878125\" y=\"120.784729\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"76.1875\" y=\"13.671429\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"101.454438\" y=\"110.073399\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"127.168124\" y=\"76.868276\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"150.019542\" y=\"58.551901\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"168.697657\" y=\"91.6392\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"183.047745\" y=\"71.018819\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"193.51181\" y=\"73.033513\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"200.777175\" y=\"81.530479\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"205.571347\" y=\"72.875862\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"208.554621\" y=\"76.416534\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"210.274454\" y=\"77.557238\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"211.156186\" y=\"74.760268\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"211.51306\" y=\"76.707189\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"211.564679\" y=\"76.353445\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"211.457478\" y=\"75.698354\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"211.28373\" y=\"76.464808\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"211.097558\" y=\"76.102545\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"210.927516\" y=\"76.045355\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"210.785942\" y=\"76.277957\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m0d2b1b2271\" x=\"210.675593\" y=\"76.09721\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 97.100868 56.070444 \n",
"L 94.496869 56.490497 \n",
"L 91.44531 56.995871 \n",
"L 88.393757 57.514372 \n",
"L 85.342191 58.046001 \n",
"L 83.908323 58.30197 \n",
"L 82.290631 58.62926 \n",
"L 79.239071 59.261526 \n",
"L 76.187512 59.908668 \n",
"L 73.307377 60.533498 \n",
"L 73.135938 60.576413 \n",
"L 70.084378 61.357446 \n",
"L 67.032818 62.155645 \n",
"L 64.752171 62.765024 \n",
"L 63.981244 63.008465 \n",
"L 60.929685 63.992365 \n",
"L 57.878125 64.996552 \n",
"L 54.826565 66.248684 \n",
"L 52.48604 67.228079 \n",
"L 51.775006 67.610622 \n",
"L 48.723432 69.284276 \n",
"L 48.409729 69.459606 \n",
"L 45.671872 71.601873 \n",
"L 45.55989 71.691133 \n",
"L 43.880133 73.922661 \n",
"L 43.320214 76.154188 \n",
"L 43.880133 78.385715 \n",
"L 45.55989 80.617242 \n",
"L 45.671872 80.706502 \n",
"L 48.409729 82.848769 \n",
"L 48.723432 83.024098 \n",
"L 51.775006 84.697753 \n",
"L 52.486042 85.080296 \n",
"L 54.826565 86.059689 \n",
"L 57.878125 87.311823 \n",
"L 60.929685 88.31601 \n",
"L 63.981244 89.299909 \n",
"L 64.752171 89.54335 \n",
"L 67.032818 90.152729 \n",
"L 70.084378 90.950929 \n",
"L 73.135938 91.731962 \n",
"L 73.307377 91.774877 \n",
"L 76.187512 92.399707 \n",
"L 79.239071 93.046849 \n",
"L 82.290631 93.679115 \n",
"L 83.908319 94.006403 \n",
"L 85.342191 94.262373 \n",
"L 88.393757 94.794003 \n",
"L 91.44531 95.312502 \n",
"L 94.496869 95.817878 \n",
"L 97.100868 96.237931 \n",
"L 97.548436 96.302528 \n",
"L 100.599996 96.731216 \n",
"L 103.651563 97.14816 \n",
"L 106.703122 97.553357 \n",
"L 109.754682 97.94681 \n",
"L 112.806249 98.328519 \n",
"L 113.968755 98.469458 \n",
"L 115.857816 98.676672 \n",
"L 118.909375 99.000774 \n",
"L 121.960942 99.314251 \n",
"L 125.012502 99.617101 \n",
"L 128.064069 99.909325 \n",
"L 131.115628 100.190923 \n",
"L 134.167188 100.461893 \n",
"L 136.969651 100.700986 \n",
"L 137.218755 100.72039 \n",
"L 140.270314 100.948394 \n",
"L 143.321874 101.166696 \n",
"L 146.373441 101.375295 \n",
"L 149.425 101.574191 \n",
"L 152.476564 101.763386 \n",
"L 155.528127 101.942879 \n",
"L 158.57969 102.112669 \n",
"L 161.631253 102.272756 \n",
"L 164.682813 102.423143 \n",
"L 167.734376 102.563825 \n",
"L 170.785939 102.694807 \n",
"L 173.837499 102.816085 \n",
"L 176.889062 102.927661 \n",
"L 177.034329 102.932511 \n",
"L 179.940626 103.021774 \n",
"L 182.992189 103.106571 \n",
"L 186.04375 103.182444 \n",
"L 189.095313 103.249389 \n",
"L 192.146877 103.307409 \n",
"L 195.198438 103.356502 \n",
"L 198.250001 103.39667 \n",
"L 201.301564 103.427911 \n",
"L 204.353126 103.450227 \n",
"L 207.404689 103.463616 \n",
"L 210.456251 103.468079 \n",
"L 213.507813 103.463616 \n",
"L 216.559376 103.450227 \n",
"L 219.610939 103.427911 \n",
"L 222.662501 103.39667 \n",
"L 225.714063 103.356502 \n",
"L 228.765626 103.307409 \n",
"L 231.817188 103.249389 \n",
"L 234.868751 103.182444 \n",
"L 237.920313 103.106571 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 42.620312 103.356502 \n",
"L 45.671872 103.842976 \n",
"L 48.723432 104.32052 \n",
"L 51.775006 104.789143 \n",
"L 54.263642 105.164038 \n",
"L 54.826565 105.242557 \n",
"L 57.878125 105.659934 \n",
"L 60.929685 106.069047 \n",
"L 63.981244 106.469895 \n",
"L 67.032818 106.86248 \n",
"L 70.084378 107.246799 \n",
"L 71.29158 107.395566 \n",
"L 73.135938 107.607177 \n",
"L 76.187512 107.949602 \n",
"L 79.239071 108.284331 \n",
"L 82.290631 108.611365 \n",
"L 85.342191 108.930704 \n",
"L 88.393757 109.242348 \n",
"L 91.44531 109.546297 \n",
"L 92.277553 109.627094 \n",
"L 94.496869 109.82865 \n",
"L 97.548436 110.098594 \n",
"L 100.599996 110.361338 \n",
"L 103.651563 110.616884 \n",
"L 106.703122 110.86523 \n",
"L 109.754682 111.106379 \n",
"L 112.806249 111.340329 \n",
"L 115.857816 111.567082 \n",
"L 118.909375 111.786635 \n",
"L 119.943828 111.858621 \n",
"L 121.960942 111.990483 \n",
"L 125.012502 112.183206 \n",
"L 128.064069 112.369166 \n",
"L 131.115628 112.548365 \n",
"L 134.167188 112.720801 \n",
"L 137.218755 112.886476 \n",
"L 140.270314 113.045388 \n",
"L 143.321874 113.197536 \n",
"L 146.373441 113.342924 \n",
"L 149.425 113.481549 \n",
"L 152.476564 113.613412 \n",
"L 155.528127 113.738514 \n",
"L 158.57969 113.856852 \n",
"L 161.631253 113.968428 \n",
"L 164.682813 114.073243 \n",
"L 165.20896 114.090149 \n",
"L 167.734376 114.166657 \n",
"L 170.785939 114.25273 \n",
"L 173.837499 114.332428 \n",
"L 176.889062 114.40575 \n",
"L 179.940626 114.472696 \n",
"L 182.992189 114.533266 \n",
"L 186.04375 114.587459 \n",
"L 189.095313 114.635279 \n",
"L 192.146877 114.676722 \n",
"L 195.198438 114.711787 \n",
"L 198.250001 114.740479 \n",
"L 201.301564 114.762795 \n",
"L 204.353126 114.778733 \n",
"L 207.404689 114.788297 \n",
"L 210.456251 114.791485 \n",
"L 213.507813 114.788297 \n",
"L 216.559376 114.778733 \n",
"L 219.610939 114.762795 \n",
"L 222.662501 114.740479 \n",
"L 225.714063 114.711787 \n",
"L 228.765626 114.676722 \n",
"L 231.817188 114.635279 \n",
"L 234.868751 114.587459 \n",
"L 237.920313 114.533266 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 42.620312 114.711787 \n",
"L 45.671872 115.059268 \n",
"L 48.723432 115.400372 \n",
"L 51.775006 115.735103 \n",
"L 54.826565 116.063456 \n",
"L 57.27387 116.321676 \n",
"L 57.878125 116.381986 \n",
"L 60.929685 116.680529 \n",
"L 63.981244 116.973039 \n",
"L 67.032818 117.25952 \n",
"L 70.084378 117.539968 \n",
"L 73.135938 117.814385 \n",
"L 76.187512 118.082773 \n",
"L 79.239071 118.345128 \n",
"L 81.716227 118.553204 \n",
"L 82.290631 118.598978 \n",
"L 85.342191 118.836434 \n",
"L 88.393757 119.06817 \n",
"L 91.44531 119.294183 \n",
"L 94.496869 119.514475 \n",
"L 97.548436 119.729044 \n",
"L 100.599996 119.937892 \n",
"L 103.651563 120.141019 \n",
"L 106.703122 120.338424 \n",
"L 109.754682 120.530107 \n",
"L 112.806249 120.716066 \n",
"L 113.968764 120.784729 \n",
"L 115.857816 120.890861 \n",
"L 118.909375 121.056867 \n",
"L 121.960942 121.217428 \n",
"L 125.012502 121.372545 \n",
"L 128.064069 121.522221 \n",
"L 131.115628 121.666455 \n",
"L 134.167188 121.805244 \n",
"L 137.218755 121.938591 \n",
"L 140.270314 122.066495 \n",
"L 143.321874 122.188957 \n",
"L 146.373441 122.305977 \n",
"L 149.425 122.417554 \n",
"L 152.476564 122.523686 \n",
"L 155.528127 122.624376 \n",
"L 158.57969 122.719627 \n",
"L 161.631253 122.809432 \n",
"L 164.682813 122.893792 \n",
"L 167.734376 122.972713 \n",
"L 169.542874 123.016259 \n",
"L 170.785939 123.044799 \n",
"L 173.837499 123.109668 \n",
"L 176.889062 123.169348 \n",
"L 179.940626 123.223838 \n",
"L 182.992189 123.27314 \n",
"L 186.04375 123.317252 \n",
"L 189.095313 123.356175 \n",
"L 192.146877 123.389906 \n",
"L 195.198438 123.41845 \n",
"L 198.250001 123.441803 \n",
"L 201.301564 123.459967 \n",
"L 204.353126 123.472941 \n",
"L 207.404689 123.480723 \n",
"L 210.456251 123.483319 \n",
"L 213.507813 123.480723 \n",
"L 216.559376 123.472941 \n",
"L 219.610939 123.459967 \n",
"L 222.662501 123.441803 \n",
"L 225.714063 123.41845 \n",
"L 228.765626 123.389906 \n",
"L 231.817188 123.356175 \n",
"L 234.868751 123.317252 \n",
"L 237.920313 123.27314 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 42.620312 123.41845 \n",
"L 45.671872 123.701282 \n",
"L 48.723432 123.978926 \n",
"L 51.775006 124.251381 \n",
"L 54.826565 124.518645 \n",
"L 57.878125 124.780719 \n",
"L 60.929685 125.037605 \n",
"L 63.477901 125.247784 \n",
"L 63.981244 125.287455 \n",
"L 67.032818 125.523006 \n",
"L 70.084378 125.753596 \n",
"L 73.135938 125.979229 \n",
"L 76.187512 126.199902 \n",
"L 79.239071 126.415616 \n",
"L 82.290631 126.626371 \n",
"L 85.342191 126.832168 \n",
"L 88.393757 127.033006 \n",
"L 91.44531 127.228883 \n",
"L 94.496869 127.419802 \n",
"L 95.47338 127.479309 \n",
"L 97.548436 127.600381 \n",
"L 100.599996 127.77368 \n",
"L 103.651563 127.942232 \n",
"L 106.703122 128.106036 \n",
"L 109.754682 128.265093 \n",
"L 112.806249 128.4194 \n",
"L 115.857816 128.568959 \n",
"L 118.909375 128.713773 \n",
"L 121.960942 128.853837 \n",
"L 125.012502 128.989152 \n",
"L 128.064069 129.11972 \n",
"L 131.115628 129.245542 \n",
"L 134.167188 129.366613 \n",
"L 137.218755 129.482937 \n",
"L 140.270314 129.594513 \n",
"L 143.321874 129.701342 \n",
"L 143.605786 129.71084 \n",
"L 146.373441 129.799644 \n",
"L 149.425 129.893005 \n",
"L 152.476564 129.981809 \n",
"L 155.528127 130.06606 \n",
"L 158.57969 130.145759 \n",
"L 161.631253 130.220903 \n",
"L 164.682813 130.29149 \n",
"L 167.734376 130.357525 \n",
"L 170.785939 130.419007 \n",
"L 173.837499 130.475933 \n",
"L 176.889062 130.528305 \n",
"L 179.940626 130.576123 \n",
"L 182.992189 130.619388 \n",
"L 186.04375 130.658098 \n",
"L 189.095313 130.692255 \n",
"L 192.146877 130.721856 \n",
"L 195.198438 130.746905 \n",
"L 198.250001 130.767398 \n",
"L 201.301564 130.783338 \n",
"L 204.353126 130.794723 \n",
"L 207.404689 130.801553 \n",
"L 210.456251 130.803831 \n",
"L 213.507813 130.801553 \n",
"L 216.559376 130.794723 \n",
"L 219.610939 130.783338 \n",
"L 222.662501 130.767398 \n",
"L 225.714063 130.746905 \n",
"L 228.765626 130.721856 \n",
"L 231.817188 130.692255 \n",
"L 234.868751 130.658098 \n",
"L 237.920313 130.619388 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 42.620312 130.746905 \n",
"L 45.671872 130.995104 \n",
"L 48.723432 131.238751 \n",
"L 51.775006 131.477844 \n",
"L 54.826565 131.712381 \n",
"L 57.878125 131.942365 \n",
"L 60.929685 132.158954 \n",
"L 63.981244 132.371168 \n",
"L 67.032818 132.579007 \n",
"L 70.084378 132.782469 \n",
"L 73.135938 132.981557 \n",
"L 76.187512 133.176269 \n",
"L 79.239071 133.366605 \n",
"L 82.290631 133.552565 \n",
"L 85.342191 133.734151 \n",
"L 88.393757 133.911361 \n",
"L 91.44531 134.084194 \n",
"L 93.070112 134.173889 \n",
"L 94.496869 134.24968 \n",
"L 97.548436 134.40757 \n",
"L 100.599996 134.561251 \n",
"L 103.651563 134.710721 \n",
"L 106.703122 134.855981 \n",
"L 109.754682 134.997031 \n",
"L 112.806249 135.133869 \n",
"L 115.857816 135.266498 \n",
"L 118.909375 135.394917 \n",
"L 121.960942 135.519124 \n",
"L 125.012502 135.63912 \n",
"L 128.064069 135.754907 \n",
"L 131.115628 135.866484 \n",
"L 134.167188 135.973849 \n",
"L 137.218755 136.077004 \n",
"L 140.270314 136.175949 \n",
"L 143.321874 136.270684 \n",
"L 146.373441 136.361209 \n",
"L 147.936485 136.40542 \n",
"L 149.425 136.445992 \n",
"L 152.476564 136.525109 \n",
"L 155.528127 136.600169 \n",
"L 158.57969 136.671174 \n",
"L 161.631253 136.738119 \n",
"L 164.682813 136.801006 \n",
"L 167.734376 136.859838 \n",
"L 170.785939 136.914613 \n",
"L 173.837499 136.965329 \n",
"L 176.889062 137.011987 \n",
"L 179.940626 137.054589 \n",
"L 182.992189 137.093134 \n",
"L 186.04375 137.127621 \n",
"L 189.095313 137.158052 \n",
"L 192.146877 137.184424 \n",
"L 195.198438 137.20674 \n",
"L 198.250001 137.224998 \n",
"L 201.301564 137.239198 \n",
"L 204.353126 137.249342 \n",
"L 207.404689 137.255427 \n",
"L 210.456251 137.257456 \n",
"L 213.507813 137.255427 \n",
"L 216.559376 137.249342 \n",
"L 219.610939 137.239198 \n",
"L 222.662501 137.224998 \n",
"L 225.714063 137.20674 \n",
"L 228.765626 137.184424 \n",
"L 231.817188 137.158052 \n",
"L 234.868751 137.127621 \n",
"L 237.920313 137.093134 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 42.620312 137.206738 \n",
"L 45.671872 137.427865 \n",
"L 48.723432 137.644928 \n",
"L 51.775006 137.857941 \n",
"L 54.826565 138.066894 \n",
"L 57.878125 138.271785 \n",
"L 60.929685 138.472624 \n",
"L 63.477871 138.636945 \n",
"L 63.981244 138.668266 \n",
"L 67.032818 138.854225 \n",
"L 70.084378 139.036272 \n",
"L 73.135938 139.214402 \n",
"L 76.187512 139.388619 \n",
"L 79.239071 139.558919 \n",
"L 82.290631 139.725305 \n",
"L 85.342191 139.887775 \n",
"L 88.393757 140.046331 \n",
"L 91.44531 140.200971 \n",
"L 94.496869 140.351697 \n",
"L 97.548436 140.498509 \n",
"L 100.599996 140.641405 \n",
"L 103.651563 140.780385 \n",
"L 105.641791 140.868475 \n",
"L 106.703122 140.913859 \n",
"L 109.754682 141.040563 \n",
"L 112.806249 141.163487 \n",
"L 115.857816 141.282628 \n",
"L 118.909375 141.397988 \n",
"L 121.960942 141.509564 \n",
"L 125.012502 141.617357 \n",
"L 128.064069 141.721369 \n",
"L 131.115628 141.821598 \n",
"L 134.167188 141.918047 \n",
"L 137.218755 142.010711 \n",
"L 140.270314 142.099596 \n",
"L 143.321874 142.184697 \n",
"L 146.373441 142.266014 \n",
"L 149.425 142.343551 \n",
"L 152.476564 142.417304 \n",
"L 155.528127 142.487276 \n",
"L 158.57969 142.553466 \n",
"L 161.631253 142.615871 \n",
"L 164.682813 142.674496 \n",
"L 167.734376 142.729341 \n",
"L 170.785939 142.780399 \n",
"L 173.837499 142.82768 \n",
"L 176.889062 142.871173 \n",
"L 179.940626 142.910887 \n",
"L 182.992189 142.94682 \n",
"L 186.04375 142.97897 \n",
"L 189.095313 143.007335 \n",
"L 192.146877 143.031921 \n",
"L 195.198438 143.052723 \n",
"L 198.250001 143.069741 \n",
"L 201.301564 143.082978 \n",
"L 204.353126 143.092436 \n",
"L 207.404689 143.09811 \n",
"L 210.456251 143.1 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" <path d=\"M 210.456251 143.1 \n",
"L 213.507813 143.09811 \n",
"L 216.559376 143.092436 \n",
"L 219.610939 143.082978 \n",
"L 222.662501 143.069741 \n",
"L 225.714063 143.052723 \n",
"L 228.765626 143.031921 \n",
"L 231.817188 143.007335 \n",
"L 234.868751 142.97897 \n",
"L 237.920313 142.94682 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\">\n",
" <path d=\"M 42.620312 143.052723 \n",
"L 43.320206 143.1 \n",
"\" clip-path=\"url(#pff4acb2cbb)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_9\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"pff4acb2cbb\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def momentum_2d(x1, x2, v1, v2):\n",
" v1 = beta * v1 + 0.2 * x1\n",
" v2 = beta * v2 + 4 * x2\n",
" return x1 - eta * v1, x2 - eta * v2, v1, v2\n",
"\n",
"eta, beta = 0.6, 0.5\n",
"d2l.show_trace_2d(f_2d, d2l.train_2d(momentum_2d))"
]
},
{
"cell_type": "markdown",
"id": "3394c74c",
"metadata": {
"origin_pos": 9
},
"source": [
"正如所见,尽管学习率与我们以前使用的相同,动量法仍然很好地收敛了。\n",
"让我们看看当降低动量参数时会发生什么。\n",
"将其减半至$\\beta = 0.25$会导致一条几乎没有收敛的轨迹。\n",
"尽管如此,它比没有动量时解将会发散要好得多。\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "647460a2",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.652444Z",
"iopub.status.busy": "2023-08-18T07:02:10.652134Z",
"iopub.status.idle": "2023-08-18T07:02:10.782619Z",
"shell.execute_reply": "2023-08-18T07:02:10.781789Z"
},
"origin_pos": 10,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: -0.126340, x2: -0.186632\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:10.754456</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"ma7811b720e\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#ma7811b720e\" x=\"88.39375\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(81.022656 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#ma7811b720e\" x=\"149.425\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(142.053907 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#ma7811b720e\" x=\"210.456251\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(207.275001 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(134.129687 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"md551d3c23f\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#md551d3c23f\" x=\"42.620312\" y=\"120.784729\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(20.878125 124.583948)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#md551d3c23f\" x=\"42.620312\" y=\"76.154187\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 79.953406)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#md551d3c23f\" x=\"42.620312\" y=\"31.523645\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(29.257812 35.322864)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798437 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 57.878125 120.784729 \n",
"L 76.1875 13.671429 \n",
"L 96.877094 136.851724 \n",
"L 115.678991 21.972709 \n",
"L 131.752737 123.288502 \n",
"L 145.215595 35.495094 \n",
"L 156.410188 111.128565 \n",
"L 165.694364 46.098425 \n",
"L 173.386834 101.974719 \n",
"L 179.758282 53.974517 \n",
"L 185.0349 95.205676 \n",
"L 189.404616 59.789893 \n",
"L 193.023242 90.210253 \n",
"L 196.019859 64.080785 \n",
"L 198.50138 86.524583 \n",
"L 200.556345 67.246582 \n",
"L 202.258075 83.805334 \n",
"L 203.667289 69.58227 \n",
"L 204.834267 81.799106 \n",
"L 205.80065 71.30551 \n",
"L 206.600918 80.318936 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m646e15f0a3\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p4e59ddec63)\">\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"57.878125\" y=\"120.784729\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"76.1875\" y=\"13.671429\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"96.877094\" y=\"136.851724\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"115.678991\" y=\"21.972709\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"131.752737\" y=\"123.288502\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"145.215595\" y=\"35.495094\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"156.410188\" y=\"111.128565\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"165.694364\" y=\"46.098425\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"173.386834\" y=\"101.974719\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"179.758282\" y=\"53.974517\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"185.0349\" y=\"95.205676\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"189.404616\" y=\"59.789893\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"193.023242\" y=\"90.210253\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"196.019859\" y=\"64.080785\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"198.50138\" y=\"86.524583\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"200.556345\" y=\"67.246582\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"202.258075\" y=\"83.805334\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"203.667289\" y=\"69.58227\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"204.834267\" y=\"81.799106\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"205.80065\" y=\"71.30551\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m646e15f0a3\" x=\"206.600918\" y=\"80.318936\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 97.100868 56.070444 \n",
"L 94.496869 56.490497 \n",
"L 91.44531 56.995871 \n",
"L 88.393757 57.514372 \n",
"L 85.342191 58.046001 \n",
"L 83.908323 58.30197 \n",
"L 82.290631 58.62926 \n",
"L 79.239071 59.261526 \n",
"L 76.187512 59.908668 \n",
"L 73.307377 60.533498 \n",
"L 73.135938 60.576413 \n",
"L 70.084378 61.357446 \n",
"L 67.032818 62.155645 \n",
"L 64.752171 62.765024 \n",
"L 63.981244 63.008465 \n",
"L 60.929685 63.992365 \n",
"L 57.878125 64.996552 \n",
"L 54.826565 66.248684 \n",
"L 52.48604 67.228079 \n",
"L 51.775006 67.610622 \n",
"L 48.723432 69.284276 \n",
"L 48.409729 69.459606 \n",
"L 45.671872 71.601873 \n",
"L 45.55989 71.691133 \n",
"L 43.880133 73.922661 \n",
"L 43.320214 76.154188 \n",
"L 43.880133 78.385715 \n",
"L 45.55989 80.617242 \n",
"L 45.671872 80.706502 \n",
"L 48.409729 82.848769 \n",
"L 48.723432 83.024098 \n",
"L 51.775006 84.697753 \n",
"L 52.486042 85.080296 \n",
"L 54.826565 86.059689 \n",
"L 57.878125 87.311823 \n",
"L 60.929685 88.31601 \n",
"L 63.981244 89.299909 \n",
"L 64.752171 89.54335 \n",
"L 67.032818 90.152729 \n",
"L 70.084378 90.950929 \n",
"L 73.135938 91.731962 \n",
"L 73.307377 91.774877 \n",
"L 76.187512 92.399707 \n",
"L 79.239071 93.046849 \n",
"L 82.290631 93.679115 \n",
"L 83.908319 94.006403 \n",
"L 85.342191 94.262373 \n",
"L 88.393757 94.794003 \n",
"L 91.44531 95.312502 \n",
"L 94.496869 95.817878 \n",
"L 97.100868 96.237931 \n",
"L 97.548436 96.302528 \n",
"L 100.599996 96.731216 \n",
"L 103.651563 97.14816 \n",
"L 106.703122 97.553357 \n",
"L 109.754682 97.94681 \n",
"L 112.806249 98.328519 \n",
"L 113.968755 98.469458 \n",
"L 115.857816 98.676672 \n",
"L 118.909375 99.000774 \n",
"L 121.960942 99.314251 \n",
"L 125.012502 99.617101 \n",
"L 128.064069 99.909325 \n",
"L 131.115628 100.190923 \n",
"L 134.167188 100.461893 \n",
"L 136.969651 100.700986 \n",
"L 137.218755 100.72039 \n",
"L 140.270314 100.948394 \n",
"L 143.321874 101.166696 \n",
"L 146.373441 101.375295 \n",
"L 149.425 101.574191 \n",
"L 152.476564 101.763386 \n",
"L 155.528127 101.942879 \n",
"L 158.57969 102.112669 \n",
"L 161.631253 102.272756 \n",
"L 164.682813 102.423143 \n",
"L 167.734376 102.563825 \n",
"L 170.785939 102.694807 \n",
"L 173.837499 102.816085 \n",
"L 176.889062 102.927661 \n",
"L 177.034329 102.932511 \n",
"L 179.940626 103.021774 \n",
"L 182.992189 103.106571 \n",
"L 186.04375 103.182444 \n",
"L 189.095313 103.249389 \n",
"L 192.146877 103.307409 \n",
"L 195.198438 103.356502 \n",
"L 198.250001 103.39667 \n",
"L 201.301564 103.427911 \n",
"L 204.353126 103.450227 \n",
"L 207.404689 103.463616 \n",
"L 210.456251 103.468079 \n",
"L 213.507813 103.463616 \n",
"L 216.559376 103.450227 \n",
"L 219.610939 103.427911 \n",
"L 222.662501 103.39667 \n",
"L 225.714063 103.356502 \n",
"L 228.765626 103.307409 \n",
"L 231.817188 103.249389 \n",
"L 234.868751 103.182444 \n",
"L 237.920313 103.106571 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 42.620312 103.356502 \n",
"L 45.671872 103.842976 \n",
"L 48.723432 104.32052 \n",
"L 51.775006 104.789143 \n",
"L 54.263642 105.164038 \n",
"L 54.826565 105.242557 \n",
"L 57.878125 105.659934 \n",
"L 60.929685 106.069047 \n",
"L 63.981244 106.469895 \n",
"L 67.032818 106.86248 \n",
"L 70.084378 107.246799 \n",
"L 71.29158 107.395566 \n",
"L 73.135938 107.607177 \n",
"L 76.187512 107.949602 \n",
"L 79.239071 108.284331 \n",
"L 82.290631 108.611365 \n",
"L 85.342191 108.930704 \n",
"L 88.393757 109.242348 \n",
"L 91.44531 109.546297 \n",
"L 92.277553 109.627094 \n",
"L 94.496869 109.82865 \n",
"L 97.548436 110.098594 \n",
"L 100.599996 110.361338 \n",
"L 103.651563 110.616884 \n",
"L 106.703122 110.86523 \n",
"L 109.754682 111.106379 \n",
"L 112.806249 111.340329 \n",
"L 115.857816 111.567082 \n",
"L 118.909375 111.786635 \n",
"L 119.943828 111.858621 \n",
"L 121.960942 111.990483 \n",
"L 125.012502 112.183206 \n",
"L 128.064069 112.369166 \n",
"L 131.115628 112.548365 \n",
"L 134.167188 112.720801 \n",
"L 137.218755 112.886476 \n",
"L 140.270314 113.045388 \n",
"L 143.321874 113.197536 \n",
"L 146.373441 113.342924 \n",
"L 149.425 113.481549 \n",
"L 152.476564 113.613412 \n",
"L 155.528127 113.738514 \n",
"L 158.57969 113.856852 \n",
"L 161.631253 113.968428 \n",
"L 164.682813 114.073243 \n",
"L 165.20896 114.090149 \n",
"L 167.734376 114.166657 \n",
"L 170.785939 114.25273 \n",
"L 173.837499 114.332428 \n",
"L 176.889062 114.40575 \n",
"L 179.940626 114.472696 \n",
"L 182.992189 114.533266 \n",
"L 186.04375 114.587459 \n",
"L 189.095313 114.635279 \n",
"L 192.146877 114.676722 \n",
"L 195.198438 114.711787 \n",
"L 198.250001 114.740479 \n",
"L 201.301564 114.762795 \n",
"L 204.353126 114.778733 \n",
"L 207.404689 114.788297 \n",
"L 210.456251 114.791485 \n",
"L 213.507813 114.788297 \n",
"L 216.559376 114.778733 \n",
"L 219.610939 114.762795 \n",
"L 222.662501 114.740479 \n",
"L 225.714063 114.711787 \n",
"L 228.765626 114.676722 \n",
"L 231.817188 114.635279 \n",
"L 234.868751 114.587459 \n",
"L 237.920313 114.533266 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 42.620312 114.711787 \n",
"L 45.671872 115.059268 \n",
"L 48.723432 115.400372 \n",
"L 51.775006 115.735103 \n",
"L 54.826565 116.063456 \n",
"L 57.27387 116.321676 \n",
"L 57.878125 116.381986 \n",
"L 60.929685 116.680529 \n",
"L 63.981244 116.973039 \n",
"L 67.032818 117.25952 \n",
"L 70.084378 117.539968 \n",
"L 73.135938 117.814385 \n",
"L 76.187512 118.082773 \n",
"L 79.239071 118.345128 \n",
"L 81.716227 118.553204 \n",
"L 82.290631 118.598978 \n",
"L 85.342191 118.836434 \n",
"L 88.393757 119.06817 \n",
"L 91.44531 119.294183 \n",
"L 94.496869 119.514475 \n",
"L 97.548436 119.729044 \n",
"L 100.599996 119.937892 \n",
"L 103.651563 120.141019 \n",
"L 106.703122 120.338424 \n",
"L 109.754682 120.530107 \n",
"L 112.806249 120.716066 \n",
"L 113.968764 120.784729 \n",
"L 115.857816 120.890861 \n",
"L 118.909375 121.056867 \n",
"L 121.960942 121.217428 \n",
"L 125.012502 121.372545 \n",
"L 128.064069 121.522221 \n",
"L 131.115628 121.666455 \n",
"L 134.167188 121.805244 \n",
"L 137.218755 121.938591 \n",
"L 140.270314 122.066495 \n",
"L 143.321874 122.188957 \n",
"L 146.373441 122.305977 \n",
"L 149.425 122.417554 \n",
"L 152.476564 122.523686 \n",
"L 155.528127 122.624376 \n",
"L 158.57969 122.719627 \n",
"L 161.631253 122.809432 \n",
"L 164.682813 122.893792 \n",
"L 167.734376 122.972713 \n",
"L 169.542874 123.016259 \n",
"L 170.785939 123.044799 \n",
"L 173.837499 123.109668 \n",
"L 176.889062 123.169348 \n",
"L 179.940626 123.223838 \n",
"L 182.992189 123.27314 \n",
"L 186.04375 123.317252 \n",
"L 189.095313 123.356175 \n",
"L 192.146877 123.389906 \n",
"L 195.198438 123.41845 \n",
"L 198.250001 123.441803 \n",
"L 201.301564 123.459967 \n",
"L 204.353126 123.472941 \n",
"L 207.404689 123.480723 \n",
"L 210.456251 123.483319 \n",
"L 213.507813 123.480723 \n",
"L 216.559376 123.472941 \n",
"L 219.610939 123.459967 \n",
"L 222.662501 123.441803 \n",
"L 225.714063 123.41845 \n",
"L 228.765626 123.389906 \n",
"L 231.817188 123.356175 \n",
"L 234.868751 123.317252 \n",
"L 237.920313 123.27314 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 42.620312 123.41845 \n",
"L 45.671872 123.701282 \n",
"L 48.723432 123.978926 \n",
"L 51.775006 124.251381 \n",
"L 54.826565 124.518645 \n",
"L 57.878125 124.780719 \n",
"L 60.929685 125.037605 \n",
"L 63.477901 125.247784 \n",
"L 63.981244 125.287455 \n",
"L 67.032818 125.523006 \n",
"L 70.084378 125.753596 \n",
"L 73.135938 125.979229 \n",
"L 76.187512 126.199902 \n",
"L 79.239071 126.415616 \n",
"L 82.290631 126.626371 \n",
"L 85.342191 126.832168 \n",
"L 88.393757 127.033006 \n",
"L 91.44531 127.228883 \n",
"L 94.496869 127.419802 \n",
"L 95.47338 127.479309 \n",
"L 97.548436 127.600381 \n",
"L 100.599996 127.77368 \n",
"L 103.651563 127.942232 \n",
"L 106.703122 128.106036 \n",
"L 109.754682 128.265093 \n",
"L 112.806249 128.4194 \n",
"L 115.857816 128.568959 \n",
"L 118.909375 128.713773 \n",
"L 121.960942 128.853837 \n",
"L 125.012502 128.989152 \n",
"L 128.064069 129.11972 \n",
"L 131.115628 129.245542 \n",
"L 134.167188 129.366613 \n",
"L 137.218755 129.482937 \n",
"L 140.270314 129.594513 \n",
"L 143.321874 129.701342 \n",
"L 143.605786 129.71084 \n",
"L 146.373441 129.799644 \n",
"L 149.425 129.893005 \n",
"L 152.476564 129.981809 \n",
"L 155.528127 130.06606 \n",
"L 158.57969 130.145759 \n",
"L 161.631253 130.220903 \n",
"L 164.682813 130.29149 \n",
"L 167.734376 130.357525 \n",
"L 170.785939 130.419007 \n",
"L 173.837499 130.475933 \n",
"L 176.889062 130.528305 \n",
"L 179.940626 130.576123 \n",
"L 182.992189 130.619388 \n",
"L 186.04375 130.658098 \n",
"L 189.095313 130.692255 \n",
"L 192.146877 130.721856 \n",
"L 195.198438 130.746905 \n",
"L 198.250001 130.767398 \n",
"L 201.301564 130.783338 \n",
"L 204.353126 130.794723 \n",
"L 207.404689 130.801553 \n",
"L 210.456251 130.803831 \n",
"L 213.507813 130.801553 \n",
"L 216.559376 130.794723 \n",
"L 219.610939 130.783338 \n",
"L 222.662501 130.767398 \n",
"L 225.714063 130.746905 \n",
"L 228.765626 130.721856 \n",
"L 231.817188 130.692255 \n",
"L 234.868751 130.658098 \n",
"L 237.920313 130.619388 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 42.620312 130.746905 \n",
"L 45.671872 130.995104 \n",
"L 48.723432 131.238751 \n",
"L 51.775006 131.477844 \n",
"L 54.826565 131.712381 \n",
"L 57.878125 131.942365 \n",
"L 60.929685 132.158954 \n",
"L 63.981244 132.371168 \n",
"L 67.032818 132.579007 \n",
"L 70.084378 132.782469 \n",
"L 73.135938 132.981557 \n",
"L 76.187512 133.176269 \n",
"L 79.239071 133.366605 \n",
"L 82.290631 133.552565 \n",
"L 85.342191 133.734151 \n",
"L 88.393757 133.911361 \n",
"L 91.44531 134.084194 \n",
"L 93.070112 134.173889 \n",
"L 94.496869 134.24968 \n",
"L 97.548436 134.40757 \n",
"L 100.599996 134.561251 \n",
"L 103.651563 134.710721 \n",
"L 106.703122 134.855981 \n",
"L 109.754682 134.997031 \n",
"L 112.806249 135.133869 \n",
"L 115.857816 135.266498 \n",
"L 118.909375 135.394917 \n",
"L 121.960942 135.519124 \n",
"L 125.012502 135.63912 \n",
"L 128.064069 135.754907 \n",
"L 131.115628 135.866484 \n",
"L 134.167188 135.973849 \n",
"L 137.218755 136.077004 \n",
"L 140.270314 136.175949 \n",
"L 143.321874 136.270684 \n",
"L 146.373441 136.361209 \n",
"L 147.936485 136.40542 \n",
"L 149.425 136.445992 \n",
"L 152.476564 136.525109 \n",
"L 155.528127 136.600169 \n",
"L 158.57969 136.671174 \n",
"L 161.631253 136.738119 \n",
"L 164.682813 136.801006 \n",
"L 167.734376 136.859838 \n",
"L 170.785939 136.914613 \n",
"L 173.837499 136.965329 \n",
"L 176.889062 137.011987 \n",
"L 179.940626 137.054589 \n",
"L 182.992189 137.093134 \n",
"L 186.04375 137.127621 \n",
"L 189.095313 137.158052 \n",
"L 192.146877 137.184424 \n",
"L 195.198438 137.20674 \n",
"L 198.250001 137.224998 \n",
"L 201.301564 137.239198 \n",
"L 204.353126 137.249342 \n",
"L 207.404689 137.255427 \n",
"L 210.456251 137.257456 \n",
"L 213.507813 137.255427 \n",
"L 216.559376 137.249342 \n",
"L 219.610939 137.239198 \n",
"L 222.662501 137.224998 \n",
"L 225.714063 137.20674 \n",
"L 228.765626 137.184424 \n",
"L 231.817188 137.158052 \n",
"L 234.868751 137.127621 \n",
"L 237.920313 137.093134 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 42.620312 137.206738 \n",
"L 45.671872 137.427865 \n",
"L 48.723432 137.644928 \n",
"L 51.775006 137.857941 \n",
"L 54.826565 138.066894 \n",
"L 57.878125 138.271785 \n",
"L 60.929685 138.472624 \n",
"L 63.477871 138.636945 \n",
"L 63.981244 138.668266 \n",
"L 67.032818 138.854225 \n",
"L 70.084378 139.036272 \n",
"L 73.135938 139.214402 \n",
"L 76.187512 139.388619 \n",
"L 79.239071 139.558919 \n",
"L 82.290631 139.725305 \n",
"L 85.342191 139.887775 \n",
"L 88.393757 140.046331 \n",
"L 91.44531 140.200971 \n",
"L 94.496869 140.351697 \n",
"L 97.548436 140.498509 \n",
"L 100.599996 140.641405 \n",
"L 103.651563 140.780385 \n",
"L 105.641791 140.868475 \n",
"L 106.703122 140.913859 \n",
"L 109.754682 141.040563 \n",
"L 112.806249 141.163487 \n",
"L 115.857816 141.282628 \n",
"L 118.909375 141.397988 \n",
"L 121.960942 141.509564 \n",
"L 125.012502 141.617357 \n",
"L 128.064069 141.721369 \n",
"L 131.115628 141.821598 \n",
"L 134.167188 141.918047 \n",
"L 137.218755 142.010711 \n",
"L 140.270314 142.099596 \n",
"L 143.321874 142.184697 \n",
"L 146.373441 142.266014 \n",
"L 149.425 142.343551 \n",
"L 152.476564 142.417304 \n",
"L 155.528127 142.487276 \n",
"L 158.57969 142.553466 \n",
"L 161.631253 142.615871 \n",
"L 164.682813 142.674496 \n",
"L 167.734376 142.729341 \n",
"L 170.785939 142.780399 \n",
"L 173.837499 142.82768 \n",
"L 176.889062 142.871173 \n",
"L 179.940626 142.910887 \n",
"L 182.992189 142.94682 \n",
"L 186.04375 142.97897 \n",
"L 189.095313 143.007335 \n",
"L 192.146877 143.031921 \n",
"L 195.198438 143.052723 \n",
"L 198.250001 143.069741 \n",
"L 201.301564 143.082978 \n",
"L 204.353126 143.092436 \n",
"L 207.404689 143.09811 \n",
"L 210.456251 143.1 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" <path d=\"M 210.456251 143.1 \n",
"L 213.507813 143.09811 \n",
"L 216.559376 143.092436 \n",
"L 219.610939 143.082978 \n",
"L 222.662501 143.069741 \n",
"L 225.714063 143.052723 \n",
"L 228.765626 143.031921 \n",
"L 231.817188 143.007335 \n",
"L 234.868751 142.97897 \n",
"L 237.920313 142.94682 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\">\n",
" <path d=\"M 42.620312 143.052723 \n",
"L 43.320206 143.1 \n",
"\" clip-path=\"url(#p4e59ddec63)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_9\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p4e59ddec63\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"eta, beta = 0.6, 0.25\n",
"d2l.show_trace_2d(f_2d, d2l.train_2d(momentum_2d))"
]
},
{
"cell_type": "markdown",
"id": "86235621",
"metadata": {
"origin_pos": 11
},
"source": [
"请注意,我们可以将动量法与随机梯度下降,特别是小批量随机梯度下降结合起来。\n",
"唯一的变化是,在这种情况下,我们将梯度$\\mathbf{g}_{t, t-1}$替换为$\\mathbf{g}_t$。\n",
"为了方便起见,我们在时间$t=0$初始化$\\mathbf{v}_0 = 0$。\n",
"\n",
"### 有效样本权重\n",
"\n",
"回想一下$\\mathbf{v}_t = \\sum_{\\tau = 0}^{t-1} \\beta^{\\tau} \\mathbf{g}_{t-\\tau, t-\\tau-1}$。\n",
"极限条件下,$\\sum_{\\tau=0}^\\infty \\beta^\\tau = \\frac{1}{1-\\beta}$。\n",
"换句话说,不同于在梯度下降或者随机梯度下降中取步长$\\eta$,我们选取步长$\\frac{\\eta}{1-\\beta}$,同时处理潜在表现可能会更好的下降方向。\n",
"这是集两种好处于一身的做法。\n",
"为了说明$\\beta$的不同选择的权重效果如何,请参考下面的图表。\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "d4356576",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.786111Z",
"iopub.status.busy": "2023-08-18T07:02:10.785799Z",
"iopub.status.idle": "2023-08-18T07:02:10.963762Z",
"shell.execute_reply": "2023-08-18T07:02:10.962940Z"
},
"origin_pos": 12,
"tab": [
"pytorch"
]
},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"234.6408pt\" height=\"180.65625pt\" viewBox=\"0 0 234.6408 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:10.919653</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 234.6408 180.65625 \n",
"L 234.6408 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"L 30.103125 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"m239bec96be\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m239bec96be\" x=\"38.980398\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(35.799148 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#m239bec96be\" x=\"84.504873\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(78.142373 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#m239bec96be\" x=\"130.029349\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 20 -->\n",
" <g transform=\"translate(123.666849 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m239bec96be\" x=\"175.553824\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 30 -->\n",
" <g transform=\"translate(169.191324 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-33\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#m239bec96be\" x=\"221.0783\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 40 -->\n",
" <g transform=\"translate(214.7158 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- time -->\n",
" <g transform=\"translate(116.457031 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-74\" d=\"M 1172 4494 \n",
"L 1172 3500 \n",
"L 2356 3500 \n",
"L 2356 3053 \n",
"L 1172 3053 \n",
"L 1172 1153 \n",
"Q 1172 725 1289 603 \n",
"Q 1406 481 1766 481 \n",
"L 2356 481 \n",
"L 2356 0 \n",
"L 1766 0 \n",
"Q 1100 0 847 248 \n",
"Q 594 497 594 1153 \n",
"L 594 3053 \n",
"L 172 3053 \n",
"L 172 3500 \n",
"L 594 3500 \n",
"L 594 4494 \n",
"L 1172 4494 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-69\" d=\"M 603 3500 \n",
"L 1178 3500 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 3500 \n",
"z\n",
"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 4134 \n",
"L 603 4134 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6d\" d=\"M 3328 2828 \n",
"Q 3544 3216 3844 3400 \n",
"Q 4144 3584 4550 3584 \n",
"Q 5097 3584 5394 3201 \n",
"Q 5691 2819 5691 2113 \n",
"L 5691 0 \n",
"L 5113 0 \n",
"L 5113 2094 \n",
"Q 5113 2597 4934 2840 \n",
"Q 4756 3084 4391 3084 \n",
"Q 3944 3084 3684 2787 \n",
"Q 3425 2491 3425 1978 \n",
"L 3425 0 \n",
"L 2847 0 \n",
"L 2847 2094 \n",
"Q 2847 2600 2669 2842 \n",
"Q 2491 3084 2119 3084 \n",
"Q 1678 3084 1418 2786 \n",
"Q 1159 2488 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2956 \n",
"Q 1356 3278 1631 3431 \n",
"Q 1906 3584 2284 3584 \n",
"Q 2666 3584 2933 3390 \n",
"Q 3200 3197 3328 2828 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-69\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-6d\" x=\"66.992188\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"164.404297\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_6\">\n",
" <defs>\n",
" <path id=\"m172497d4e9\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m172497d4e9\" x=\"30.103125\" y=\"136.922727\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.0 -->\n",
" <g transform=\"translate(7.2 140.721946)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_7\">\n",
" <g>\n",
" <use xlink:href=\"#m172497d4e9\" x=\"30.103125\" y=\"112.213636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.2 -->\n",
" <g transform=\"translate(7.2 116.012855)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m172497d4e9\" x=\"30.103125\" y=\"87.504545\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.4 -->\n",
" <g transform=\"translate(7.2 91.303764)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_9\">\n",
" <g>\n",
" <use xlink:href=\"#m172497d4e9\" x=\"30.103125\" y=\"62.795455\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.6 -->\n",
" <g transform=\"translate(7.2 66.594673)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
"Q 1688 2584 1439 2293 \n",
"Q 1191 2003 1191 1497 \n",
"Q 1191 994 1439 701 \n",
"Q 1688 409 2113 409 \n",
"Q 2538 409 2786 701 \n",
"Q 3034 994 3034 1497 \n",
"Q 3034 2003 2786 2293 \n",
"Q 2538 2584 2113 2584 \n",
"z\n",
"M 3366 4563 \n",
"L 3366 3988 \n",
"Q 3128 4100 2886 4159 \n",
"Q 2644 4219 2406 4219 \n",
"Q 1781 4219 1451 3797 \n",
"Q 1122 3375 1075 2522 \n",
"Q 1259 2794 1537 2939 \n",
"Q 1816 3084 2150 3084 \n",
"Q 2853 3084 3261 2657 \n",
"Q 3669 2231 3669 1497 \n",
"Q 3669 778 3244 343 \n",
"Q 2819 -91 2113 -91 \n",
"Q 1303 -91 875 529 \n",
"Q 447 1150 447 2328 \n",
"Q 447 3434 972 4092 \n",
"Q 1497 4750 2381 4750 \n",
"Q 2619 4750 2861 4703 \n",
"Q 3103 4656 3366 4563 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-36\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m172497d4e9\" x=\"30.103125\" y=\"38.086364\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.8 -->\n",
" <g transform=\"translate(7.2 41.885582)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
"Q 1584 2216 1326 1975 \n",
"Q 1069 1734 1069 1313 \n",
"Q 1069 891 1326 650 \n",
"Q 1584 409 2034 409 \n",
"Q 2484 409 2743 651 \n",
"Q 3003 894 3003 1313 \n",
"Q 3003 1734 2745 1975 \n",
"Q 2488 2216 2034 2216 \n",
"z\n",
"M 1403 2484 \n",
"Q 997 2584 770 2862 \n",
"Q 544 3141 544 3541 \n",
"Q 544 4100 942 4425 \n",
"Q 1341 4750 2034 4750 \n",
"Q 2731 4750 3128 4425 \n",
"Q 3525 4100 3525 3541 \n",
"Q 3525 3141 3298 2862 \n",
"Q 3072 2584 2669 2484 \n",
"Q 3125 2378 3379 2068 \n",
"Q 3634 1759 3634 1313 \n",
"Q 3634 634 3220 271 \n",
"Q 2806 -91 2034 -91 \n",
"Q 1263 -91 848 271 \n",
"Q 434 634 434 1313 \n",
"Q 434 1759 690 2068 \n",
"Q 947 2378 1403 2484 \n",
"z\n",
"M 1172 3481 \n",
"Q 1172 3119 1398 2916 \n",
"Q 1625 2713 2034 2713 \n",
"Q 2441 2713 2670 2916 \n",
"Q 2900 3119 2900 3481 \n",
"Q 2900 3844 2670 4047 \n",
"Q 2441 4250 2034 4250 \n",
"Q 1625 4250 1398 4047 \n",
"Q 1172 3844 1172 3481 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-38\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_11\">\n",
" <g>\n",
" <use xlink:href=\"#m172497d4e9\" x=\"30.103125\" y=\"13.377273\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(7.2 17.176491)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <path d=\"M 38.980398 13.377273 \n",
"L 43.532845 19.554545 \n",
"L 48.085293 25.422955 \n",
"L 52.63774 30.997943 \n",
"L 57.190188 36.294182 \n",
"L 61.742635 41.32561 \n",
"L 66.295083 46.105466 \n",
"L 70.847531 50.646329 \n",
"L 75.399978 54.960149 \n",
"L 79.952426 59.058277 \n",
"L 84.504873 62.9515 \n",
"L 89.057321 66.650061 \n",
"L 93.609768 70.163695 \n",
"L 98.162216 73.501646 \n",
"L 102.714663 76.6727 \n",
"L 107.267111 79.685202 \n",
"L 111.819559 82.547078 \n",
"L 116.372006 85.26586 \n",
"L 120.924454 87.848704 \n",
"L 125.476901 90.302405 \n",
"L 130.029349 92.633421 \n",
"L 134.581796 94.847886 \n",
"L 139.134244 96.951628 \n",
"L 143.686691 98.950183 \n",
"L 148.239139 100.848811 \n",
"L 152.791587 102.652506 \n",
"L 157.344034 104.366017 \n",
"L 161.896482 105.993853 \n",
"L 166.448929 107.540297 \n",
"L 171.001377 109.009418 \n",
"L 175.553824 110.405084 \n",
"L 180.106272 111.730966 \n",
"L 184.658719 112.990554 \n",
"L 189.211167 114.187163 \n",
"L 193.763615 115.323941 \n",
"L 198.316062 116.40388 \n",
"L 202.86851 117.429822 \n",
"L 207.420957 118.404468 \n",
"L 211.973405 119.330381 \n",
"L 216.525852 120.209998 \n",
"\" clip-path=\"url(#p75e4ebb616)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 38.980398 13.377273 \n",
"L 43.532845 25.731818 \n",
"L 48.085293 36.850909 \n",
"L 52.63774 46.858091 \n",
"L 57.190188 55.864555 \n",
"L 61.742635 63.970372 \n",
"L 66.295083 71.265607 \n",
"L 70.847531 77.831319 \n",
"L 75.399978 83.74046 \n",
"L 79.952426 89.058687 \n",
"L 84.504873 93.845091 \n",
"L 89.057321 98.152855 \n",
"L 93.609768 102.029842 \n",
"L 98.162216 105.51913 \n",
"L 102.714663 108.65949 \n",
"L 107.267111 111.485814 \n",
"L 111.819559 114.029505 \n",
"L 116.372006 116.318827 \n",
"L 120.924454 118.379217 \n",
"L 125.476901 120.233568 \n",
"L 130.029349 121.902484 \n",
"L 134.581796 123.404509 \n",
"L 139.134244 124.75633 \n",
"L 143.686691 125.97297 \n",
"L 148.239139 127.067946 \n",
"L 152.791587 128.053424 \n",
"L 157.344034 128.940354 \n",
"L 161.896482 129.738592 \n",
"L 166.448929 130.457005 \n",
"L 171.001377 131.103577 \n",
"L 175.553824 131.685492 \n",
"L 180.106272 132.209216 \n",
"L 184.658719 132.680567 \n",
"L 189.211167 133.104783 \n",
"L 193.763615 133.486577 \n",
"L 198.316062 133.830192 \n",
"L 202.86851 134.139446 \n",
"L 207.420957 134.417774 \n",
"L 211.973405 134.668269 \n",
"L 216.525852 134.893715 \n",
"\" clip-path=\"url(#p75e4ebb616)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <path d=\"M 38.980398 13.377273 \n",
"L 43.532845 62.795455 \n",
"L 48.085293 92.446364 \n",
"L 52.63774 110.236909 \n",
"L 57.190188 120.911236 \n",
"L 61.742635 127.315833 \n",
"L 66.295083 131.158591 \n",
"L 70.847531 133.464245 \n",
"L 75.399978 134.847638 \n",
"L 79.952426 135.677674 \n",
"L 84.504873 136.175695 \n",
"L 89.057321 136.474508 \n",
"L 93.609768 136.653796 \n",
"L 98.162216 136.761368 \n",
"L 102.714663 136.825912 \n",
"L 107.267111 136.864638 \n",
"L 111.819559 136.887874 \n",
"L 116.372006 136.901815 \n",
"L 120.924454 136.91018 \n",
"L 125.476901 136.915199 \n",
"L 130.029349 136.91821 \n",
"L 134.581796 136.920017 \n",
"L 139.134244 136.921101 \n",
"L 143.686691 136.921752 \n",
"L 148.239139 136.922142 \n",
"L 152.791587 136.922376 \n",
"L 157.344034 136.922517 \n",
"L 161.896482 136.922601 \n",
"L 166.448929 136.922651 \n",
"L 171.001377 136.922682 \n",
"L 175.553824 136.9227 \n",
"L 180.106272 136.922711 \n",
"L 184.658719 136.922717 \n",
"L 189.211167 136.922721 \n",
"L 193.763615 136.922724 \n",
"L 198.316062 136.922725 \n",
"L 202.86851 136.922726 \n",
"L 207.420957 136.922727 \n",
"L 211.973405 136.922727 \n",
"L 216.525852 136.922727 \n",
"\" clip-path=\"url(#p75e4ebb616)\" style=\"fill: none; stroke: #2ca02c; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 38.980398 13.377273 \n",
"L 43.532845 136.922727 \n",
"L 48.085293 136.922727 \n",
"L 52.63774 136.922727 \n",
"L 57.190188 136.922727 \n",
"L 61.742635 136.922727 \n",
"L 66.295083 136.922727 \n",
"L 70.847531 136.922727 \n",
"L 75.399978 136.922727 \n",
"L 79.952426 136.922727 \n",
"L 84.504873 136.922727 \n",
"L 89.057321 136.922727 \n",
"L 93.609768 136.922727 \n",
"L 98.162216 136.922727 \n",
"L 102.714663 136.922727 \n",
"L 107.267111 136.922727 \n",
"L 111.819559 136.922727 \n",
"L 116.372006 136.922727 \n",
"L 120.924454 136.922727 \n",
"L 125.476901 136.922727 \n",
"L 130.029349 136.922727 \n",
"L 134.581796 136.922727 \n",
"L 139.134244 136.922727 \n",
"L 143.686691 136.922727 \n",
"L 148.239139 136.922727 \n",
"L 152.791587 136.922727 \n",
"L 157.344034 136.922727 \n",
"L 161.896482 136.922727 \n",
"L 166.448929 136.922727 \n",
"L 171.001377 136.922727 \n",
"L 175.553824 136.922727 \n",
"L 180.106272 136.922727 \n",
"L 184.658719 136.922727 \n",
"L 189.211167 136.922727 \n",
"L 193.763615 136.922727 \n",
"L 198.316062 136.922727 \n",
"L 202.86851 136.922727 \n",
"L 207.420957 136.922727 \n",
"L 211.973405 136.922727 \n",
"L 216.525852 136.922727 \n",
"\" clip-path=\"url(#p75e4ebb616)\" style=\"fill: none; stroke: #d62728; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 30.103125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 225.403125 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 30.103125 7.2 \n",
"L 225.403125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"legend_1\">\n",
" <g id=\"patch_7\">\n",
" <path d=\"M 126.851562 73.9125 \n",
"L 218.403125 73.9125 \n",
"Q 220.403125 73.9125 220.403125 71.9125 \n",
"L 220.403125 14.2 \n",
"Q 220.403125 12.2 218.403125 12.2 \n",
"L 126.851562 12.2 \n",
"Q 124.851562 12.2 124.851562 14.2 \n",
"L 124.851562 71.9125 \n",
"Q 124.851562 73.9125 126.851562 73.9125 \n",
"z\n",
"\" style=\"fill: #ffffff; opacity: 0.8; stroke: #cccccc; stroke-linejoin: miter\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <path d=\"M 128.851562 20.298437 \n",
"L 138.851562 20.298437 \n",
"L 148.851562 20.298437 \n",
"\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- beta = 0.95 -->\n",
" <g transform=\"translate(156.851562 23.798437)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-62\" d=\"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"M 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2969 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-61\" d=\"M 2194 1759 \n",
"Q 1497 1759 1228 1600 \n",
"Q 959 1441 959 1056 \n",
"Q 959 750 1161 570 \n",
"Q 1363 391 1709 391 \n",
"Q 2188 391 2477 730 \n",
"Q 2766 1069 2766 1631 \n",
"L 2766 1759 \n",
"L 2194 1759 \n",
"z\n",
"M 3341 1997 \n",
"L 3341 0 \n",
"L 2766 0 \n",
"L 2766 531 \n",
"Q 2569 213 2275 61 \n",
"Q 1981 -91 1556 -91 \n",
"Q 1019 -91 701 211 \n",
"Q 384 513 384 1019 \n",
"Q 384 1609 779 1909 \n",
"Q 1175 2209 1959 2209 \n",
"L 2766 2209 \n",
"L 2766 2266 \n",
"Q 2766 2663 2505 2880 \n",
"Q 2244 3097 1772 3097 \n",
"Q 1472 3097 1187 3025 \n",
"Q 903 2953 641 2809 \n",
"L 641 3341 \n",
"Q 956 3463 1253 3523 \n",
"Q 1550 3584 1831 3584 \n",
"Q 2591 3584 2966 3190 \n",
"Q 3341 2797 3341 1997 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-20\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-3d\" d=\"M 678 2906 \n",
"L 4684 2906 \n",
"L 4684 2381 \n",
"L 678 2381 \n",
"L 678 2906 \n",
"z\n",
"M 678 1631 \n",
"L 4684 1631 \n",
"L 4684 1100 \n",
"L 678 1100 \n",
"L 678 1631 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-39\" d=\"M 703 97 \n",
"L 703 672 \n",
"Q 941 559 1184 500 \n",
"Q 1428 441 1663 441 \n",
"Q 2288 441 2617 861 \n",
"Q 2947 1281 2994 2138 \n",
"Q 2813 1869 2534 1725 \n",
"Q 2256 1581 1919 1581 \n",
"Q 1219 1581 811 2004 \n",
"Q 403 2428 403 3163 \n",
"Q 403 3881 828 4315 \n",
"Q 1253 4750 1959 4750 \n",
"Q 2769 4750 3195 4129 \n",
"Q 3622 3509 3622 2328 \n",
"Q 3622 1225 3098 567 \n",
"Q 2575 -91 1691 -91 \n",
"Q 1453 -91 1209 -44 \n",
"Q 966 3 703 97 \n",
"z\n",
"M 1959 2075 \n",
"Q 2384 2075 2632 2365 \n",
"Q 2881 2656 2881 3163 \n",
"Q 2881 3666 2632 3958 \n",
"Q 2384 4250 1959 4250 \n",
"Q 1534 4250 1286 3958 \n",
"Q 1038 3666 1038 3163 \n",
"Q 1038 2656 1286 2365 \n",
"Q 1534 2075 1959 2075 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-62\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"63.476562\"/>\n",
" <use xlink:href=\"#DejaVuSans-74\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"164.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"225.488281\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"257.275391\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"341.064453\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"372.851562\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"436.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-39\" x=\"468.261719\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"531.884766\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 128.851562 34.976562 \n",
"L 138.851562 34.976562 \n",
"L 148.851562 34.976562 \n",
"\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_14\">\n",
" <!-- beta = 0.90 -->\n",
" <g transform=\"translate(156.851562 38.476562)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-62\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"63.476562\"/>\n",
" <use xlink:href=\"#DejaVuSans-74\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"164.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"225.488281\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"257.275391\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"341.064453\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"372.851562\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"436.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-39\" x=\"468.261719\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"531.884766\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <path d=\"M 128.851562 49.654687 \n",
"L 138.851562 49.654687 \n",
"L 148.851562 49.654687 \n",
"\" style=\"fill: none; stroke: #2ca02c; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_15\">\n",
" <!-- beta = 0.60 -->\n",
" <g transform=\"translate(156.851562 53.154687)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-62\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"63.476562\"/>\n",
" <use xlink:href=\"#DejaVuSans-74\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"164.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"225.488281\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"257.275391\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"341.064453\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"372.851562\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"436.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-36\" x=\"468.261719\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"531.884766\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 128.851562 64.332812 \n",
"L 138.851562 64.332812 \n",
"L 148.851562 64.332812 \n",
"\" style=\"fill: none; stroke: #d62728; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_16\">\n",
" <!-- beta = 0.00 -->\n",
" <g transform=\"translate(156.851562 67.832812)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-62\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"63.476562\"/>\n",
" <use xlink:href=\"#DejaVuSans-74\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"164.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"225.488281\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"257.275391\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"341.064453\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"372.851562\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"436.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"468.261719\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"531.884766\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p75e4ebb616\">\n",
" <rect x=\"30.103125\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"d2l.set_figsize()\n",
"betas = [0.95, 0.9, 0.6, 0]\n",
"for beta in betas:\n",
" x = torch.arange(40).detach().numpy()\n",
" d2l.plt.plot(x, beta ** x, label=f'beta = {beta:.2f}')\n",
"d2l.plt.xlabel('time')\n",
"d2l.plt.legend();"
]
},
{
"cell_type": "markdown",
"id": "34672832",
"metadata": {
"origin_pos": 13
},
"source": [
"## 实际实验\n",
"\n",
"让我们来看看动量法在实验中是如何运作的。\n",
"为此,我们需要一个更加可扩展的实现。\n",
"\n",
"### 从零开始实现\n",
"\n",
"相比于小批量随机梯度下降,动量方法需要维护一组辅助变量,即速度。\n",
"它与梯度以及优化问题的变量具有相同的形状。\n",
"在下面的实现中,我们称这些变量为`states`。\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "efcefb6c",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.967317Z",
"iopub.status.busy": "2023-08-18T07:02:10.967005Z",
"iopub.status.idle": "2023-08-18T07:02:10.971350Z",
"shell.execute_reply": "2023-08-18T07:02:10.970589Z"
},
"origin_pos": 14,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"def init_momentum_states(feature_dim):\n",
" v_w = torch.zeros((feature_dim, 1))\n",
" v_b = torch.zeros(1)\n",
" return (v_w, v_b)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "09906cb6",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.974571Z",
"iopub.status.busy": "2023-08-18T07:02:10.974264Z",
"iopub.status.idle": "2023-08-18T07:02:10.979046Z",
"shell.execute_reply": "2023-08-18T07:02:10.978264Z"
},
"origin_pos": 18,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"def sgd_momentum(params, states, hyperparams):\n",
" for p, v in zip(params, states):\n",
" with torch.no_grad():\n",
" v[:] = hyperparams['momentum'] * v + p.grad\n",
" p[:] -= hyperparams['lr'] * v\n",
" p.grad.data.zero_()"
]
},
{
"cell_type": "markdown",
"id": "b7ac4059",
"metadata": {
"origin_pos": 21
},
"source": [
"让我们看看它在实验中是如何运作的。\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a9e8f491",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:10.982336Z",
"iopub.status.busy": "2023-08-18T07:02:10.982029Z",
"iopub.status.idle": "2023-08-18T07:02:13.845008Z",
"shell.execute_reply": "2023-08-18T07:02:13.844067Z"
},
"origin_pos": 22,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss: 0.246, 0.013 sec/epoch\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"266.957813pt\" height=\"184.455469pt\" viewBox=\"0 0 266.957813 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:13.806048</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 184.455469 \n",
"L 266.957813 184.455469 \n",
"L 266.957813 0 \n",
"L -0 0 \n",
"L -0 184.455469 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"L 56.50625 10.999219 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m04a8415d8d\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m04a8415d8d\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0.0 -->\n",
" <g transform=\"translate(48.554688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 105.33125 146.899219 \n",
"L 105.33125 10.999219 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m04a8415d8d\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 0.5 -->\n",
" <g transform=\"translate(97.379688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 154.15625 146.899219 \n",
"L 154.15625 10.999219 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m04a8415d8d\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(146.204688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 202.98125 146.899219 \n",
"L 202.98125 10.999219 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m04a8415d8d\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 1.5 -->\n",
" <g transform=\"translate(195.029688 161.497656)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m04a8415d8d\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 2.0 -->\n",
" <g transform=\"translate(243.854688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 56.50625 141.672296 \n",
"L 251.80625 141.672296 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m4160a6110f\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m4160a6110f\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.225 -->\n",
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 56.50625 115.53768 \n",
"L 251.80625 115.53768 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m4160a6110f\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.250 -->\n",
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 56.50625 89.403065 \n",
"L 251.80625 89.403065 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m4160a6110f\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.275 -->\n",
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 56.50625 63.26845 \n",
"L 251.80625 63.26845 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#m4160a6110f\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.300 -->\n",
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 56.50625 37.133834 \n",
"L 251.80625 37.133834 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#m4160a6110f\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.325 -->\n",
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#m4160a6110f\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.350 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- loss -->\n",
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 69.52625 2.016678 \n",
"L 82.54625 81.460884 \n",
"L 95.56625 110.899384 \n",
"L 108.58625 117.084789 \n",
"L 121.60625 117.378577 \n",
"L 134.62625 119.060417 \n",
"L 147.64625 121.186947 \n",
"L 160.66625 119.48947 \n",
"L 173.68625 123.853698 \n",
"L 186.70625 123.148077 \n",
"L 199.72625 121.068396 \n",
"L 212.74625 120.698405 \n",
"L 225.76625 120.907019 \n",
"L 238.78625 123.087477 \n",
"L 251.80625 119.215066 \n",
"\" clip-path=\"url(#p915f42d281)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p915f42d281\">\n",
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def train_momentum(lr, momentum, num_epochs=2):\n",
" d2l.train_ch11(sgd_momentum, init_momentum_states(feature_dim),\n",
" {'lr': lr, 'momentum': momentum}, data_iter,\n",
" feature_dim, num_epochs)\n",
"\n",
"data_iter, feature_dim = d2l.get_data_ch11(batch_size=10)\n",
"train_momentum(0.02, 0.5)"
]
},
{
"cell_type": "markdown",
"id": "169bd66c",
"metadata": {
"origin_pos": 23
},
"source": [
"当我们将动量超参数`momentum`增加到0.9时,它相当于有效样本数量增加到$\\frac{1}{1 - 0.9} = 10$。\n",
"我们将学习率略微降至$0.01$,以确保可控。\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a18e94fe",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:13.849422Z",
"iopub.status.busy": "2023-08-18T07:02:13.848541Z",
"iopub.status.idle": "2023-08-18T07:02:16.686903Z",
"shell.execute_reply": "2023-08-18T07:02:16.685706Z"
},
"origin_pos": 24,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss: 0.261, 0.013 sec/epoch\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"266.957813pt\" height=\"184.455469pt\" viewBox=\"0 0 266.957813 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:16.649209</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 184.455469 \n",
"L 266.957813 184.455469 \n",
"L 266.957813 0 \n",
"L -0 0 \n",
"L -0 184.455469 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"L 56.50625 10.999219 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"mbdf492a3bf\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mbdf492a3bf\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0.0 -->\n",
" <g transform=\"translate(48.554688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 105.33125 146.899219 \n",
"L 105.33125 10.999219 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#mbdf492a3bf\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 0.5 -->\n",
" <g transform=\"translate(97.379688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 154.15625 146.899219 \n",
"L 154.15625 10.999219 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mbdf492a3bf\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(146.204688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 202.98125 146.899219 \n",
"L 202.98125 10.999219 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#mbdf492a3bf\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 1.5 -->\n",
" <g transform=\"translate(195.029688 161.497656)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#mbdf492a3bf\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 2.0 -->\n",
" <g transform=\"translate(243.854688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 56.50625 141.672296 \n",
"L 251.80625 141.672296 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m2308b1c746\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m2308b1c746\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.225 -->\n",
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 56.50625 115.53768 \n",
"L 251.80625 115.53768 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m2308b1c746\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.250 -->\n",
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 56.50625 89.403065 \n",
"L 251.80625 89.403065 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m2308b1c746\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.275 -->\n",
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 56.50625 63.26845 \n",
"L 251.80625 63.26845 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#m2308b1c746\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.300 -->\n",
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 56.50625 37.133834 \n",
"L 251.80625 37.133834 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#m2308b1c746\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.325 -->\n",
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#m2308b1c746\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.350 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- loss -->\n",
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 69.52625 85.486909 \n",
"L 82.54625 111.266967 \n",
"L 95.56625 115.567562 \n",
"L 108.58625 118.443296 \n",
"L 121.60625 120.164867 \n",
"L 134.62625 109.010761 \n",
"L 147.64625 120.786631 \n",
"L 160.66625 115.414898 \n",
"L 173.68625 114.853701 \n",
"L 186.70625 116.209553 \n",
"L 199.72625 120.52659 \n",
"L 212.74625 109.4016 \n",
"L 225.76625 120.14478 \n",
"L 238.78625 114.669767 \n",
"L 251.80625 103.676985 \n",
"\" clip-path=\"url(#p5973f0fc0e)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p5973f0fc0e\">\n",
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"train_momentum(0.01, 0.9)"
]
},
{
"cell_type": "markdown",
"id": "8e9b6723",
"metadata": {
"origin_pos": 25
},
"source": [
"降低学习率进一步解决了任何非平滑优化问题的困难,将其设置为$0.005$会产生良好的收敛性能。\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "8f540239",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:16.690931Z",
"iopub.status.busy": "2023-08-18T07:02:16.690014Z",
"iopub.status.idle": "2023-08-18T07:02:19.517867Z",
"shell.execute_reply": "2023-08-18T07:02:19.517070Z"
},
"origin_pos": 26,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss: 0.245, 0.013 sec/epoch\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"266.957813pt\" height=\"184.455469pt\" viewBox=\"0 0 266.957813 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:19.483270</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 184.455469 \n",
"L 266.957813 184.455469 \n",
"L 266.957813 0 \n",
"L -0 0 \n",
"L -0 184.455469 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"L 56.50625 10.999219 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m51e235b036\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m51e235b036\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0.0 -->\n",
" <g transform=\"translate(48.554688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 105.33125 146.899219 \n",
"L 105.33125 10.999219 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m51e235b036\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 0.5 -->\n",
" <g transform=\"translate(97.379688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 154.15625 146.899219 \n",
"L 154.15625 10.999219 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m51e235b036\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(146.204688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 202.98125 146.899219 \n",
"L 202.98125 10.999219 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m51e235b036\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 1.5 -->\n",
" <g transform=\"translate(195.029688 161.497656)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m51e235b036\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 2.0 -->\n",
" <g transform=\"translate(243.854688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 56.50625 141.672296 \n",
"L 251.80625 141.672296 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"mdba9516d74\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mdba9516d74\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.225 -->\n",
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 56.50625 115.53768 \n",
"L 251.80625 115.53768 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#mdba9516d74\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.250 -->\n",
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 56.50625 89.403065 \n",
"L 251.80625 89.403065 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#mdba9516d74\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.275 -->\n",
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 56.50625 63.26845 \n",
"L 251.80625 63.26845 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#mdba9516d74\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.300 -->\n",
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 56.50625 37.133834 \n",
"L 251.80625 37.133834 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#mdba9516d74\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.325 -->\n",
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#mdba9516d74\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.350 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- loss -->\n",
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 69.52625 6.885397 \n",
"L 82.54625 96.011984 \n",
"L 95.56625 116.974664 \n",
"L 108.58625 119.339103 \n",
"L 121.60625 118.637945 \n",
"L 134.62625 122.031835 \n",
"L 147.64625 119.43365 \n",
"L 160.66625 117.51143 \n",
"L 173.68625 121.849597 \n",
"L 186.70625 119.695501 \n",
"L 199.72625 122.531252 \n",
"L 212.74625 120.296248 \n",
"L 225.76625 119.387917 \n",
"L 238.78625 123.581581 \n",
"L 251.80625 121.07205 \n",
"\" clip-path=\"url(#pa50a3f2d4f)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"pa50a3f2d4f\">\n",
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"train_momentum(0.005, 0.9)"
]
},
{
"cell_type": "markdown",
"id": "870d8217",
"metadata": {
"origin_pos": 27
},
"source": [
"### 简洁实现\n",
"\n",
"由于深度学习框架中的优化求解器早已构建了动量法,设置匹配参数会产生非常类似的轨迹。\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "02c798ba",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:19.521379Z",
"iopub.status.busy": "2023-08-18T07:02:19.521094Z",
"iopub.status.idle": "2023-08-18T07:02:24.726079Z",
"shell.execute_reply": "2023-08-18T07:02:24.725193Z"
},
"origin_pos": 29,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss: 0.247, 0.012 sec/epoch\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"262.1875pt\" height=\"184.455469pt\" viewBox=\"0 0 262.1875 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:24.690768</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 184.455469 \n",
"L 262.1875 184.455469 \n",
"L 262.1875 0 \n",
"L -0 0 \n",
"L -0 184.455469 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"L 56.50625 10.999219 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m8223db35e0\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m8223db35e0\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(53.325 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 105.33125 146.899219 \n",
"L 105.33125 10.999219 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m8223db35e0\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(102.15 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 154.15625 146.899219 \n",
"L 154.15625 10.999219 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m8223db35e0\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(150.975 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 202.98125 146.899219 \n",
"L 202.98125 10.999219 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m8223db35e0\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 3 -->\n",
" <g transform=\"translate(199.8 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-33\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m8223db35e0\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(248.625 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 56.50625 141.672296 \n",
"L 251.80625 141.672296 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"meeeb0e4cdb\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#meeeb0e4cdb\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.225 -->\n",
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 56.50625 115.53768 \n",
"L 251.80625 115.53768 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#meeeb0e4cdb\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.250 -->\n",
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 56.50625 89.403065 \n",
"L 251.80625 89.403065 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#meeeb0e4cdb\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.275 -->\n",
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 56.50625 63.26845 \n",
"L 251.80625 63.26845 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#meeeb0e4cdb\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.300 -->\n",
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 56.50625 37.133834 \n",
"L 251.80625 37.133834 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#meeeb0e4cdb\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.325 -->\n",
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#meeeb0e4cdb\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.350 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- loss -->\n",
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 63.01625 95.193089 \n",
"L 69.52625 115.836536 \n",
"L 76.03625 113.16367 \n",
"L 82.54625 111.010545 \n",
"L 89.05625 120.448619 \n",
"L 95.56625 120.213831 \n",
"L 102.07625 119.807208 \n",
"L 108.58625 107.476192 \n",
"L 115.09625 121.154558 \n",
"L 121.60625 119.014139 \n",
"L 128.11625 102.327127 \n",
"L 134.62625 119.598571 \n",
"L 141.13625 105.5642 \n",
"L 147.64625 123.988469 \n",
"L 154.15625 117.282589 \n",
"L 160.66625 114.657588 \n",
"L 167.17625 112.178342 \n",
"L 173.68625 122.002839 \n",
"L 180.19625 119.586152 \n",
"L 186.70625 112.131701 \n",
"L 193.21625 120.411827 \n",
"L 199.72625 115.736965 \n",
"L 206.23625 121.595515 \n",
"L 212.74625 118.38555 \n",
"L 219.25625 115.403163 \n",
"L 225.76625 111.722826 \n",
"L 232.27625 102.935077 \n",
"L 238.78625 116.944578 \n",
"L 245.29625 111.275613 \n",
"L 251.80625 118.949482 \n",
"\" clip-path=\"url(#p4afc4589cc)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p4afc4589cc\">\n",
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"trainer = torch.optim.SGD\n",
"d2l.train_concise_ch11(trainer, {'lr': 0.005, 'momentum': 0.9}, data_iter)"
]
},
{
"cell_type": "markdown",
"id": "a19f8648",
"metadata": {
"origin_pos": 32
},
"source": [
"## 理论分析\n",
"\n",
"$f(x) = 0.1 x_1^2 + 2 x_2^2$的2D示例似乎相当牵强。\n",
"下面我们将看到,它在实际生活中非常具有代表性,至少最小化凸二次目标函数的情况下是如此。\n",
"\n",
"### 二次凸函数\n",
"\n",
"考虑这个函数\n",
"\n",
"$$h(\\mathbf{x}) = \\frac{1}{2} \\mathbf{x}^\\top \\mathbf{Q} \\mathbf{x} + \\mathbf{x}^\\top \\mathbf{c} + b.$$\n",
"\n",
"这是一个普通的二次函数。\n",
"对于正定矩阵$\\mathbf{Q} \\succ 0$,即对于具有正特征值的矩阵,有最小化器为$\\mathbf{x}^* = -\\mathbf{Q}^{-1} \\mathbf{c}$,最小值为$b - \\frac{1}{2} \\mathbf{c}^\\top \\mathbf{Q}^{-1} \\mathbf{c}$。\n",
"因此我们可以将$h$重写为\n",
"\n",
"$$h(\\mathbf{x}) = \\frac{1}{2} (\\mathbf{x} - \\mathbf{Q}^{-1} \\mathbf{c})^\\top \\mathbf{Q} (\\mathbf{x} - \\mathbf{Q}^{-1} \\mathbf{c}) + b - \\frac{1}{2} \\mathbf{c}^\\top \\mathbf{Q}^{-1} \\mathbf{c}.$$\n",
"\n",
"梯度由$\\partial_{\\mathbf{x}} f(\\mathbf{x}) = \\mathbf{Q} (\\mathbf{x} - \\mathbf{Q}^{-1} \\mathbf{c})$给出。\n",
"也就是说,它是由$\\mathbf{x}$和最小化器之间的距离乘以$\\mathbf{Q}$所得出的。\n",
"因此,动量法还是$\\mathbf{Q} (\\mathbf{x}_t - \\mathbf{Q}^{-1} \\mathbf{c})$的线性组合。\n",
"\n",
"由于$\\mathbf{Q}$是正定的,因此可以通过$\\mathbf{Q} = \\mathbf{O}^\\top \\boldsymbol{\\Lambda} \\mathbf{O}$分解为正交(旋转)矩阵$\\mathbf{O}$和正特征值的对角矩阵$\\boldsymbol{\\Lambda}$。\n",
"这使我们能够将变量从$\\mathbf{x}$更改为$\\mathbf{z} := \\mathbf{O} (\\mathbf{x} - \\mathbf{Q}^{-1} \\mathbf{c})$,以获得一个非常简化的表达式:\n",
"\n",
"$$h(\\mathbf{z}) = \\frac{1}{2} \\mathbf{z}^\\top \\boldsymbol{\\Lambda} \\mathbf{z} + b'.$$\n",
"\n",
"这里$b' = b - \\frac{1}{2} \\mathbf{c}^\\top \\mathbf{Q}^{-1} \\mathbf{c}$。\n",
"由于$\\mathbf{O}$只是一个正交矩阵,因此不会真正意义上扰动梯度。\n",
"以$\\mathbf{z}$表示的梯度下降变成\n",
"\n",
"$$\\mathbf{z}_t = \\mathbf{z}_{t-1} - \\boldsymbol{\\Lambda} \\mathbf{z}_{t-1} = (\\mathbf{I} - \\boldsymbol{\\Lambda}) \\mathbf{z}_{t-1}.$$\n",
"\n",
"这个表达式中的重要事实是梯度下降在不同的特征空间之间不会混合。\n",
"也就是说,如果用$\\mathbf{Q}$的特征系统来表示,优化问题是以逐坐标顺序的方式进行的。\n",
"这在动量法中也适用。\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathbf{v}_t & = \\beta \\mathbf{v}_{t-1} + \\boldsymbol{\\Lambda} \\mathbf{z}_{t-1} \\\\\n",
"\\mathbf{z}_t & = \\mathbf{z}_{t-1} - \\eta \\left(\\beta \\mathbf{v}_{t-1} + \\boldsymbol{\\Lambda} \\mathbf{z}_{t-1}\\right) \\\\\n",
" & = (\\mathbf{I} - \\eta \\boldsymbol{\\Lambda}) \\mathbf{z}_{t-1} - \\eta \\beta \\mathbf{v}_{t-1}.\n",
"\\end{aligned}$$\n",
"\n",
"在这样做的过程中,我们只是证明了以下定理:带有和带有不凸二次函数动量的梯度下降,可以分解为朝二次矩阵特征向量方向坐标顺序的优化。\n",
"\n",
"### 标量函数\n",
"\n",
"鉴于上述结果,让我们看看当我们最小化函数$f(x) = \\frac{\\lambda}{2} x^2$时会发生什么。\n",
"对于梯度下降我们有\n",
"\n",
"$$x_{t+1} = x_t - \\eta \\lambda x_t = (1 - \\eta \\lambda) x_t.$$\n",
"\n",
"每$|1 - \\eta \\lambda| < 1$时,这种优化以指数速度收敛,因为在$t$步之后我们可以得到$x_t = (1 - \\eta \\lambda)^t x_0$。\n",
"这显示了在我们将学习率$\\eta$提高到$\\eta \\lambda = 1$之前,收敛率最初是如何提高的。\n",
"超过该数值之后,梯度开始发散,对于$\\eta \\lambda > 2$而言,优化问题将会发散。\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "5bb5257b",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:02:24.729721Z",
"iopub.status.busy": "2023-08-18T07:02:24.729429Z",
"iopub.status.idle": "2023-08-18T07:02:24.950890Z",
"shell.execute_reply": "2023-08-18T07:02:24.950096Z"
},
"origin_pos": 33,
"tab": [
"pytorch"
]
},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"386.845312pt\" height=\"262.19625pt\" viewBox=\"0 0 386.845312 262.19625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:02:24.900377</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 262.19625 \n",
"L 386.845312 262.19625 \n",
"L 386.845312 0 \n",
"L 0 0 \n",
"L 0 262.19625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 44.845313 224.64 \n",
"L 379.645313 224.64 \n",
"L 379.645313 7.2 \n",
"L 44.845313 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"maf0a4672bb\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"60.063494\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0.0 -->\n",
" <g transform=\"translate(52.111932 239.238437)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"100.111341\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2.5 -->\n",
" <g transform=\"translate(92.159779 239.238437)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"140.159188\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 5.0 -->\n",
" <g transform=\"translate(132.207626 239.238437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-35\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"180.207035\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 7.5 -->\n",
" <g transform=\"translate(172.255472 239.238437)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-37\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"220.254882\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10.0 -->\n",
" <g transform=\"translate(209.122069 239.238437)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_6\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"260.302729\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 12.5 -->\n",
" <g transform=\"translate(249.169916 239.238437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_7\">\n",
" <g id=\"line2d_7\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"300.350576\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 15.0 -->\n",
" <g transform=\"translate(289.217763 239.238437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_8\">\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#maf0a4672bb\" x=\"340.398423\" y=\"224.64\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 17.5 -->\n",
" <g transform=\"translate(329.26561 239.238437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"127.246094\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- time -->\n",
" <g transform=\"translate(200.949219 252.916562)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-74\" d=\"M 1172 4494 \n",
"L 1172 3500 \n",
"L 2356 3500 \n",
"L 2356 3053 \n",
"L 1172 3053 \n",
"L 1172 1153 \n",
"Q 1172 725 1289 603 \n",
"Q 1406 481 1766 481 \n",
"L 2356 481 \n",
"L 2356 0 \n",
"L 1766 0 \n",
"Q 1100 0 847 248 \n",
"Q 594 497 594 1153 \n",
"L 594 3053 \n",
"L 172 3053 \n",
"L 172 3500 \n",
"L 594 3500 \n",
"L 594 4494 \n",
"L 1172 4494 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-69\" d=\"M 603 3500 \n",
"L 1178 3500 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 3500 \n",
"z\n",
"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 4134 \n",
"L 603 4134 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6d\" d=\"M 3328 2828 \n",
"Q 3544 3216 3844 3400 \n",
"Q 4144 3584 4550 3584 \n",
"Q 5097 3584 5394 3201 \n",
"Q 5691 2819 5691 2113 \n",
"L 5691 0 \n",
"L 5113 0 \n",
"L 5113 2094 \n",
"Q 5113 2597 4934 2840 \n",
"Q 4756 3084 4391 3084 \n",
"Q 3944 3084 3684 2787 \n",
"Q 3425 2491 3425 1978 \n",
"L 3425 0 \n",
"L 2847 0 \n",
"L 2847 2094 \n",
"Q 2847 2600 2669 2842 \n",
"Q 2491 3084 2119 3084 \n",
"Q 1678 3084 1418 2786 \n",
"Q 1159 2488 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2956 \n",
"Q 1356 3278 1631 3431 \n",
"Q 1906 3584 2284 3584 \n",
"Q 2666 3584 2933 3390 \n",
"Q 3200 3197 3328 2828 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-69\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-6d\" x=\"66.992188\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"164.404297\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_9\">\n",
" <defs>\n",
" <path id=\"m58610aa314\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"199.150622\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.75 -->\n",
" <g transform=\"translate(7.2 202.949841)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"179.199219\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"242.822266\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"173.141053\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.50 -->\n",
" <g transform=\"translate(7.2 176.940271)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"179.199219\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"242.822266\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_11\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"147.131483\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.25 -->\n",
" <g transform=\"translate(7.2 150.930702)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"83.789062\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"179.199219\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"242.822266\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_12\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"121.121914\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- 0.00 -->\n",
" <g transform=\"translate(15.579688 124.921133)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_13\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"95.112344\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_14\">\n",
" <!-- 0.25 -->\n",
" <g transform=\"translate(15.579688 98.911563)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"69.102775\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_15\">\n",
" <!-- 0.50 -->\n",
" <g transform=\"translate(15.579688 72.901994)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_7\">\n",
" <g id=\"line2d_15\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"43.093206\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_16\">\n",
" <!-- 0.75 -->\n",
" <g transform=\"translate(15.579688 46.892424)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_8\">\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m58610aa314\" x=\"44.845313\" y=\"17.083636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_17\">\n",
" <!-- 1.00 -->\n",
" <g transform=\"translate(15.579688 20.882855)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 60.063494 17.083636 \n",
"L 76.082633 18.124019 \n",
"L 92.101772 19.153998 \n",
"L 108.120911 20.173677 \n",
"L 124.140049 21.18316 \n",
"L 140.159188 22.182547 \n",
"L 156.178327 23.171941 \n",
"L 172.197466 24.151441 \n",
"L 188.216604 25.121145 \n",
"L 204.235743 26.081153 \n",
"L 220.254882 27.031561 \n",
"L 236.274021 27.972464 \n",
"L 252.293159 28.903959 \n",
"L 268.312298 29.826138 \n",
"L 284.331437 30.739096 \n",
"L 300.350576 31.642924 \n",
"L 316.369714 32.537714 \n",
"L 332.388853 33.423556 \n",
"L 348.407992 34.30054 \n",
"L 364.427131 35.168753 \n",
"\" clip-path=\"url(#p9d4de64db5)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <path d=\"M 60.063494 17.083636 \n",
"L 76.082633 27.487464 \n",
"L 92.101772 36.850909 \n",
"L 108.120911 45.27801 \n",
"L 124.140049 52.8624 \n",
"L 140.159188 59.688351 \n",
"L 156.178327 65.831708 \n",
"L 172.197466 71.360728 \n",
"L 188.216604 76.336847 \n",
"L 204.235743 80.815354 \n",
"L 220.254882 84.84601 \n",
"L 236.274021 88.4736 \n",
"L 252.293159 91.738431 \n",
"L 268.312298 94.67678 \n",
"L 284.331437 97.321293 \n",
"L 300.350576 99.701355 \n",
"L 316.369714 101.843411 \n",
"L 332.388853 103.771261 \n",
"L 348.407992 105.506327 \n",
"L 364.427131 107.067885 \n",
"\" clip-path=\"url(#p9d4de64db5)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 60.063494 17.083636 \n",
"L 76.082633 121.121914 \n",
"L 92.101772 121.121914 \n",
"L 108.120911 121.121914 \n",
"L 124.140049 121.121914 \n",
"L 140.159188 121.121914 \n",
"L 156.178327 121.121914 \n",
"L 172.197466 121.121914 \n",
"L 188.216604 121.121914 \n",
"L 204.235743 121.121914 \n",
"L 220.254882 121.121914 \n",
"L 236.274021 121.121914 \n",
"L 252.293159 121.121914 \n",
"L 268.312298 121.121914 \n",
"L 284.331437 121.121914 \n",
"L 300.350576 121.121914 \n",
"L 316.369714 121.121914 \n",
"L 332.388853 121.121914 \n",
"L 348.407992 121.121914 \n",
"L 364.427131 121.121914 \n",
"\" clip-path=\"url(#p9d4de64db5)\" style=\"fill: none; stroke: #2ca02c; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <path d=\"M 60.063494 17.083636 \n",
"L 76.082633 214.756364 \n",
"L 92.101772 36.850909 \n",
"L 108.120911 196.965818 \n",
"L 124.140049 52.8624 \n",
"L 140.159188 182.555476 \n",
"L 156.178327 65.831708 \n",
"L 172.197466 170.883099 \n",
"L 188.216604 76.336847 \n",
"L 204.235743 161.428474 \n",
"L 220.254882 84.84601 \n",
"L 236.274021 153.770228 \n",
"L 252.293159 91.738431 \n",
"L 268.312298 147.567048 \n",
"L 284.331437 97.321293 \n",
"L 300.350576 142.542473 \n",
"L 316.369714 101.843411 \n",
"L 332.388853 138.472566 \n",
"L 348.407992 105.506327 \n",
"L 364.427131 135.175942 \n",
"\" clip-path=\"url(#p9d4de64db5)\" style=\"fill: none; stroke: #d62728; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 44.845313 224.64 \n",
"L 44.845313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 379.645313 224.64 \n",
"L 379.645313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 44.845312 224.64 \n",
"L 379.645313 224.64 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 44.845312 7.2 \n",
"L 379.645313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"legend_1\">\n",
" <g id=\"patch_7\">\n",
" <path d=\"M 259.809375 219.64 \n",
"L 372.645313 219.64 \n",
"Q 374.645313 219.64 374.645313 217.64 \n",
"L 374.645313 159.9275 \n",
"Q 374.645313 157.9275 372.645313 157.9275 \n",
"L 259.809375 157.9275 \n",
"Q 257.809375 157.9275 257.809375 159.9275 \n",
"L 257.809375 217.64 \n",
"Q 257.809375 219.64 259.809375 219.64 \n",
"z\n",
"\" style=\"fill: #ffffff; opacity: 0.8; stroke: #cccccc; stroke-linejoin: miter\"/>\n",
" </g>\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 261.809375 166.025937 \n",
"L 271.809375 166.025937 \n",
"L 281.809375 166.025937 \n",
"\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_18\">\n",
" <!-- lambda = 0.10 -->\n",
" <g transform=\"translate(289.809375 169.525937)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-61\" d=\"M 2194 1759 \n",
"Q 1497 1759 1228 1600 \n",
"Q 959 1441 959 1056 \n",
"Q 959 750 1161 570 \n",
"Q 1363 391 1709 391 \n",
"Q 2188 391 2477 730 \n",
"Q 2766 1069 2766 1631 \n",
"L 2766 1759 \n",
"L 2194 1759 \n",
"z\n",
"M 3341 1997 \n",
"L 3341 0 \n",
"L 2766 0 \n",
"L 2766 531 \n",
"Q 2569 213 2275 61 \n",
"Q 1981 -91 1556 -91 \n",
"Q 1019 -91 701 211 \n",
"Q 384 513 384 1019 \n",
"Q 384 1609 779 1909 \n",
"Q 1175 2209 1959 2209 \n",
"L 2766 2209 \n",
"L 2766 2266 \n",
"Q 2766 2663 2505 2880 \n",
"Q 2244 3097 1772 3097 \n",
"Q 1472 3097 1187 3025 \n",
"Q 903 2953 641 2809 \n",
"L 641 3341 \n",
"Q 956 3463 1253 3523 \n",
"Q 1550 3584 1831 3584 \n",
"Q 2591 3584 2966 3190 \n",
"Q 3341 2797 3341 1997 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-62\" d=\"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"M 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2969 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-64\" d=\"M 2906 2969 \n",
"L 2906 4863 \n",
"L 3481 4863 \n",
"L 3481 0 \n",
"L 2906 0 \n",
"L 2906 525 \n",
"Q 2725 213 2448 61 \n",
"Q 2172 -91 1784 -91 \n",
"Q 1150 -91 751 415 \n",
"Q 353 922 353 1747 \n",
"Q 353 2572 751 3078 \n",
"Q 1150 3584 1784 3584 \n",
"Q 2172 3584 2448 3432 \n",
"Q 2725 3281 2906 2969 \n",
"z\n",
"M 947 1747 \n",
"Q 947 1113 1208 752 \n",
"Q 1469 391 1925 391 \n",
"Q 2381 391 2643 752 \n",
"Q 2906 1113 2906 1747 \n",
"Q 2906 2381 2643 2742 \n",
"Q 2381 3103 1925 3103 \n",
"Q 1469 3103 1208 2742 \n",
"Q 947 2381 947 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-20\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-3d\" d=\"M 678 2906 \n",
"L 4684 2906 \n",
"L 4684 2381 \n",
"L 678 2381 \n",
"L 678 2906 \n",
"z\n",
"M 678 1631 \n",
"L 4684 1631 \n",
"L 4684 1100 \n",
"L 678 1100 \n",
"L 678 1631 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-6d\" x=\"89.0625\"/>\n",
" <use xlink:href=\"#DejaVuSans-62\" x=\"186.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-64\" x=\"249.951172\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"313.427734\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"374.707031\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"406.494141\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"490.283203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"522.070312\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"585.693359\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"617.480469\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"681.103516\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <path d=\"M 261.809375 180.704062 \n",
"L 271.809375 180.704062 \n",
"L 281.809375 180.704062 \n",
"\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_19\">\n",
" <!-- lambda = 1.00 -->\n",
" <g transform=\"translate(289.809375 184.204062)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-6d\" x=\"89.0625\"/>\n",
" <use xlink:href=\"#DejaVuSans-62\" x=\"186.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-64\" x=\"249.951172\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"313.427734\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"374.707031\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"406.494141\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"490.283203\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"522.070312\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"585.693359\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"617.480469\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"681.103516\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 261.809375 195.382187 \n",
"L 271.809375 195.382187 \n",
"L 281.809375 195.382187 \n",
"\" style=\"fill: none; stroke: #2ca02c; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_20\">\n",
" <!-- lambda = 10.00 -->\n",
" <g transform=\"translate(289.809375 198.882187)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-6d\" x=\"89.0625\"/>\n",
" <use xlink:href=\"#DejaVuSans-62\" x=\"186.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-64\" x=\"249.951172\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"313.427734\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"374.707031\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"406.494141\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"490.283203\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"522.070312\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"585.693359\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"649.316406\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"681.103516\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"744.726562\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_24\">\n",
" <path d=\"M 261.809375 210.060312 \n",
"L 271.809375 210.060312 \n",
"L 281.809375 210.060312 \n",
"\" style=\"fill: none; stroke: #d62728; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_21\">\n",
" <!-- lambda = 19.00 -->\n",
" <g transform=\"translate(289.809375 213.560312)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-39\" d=\"M 703 97 \n",
"L 703 672 \n",
"Q 941 559 1184 500 \n",
"Q 1428 441 1663 441 \n",
"Q 2288 441 2617 861 \n",
"Q 2947 1281 2994 2138 \n",
"Q 2813 1869 2534 1725 \n",
"Q 2256 1581 1919 1581 \n",
"Q 1219 1581 811 2004 \n",
"Q 403 2428 403 3163 \n",
"Q 403 3881 828 4315 \n",
"Q 1253 4750 1959 4750 \n",
"Q 2769 4750 3195 4129 \n",
"Q 3622 3509 3622 2328 \n",
"Q 3622 1225 3098 567 \n",
"Q 2575 -91 1691 -91 \n",
"Q 1453 -91 1209 -44 \n",
"Q 966 3 703 97 \n",
"z\n",
"M 1959 2075 \n",
"Q 2384 2075 2632 2365 \n",
"Q 2881 2656 2881 3163 \n",
"Q 2881 3666 2632 3958 \n",
"Q 2384 4250 1959 4250 \n",
"Q 1534 4250 1286 3958 \n",
"Q 1038 3666 1038 3163 \n",
"Q 1038 2656 1286 2365 \n",
"Q 1534 2075 1959 2075 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-6d\" x=\"89.0625\"/>\n",
" <use xlink:href=\"#DejaVuSans-62\" x=\"186.474609\"/>\n",
" <use xlink:href=\"#DejaVuSans-64\" x=\"249.951172\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"313.427734\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"374.707031\"/>\n",
" <use xlink:href=\"#DejaVuSans-3d\" x=\"406.494141\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"490.283203\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"522.070312\"/>\n",
" <use xlink:href=\"#DejaVuSans-39\" x=\"585.693359\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"649.316406\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"681.103516\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"744.726562\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p9d4de64db5\">\n",
" <rect x=\"44.845313\" y=\"7.2\" width=\"334.8\" height=\"217.44\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"lambdas = [0.1, 1, 10, 19]\n",
"eta = 0.1\n",
"d2l.set_figsize((6, 4))\n",
"for lam in lambdas:\n",
" t = torch.arange(20).detach().numpy()\n",
" d2l.plt.plot(t, (1 - eta * lam) ** t, label=f'lambda = {lam:.2f}')\n",
"d2l.plt.xlabel('time')\n",
"d2l.plt.legend();"
]
},
{
"cell_type": "markdown",
"id": "b37a53d5",
"metadata": {
"origin_pos": 34
},
"source": [
"为了分析动量的收敛情况,我们首先用两个标量重写更新方程:一个用于$x$,另一个用于动量$v$。这产生了:\n",
"\n",
"$$\n",
"\\begin{bmatrix} v_{t+1} \\\\ x_{t+1} \\end{bmatrix} =\n",
"\\begin{bmatrix} \\beta & \\lambda \\\\ -\\eta \\beta & (1 - \\eta \\lambda) \\end{bmatrix}\n",
"\\begin{bmatrix} v_{t} \\\\ x_{t} \\end{bmatrix} = \\mathbf{R}(\\beta, \\eta, \\lambda) \\begin{bmatrix} v_{t} \\\\ x_{t} \\end{bmatrix}.\n",
"$$\n",
"\n",
"我们用$\\mathbf{R}$来表示$2 \\times 2$管理的收敛表现。\n",
"在$t$步之后,最初的值$[v_0, x_0]$变为$\\mathbf{R}(\\beta, \\eta, \\lambda)^t [v_0, x_0]$。\n",
"因此,收敛速度是由$\\mathbf{R}$的特征值决定的。\n",
"请参阅[文章](https://distill.pub/2017/momentum/) :cite:`Goh.2017`了解精彩动画。\n",
"请参阅 :cite:`Flammarion.Bach.2015`了解详细分析。\n",
"简而言之,当$0 < \\eta \\lambda < 2 + 2 \\beta$时动量收敛。\n",
"与梯度下降的$0 < \\eta \\lambda < 2$相比,这是更大范围的可行参数。\n",
"另外,一般而言较大值的$\\beta$是可取的。\n",
"\n",
"## 小结\n",
"\n",
"* 动量法用过去梯度的平均值来替换梯度,这大大加快了收敛速度。\n",
"* 对于无噪声梯度下降和嘈杂随机梯度下降,动量法都是可取的。\n",
"* 动量法可以防止在随机梯度下降的优化过程停滞的问题。\n",
"* 由于对过去的数据进行了指数降权,有效梯度数为$\\frac{1}{1-\\beta}$。\n",
"* 在凸二次问题中,可以对动量法进行明确而详细的分析。\n",
"* 动量法的实现非常简单,但它需要我们存储额外的状态向量(动量$\\mathbf{v}$)。\n",
"\n",
"## 练习\n",
"\n",
"1. 使用动量超参数和学习率的其他组合,观察和分析不同的实验结果。\n",
"1. 试试梯度下降和动量法来解决一个二次问题,其中有多个特征值,即$f(x) = \\frac{1}{2} \\sum_i \\lambda_i x_i^2$,例如$\\lambda_i = 2^{-i}$。绘制出$x$的值在初始化$x_i = 1$时如何下降。\n",
"1. 推导$h(\\mathbf{x}) = \\frac{1}{2} \\mathbf{x}^\\top \\mathbf{Q} \\mathbf{x} + \\mathbf{x}^\\top \\mathbf{c} + b$的最小值和最小化器。\n",
"1. 当我们执行带动量法的随机梯度下降时会有什么变化?当我们使用带动量法的小批量随机梯度下降时会发生什么?试验参数如何?\n"
]
},
{
"cell_type": "markdown",
"id": "e4aef6a6",
"metadata": {
"origin_pos": 36,
"tab": [
"pytorch"
]
},
"source": [
"[Discussions](https://discuss.d2l.ai/t/4328)\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"required_libs": []
},
"nbformat": 4,
"nbformat_minor": 5
}