2325 lines
90 KiB
Plaintext
2325 lines
90 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "efd11d18",
|
||
"metadata": {
|
||
"origin_pos": 0
|
||
},
|
||
"source": [
|
||
"# Adam算法\n",
|
||
":label:`sec_adam`\n",
|
||
"\n",
|
||
"本章我们已经学习了许多有效优化的技术。\n",
|
||
"在本节讨论之前,我们先详细回顾一下这些技术:\n",
|
||
"\n",
|
||
"* 在 :numref:`sec_sgd`中,我们学习了:随机梯度下降在解决优化问题时比梯度下降更有效。\n",
|
||
"* 在 :numref:`sec_minibatch_sgd`中,我们学习了:在一个小批量中使用更大的观测值集,可以通过向量化提供额外效率。这是高效的多机、多GPU和整体并行处理的关键。\n",
|
||
"* 在 :numref:`sec_momentum`中我们添加了一种机制,用于汇总过去梯度的历史以加速收敛。\n",
|
||
"* 在 :numref:`sec_adagrad`中,我们通过对每个坐标缩放来实现高效计算的预处理器。\n",
|
||
"* 在 :numref:`sec_rmsprop`中,我们通过学习率的调整来分离每个坐标的缩放。\n",
|
||
"\n",
|
||
"Adam算法 :cite:`Kingma.Ba.2014`将所有这些技术汇总到一个高效的学习算法中。\n",
|
||
"不出预料,作为深度学习中使用的更强大和有效的优化算法之一,它非常受欢迎。\n",
|
||
"但是它并非没有问题,尤其是 :cite:`Reddi.Kale.Kumar.2019`表明,有时Adam算法可能由于方差控制不良而发散。\n",
|
||
"在完善工作中, :cite:`Zaheer.Reddi.Sachan.ea.2018`给Adam算法提供了一个称为Yogi的热补丁来解决这些问题。\n",
|
||
"下面我们了解一下Adam算法。\n",
|
||
"\n",
|
||
"## 算法\n",
|
||
"\n",
|
||
"Adam算法的关键组成部分之一是:它使用指数加权移动平均值来估算梯度的动量和二次矩,即它使用状态变量\n",
|
||
"\n",
|
||
"$$\\begin{aligned}\n",
|
||
" \\mathbf{v}_t & \\leftarrow \\beta_1 \\mathbf{v}_{t-1} + (1 - \\beta_1) \\mathbf{g}_t, \\\\\n",
|
||
" \\mathbf{s}_t & \\leftarrow \\beta_2 \\mathbf{s}_{t-1} + (1 - \\beta_2) \\mathbf{g}_t^2.\n",
|
||
"\\end{aligned}$$\n",
|
||
"\n",
|
||
"这里$\\beta_1$和$\\beta_2$是非负加权参数。\n",
|
||
"常将它们设置为$\\beta_1 = 0.9$和$\\beta_2 = 0.999$。\n",
|
||
"也就是说,方差估计的移动远远慢于动量估计的移动。\n",
|
||
"注意,如果我们初始化$\\mathbf{v}_0 = \\mathbf{s}_0 = 0$,就会获得一个相当大的初始偏差。\n",
|
||
"我们可以通过使用$\\sum_{i=0}^t \\beta^i = \\frac{1 - \\beta^t}{1 - \\beta}$来解决这个问题。\n",
|
||
"相应地,标准化状态变量由下式获得\n",
|
||
"\n",
|
||
"$$\\hat{\\mathbf{v}}_t = \\frac{\\mathbf{v}_t}{1 - \\beta_1^t} \\text{ and } \\hat{\\mathbf{s}}_t = \\frac{\\mathbf{s}_t}{1 - \\beta_2^t}.$$\n",
|
||
"\n",
|
||
"有了正确的估计,我们现在可以写出更新方程。\n",
|
||
"首先,我们以非常类似于RMSProp算法的方式重新缩放梯度以获得\n",
|
||
"\n",
|
||
"$$\\mathbf{g}_t' = \\frac{\\eta \\hat{\\mathbf{v}}_t}{\\sqrt{\\hat{\\mathbf{s}}_t} + \\epsilon}.$$\n",
|
||
"\n",
|
||
"与RMSProp不同,我们的更新使用动量$\\hat{\\mathbf{v}}_t$而不是梯度本身。\n",
|
||
"此外,由于使用$\\frac{1}{\\sqrt{\\hat{\\mathbf{s}}_t} + \\epsilon}$而不是$\\frac{1}{\\sqrt{\\hat{\\mathbf{s}}_t + \\epsilon}}$进行缩放,两者会略有差异。\n",
|
||
"前者在实践中效果略好一些,因此与RMSProp算法有所区分。\n",
|
||
"通常,我们选择$\\epsilon = 10^{-6}$,这是为了在数值稳定性和逼真度之间取得良好的平衡。\n",
|
||
"\n",
|
||
"最后,我们简单更新:\n",
|
||
"\n",
|
||
"$$\\mathbf{x}_t \\leftarrow \\mathbf{x}_{t-1} - \\mathbf{g}_t'.$$\n",
|
||
"\n",
|
||
"回顾Adam算法,它的设计灵感很清楚:\n",
|
||
"首先,动量和规模在状态变量中清晰可见,\n",
|
||
"它们相当独特的定义使我们移除偏项(这可以通过稍微不同的初始化和更新条件来修正)。\n",
|
||
"其次,RMSProp算法中两项的组合都非常简单。\n",
|
||
"最后,明确的学习率$\\eta$使我们能够控制步长来解决收敛问题。\n",
|
||
"\n",
|
||
"## 实现\n",
|
||
"\n",
|
||
"从头开始实现Adam算法并不难。\n",
|
||
"为方便起见,我们将时间步$t$存储在`hyperparams`字典中。\n",
|
||
"除此之外,一切都很简单。\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "5a412a5c",
|
||
"metadata": {
|
||
"execution": {
|
||
"iopub.execute_input": "2023-08-18T07:06:58.408547Z",
|
||
"iopub.status.busy": "2023-08-18T07:06:58.407890Z",
|
||
"iopub.status.idle": "2023-08-18T07:07:00.371669Z",
|
||
"shell.execute_reply": "2023-08-18T07:07:00.370790Z"
|
||
},
|
||
"origin_pos": 2,
|
||
"tab": [
|
||
"pytorch"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"%matplotlib inline\n",
|
||
"import torch\n",
|
||
"from d2l import torch as d2l\n",
|
||
"\n",
|
||
"\n",
|
||
"def init_adam_states(feature_dim):\n",
|
||
" v_w, v_b = torch.zeros((feature_dim, 1)), torch.zeros(1)\n",
|
||
" s_w, s_b = torch.zeros((feature_dim, 1)), torch.zeros(1)\n",
|
||
" return ((v_w, s_w), (v_b, s_b))\n",
|
||
"\n",
|
||
"def adam(params, states, hyperparams):\n",
|
||
" beta1, beta2, eps = 0.9, 0.999, 1e-6\n",
|
||
" for p, (v, s) in zip(params, states):\n",
|
||
" with torch.no_grad():\n",
|
||
" v[:] = beta1 * v + (1 - beta1) * p.grad\n",
|
||
" s[:] = beta2 * s + (1 - beta2) * torch.square(p.grad)\n",
|
||
" v_bias_corr = v / (1 - beta1 ** hyperparams['t'])\n",
|
||
" s_bias_corr = s / (1 - beta2 ** hyperparams['t'])\n",
|
||
" p[:] -= hyperparams['lr'] * v_bias_corr / (torch.sqrt(s_bias_corr)\n",
|
||
" + eps)\n",
|
||
" p.grad.data.zero_()\n",
|
||
" hyperparams['t'] += 1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4c28a0af",
|
||
"metadata": {
|
||
"origin_pos": 5
|
||
},
|
||
"source": [
|
||
"现在,我们用以上Adam算法来训练模型,这里我们使用$\\eta = 0.01$的学习率。\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "959e8f03",
|
||
"metadata": {
|
||
"execution": {
|
||
"iopub.execute_input": "2023-08-18T07:07:00.376075Z",
|
||
"iopub.status.busy": "2023-08-18T07:07:00.375389Z",
|
||
"iopub.status.idle": "2023-08-18T07:07:03.158768Z",
|
||
"shell.execute_reply": "2023-08-18T07:07:03.157913Z"
|
||
},
|
||
"origin_pos": 6,
|
||
"tab": [
|
||
"pytorch"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"loss: 0.244, 0.015 sec/epoch\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"image/svg+xml": [
|
||
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
|
||
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
|
||
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
|
||
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"266.957813pt\" height=\"184.455469pt\" viewBox=\"0 0 266.957813 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
|
||
" <metadata>\n",
|
||
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
|
||
" <cc:Work>\n",
|
||
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
|
||
" <dc:date>2023-08-18T07:07:03.123695</dc:date>\n",
|
||
" <dc:format>image/svg+xml</dc:format>\n",
|
||
" <dc:creator>\n",
|
||
" <cc:Agent>\n",
|
||
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
|
||
" </cc:Agent>\n",
|
||
" </dc:creator>\n",
|
||
" </cc:Work>\n",
|
||
" </rdf:RDF>\n",
|
||
" </metadata>\n",
|
||
" <defs>\n",
|
||
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
|
||
" </defs>\n",
|
||
" <g id=\"figure_1\">\n",
|
||
" <g id=\"patch_1\">\n",
|
||
" <path d=\"M -0 184.455469 \n",
|
||
"L 266.957813 184.455469 \n",
|
||
"L 266.957813 0 \n",
|
||
"L -0 0 \n",
|
||
"L -0 184.455469 \n",
|
||
"z\n",
|
||
"\" style=\"fill: none\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"axes_1\">\n",
|
||
" <g id=\"patch_2\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ffffff\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_1\">\n",
|
||
" <g id=\"xtick_1\">\n",
|
||
" <g id=\"line2d_1\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_2\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"m7dc2fc49b2\" d=\"M 0 0 \n",
|
||
"L 0 3.5 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m7dc2fc49b2\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_1\">\n",
|
||
" <!-- 0.0 -->\n",
|
||
" <g transform=\"translate(48.554688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
|
||
"Q 1547 4250 1301 3770 \n",
|
||
"Q 1056 3291 1056 2328 \n",
|
||
"Q 1056 1369 1301 889 \n",
|
||
"Q 1547 409 2034 409 \n",
|
||
"Q 2525 409 2770 889 \n",
|
||
"Q 3016 1369 3016 2328 \n",
|
||
"Q 3016 3291 2770 3770 \n",
|
||
"Q 2525 4250 2034 4250 \n",
|
||
"z\n",
|
||
"M 2034 4750 \n",
|
||
"Q 2819 4750 3233 4129 \n",
|
||
"Q 3647 3509 3647 2328 \n",
|
||
"Q 3647 1150 3233 529 \n",
|
||
"Q 2819 -91 2034 -91 \n",
|
||
"Q 1250 -91 836 529 \n",
|
||
"Q 422 1150 422 2328 \n",
|
||
"Q 422 3509 836 4129 \n",
|
||
"Q 1250 4750 2034 4750 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
|
||
"L 1344 794 \n",
|
||
"L 1344 0 \n",
|
||
"L 684 0 \n",
|
||
"L 684 794 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_2\">\n",
|
||
" <g id=\"line2d_3\">\n",
|
||
" <path d=\"M 105.33125 146.899219 \n",
|
||
"L 105.33125 10.999219 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_4\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m7dc2fc49b2\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_2\">\n",
|
||
" <!-- 0.5 -->\n",
|
||
" <g transform=\"translate(97.379688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
|
||
"L 3169 4666 \n",
|
||
"L 3169 4134 \n",
|
||
"L 1269 4134 \n",
|
||
"L 1269 2991 \n",
|
||
"Q 1406 3038 1543 3061 \n",
|
||
"Q 1681 3084 1819 3084 \n",
|
||
"Q 2600 3084 3056 2656 \n",
|
||
"Q 3513 2228 3513 1497 \n",
|
||
"Q 3513 744 3044 326 \n",
|
||
"Q 2575 -91 1722 -91 \n",
|
||
"Q 1428 -91 1123 -41 \n",
|
||
"Q 819 9 494 109 \n",
|
||
"L 494 744 \n",
|
||
"Q 775 591 1075 516 \n",
|
||
"Q 1375 441 1709 441 \n",
|
||
"Q 2250 441 2565 725 \n",
|
||
"Q 2881 1009 2881 1497 \n",
|
||
"Q 2881 1984 2565 2268 \n",
|
||
"Q 2250 2553 1709 2553 \n",
|
||
"Q 1456 2553 1204 2497 \n",
|
||
"Q 953 2441 691 2322 \n",
|
||
"L 691 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_3\">\n",
|
||
" <g id=\"line2d_5\">\n",
|
||
" <path d=\"M 154.15625 146.899219 \n",
|
||
"L 154.15625 10.999219 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_6\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m7dc2fc49b2\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_3\">\n",
|
||
" <!-- 1.0 -->\n",
|
||
" <g transform=\"translate(146.204688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
|
||
"L 1825 531 \n",
|
||
"L 1825 4091 \n",
|
||
"L 703 3866 \n",
|
||
"L 703 4441 \n",
|
||
"L 1819 4666 \n",
|
||
"L 2450 4666 \n",
|
||
"L 2450 531 \n",
|
||
"L 3481 531 \n",
|
||
"L 3481 0 \n",
|
||
"L 794 0 \n",
|
||
"L 794 531 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_4\">\n",
|
||
" <g id=\"line2d_7\">\n",
|
||
" <path d=\"M 202.98125 146.899219 \n",
|
||
"L 202.98125 10.999219 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_8\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m7dc2fc49b2\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_4\">\n",
|
||
" <!-- 1.5 -->\n",
|
||
" <g transform=\"translate(195.029688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_5\">\n",
|
||
" <g id=\"line2d_9\">\n",
|
||
" <path d=\"M 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_10\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m7dc2fc49b2\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_5\">\n",
|
||
" <!-- 2.0 -->\n",
|
||
" <g transform=\"translate(243.854688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
|
||
"L 3431 531 \n",
|
||
"L 3431 0 \n",
|
||
"L 469 0 \n",
|
||
"L 469 531 \n",
|
||
"Q 828 903 1448 1529 \n",
|
||
"Q 2069 2156 2228 2338 \n",
|
||
"Q 2531 2678 2651 2914 \n",
|
||
"Q 2772 3150 2772 3378 \n",
|
||
"Q 2772 3750 2511 3984 \n",
|
||
"Q 2250 4219 1831 4219 \n",
|
||
"Q 1534 4219 1204 4116 \n",
|
||
"Q 875 4013 500 3803 \n",
|
||
"L 500 4441 \n",
|
||
"Q 881 4594 1212 4672 \n",
|
||
"Q 1544 4750 1819 4750 \n",
|
||
"Q 2544 4750 2975 4387 \n",
|
||
"Q 3406 4025 3406 3419 \n",
|
||
"Q 3406 3131 3298 2873 \n",
|
||
"Q 3191 2616 2906 2266 \n",
|
||
"Q 2828 2175 2409 1742 \n",
|
||
"Q 1991 1309 1228 531 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_6\">\n",
|
||
" <!-- epoch -->\n",
|
||
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
|
||
"L 3597 1613 \n",
|
||
"L 953 1613 \n",
|
||
"Q 991 1019 1311 708 \n",
|
||
"Q 1631 397 2203 397 \n",
|
||
"Q 2534 397 2845 478 \n",
|
||
"Q 3156 559 3463 722 \n",
|
||
"L 3463 178 \n",
|
||
"Q 3153 47 2828 -22 \n",
|
||
"Q 2503 -91 2169 -91 \n",
|
||
"Q 1331 -91 842 396 \n",
|
||
"Q 353 884 353 1716 \n",
|
||
"Q 353 2575 817 3079 \n",
|
||
"Q 1281 3584 2069 3584 \n",
|
||
"Q 2775 3584 3186 3129 \n",
|
||
"Q 3597 2675 3597 1894 \n",
|
||
"z\n",
|
||
"M 3022 2063 \n",
|
||
"Q 3016 2534 2758 2815 \n",
|
||
"Q 2500 3097 2075 3097 \n",
|
||
"Q 1594 3097 1305 2825 \n",
|
||
"Q 1016 2553 972 2059 \n",
|
||
"L 3022 2063 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
|
||
"L 1159 -1331 \n",
|
||
"L 581 -1331 \n",
|
||
"L 581 3500 \n",
|
||
"L 1159 3500 \n",
|
||
"L 1159 2969 \n",
|
||
"Q 1341 3281 1617 3432 \n",
|
||
"Q 1894 3584 2278 3584 \n",
|
||
"Q 2916 3584 3314 3078 \n",
|
||
"Q 3713 2572 3713 1747 \n",
|
||
"Q 3713 922 3314 415 \n",
|
||
"Q 2916 -91 2278 -91 \n",
|
||
"Q 1894 -91 1617 61 \n",
|
||
"Q 1341 213 1159 525 \n",
|
||
"z\n",
|
||
"M 3116 1747 \n",
|
||
"Q 3116 2381 2855 2742 \n",
|
||
"Q 2594 3103 2138 3103 \n",
|
||
"Q 1681 3103 1420 2742 \n",
|
||
"Q 1159 2381 1159 1747 \n",
|
||
"Q 1159 1113 1420 752 \n",
|
||
"Q 1681 391 2138 391 \n",
|
||
"Q 2594 391 2855 752 \n",
|
||
"Q 3116 1113 3116 1747 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
|
||
"Q 1497 3097 1228 2736 \n",
|
||
"Q 959 2375 959 1747 \n",
|
||
"Q 959 1119 1226 758 \n",
|
||
"Q 1494 397 1959 397 \n",
|
||
"Q 2419 397 2687 759 \n",
|
||
"Q 2956 1122 2956 1747 \n",
|
||
"Q 2956 2369 2687 2733 \n",
|
||
"Q 2419 3097 1959 3097 \n",
|
||
"z\n",
|
||
"M 1959 3584 \n",
|
||
"Q 2709 3584 3137 3096 \n",
|
||
"Q 3566 2609 3566 1747 \n",
|
||
"Q 3566 888 3137 398 \n",
|
||
"Q 2709 -91 1959 -91 \n",
|
||
"Q 1206 -91 779 398 \n",
|
||
"Q 353 888 353 1747 \n",
|
||
"Q 353 2609 779 3096 \n",
|
||
"Q 1206 3584 1959 3584 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
|
||
"L 3122 2828 \n",
|
||
"Q 2878 2963 2633 3030 \n",
|
||
"Q 2388 3097 2138 3097 \n",
|
||
"Q 1578 3097 1268 2742 \n",
|
||
"Q 959 2388 959 1747 \n",
|
||
"Q 959 1106 1268 751 \n",
|
||
"Q 1578 397 2138 397 \n",
|
||
"Q 2388 397 2633 464 \n",
|
||
"Q 2878 531 3122 666 \n",
|
||
"L 3122 134 \n",
|
||
"Q 2881 22 2623 -34 \n",
|
||
"Q 2366 -91 2075 -91 \n",
|
||
"Q 1284 -91 818 406 \n",
|
||
"Q 353 903 353 1747 \n",
|
||
"Q 353 2603 823 3093 \n",
|
||
"Q 1294 3584 2113 3584 \n",
|
||
"Q 2378 3584 2631 3529 \n",
|
||
"Q 2884 3475 3122 3366 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
|
||
"L 3513 0 \n",
|
||
"L 2938 0 \n",
|
||
"L 2938 2094 \n",
|
||
"Q 2938 2591 2744 2837 \n",
|
||
"Q 2550 3084 2163 3084 \n",
|
||
"Q 1697 3084 1428 2787 \n",
|
||
"Q 1159 2491 1159 1978 \n",
|
||
"L 1159 0 \n",
|
||
"L 581 0 \n",
|
||
"L 581 4863 \n",
|
||
"L 1159 4863 \n",
|
||
"L 1159 2956 \n",
|
||
"Q 1366 3272 1645 3428 \n",
|
||
"Q 1925 3584 2291 3584 \n",
|
||
"Q 2894 3584 3203 3211 \n",
|
||
"Q 3513 2838 3513 2113 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_2\">\n",
|
||
" <g id=\"ytick_1\">\n",
|
||
" <g id=\"line2d_11\">\n",
|
||
" <path d=\"M 56.50625 141.672296 \n",
|
||
"L 251.80625 141.672296 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_12\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"mf8fba4ad52\" d=\"M 0 0 \n",
|
||
"L -3.5 0 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mf8fba4ad52\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_7\">\n",
|
||
" <!-- 0.225 -->\n",
|
||
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_2\">\n",
|
||
" <g id=\"line2d_13\">\n",
|
||
" <path d=\"M 56.50625 115.53768 \n",
|
||
"L 251.80625 115.53768 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_14\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mf8fba4ad52\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_8\">\n",
|
||
" <!-- 0.250 -->\n",
|
||
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_3\">\n",
|
||
" <g id=\"line2d_15\">\n",
|
||
" <path d=\"M 56.50625 89.403065 \n",
|
||
"L 251.80625 89.403065 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_16\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mf8fba4ad52\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_9\">\n",
|
||
" <!-- 0.275 -->\n",
|
||
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
|
||
"L 3525 4666 \n",
|
||
"L 3525 4397 \n",
|
||
"L 1831 0 \n",
|
||
"L 1172 0 \n",
|
||
"L 2766 4134 \n",
|
||
"L 525 4134 \n",
|
||
"L 525 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_4\">\n",
|
||
" <g id=\"line2d_17\">\n",
|
||
" <path d=\"M 56.50625 63.26845 \n",
|
||
"L 251.80625 63.26845 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_18\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mf8fba4ad52\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_10\">\n",
|
||
" <!-- 0.300 -->\n",
|
||
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
|
||
"Q 3050 2419 3304 2112 \n",
|
||
"Q 3559 1806 3559 1356 \n",
|
||
"Q 3559 666 3084 287 \n",
|
||
"Q 2609 -91 1734 -91 \n",
|
||
"Q 1441 -91 1130 -33 \n",
|
||
"Q 819 25 488 141 \n",
|
||
"L 488 750 \n",
|
||
"Q 750 597 1062 519 \n",
|
||
"Q 1375 441 1716 441 \n",
|
||
"Q 2309 441 2620 675 \n",
|
||
"Q 2931 909 2931 1356 \n",
|
||
"Q 2931 1769 2642 2001 \n",
|
||
"Q 2353 2234 1838 2234 \n",
|
||
"L 1294 2234 \n",
|
||
"L 1294 2753 \n",
|
||
"L 1863 2753 \n",
|
||
"Q 2328 2753 2575 2939 \n",
|
||
"Q 2822 3125 2822 3475 \n",
|
||
"Q 2822 3834 2567 4026 \n",
|
||
"Q 2313 4219 1838 4219 \n",
|
||
"Q 1578 4219 1281 4162 \n",
|
||
"Q 984 4106 628 3988 \n",
|
||
"L 628 4550 \n",
|
||
"Q 988 4650 1302 4700 \n",
|
||
"Q 1616 4750 1894 4750 \n",
|
||
"Q 2613 4750 3031 4423 \n",
|
||
"Q 3450 4097 3450 3541 \n",
|
||
"Q 3450 3153 3228 2886 \n",
|
||
"Q 3006 2619 2597 2516 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_5\">\n",
|
||
" <g id=\"line2d_19\">\n",
|
||
" <path d=\"M 56.50625 37.133834 \n",
|
||
"L 251.80625 37.133834 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_20\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mf8fba4ad52\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_11\">\n",
|
||
" <!-- 0.325 -->\n",
|
||
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_6\">\n",
|
||
" <g id=\"line2d_21\">\n",
|
||
" <path d=\"M 56.50625 10.999219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_22\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mf8fba4ad52\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_12\">\n",
|
||
" <!-- 0.350 -->\n",
|
||
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_13\">\n",
|
||
" <!-- loss -->\n",
|
||
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
|
||
"L 1178 4863 \n",
|
||
"L 1178 0 \n",
|
||
"L 603 0 \n",
|
||
"L 603 4863 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
|
||
"L 2834 2853 \n",
|
||
"Q 2591 2978 2328 3040 \n",
|
||
"Q 2066 3103 1784 3103 \n",
|
||
"Q 1356 3103 1142 2972 \n",
|
||
"Q 928 2841 928 2578 \n",
|
||
"Q 928 2378 1081 2264 \n",
|
||
"Q 1234 2150 1697 2047 \n",
|
||
"L 1894 2003 \n",
|
||
"Q 2506 1872 2764 1633 \n",
|
||
"Q 3022 1394 3022 966 \n",
|
||
"Q 3022 478 2636 193 \n",
|
||
"Q 2250 -91 1575 -91 \n",
|
||
"Q 1294 -91 989 -36 \n",
|
||
"Q 684 19 347 128 \n",
|
||
"L 347 722 \n",
|
||
"Q 666 556 975 473 \n",
|
||
"Q 1284 391 1588 391 \n",
|
||
"Q 1994 391 2212 530 \n",
|
||
"Q 2431 669 2431 922 \n",
|
||
"Q 2431 1156 2273 1281 \n",
|
||
"Q 2116 1406 1581 1522 \n",
|
||
"L 1381 1569 \n",
|
||
"Q 847 1681 609 1914 \n",
|
||
"Q 372 2147 372 2553 \n",
|
||
"Q 372 3047 722 3315 \n",
|
||
"Q 1072 3584 1716 3584 \n",
|
||
"Q 2034 3584 2315 3537 \n",
|
||
"Q 2597 3491 2834 3397 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_23\">\n",
|
||
" <path d=\"M 75.933442 -1 \n",
|
||
"L 82.54625 37.194192 \n",
|
||
"L 95.56625 81.765788 \n",
|
||
"L 108.58625 108.961006 \n",
|
||
"L 121.60625 115.915567 \n",
|
||
"L 134.62625 118.953393 \n",
|
||
"L 147.64625 121.015797 \n",
|
||
"L 160.66625 122.422801 \n",
|
||
"L 173.68625 121.738152 \n",
|
||
"L 186.70625 119.471584 \n",
|
||
"L 199.72625 122.077453 \n",
|
||
"L 212.74625 122.722963 \n",
|
||
"L 225.76625 122.039335 \n",
|
||
"L 238.78625 122.261705 \n",
|
||
"L 251.80625 122.052467 \n",
|
||
"\" clip-path=\"url(#p1d294cb0bf)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_3\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_4\">\n",
|
||
" <path d=\"M 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_5\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 251.80625 146.899219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_6\">\n",
|
||
" <path d=\"M 56.50625 10.999219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <defs>\n",
|
||
" <clipPath id=\"p1d294cb0bf\">\n",
|
||
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
|
||
" </clipPath>\n",
|
||
" </defs>\n",
|
||
"</svg>\n"
|
||
],
|
||
"text/plain": [
|
||
"<Figure size 252x180 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"data_iter, feature_dim = d2l.get_data_ch11(batch_size=10)\n",
|
||
"d2l.train_ch11(adam, init_adam_states(feature_dim),\n",
|
||
" {'lr': 0.01, 't': 1}, data_iter, feature_dim);"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1f3ac5ff",
|
||
"metadata": {
|
||
"origin_pos": 7
|
||
},
|
||
"source": [
|
||
"此外,我们可以用深度学习框架自带算法应用Adam算法,这里我们只需要传递配置参数。\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "bba60ced",
|
||
"metadata": {
|
||
"execution": {
|
||
"iopub.execute_input": "2023-08-18T07:07:03.162734Z",
|
||
"iopub.status.busy": "2023-08-18T07:07:03.162173Z",
|
||
"iopub.status.idle": "2023-08-18T07:07:08.522679Z",
|
||
"shell.execute_reply": "2023-08-18T07:07:08.521534Z"
|
||
},
|
||
"origin_pos": 9,
|
||
"tab": [
|
||
"pytorch"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"loss: 0.254, 0.015 sec/epoch\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"image/svg+xml": [
|
||
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
|
||
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
|
||
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
|
||
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"262.1875pt\" height=\"184.455469pt\" viewBox=\"0 0 262.1875 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
|
||
" <metadata>\n",
|
||
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
|
||
" <cc:Work>\n",
|
||
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
|
||
" <dc:date>2023-08-18T07:07:08.488032</dc:date>\n",
|
||
" <dc:format>image/svg+xml</dc:format>\n",
|
||
" <dc:creator>\n",
|
||
" <cc:Agent>\n",
|
||
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
|
||
" </cc:Agent>\n",
|
||
" </dc:creator>\n",
|
||
" </cc:Work>\n",
|
||
" </rdf:RDF>\n",
|
||
" </metadata>\n",
|
||
" <defs>\n",
|
||
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
|
||
" </defs>\n",
|
||
" <g id=\"figure_1\">\n",
|
||
" <g id=\"patch_1\">\n",
|
||
" <path d=\"M -0 184.455469 \n",
|
||
"L 262.1875 184.455469 \n",
|
||
"L 262.1875 0 \n",
|
||
"L -0 0 \n",
|
||
"L -0 184.455469 \n",
|
||
"z\n",
|
||
"\" style=\"fill: none\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"axes_1\">\n",
|
||
" <g id=\"patch_2\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ffffff\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_1\">\n",
|
||
" <g id=\"xtick_1\">\n",
|
||
" <g id=\"line2d_1\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_2\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"m12a7fbcaf4\" d=\"M 0 0 \n",
|
||
"L 0 3.5 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m12a7fbcaf4\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_1\">\n",
|
||
" <!-- 0 -->\n",
|
||
" <g transform=\"translate(53.325 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
|
||
"Q 1547 4250 1301 3770 \n",
|
||
"Q 1056 3291 1056 2328 \n",
|
||
"Q 1056 1369 1301 889 \n",
|
||
"Q 1547 409 2034 409 \n",
|
||
"Q 2525 409 2770 889 \n",
|
||
"Q 3016 1369 3016 2328 \n",
|
||
"Q 3016 3291 2770 3770 \n",
|
||
"Q 2525 4250 2034 4250 \n",
|
||
"z\n",
|
||
"M 2034 4750 \n",
|
||
"Q 2819 4750 3233 4129 \n",
|
||
"Q 3647 3509 3647 2328 \n",
|
||
"Q 3647 1150 3233 529 \n",
|
||
"Q 2819 -91 2034 -91 \n",
|
||
"Q 1250 -91 836 529 \n",
|
||
"Q 422 1150 422 2328 \n",
|
||
"Q 422 3509 836 4129 \n",
|
||
"Q 1250 4750 2034 4750 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_2\">\n",
|
||
" <g id=\"line2d_3\">\n",
|
||
" <path d=\"M 105.33125 146.899219 \n",
|
||
"L 105.33125 10.999219 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_4\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m12a7fbcaf4\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_2\">\n",
|
||
" <!-- 1 -->\n",
|
||
" <g transform=\"translate(102.15 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
|
||
"L 1825 531 \n",
|
||
"L 1825 4091 \n",
|
||
"L 703 3866 \n",
|
||
"L 703 4441 \n",
|
||
"L 1819 4666 \n",
|
||
"L 2450 4666 \n",
|
||
"L 2450 531 \n",
|
||
"L 3481 531 \n",
|
||
"L 3481 0 \n",
|
||
"L 794 0 \n",
|
||
"L 794 531 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_3\">\n",
|
||
" <g id=\"line2d_5\">\n",
|
||
" <path d=\"M 154.15625 146.899219 \n",
|
||
"L 154.15625 10.999219 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_6\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m12a7fbcaf4\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_3\">\n",
|
||
" <!-- 2 -->\n",
|
||
" <g transform=\"translate(150.975 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
|
||
"L 3431 531 \n",
|
||
"L 3431 0 \n",
|
||
"L 469 0 \n",
|
||
"L 469 531 \n",
|
||
"Q 828 903 1448 1529 \n",
|
||
"Q 2069 2156 2228 2338 \n",
|
||
"Q 2531 2678 2651 2914 \n",
|
||
"Q 2772 3150 2772 3378 \n",
|
||
"Q 2772 3750 2511 3984 \n",
|
||
"Q 2250 4219 1831 4219 \n",
|
||
"Q 1534 4219 1204 4116 \n",
|
||
"Q 875 4013 500 3803 \n",
|
||
"L 500 4441 \n",
|
||
"Q 881 4594 1212 4672 \n",
|
||
"Q 1544 4750 1819 4750 \n",
|
||
"Q 2544 4750 2975 4387 \n",
|
||
"Q 3406 4025 3406 3419 \n",
|
||
"Q 3406 3131 3298 2873 \n",
|
||
"Q 3191 2616 2906 2266 \n",
|
||
"Q 2828 2175 2409 1742 \n",
|
||
"Q 1991 1309 1228 531 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_4\">\n",
|
||
" <g id=\"line2d_7\">\n",
|
||
" <path d=\"M 202.98125 146.899219 \n",
|
||
"L 202.98125 10.999219 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_8\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m12a7fbcaf4\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_4\">\n",
|
||
" <!-- 3 -->\n",
|
||
" <g transform=\"translate(199.8 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
|
||
"Q 3050 2419 3304 2112 \n",
|
||
"Q 3559 1806 3559 1356 \n",
|
||
"Q 3559 666 3084 287 \n",
|
||
"Q 2609 -91 1734 -91 \n",
|
||
"Q 1441 -91 1130 -33 \n",
|
||
"Q 819 25 488 141 \n",
|
||
"L 488 750 \n",
|
||
"Q 750 597 1062 519 \n",
|
||
"Q 1375 441 1716 441 \n",
|
||
"Q 2309 441 2620 675 \n",
|
||
"Q 2931 909 2931 1356 \n",
|
||
"Q 2931 1769 2642 2001 \n",
|
||
"Q 2353 2234 1838 2234 \n",
|
||
"L 1294 2234 \n",
|
||
"L 1294 2753 \n",
|
||
"L 1863 2753 \n",
|
||
"Q 2328 2753 2575 2939 \n",
|
||
"Q 2822 3125 2822 3475 \n",
|
||
"Q 2822 3834 2567 4026 \n",
|
||
"Q 2313 4219 1838 4219 \n",
|
||
"Q 1578 4219 1281 4162 \n",
|
||
"Q 984 4106 628 3988 \n",
|
||
"L 628 4550 \n",
|
||
"Q 988 4650 1302 4700 \n",
|
||
"Q 1616 4750 1894 4750 \n",
|
||
"Q 2613 4750 3031 4423 \n",
|
||
"Q 3450 4097 3450 3541 \n",
|
||
"Q 3450 3153 3228 2886 \n",
|
||
"Q 3006 2619 2597 2516 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_5\">\n",
|
||
" <g id=\"line2d_9\">\n",
|
||
" <path d=\"M 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_10\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m12a7fbcaf4\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_5\">\n",
|
||
" <!-- 4 -->\n",
|
||
" <g transform=\"translate(248.625 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
|
||
"L 825 1625 \n",
|
||
"L 2419 1625 \n",
|
||
"L 2419 4116 \n",
|
||
"z\n",
|
||
"M 2253 4666 \n",
|
||
"L 3047 4666 \n",
|
||
"L 3047 1625 \n",
|
||
"L 3713 1625 \n",
|
||
"L 3713 1100 \n",
|
||
"L 3047 1100 \n",
|
||
"L 3047 0 \n",
|
||
"L 2419 0 \n",
|
||
"L 2419 1100 \n",
|
||
"L 313 1100 \n",
|
||
"L 313 1709 \n",
|
||
"L 2253 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_6\">\n",
|
||
" <!-- epoch -->\n",
|
||
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
|
||
"L 3597 1613 \n",
|
||
"L 953 1613 \n",
|
||
"Q 991 1019 1311 708 \n",
|
||
"Q 1631 397 2203 397 \n",
|
||
"Q 2534 397 2845 478 \n",
|
||
"Q 3156 559 3463 722 \n",
|
||
"L 3463 178 \n",
|
||
"Q 3153 47 2828 -22 \n",
|
||
"Q 2503 -91 2169 -91 \n",
|
||
"Q 1331 -91 842 396 \n",
|
||
"Q 353 884 353 1716 \n",
|
||
"Q 353 2575 817 3079 \n",
|
||
"Q 1281 3584 2069 3584 \n",
|
||
"Q 2775 3584 3186 3129 \n",
|
||
"Q 3597 2675 3597 1894 \n",
|
||
"z\n",
|
||
"M 3022 2063 \n",
|
||
"Q 3016 2534 2758 2815 \n",
|
||
"Q 2500 3097 2075 3097 \n",
|
||
"Q 1594 3097 1305 2825 \n",
|
||
"Q 1016 2553 972 2059 \n",
|
||
"L 3022 2063 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
|
||
"L 1159 -1331 \n",
|
||
"L 581 -1331 \n",
|
||
"L 581 3500 \n",
|
||
"L 1159 3500 \n",
|
||
"L 1159 2969 \n",
|
||
"Q 1341 3281 1617 3432 \n",
|
||
"Q 1894 3584 2278 3584 \n",
|
||
"Q 2916 3584 3314 3078 \n",
|
||
"Q 3713 2572 3713 1747 \n",
|
||
"Q 3713 922 3314 415 \n",
|
||
"Q 2916 -91 2278 -91 \n",
|
||
"Q 1894 -91 1617 61 \n",
|
||
"Q 1341 213 1159 525 \n",
|
||
"z\n",
|
||
"M 3116 1747 \n",
|
||
"Q 3116 2381 2855 2742 \n",
|
||
"Q 2594 3103 2138 3103 \n",
|
||
"Q 1681 3103 1420 2742 \n",
|
||
"Q 1159 2381 1159 1747 \n",
|
||
"Q 1159 1113 1420 752 \n",
|
||
"Q 1681 391 2138 391 \n",
|
||
"Q 2594 391 2855 752 \n",
|
||
"Q 3116 1113 3116 1747 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
|
||
"Q 1497 3097 1228 2736 \n",
|
||
"Q 959 2375 959 1747 \n",
|
||
"Q 959 1119 1226 758 \n",
|
||
"Q 1494 397 1959 397 \n",
|
||
"Q 2419 397 2687 759 \n",
|
||
"Q 2956 1122 2956 1747 \n",
|
||
"Q 2956 2369 2687 2733 \n",
|
||
"Q 2419 3097 1959 3097 \n",
|
||
"z\n",
|
||
"M 1959 3584 \n",
|
||
"Q 2709 3584 3137 3096 \n",
|
||
"Q 3566 2609 3566 1747 \n",
|
||
"Q 3566 888 3137 398 \n",
|
||
"Q 2709 -91 1959 -91 \n",
|
||
"Q 1206 -91 779 398 \n",
|
||
"Q 353 888 353 1747 \n",
|
||
"Q 353 2609 779 3096 \n",
|
||
"Q 1206 3584 1959 3584 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
|
||
"L 3122 2828 \n",
|
||
"Q 2878 2963 2633 3030 \n",
|
||
"Q 2388 3097 2138 3097 \n",
|
||
"Q 1578 3097 1268 2742 \n",
|
||
"Q 959 2388 959 1747 \n",
|
||
"Q 959 1106 1268 751 \n",
|
||
"Q 1578 397 2138 397 \n",
|
||
"Q 2388 397 2633 464 \n",
|
||
"Q 2878 531 3122 666 \n",
|
||
"L 3122 134 \n",
|
||
"Q 2881 22 2623 -34 \n",
|
||
"Q 2366 -91 2075 -91 \n",
|
||
"Q 1284 -91 818 406 \n",
|
||
"Q 353 903 353 1747 \n",
|
||
"Q 353 2603 823 3093 \n",
|
||
"Q 1294 3584 2113 3584 \n",
|
||
"Q 2378 3584 2631 3529 \n",
|
||
"Q 2884 3475 3122 3366 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
|
||
"L 3513 0 \n",
|
||
"L 2938 0 \n",
|
||
"L 2938 2094 \n",
|
||
"Q 2938 2591 2744 2837 \n",
|
||
"Q 2550 3084 2163 3084 \n",
|
||
"Q 1697 3084 1428 2787 \n",
|
||
"Q 1159 2491 1159 1978 \n",
|
||
"L 1159 0 \n",
|
||
"L 581 0 \n",
|
||
"L 581 4863 \n",
|
||
"L 1159 4863 \n",
|
||
"L 1159 2956 \n",
|
||
"Q 1366 3272 1645 3428 \n",
|
||
"Q 1925 3584 2291 3584 \n",
|
||
"Q 2894 3584 3203 3211 \n",
|
||
"Q 3513 2838 3513 2113 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_2\">\n",
|
||
" <g id=\"ytick_1\">\n",
|
||
" <g id=\"line2d_11\">\n",
|
||
" <path d=\"M 56.50625 141.672296 \n",
|
||
"L 251.80625 141.672296 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_12\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"m27739734b3\" d=\"M 0 0 \n",
|
||
"L -3.5 0 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m27739734b3\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_7\">\n",
|
||
" <!-- 0.225 -->\n",
|
||
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
|
||
"L 1344 794 \n",
|
||
"L 1344 0 \n",
|
||
"L 684 0 \n",
|
||
"L 684 794 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
|
||
"L 3169 4666 \n",
|
||
"L 3169 4134 \n",
|
||
"L 1269 4134 \n",
|
||
"L 1269 2991 \n",
|
||
"Q 1406 3038 1543 3061 \n",
|
||
"Q 1681 3084 1819 3084 \n",
|
||
"Q 2600 3084 3056 2656 \n",
|
||
"Q 3513 2228 3513 1497 \n",
|
||
"Q 3513 744 3044 326 \n",
|
||
"Q 2575 -91 1722 -91 \n",
|
||
"Q 1428 -91 1123 -41 \n",
|
||
"Q 819 9 494 109 \n",
|
||
"L 494 744 \n",
|
||
"Q 775 591 1075 516 \n",
|
||
"Q 1375 441 1709 441 \n",
|
||
"Q 2250 441 2565 725 \n",
|
||
"Q 2881 1009 2881 1497 \n",
|
||
"Q 2881 1984 2565 2268 \n",
|
||
"Q 2250 2553 1709 2553 \n",
|
||
"Q 1456 2553 1204 2497 \n",
|
||
"Q 953 2441 691 2322 \n",
|
||
"L 691 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_2\">\n",
|
||
" <g id=\"line2d_13\">\n",
|
||
" <path d=\"M 56.50625 115.53768 \n",
|
||
"L 251.80625 115.53768 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_14\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m27739734b3\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_8\">\n",
|
||
" <!-- 0.250 -->\n",
|
||
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_3\">\n",
|
||
" <g id=\"line2d_15\">\n",
|
||
" <path d=\"M 56.50625 89.403065 \n",
|
||
"L 251.80625 89.403065 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_16\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m27739734b3\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_9\">\n",
|
||
" <!-- 0.275 -->\n",
|
||
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
|
||
"L 3525 4666 \n",
|
||
"L 3525 4397 \n",
|
||
"L 1831 0 \n",
|
||
"L 1172 0 \n",
|
||
"L 2766 4134 \n",
|
||
"L 525 4134 \n",
|
||
"L 525 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_4\">\n",
|
||
" <g id=\"line2d_17\">\n",
|
||
" <path d=\"M 56.50625 63.26845 \n",
|
||
"L 251.80625 63.26845 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_18\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m27739734b3\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_10\">\n",
|
||
" <!-- 0.300 -->\n",
|
||
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_5\">\n",
|
||
" <g id=\"line2d_19\">\n",
|
||
" <path d=\"M 56.50625 37.133834 \n",
|
||
"L 251.80625 37.133834 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_20\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m27739734b3\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_11\">\n",
|
||
" <!-- 0.325 -->\n",
|
||
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_6\">\n",
|
||
" <g id=\"line2d_21\">\n",
|
||
" <path d=\"M 56.50625 10.999219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_22\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m27739734b3\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_12\">\n",
|
||
" <!-- 0.350 -->\n",
|
||
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_13\">\n",
|
||
" <!-- loss -->\n",
|
||
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
|
||
"L 1178 4863 \n",
|
||
"L 1178 0 \n",
|
||
"L 603 0 \n",
|
||
"L 603 4863 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
|
||
"L 2834 2853 \n",
|
||
"Q 2591 2978 2328 3040 \n",
|
||
"Q 2066 3103 1784 3103 \n",
|
||
"Q 1356 3103 1142 2972 \n",
|
||
"Q 928 2841 928 2578 \n",
|
||
"Q 928 2378 1081 2264 \n",
|
||
"Q 1234 2150 1697 2047 \n",
|
||
"L 1894 2003 \n",
|
||
"Q 2506 1872 2764 1633 \n",
|
||
"Q 3022 1394 3022 966 \n",
|
||
"Q 3022 478 2636 193 \n",
|
||
"Q 2250 -91 1575 -91 \n",
|
||
"Q 1294 -91 989 -36 \n",
|
||
"Q 684 19 347 128 \n",
|
||
"L 347 722 \n",
|
||
"Q 666 556 975 473 \n",
|
||
"Q 1284 391 1588 391 \n",
|
||
"Q 1994 391 2212 530 \n",
|
||
"Q 2431 669 2431 922 \n",
|
||
"Q 2431 1156 2273 1281 \n",
|
||
"Q 2116 1406 1581 1522 \n",
|
||
"L 1381 1569 \n",
|
||
"Q 847 1681 609 1914 \n",
|
||
"Q 372 2147 372 2553 \n",
|
||
"Q 372 3047 722 3315 \n",
|
||
"Q 1072 3584 1716 3584 \n",
|
||
"Q 2034 3584 2315 3537 \n",
|
||
"Q 2597 3491 2834 3397 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_23\">\n",
|
||
" <path d=\"M 66.69056 -1 \n",
|
||
"L 69.52625 37.371177 \n",
|
||
"L 76.03625 81.541253 \n",
|
||
"L 82.54625 105.384386 \n",
|
||
"L 89.05625 112.227497 \n",
|
||
"L 95.56625 120.636156 \n",
|
||
"L 102.07625 120.715804 \n",
|
||
"L 108.58625 112.552121 \n",
|
||
"L 115.09625 121.254148 \n",
|
||
"L 121.60625 121.542661 \n",
|
||
"L 128.11625 121.798723 \n",
|
||
"L 134.62625 122.25835 \n",
|
||
"L 141.13625 121.317633 \n",
|
||
"L 147.64625 122.931881 \n",
|
||
"L 154.15625 123.027853 \n",
|
||
"L 160.66625 123.069686 \n",
|
||
"L 167.17625 122.553625 \n",
|
||
"L 173.68625 122.508729 \n",
|
||
"L 180.19625 123.807006 \n",
|
||
"L 186.70625 123.146866 \n",
|
||
"L 193.21625 122.755345 \n",
|
||
"L 199.72625 123.158251 \n",
|
||
"L 206.23625 120.2054 \n",
|
||
"L 212.74625 120.426721 \n",
|
||
"L 219.25625 122.305816 \n",
|
||
"L 225.76625 123.435641 \n",
|
||
"L 232.27625 118.664978 \n",
|
||
"L 238.78625 122.102243 \n",
|
||
"L 245.29625 121.385643 \n",
|
||
"L 251.80625 111.625322 \n",
|
||
"\" clip-path=\"url(#p26492ce462)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_3\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_4\">\n",
|
||
" <path d=\"M 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_5\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 251.80625 146.899219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_6\">\n",
|
||
" <path d=\"M 56.50625 10.999219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <defs>\n",
|
||
" <clipPath id=\"p26492ce462\">\n",
|
||
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
|
||
" </clipPath>\n",
|
||
" </defs>\n",
|
||
"</svg>\n"
|
||
],
|
||
"text/plain": [
|
||
"<Figure size 252x180 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"trainer = torch.optim.Adam\n",
|
||
"d2l.train_concise_ch11(trainer, {'lr': 0.01}, data_iter)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6234218c",
|
||
"metadata": {
|
||
"origin_pos": 12
|
||
},
|
||
"source": [
|
||
"## Yogi\n",
|
||
"\n",
|
||
"Adam算法也存在一些问题:\n",
|
||
"即使在凸环境下,当$\\mathbf{s}_t$的二次矩估计值爆炸时,它可能无法收敛。\n",
|
||
" :cite:`Zaheer.Reddi.Sachan.ea.2018`为$\\mathbf{s}_t$提出了的改进更新和参数初始化。\n",
|
||
"论文中建议我们重写Adam算法更新如下:\n",
|
||
"\n",
|
||
"$$\\mathbf{s}_t \\leftarrow \\mathbf{s}_{t-1} + (1 - \\beta_2) \\left(\\mathbf{g}_t^2 - \\mathbf{s}_{t-1}\\right).$$\n",
|
||
"\n",
|
||
"每当$\\mathbf{g}_t^2$具有值很大的变量或更新很稀疏时,$\\mathbf{s}_t$可能会太快地“忘记”过去的值。\n",
|
||
"一个有效的解决方法是将$\\mathbf{g}_t^2 - \\mathbf{s}_{t-1}$替换为$\\mathbf{g}_t^2 \\odot \\mathop{\\mathrm{sgn}}(\\mathbf{g}_t^2 - \\mathbf{s}_{t-1})$。\n",
|
||
"这就是Yogi更新,现在更新的规模不再取决于偏差的量。\n",
|
||
"\n",
|
||
"$$\\mathbf{s}_t \\leftarrow \\mathbf{s}_{t-1} + (1 - \\beta_2) \\mathbf{g}_t^2 \\odot \\mathop{\\mathrm{sgn}}(\\mathbf{g}_t^2 - \\mathbf{s}_{t-1}).$$\n",
|
||
"\n",
|
||
"论文中,作者还进一步建议用更大的初始批量来初始化动量,而不仅仅是初始的逐点估计。\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "14083265",
|
||
"metadata": {
|
||
"execution": {
|
||
"iopub.execute_input": "2023-08-18T07:07:08.526607Z",
|
||
"iopub.status.busy": "2023-08-18T07:07:08.526054Z",
|
||
"iopub.status.idle": "2023-08-18T07:07:11.129122Z",
|
||
"shell.execute_reply": "2023-08-18T07:07:11.128016Z"
|
||
},
|
||
"origin_pos": 14,
|
||
"tab": [
|
||
"pytorch"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"loss: 0.245, 0.015 sec/epoch\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"image/svg+xml": [
|
||
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
|
||
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
|
||
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
|
||
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"266.957813pt\" height=\"184.455469pt\" viewBox=\"0 0 266.957813 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
|
||
" <metadata>\n",
|
||
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
|
||
" <cc:Work>\n",
|
||
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
|
||
" <dc:date>2023-08-18T07:07:11.093925</dc:date>\n",
|
||
" <dc:format>image/svg+xml</dc:format>\n",
|
||
" <dc:creator>\n",
|
||
" <cc:Agent>\n",
|
||
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
|
||
" </cc:Agent>\n",
|
||
" </dc:creator>\n",
|
||
" </cc:Work>\n",
|
||
" </rdf:RDF>\n",
|
||
" </metadata>\n",
|
||
" <defs>\n",
|
||
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
|
||
" </defs>\n",
|
||
" <g id=\"figure_1\">\n",
|
||
" <g id=\"patch_1\">\n",
|
||
" <path d=\"M -0 184.455469 \n",
|
||
"L 266.957813 184.455469 \n",
|
||
"L 266.957813 0 \n",
|
||
"L -0 0 \n",
|
||
"L -0 184.455469 \n",
|
||
"z\n",
|
||
"\" style=\"fill: none\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"axes_1\">\n",
|
||
" <g id=\"patch_2\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ffffff\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_1\">\n",
|
||
" <g id=\"xtick_1\">\n",
|
||
" <g id=\"line2d_1\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_2\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"m237606db71\" d=\"M 0 0 \n",
|
||
"L 0 3.5 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m237606db71\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_1\">\n",
|
||
" <!-- 0.0 -->\n",
|
||
" <g transform=\"translate(48.554688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
|
||
"Q 1547 4250 1301 3770 \n",
|
||
"Q 1056 3291 1056 2328 \n",
|
||
"Q 1056 1369 1301 889 \n",
|
||
"Q 1547 409 2034 409 \n",
|
||
"Q 2525 409 2770 889 \n",
|
||
"Q 3016 1369 3016 2328 \n",
|
||
"Q 3016 3291 2770 3770 \n",
|
||
"Q 2525 4250 2034 4250 \n",
|
||
"z\n",
|
||
"M 2034 4750 \n",
|
||
"Q 2819 4750 3233 4129 \n",
|
||
"Q 3647 3509 3647 2328 \n",
|
||
"Q 3647 1150 3233 529 \n",
|
||
"Q 2819 -91 2034 -91 \n",
|
||
"Q 1250 -91 836 529 \n",
|
||
"Q 422 1150 422 2328 \n",
|
||
"Q 422 3509 836 4129 \n",
|
||
"Q 1250 4750 2034 4750 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
|
||
"L 1344 794 \n",
|
||
"L 1344 0 \n",
|
||
"L 684 0 \n",
|
||
"L 684 794 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_2\">\n",
|
||
" <g id=\"line2d_3\">\n",
|
||
" <path d=\"M 105.33125 146.899219 \n",
|
||
"L 105.33125 10.999219 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_4\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m237606db71\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_2\">\n",
|
||
" <!-- 0.5 -->\n",
|
||
" <g transform=\"translate(97.379688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
|
||
"L 3169 4666 \n",
|
||
"L 3169 4134 \n",
|
||
"L 1269 4134 \n",
|
||
"L 1269 2991 \n",
|
||
"Q 1406 3038 1543 3061 \n",
|
||
"Q 1681 3084 1819 3084 \n",
|
||
"Q 2600 3084 3056 2656 \n",
|
||
"Q 3513 2228 3513 1497 \n",
|
||
"Q 3513 744 3044 326 \n",
|
||
"Q 2575 -91 1722 -91 \n",
|
||
"Q 1428 -91 1123 -41 \n",
|
||
"Q 819 9 494 109 \n",
|
||
"L 494 744 \n",
|
||
"Q 775 591 1075 516 \n",
|
||
"Q 1375 441 1709 441 \n",
|
||
"Q 2250 441 2565 725 \n",
|
||
"Q 2881 1009 2881 1497 \n",
|
||
"Q 2881 1984 2565 2268 \n",
|
||
"Q 2250 2553 1709 2553 \n",
|
||
"Q 1456 2553 1204 2497 \n",
|
||
"Q 953 2441 691 2322 \n",
|
||
"L 691 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_3\">\n",
|
||
" <g id=\"line2d_5\">\n",
|
||
" <path d=\"M 154.15625 146.899219 \n",
|
||
"L 154.15625 10.999219 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_6\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m237606db71\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_3\">\n",
|
||
" <!-- 1.0 -->\n",
|
||
" <g transform=\"translate(146.204688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
|
||
"L 1825 531 \n",
|
||
"L 1825 4091 \n",
|
||
"L 703 3866 \n",
|
||
"L 703 4441 \n",
|
||
"L 1819 4666 \n",
|
||
"L 2450 4666 \n",
|
||
"L 2450 531 \n",
|
||
"L 3481 531 \n",
|
||
"L 3481 0 \n",
|
||
"L 794 0 \n",
|
||
"L 794 531 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_4\">\n",
|
||
" <g id=\"line2d_7\">\n",
|
||
" <path d=\"M 202.98125 146.899219 \n",
|
||
"L 202.98125 10.999219 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_8\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m237606db71\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_4\">\n",
|
||
" <!-- 1.5 -->\n",
|
||
" <g transform=\"translate(195.029688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_5\">\n",
|
||
" <g id=\"line2d_9\">\n",
|
||
" <path d=\"M 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_10\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m237606db71\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_5\">\n",
|
||
" <!-- 2.0 -->\n",
|
||
" <g transform=\"translate(243.854688 161.497656)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
|
||
"L 3431 531 \n",
|
||
"L 3431 0 \n",
|
||
"L 469 0 \n",
|
||
"L 469 531 \n",
|
||
"Q 828 903 1448 1529 \n",
|
||
"Q 2069 2156 2228 2338 \n",
|
||
"Q 2531 2678 2651 2914 \n",
|
||
"Q 2772 3150 2772 3378 \n",
|
||
"Q 2772 3750 2511 3984 \n",
|
||
"Q 2250 4219 1831 4219 \n",
|
||
"Q 1534 4219 1204 4116 \n",
|
||
"Q 875 4013 500 3803 \n",
|
||
"L 500 4441 \n",
|
||
"Q 881 4594 1212 4672 \n",
|
||
"Q 1544 4750 1819 4750 \n",
|
||
"Q 2544 4750 2975 4387 \n",
|
||
"Q 3406 4025 3406 3419 \n",
|
||
"Q 3406 3131 3298 2873 \n",
|
||
"Q 3191 2616 2906 2266 \n",
|
||
"Q 2828 2175 2409 1742 \n",
|
||
"Q 1991 1309 1228 531 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_6\">\n",
|
||
" <!-- epoch -->\n",
|
||
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
|
||
"L 3597 1613 \n",
|
||
"L 953 1613 \n",
|
||
"Q 991 1019 1311 708 \n",
|
||
"Q 1631 397 2203 397 \n",
|
||
"Q 2534 397 2845 478 \n",
|
||
"Q 3156 559 3463 722 \n",
|
||
"L 3463 178 \n",
|
||
"Q 3153 47 2828 -22 \n",
|
||
"Q 2503 -91 2169 -91 \n",
|
||
"Q 1331 -91 842 396 \n",
|
||
"Q 353 884 353 1716 \n",
|
||
"Q 353 2575 817 3079 \n",
|
||
"Q 1281 3584 2069 3584 \n",
|
||
"Q 2775 3584 3186 3129 \n",
|
||
"Q 3597 2675 3597 1894 \n",
|
||
"z\n",
|
||
"M 3022 2063 \n",
|
||
"Q 3016 2534 2758 2815 \n",
|
||
"Q 2500 3097 2075 3097 \n",
|
||
"Q 1594 3097 1305 2825 \n",
|
||
"Q 1016 2553 972 2059 \n",
|
||
"L 3022 2063 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
|
||
"L 1159 -1331 \n",
|
||
"L 581 -1331 \n",
|
||
"L 581 3500 \n",
|
||
"L 1159 3500 \n",
|
||
"L 1159 2969 \n",
|
||
"Q 1341 3281 1617 3432 \n",
|
||
"Q 1894 3584 2278 3584 \n",
|
||
"Q 2916 3584 3314 3078 \n",
|
||
"Q 3713 2572 3713 1747 \n",
|
||
"Q 3713 922 3314 415 \n",
|
||
"Q 2916 -91 2278 -91 \n",
|
||
"Q 1894 -91 1617 61 \n",
|
||
"Q 1341 213 1159 525 \n",
|
||
"z\n",
|
||
"M 3116 1747 \n",
|
||
"Q 3116 2381 2855 2742 \n",
|
||
"Q 2594 3103 2138 3103 \n",
|
||
"Q 1681 3103 1420 2742 \n",
|
||
"Q 1159 2381 1159 1747 \n",
|
||
"Q 1159 1113 1420 752 \n",
|
||
"Q 1681 391 2138 391 \n",
|
||
"Q 2594 391 2855 752 \n",
|
||
"Q 3116 1113 3116 1747 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
|
||
"Q 1497 3097 1228 2736 \n",
|
||
"Q 959 2375 959 1747 \n",
|
||
"Q 959 1119 1226 758 \n",
|
||
"Q 1494 397 1959 397 \n",
|
||
"Q 2419 397 2687 759 \n",
|
||
"Q 2956 1122 2956 1747 \n",
|
||
"Q 2956 2369 2687 2733 \n",
|
||
"Q 2419 3097 1959 3097 \n",
|
||
"z\n",
|
||
"M 1959 3584 \n",
|
||
"Q 2709 3584 3137 3096 \n",
|
||
"Q 3566 2609 3566 1747 \n",
|
||
"Q 3566 888 3137 398 \n",
|
||
"Q 2709 -91 1959 -91 \n",
|
||
"Q 1206 -91 779 398 \n",
|
||
"Q 353 888 353 1747 \n",
|
||
"Q 353 2609 779 3096 \n",
|
||
"Q 1206 3584 1959 3584 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
|
||
"L 3122 2828 \n",
|
||
"Q 2878 2963 2633 3030 \n",
|
||
"Q 2388 3097 2138 3097 \n",
|
||
"Q 1578 3097 1268 2742 \n",
|
||
"Q 959 2388 959 1747 \n",
|
||
"Q 959 1106 1268 751 \n",
|
||
"Q 1578 397 2138 397 \n",
|
||
"Q 2388 397 2633 464 \n",
|
||
"Q 2878 531 3122 666 \n",
|
||
"L 3122 134 \n",
|
||
"Q 2881 22 2623 -34 \n",
|
||
"Q 2366 -91 2075 -91 \n",
|
||
"Q 1284 -91 818 406 \n",
|
||
"Q 353 903 353 1747 \n",
|
||
"Q 353 2603 823 3093 \n",
|
||
"Q 1294 3584 2113 3584 \n",
|
||
"Q 2378 3584 2631 3529 \n",
|
||
"Q 2884 3475 3122 3366 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
|
||
"L 3513 0 \n",
|
||
"L 2938 0 \n",
|
||
"L 2938 2094 \n",
|
||
"Q 2938 2591 2744 2837 \n",
|
||
"Q 2550 3084 2163 3084 \n",
|
||
"Q 1697 3084 1428 2787 \n",
|
||
"Q 1159 2491 1159 1978 \n",
|
||
"L 1159 0 \n",
|
||
"L 581 0 \n",
|
||
"L 581 4863 \n",
|
||
"L 1159 4863 \n",
|
||
"L 1159 2956 \n",
|
||
"Q 1366 3272 1645 3428 \n",
|
||
"Q 1925 3584 2291 3584 \n",
|
||
"Q 2894 3584 3203 3211 \n",
|
||
"Q 3513 2838 3513 2113 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_2\">\n",
|
||
" <g id=\"ytick_1\">\n",
|
||
" <g id=\"line2d_11\">\n",
|
||
" <path d=\"M 56.50625 141.672296 \n",
|
||
"L 251.80625 141.672296 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_12\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"mec308516a0\" d=\"M 0 0 \n",
|
||
"L -3.5 0 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mec308516a0\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_7\">\n",
|
||
" <!-- 0.225 -->\n",
|
||
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_2\">\n",
|
||
" <g id=\"line2d_13\">\n",
|
||
" <path d=\"M 56.50625 115.53768 \n",
|
||
"L 251.80625 115.53768 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_14\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mec308516a0\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_8\">\n",
|
||
" <!-- 0.250 -->\n",
|
||
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_3\">\n",
|
||
" <g id=\"line2d_15\">\n",
|
||
" <path d=\"M 56.50625 89.403065 \n",
|
||
"L 251.80625 89.403065 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_16\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mec308516a0\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_9\">\n",
|
||
" <!-- 0.275 -->\n",
|
||
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
|
||
"L 3525 4666 \n",
|
||
"L 3525 4397 \n",
|
||
"L 1831 0 \n",
|
||
"L 1172 0 \n",
|
||
"L 2766 4134 \n",
|
||
"L 525 4134 \n",
|
||
"L 525 4666 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_4\">\n",
|
||
" <g id=\"line2d_17\">\n",
|
||
" <path d=\"M 56.50625 63.26845 \n",
|
||
"L 251.80625 63.26845 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_18\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mec308516a0\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_10\">\n",
|
||
" <!-- 0.300 -->\n",
|
||
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
|
||
"Q 3050 2419 3304 2112 \n",
|
||
"Q 3559 1806 3559 1356 \n",
|
||
"Q 3559 666 3084 287 \n",
|
||
"Q 2609 -91 1734 -91 \n",
|
||
"Q 1441 -91 1130 -33 \n",
|
||
"Q 819 25 488 141 \n",
|
||
"L 488 750 \n",
|
||
"Q 750 597 1062 519 \n",
|
||
"Q 1375 441 1716 441 \n",
|
||
"Q 2309 441 2620 675 \n",
|
||
"Q 2931 909 2931 1356 \n",
|
||
"Q 2931 1769 2642 2001 \n",
|
||
"Q 2353 2234 1838 2234 \n",
|
||
"L 1294 2234 \n",
|
||
"L 1294 2753 \n",
|
||
"L 1863 2753 \n",
|
||
"Q 2328 2753 2575 2939 \n",
|
||
"Q 2822 3125 2822 3475 \n",
|
||
"Q 2822 3834 2567 4026 \n",
|
||
"Q 2313 4219 1838 4219 \n",
|
||
"Q 1578 4219 1281 4162 \n",
|
||
"Q 984 4106 628 3988 \n",
|
||
"L 628 4550 \n",
|
||
"Q 988 4650 1302 4700 \n",
|
||
"Q 1616 4750 1894 4750 \n",
|
||
"Q 2613 4750 3031 4423 \n",
|
||
"Q 3450 4097 3450 3541 \n",
|
||
"Q 3450 3153 3228 2886 \n",
|
||
"Q 3006 2619 2597 2516 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_5\">\n",
|
||
" <g id=\"line2d_19\">\n",
|
||
" <path d=\"M 56.50625 37.133834 \n",
|
||
"L 251.80625 37.133834 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_20\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mec308516a0\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_11\">\n",
|
||
" <!-- 0.325 -->\n",
|
||
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_6\">\n",
|
||
" <g id=\"line2d_21\">\n",
|
||
" <path d=\"M 56.50625 10.999219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_22\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mec308516a0\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_12\">\n",
|
||
" <!-- 0.350 -->\n",
|
||
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_13\">\n",
|
||
" <!-- loss -->\n",
|
||
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
|
||
"L 1178 4863 \n",
|
||
"L 1178 0 \n",
|
||
"L 603 0 \n",
|
||
"L 603 4863 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
|
||
"L 2834 2853 \n",
|
||
"Q 2591 2978 2328 3040 \n",
|
||
"Q 2066 3103 1784 3103 \n",
|
||
"Q 1356 3103 1142 2972 \n",
|
||
"Q 928 2841 928 2578 \n",
|
||
"Q 928 2378 1081 2264 \n",
|
||
"Q 1234 2150 1697 2047 \n",
|
||
"L 1894 2003 \n",
|
||
"Q 2506 1872 2764 1633 \n",
|
||
"Q 3022 1394 3022 966 \n",
|
||
"Q 3022 478 2636 193 \n",
|
||
"Q 2250 -91 1575 -91 \n",
|
||
"Q 1294 -91 989 -36 \n",
|
||
"Q 684 19 347 128 \n",
|
||
"L 347 722 \n",
|
||
"Q 666 556 975 473 \n",
|
||
"Q 1284 391 1588 391 \n",
|
||
"Q 1994 391 2212 530 \n",
|
||
"Q 2431 669 2431 922 \n",
|
||
"Q 2431 1156 2273 1281 \n",
|
||
"Q 2116 1406 1581 1522 \n",
|
||
"L 1381 1569 \n",
|
||
"Q 847 1681 609 1914 \n",
|
||
"Q 372 2147 372 2553 \n",
|
||
"Q 372 3047 722 3315 \n",
|
||
"Q 1072 3584 1716 3584 \n",
|
||
"Q 2034 3584 2315 3537 \n",
|
||
"Q 2597 3491 2834 3397 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
|
||
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"line2d_23\">\n",
|
||
" <path d=\"M 73.007467 -1 \n",
|
||
"L 82.54625 62.440437 \n",
|
||
"L 95.56625 101.125442 \n",
|
||
"L 108.58625 115.212926 \n",
|
||
"L 121.60625 119.800377 \n",
|
||
"L 134.62625 122.585943 \n",
|
||
"L 147.64625 121.387646 \n",
|
||
"L 160.66625 116.834716 \n",
|
||
"L 173.68625 118.22928 \n",
|
||
"L 186.70625 121.939493 \n",
|
||
"L 199.72625 118.857904 \n",
|
||
"L 212.74625 123.159671 \n",
|
||
"L 225.76625 122.540552 \n",
|
||
"L 238.78625 119.650837 \n",
|
||
"L 251.80625 120.455439 \n",
|
||
"\" clip-path=\"url(#pc59c6bafe9)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_3\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 56.50625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_4\">\n",
|
||
" <path d=\"M 251.80625 146.899219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_5\">\n",
|
||
" <path d=\"M 56.50625 146.899219 \n",
|
||
"L 251.80625 146.899219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_6\">\n",
|
||
" <path d=\"M 56.50625 10.999219 \n",
|
||
"L 251.80625 10.999219 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <defs>\n",
|
||
" <clipPath id=\"pc59c6bafe9\">\n",
|
||
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
|
||
" </clipPath>\n",
|
||
" </defs>\n",
|
||
"</svg>\n"
|
||
],
|
||
"text/plain": [
|
||
"<Figure size 252x180 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"def yogi(params, states, hyperparams):\n",
|
||
" beta1, beta2, eps = 0.9, 0.999, 1e-3\n",
|
||
" for p, (v, s) in zip(params, states):\n",
|
||
" with torch.no_grad():\n",
|
||
" v[:] = beta1 * v + (1 - beta1) * p.grad\n",
|
||
" s[:] = s + (1 - beta2) * torch.sign(\n",
|
||
" torch.square(p.grad) - s) * torch.square(p.grad)\n",
|
||
" v_bias_corr = v / (1 - beta1 ** hyperparams['t'])\n",
|
||
" s_bias_corr = s / (1 - beta2 ** hyperparams['t'])\n",
|
||
" p[:] -= hyperparams['lr'] * v_bias_corr / (torch.sqrt(s_bias_corr)\n",
|
||
" + eps)\n",
|
||
" p.grad.data.zero_()\n",
|
||
" hyperparams['t'] += 1\n",
|
||
"\n",
|
||
"data_iter, feature_dim = d2l.get_data_ch11(batch_size=10)\n",
|
||
"d2l.train_ch11(yogi, init_adam_states(feature_dim),\n",
|
||
" {'lr': 0.01, 't': 1}, data_iter, feature_dim);"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4feb3739",
|
||
"metadata": {
|
||
"origin_pos": 17
|
||
},
|
||
"source": [
|
||
"## 小结\n",
|
||
"\n",
|
||
"* Adam算法将许多优化算法的功能结合到了相当强大的更新规则中。\n",
|
||
"* Adam算法在RMSProp算法基础上创建的,还在小批量的随机梯度上使用EWMA。\n",
|
||
"* 在估计动量和二次矩时,Adam算法使用偏差校正来调整缓慢的启动速度。\n",
|
||
"* 对于具有显著差异的梯度,我们可能会遇到收敛性问题。我们可以通过使用更大的小批量或者切换到改进的估计值$\\mathbf{s}_t$来修正它们。Yogi提供了这样的替代方案。\n",
|
||
"\n",
|
||
"## 练习\n",
|
||
"\n",
|
||
"1. 调节学习率,观察并分析实验结果。\n",
|
||
"1. 试着重写动量和二次矩更新,从而使其不需要偏差校正。\n",
|
||
"1. 收敛时为什么需要降低学习率$\\eta$?\n",
|
||
"1. 尝试构造一个使用Adam算法会发散而Yogi会收敛的例子。\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "475fc1bc",
|
||
"metadata": {
|
||
"origin_pos": 19,
|
||
"tab": [
|
||
"pytorch"
|
||
]
|
||
},
|
||
"source": [
|
||
"[Discussions](https://discuss.d2l.ai/t/4331)\n"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"language_info": {
|
||
"name": "python"
|
||
},
|
||
"required_libs": []
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
} |