Files
Python/d2l/d2l-zh/pytorch/chapter_optimization/adagrad.ipynb
T
2025-12-16 09:23:53 +08:00

3517 lines
138 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "b4886f2a",
"metadata": {
"origin_pos": 0
},
"source": [
"# AdaGrad算法\n",
":label:`sec_adagrad`\n",
"\n",
"我们从有关特征学习中并不常见的问题入手。\n",
"\n",
"## 稀疏特征和学习率\n",
"\n",
"假设我们正在训练一个语言模型。\n",
"为了获得良好的准确性,我们大多希望在训练的过程中降低学习率,速度通常为$\\mathcal{O}(t^{-\\frac{1}{2}})$或更低。\n",
"现在讨论关于稀疏特征(即只在偶尔出现的特征)的模型训练,这对自然语言来说很常见。\n",
"例如,我们看到“预先条件”这个词比“学习”这个词的可能性要小得多。\n",
"但是,它在计算广告学和个性化协同过滤等其他领域也很常见。\n",
"\n",
"只有在这些不常见的特征出现时,与其相关的参数才会得到有意义的更新。\n",
"鉴于学习率下降,我们可能最终会面临这样的情况:常见特征的参数相当迅速地收敛到最佳值,而对于不常见的特征,我们仍缺乏足够的观测以确定其最佳值。\n",
"换句话说,学习率要么对于常见特征而言降低太慢,要么对于不常见特征而言降低太快。\n",
"\n",
"解决此问题的一个方法是记录我们看到特定特征的次数,然后将其用作调整学习率。\n",
"即我们可以使用大小为$\\eta_i = \\frac{\\eta_0}{\\sqrt{s(i, t) + c}}$的学习率,而不是$\\eta = \\frac{\\eta_0}{\\sqrt{t + c}}$。\n",
"在这里$s(i, t)$计下了我们截至$t$时观察到功能$i$的次数。\n",
"这其实很容易实施且不产生额外损耗。\n",
"\n",
"AdaGrad算法 :cite:`Duchi.Hazan.Singer.2011`通过将粗略的计数器$s(i, t)$替换为先前观察所得梯度的平方之和来解决这个问题。\n",
"它使用$s(i, t+1) = s(i, t) + \\left(\\partial_i f(\\mathbf{x})\\right)^2$来调整学习率。\n",
"这有两个好处:首先,我们不再需要决定梯度何时算足够大。\n",
"其次,它会随梯度的大小自动变化。通常对应于较大梯度的坐标会显著缩小,而其他梯度较小的坐标则会得到更平滑的处理。\n",
"在实际应用中,它促成了计算广告学及其相关问题中非常有效的优化程序。\n",
"但是,它遮盖了AdaGrad固有的一些额外优势,这些优势在预处理环境中很容易被理解。\n",
"\n",
"## 预处理\n",
"\n",
"凸优化问题有助于分析算法的特点。\n",
"毕竟对大多数非凸问题来说,获得有意义的理论保证很难,但是直觉和洞察往往会延续。\n",
"让我们来看看最小化$f(\\mathbf{x}) = \\frac{1}{2} \\mathbf{x}^\\top \\mathbf{Q} \\mathbf{x} + \\mathbf{c}^\\top \\mathbf{x} + b$这一问题。\n",
"\n",
"正如在 :numref:`sec_momentum`中那样,我们可以根据其特征分解$\\mathbf{Q} = \\mathbf{U}^\\top \\boldsymbol{\\Lambda} \\mathbf{U}$重写这个问题,来得到一个简化得多的问题,使每个坐标都可以单独解出:\n",
"\n",
"$$f(\\mathbf{x}) = \\bar{f}(\\bar{\\mathbf{x}}) = \\frac{1}{2} \\bar{\\mathbf{x}}^\\top \\boldsymbol{\\Lambda} \\bar{\\mathbf{x}} + \\bar{\\mathbf{c}}^\\top \\bar{\\mathbf{x}} + b.$$\n",
"\n",
"在这里我们使用了$\\mathbf{x} = \\mathbf{U} \\mathbf{x}$,且因此$\\mathbf{c} = \\mathbf{U} \\mathbf{c}$。\n",
"修改后优化器为$\\bar{\\mathbf{x}} = -\\boldsymbol{\\Lambda}^{-1} \\bar{\\mathbf{c}}$且最小值为$-\\frac{1}{2} \\bar{\\mathbf{c}}^\\top \\boldsymbol{\\Lambda}^{-1} \\bar{\\mathbf{c}} + b$。\n",
"这样更容易计算,因为$\\boldsymbol{\\Lambda}$是一个包含$\\mathbf{Q}$特征值的对角矩阵。\n",
"\n",
"如果稍微扰动$\\mathbf{c}$,我们会期望在$f$的最小化器中只产生微小的变化。\n",
"遗憾的是,情况并非如此。\n",
"虽然$\\mathbf{c}$的微小变化导致了$\\bar{\\mathbf{c}}$同样的微小变化,但$f$的(以及$\\bar{f}$的)最小化器并非如此。\n",
"每当特征值$\\boldsymbol{\\Lambda}_i$很大时,我们只会看到$\\bar{x}_i$和$\\bar{f}$的最小值发声微小变化。\n",
"相反,对小的$\\boldsymbol{\\Lambda}_i$来说,$\\bar{x}_i$的变化可能是剧烈的。\n",
"最大和最小的特征值之比称为优化问题的*条件数*(condition number)。\n",
"\n",
"$$\\kappa = \\frac{\\boldsymbol{\\Lambda}_1}{\\boldsymbol{\\Lambda}_d}.$$\n",
"\n",
"如果条件编号$\\kappa$很大,准确解决优化问题就会很难。\n",
"我们需要确保在获取大量动态的特征值范围时足够谨慎:难道我们不能简单地通过扭曲空间来“修复”这个问题,从而使所有特征值都是$1$?\n",
"理论上这很容易:我们只需要$\\mathbf{Q}$的特征值和特征向量即可将问题从$\\mathbf{x}$整理到$\\mathbf{z} := \\boldsymbol{\\Lambda}^{\\frac{1}{2}} \\mathbf{U} \\mathbf{x}$中的一个。\n",
"在新的坐标系中,$\\mathbf{x}^\\top \\mathbf{Q} \\mathbf{x}$可以被简化为$\\|\\mathbf{z}\\|^2$。\n",
"可惜,这是一个相当不切实际的想法。\n",
"一般而言,计算特征值和特征向量要比解决实际问题“贵”得多。\n",
"\n",
"虽然准确计算特征值可能会很昂贵,但即便只是大致猜测并计算它们,也可能已经比不做任何事情好得多。\n",
"特别是,我们可以使用$\\mathbf{Q}$的对角线条目并相应地重新缩放它。\n",
"这比计算特征值开销小的多。\n",
"\n",
"$$\\tilde{\\mathbf{Q}} = \\mathrm{diag}^{-\\frac{1}{2}}(\\mathbf{Q}) \\mathbf{Q} \\mathrm{diag}^{-\\frac{1}{2}}(\\mathbf{Q}).$$\n",
"\n",
"在这种情况下,我们得到了$\\tilde{\\mathbf{Q}}_{ij} = \\mathbf{Q}_{ij} / \\sqrt{\\mathbf{Q}_{ii} \\mathbf{Q}_{jj}}$,特别注意对于所有$i$$\\tilde{\\mathbf{Q}}_{ii} = 1$。\n",
"在大多数情况下,这大大简化了条件数。\n",
"例如我们之前讨论的案例,它将完全消除眼下的问题,因为问题是轴对齐的。\n",
"\n",
"遗憾的是,我们还面临另一个问题:在深度学习中,我们通常情况甚至无法计算目标函数的二阶导数:对于$\\mathbf{x} \\in \\mathbb{R}^d$,即使只在小批量上,二阶导数可能也需要$\\mathcal{O}(d^2)$空间来计算,导致几乎不可行。\n",
"AdaGrad算法巧妙的思路是,使用一个代理来表示黑塞矩阵(Hessian)的对角线,既相对易于计算又高效。\n",
"\n",
"为了了解它是如何生效的,让我们来看看$\\bar{f}(\\bar{\\mathbf{x}})$。\n",
"我们有\n",
"\n",
"$$\\partial_{\\bar{\\mathbf{x}}} \\bar{f}(\\bar{\\mathbf{x}}) = \\boldsymbol{\\Lambda} \\bar{\\mathbf{x}} + \\bar{\\mathbf{c}} = \\boldsymbol{\\Lambda} \\left(\\bar{\\mathbf{x}} - \\bar{\\mathbf{x}}_0\\right),$$\n",
"\n",
"其中$\\bar{\\mathbf{x}}_0$是$\\bar{f}$的优化器。\n",
"因此,梯度的大小取决于$\\boldsymbol{\\Lambda}$和与最佳值的差值。\n",
"如果$\\bar{\\mathbf{x}} - \\bar{\\mathbf{x}}_0$没有改变,那这就是我们所求的。\n",
"毕竟在这种情况下,梯度$\\partial_{\\bar{\\mathbf{x}}} \\bar{f}(\\bar{\\mathbf{x}})$的大小就足够了。\n",
"由于AdaGrad算法是一种随机梯度下降算法,所以即使是在最佳值中,我们也会看到具有非零方差的梯度。\n",
"因此,我们可以放心地使用梯度的方差作为黑塞矩阵比例的廉价替代。\n",
"详尽的分析(要花几页解释)超出了本节的范围,请读者参考 :cite:`Duchi.Hazan.Singer.2011`。\n",
"\n",
"## 算法\n",
"\n",
"让我们接着上面正式开始讨论。\n",
"我们使用变量$\\mathbf{s}_t$来累加过去的梯度方差,如下所示:\n",
"\n",
"$$\\begin{aligned}\n",
" \\mathbf{g}_t & = \\partial_{\\mathbf{w}} l(y_t, f(\\mathbf{x}_t, \\mathbf{w})), \\\\\n",
" \\mathbf{s}_t & = \\mathbf{s}_{t-1} + \\mathbf{g}_t^2, \\\\\n",
" \\mathbf{w}_t & = \\mathbf{w}_{t-1} - \\frac{\\eta}{\\sqrt{\\mathbf{s}_t + \\epsilon}} \\cdot \\mathbf{g}_t.\n",
"\\end{aligned}$$\n",
"\n",
"在这里,操作是按照坐标顺序应用。\n",
"也就是说,$\\mathbf{v}^2$有条目$v_i^2$。\n",
"同样,$\\frac{1}{\\sqrt{v}}$有条目$\\frac{1}{\\sqrt{v_i}}$\n",
"并且$\\mathbf{u} \\cdot \\mathbf{v}$有条目$u_i v_i$。\n",
"与之前一样,$\\eta$是学习率,$\\epsilon$是一个为维持数值稳定性而添加的常数,用来确保我们不会除以$0$。\n",
"最后,我们初始化$\\mathbf{s}_0 = \\mathbf{0}$。\n",
"\n",
"就像在动量法中我们需要跟踪一个辅助变量一样,在AdaGrad算法中,我们允许每个坐标有单独的学习率。\n",
"与SGD算法相比,这并没有明显增加AdaGrad的计算代价,因为主要计算用在$l(y_t, f(\\mathbf{x}_t, \\mathbf{w}))$及其导数。\n",
"\n",
"请注意,在$\\mathbf{s}_t$中累加平方梯度意味着$\\mathbf{s}_t$基本上以线性速率增长(由于梯度从最初开始衰减,实际上比线性慢一些)。\n",
"这产生了一个学习率$\\mathcal{O}(t^{-\\frac{1}{2}})$,但是在单个坐标的层面上进行了调整。\n",
"对于凸问题,这完全足够了。\n",
"然而,在深度学习中,我们可能希望更慢地降低学习率。\n",
"这引出了许多AdaGrad算法的变体,我们将在后续章节中讨论它们。\n",
"眼下让我们先看看它在二次凸问题中的表现如何。\n",
"我们仍然以同一函数为例:\n",
"\n",
"$$f(\\mathbf{x}) = 0.1 x_1^2 + 2 x_2^2.$$\n",
"\n",
"我们将使用与之前相同的学习率来实现AdaGrad算法,即$\\eta = 0.4$。\n",
"可以看到,自变量的迭代轨迹较平滑。\n",
"但由于$\\boldsymbol{s}_t$的累加效果使学习率不断衰减,自变量在迭代后期的移动幅度较小。\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f2d18ce2",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:07:35.660003Z",
"iopub.status.busy": "2023-08-18T07:07:35.659452Z",
"iopub.status.idle": "2023-08-18T07:07:37.668048Z",
"shell.execute_reply": "2023-08-18T07:07:37.667136Z"
},
"origin_pos": 2,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import math\n",
"import torch\n",
"from d2l import torch as d2l"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "db64eb41",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:07:37.671920Z",
"iopub.status.busy": "2023-08-18T07:07:37.671523Z",
"iopub.status.idle": "2023-08-18T07:07:37.815807Z",
"shell.execute_reply": "2023-08-18T07:07:37.814923Z"
},
"origin_pos": 5,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: -2.382563, x2: -0.158591\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:07:37.784960</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"m623ecd3123\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m623ecd3123\" x=\"88.39375\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(81.022656 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#m623ecd3123\" x=\"149.425\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(142.053907 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#m623ecd3123\" x=\"210.456251\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(207.275001 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(134.129687 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"mc261d8226f\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mc261d8226f\" x=\"42.620312\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 3 -->\n",
" <g transform=\"translate(20.878125 146.899219)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#mc261d8226f\" x=\"42.620312\" y=\"108.253846\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(20.878125 112.053065)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mc261d8226f\" x=\"42.620312\" y=\"73.407692\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(20.878125 77.206911)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_7\">\n",
" <g>\n",
" <use xlink:href=\"#mc261d8226f\" x=\"42.620312\" y=\"38.561538\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 42.360757)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798437 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <path d=\"M 57.878125 108.253846 \n",
"L 70.084369 94.315384 \n",
"L 78.348688 85.608097 \n",
"L 84.908015 79.108408 \n",
"L 90.459611 73.910729 \n",
"L 95.325825 69.601347 \n",
"L 99.686585 65.949499 \n",
"L 103.654327 62.810089 \n",
"L 107.304852 60.084211 \n",
"L 110.692126 57.700407 \n",
"L 113.856191 55.604724 \n",
"L 116.827734 53.755022 \n",
"L 119.630892 52.117487 \n",
"L 122.285063 50.664405 \n",
"L 124.806116 49.372664 \n",
"L 127.207235 48.222724 \n",
"L 129.499519 47.197881 \n",
"L 131.692415 46.283724 \n",
"L 133.79405 45.467732 \n",
"L 135.811476 44.738963 \n",
"L 137.750858 44.087809 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m93d53f9d16\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p2d662c3392)\">\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"57.878125\" y=\"108.253846\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"70.084369\" y=\"94.315384\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"78.348688\" y=\"85.608097\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"84.908015\" y=\"79.108408\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"90.459611\" y=\"73.910729\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"95.325825\" y=\"69.601347\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"99.686585\" y=\"65.949499\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"103.654327\" y=\"62.810089\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"107.304852\" y=\"60.084211\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"110.692126\" y=\"57.700407\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"113.856191\" y=\"55.604724\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"116.827734\" y=\"53.755022\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"119.630892\" y=\"52.117487\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"122.285063\" y=\"50.664405\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"124.806116\" y=\"49.372664\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"127.207235\" y=\"48.222724\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"129.499519\" y=\"47.197881\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"131.692415\" y=\"46.283724\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"133.79405\" y=\"45.467732\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"135.811476\" y=\"44.738963\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m93d53f9d16\" x=\"137.750858\" y=\"44.087809\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 97.100868 7.2 \n",
"L 94.496869 7.855929 \n",
"L 91.44531 8.645091 \n",
"L 88.393757 9.454749 \n",
"L 85.342191 10.284908 \n",
"L 83.908323 10.684614 \n",
"L 82.290631 11.19569 \n",
"L 79.239071 12.182998 \n",
"L 76.187512 13.193534 \n",
"L 73.307377 14.16923 \n",
"L 73.135938 14.236244 \n",
"L 70.084378 15.455857 \n",
"L 67.032818 16.702275 \n",
"L 64.752171 17.653845 \n",
"L 63.981244 18.033987 \n",
"L 60.929685 19.570384 \n",
"L 57.878125 21.138461 \n",
"L 54.826565 23.093713 \n",
"L 52.48604 24.623076 \n",
"L 51.775006 25.220431 \n",
"L 48.723432 27.833908 \n",
"L 48.409729 28.107691 \n",
"L 45.671872 31.452924 \n",
"L 45.55989 31.592307 \n",
"L 43.880133 35.076923 \n",
"L 43.320214 38.561539 \n",
"L 43.880133 42.046154 \n",
"L 45.55989 45.530769 \n",
"L 45.671872 45.670152 \n",
"L 48.409729 49.015384 \n",
"L 48.723432 49.289168 \n",
"L 51.775006 51.902645 \n",
"L 52.486042 52.500001 \n",
"L 54.826565 54.029361 \n",
"L 57.878125 55.984615 \n",
"L 60.929685 57.552692 \n",
"L 63.981244 59.089089 \n",
"L 64.752171 59.469231 \n",
"L 67.032818 60.4208 \n",
"L 70.084378 61.667219 \n",
"L 73.135938 62.886832 \n",
"L 73.307377 62.953845 \n",
"L 76.187512 63.929541 \n",
"L 79.239071 64.940078 \n",
"L 82.290631 65.927386 \n",
"L 83.908319 66.438459 \n",
"L 85.342191 66.838166 \n",
"L 88.393757 67.668327 \n",
"L 91.44531 68.477984 \n",
"L 94.496869 69.267147 \n",
"L 97.100868 69.923076 \n",
"L 97.548436 70.023948 \n",
"L 100.599996 70.69336 \n",
"L 103.651563 71.344434 \n",
"L 106.703122 71.977164 \n",
"L 109.754682 72.591556 \n",
"L 112.806249 73.187609 \n",
"L 113.968755 73.407692 \n",
"L 115.857816 73.731264 \n",
"L 118.909375 74.237361 \n",
"L 121.960942 74.726869 \n",
"L 125.012502 75.199781 \n",
"L 128.064069 75.6561 \n",
"L 131.115628 76.095825 \n",
"L 134.167188 76.518956 \n",
"L 136.969651 76.892308 \n",
"L 137.218755 76.922609 \n",
"L 140.270314 77.278645 \n",
"L 143.321874 77.619532 \n",
"L 146.373441 77.945268 \n",
"L 149.425 78.255852 \n",
"L 152.476564 78.551287 \n",
"L 155.528127 78.831573 \n",
"L 158.57969 79.096705 \n",
"L 161.631253 79.346689 \n",
"L 164.682813 79.581522 \n",
"L 167.734376 79.801203 \n",
"L 170.785939 80.005736 \n",
"L 173.837499 80.195117 \n",
"L 176.889062 80.369347 \n",
"L 177.034329 80.37692 \n",
"L 179.940626 80.516308 \n",
"L 182.992189 80.648722 \n",
"L 186.04375 80.767201 \n",
"L 189.095313 80.871738 \n",
"L 192.146877 80.962338 \n",
"L 195.198438 81.038999 \n",
"L 198.250001 81.101723 \n",
"L 201.301564 81.150507 \n",
"L 204.353126 81.185354 \n",
"L 207.404689 81.206262 \n",
"L 210.456251 81.213231 \n",
"L 213.507813 81.206262 \n",
"L 216.559376 81.185354 \n",
"L 219.610939 81.150507 \n",
"L 222.662501 81.101723 \n",
"L 225.714063 81.038999 \n",
"L 228.765626 80.962338 \n",
"L 231.817188 80.871738 \n",
"L 234.868751 80.767201 \n",
"L 237.920313 80.648722 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 42.620312 81.038999 \n",
"L 45.671872 81.798647 \n",
"L 48.723432 82.544351 \n",
"L 51.775006 83.276123 \n",
"L 54.263642 83.861536 \n",
"L 54.826565 83.984147 \n",
"L 57.878125 84.635897 \n",
"L 60.929685 85.274743 \n",
"L 63.981244 85.900682 \n",
"L 67.032818 86.513718 \n",
"L 70.084378 87.113847 \n",
"L 71.29158 87.346153 \n",
"L 73.135938 87.676592 \n",
"L 76.187512 88.211301 \n",
"L 79.239071 88.733993 \n",
"L 82.290631 89.244669 \n",
"L 85.342191 89.74333 \n",
"L 88.393757 90.229974 \n",
"L 91.44531 90.704602 \n",
"L 92.277553 90.830769 \n",
"L 94.496869 91.145507 \n",
"L 97.548436 91.567034 \n",
"L 100.599996 91.977319 \n",
"L 103.651563 92.376364 \n",
"L 106.703122 92.764167 \n",
"L 109.754682 93.14073 \n",
"L 112.806249 93.506052 \n",
"L 115.857816 93.860135 \n",
"L 118.909375 94.202976 \n",
"L 119.943828 94.315385 \n",
"L 121.960942 94.521292 \n",
"L 125.012502 94.822237 \n",
"L 128.064069 95.112621 \n",
"L 131.115628 95.392446 \n",
"L 134.167188 95.661713 \n",
"L 137.218755 95.92042 \n",
"L 140.270314 96.168567 \n",
"L 143.321874 96.406152 \n",
"L 146.373441 96.633182 \n",
"L 149.425 96.84965 \n",
"L 152.476564 97.055559 \n",
"L 155.528127 97.25091 \n",
"L 158.57969 97.435699 \n",
"L 161.631253 97.60993 \n",
"L 164.682813 97.773602 \n",
"L 165.20896 97.800001 \n",
"L 167.734376 97.919472 \n",
"L 170.785939 98.053879 \n",
"L 173.837499 98.17833 \n",
"L 176.889062 98.292825 \n",
"L 179.940626 98.397363 \n",
"L 182.992189 98.491946 \n",
"L 186.04375 98.576571 \n",
"L 189.095313 98.651243 \n",
"L 192.146877 98.715957 \n",
"L 195.198438 98.770714 \n",
"L 198.250001 98.815518 \n",
"L 201.301564 98.850364 \n",
"L 204.353126 98.875252 \n",
"L 207.404689 98.890187 \n",
"L 210.456251 98.895165 \n",
"L 213.507813 98.890187 \n",
"L 216.559376 98.875252 \n",
"L 219.610939 98.850364 \n",
"L 222.662501 98.815518 \n",
"L 225.714063 98.770714 \n",
"L 228.765626 98.715957 \n",
"L 231.817188 98.651243 \n",
"L 234.868751 98.576571 \n",
"L 237.920313 98.491946 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 42.620312 98.770713 \n",
"L 45.671872 99.313318 \n",
"L 48.723432 99.845966 \n",
"L 51.775006 100.368661 \n",
"L 54.826565 100.881396 \n",
"L 57.27387 101.284618 \n",
"L 57.878125 101.378794 \n",
"L 60.929685 101.844979 \n",
"L 63.981244 102.301746 \n",
"L 67.032818 102.749096 \n",
"L 70.084378 103.187027 \n",
"L 73.135938 103.615539 \n",
"L 76.187512 104.034638 \n",
"L 79.239071 104.444316 \n",
"L 81.716227 104.769234 \n",
"L 82.290631 104.840712 \n",
"L 85.342191 105.211508 \n",
"L 88.393757 105.573373 \n",
"L 91.44531 105.926301 \n",
"L 94.496869 106.270296 \n",
"L 97.548436 106.605354 \n",
"L 100.599996 106.931478 \n",
"L 103.651563 107.248667 \n",
"L 106.703122 107.556923 \n",
"L 109.754682 107.856243 \n",
"L 112.806249 108.146626 \n",
"L 113.968764 108.253846 \n",
"L 115.857816 108.419576 \n",
"L 118.909375 108.6788 \n",
"L 121.960942 108.929523 \n",
"L 125.012502 109.171742 \n",
"L 128.064069 109.405468 \n",
"L 131.115628 109.630695 \n",
"L 134.167188 109.847419 \n",
"L 137.218755 110.055645 \n",
"L 140.270314 110.255373 \n",
"L 143.321874 110.446602 \n",
"L 146.373441 110.629333 \n",
"L 149.425 110.803565 \n",
"L 152.476564 110.969295 \n",
"L 155.528127 111.126526 \n",
"L 158.57969 111.275263 \n",
"L 161.631253 111.415497 \n",
"L 164.682813 111.547229 \n",
"L 167.734376 111.670467 \n",
"L 169.542874 111.738466 \n",
"L 170.785939 111.783032 \n",
"L 173.837499 111.884327 \n",
"L 176.889062 111.97752 \n",
"L 179.940626 112.062609 \n",
"L 182.992189 112.139595 \n",
"L 186.04375 112.208477 \n",
"L 189.095313 112.269257 \n",
"L 192.146877 112.32193 \n",
"L 195.198438 112.366503 \n",
"L 198.250001 112.402969 \n",
"L 201.301564 112.431332 \n",
"L 204.353126 112.451592 \n",
"L 207.404689 112.463745 \n",
"L 210.456251 112.467798 \n",
"L 213.507813 112.463745 \n",
"L 216.559376 112.451592 \n",
"L 219.610939 112.431332 \n",
"L 222.662501 112.402969 \n",
"L 225.714063 112.366503 \n",
"L 228.765626 112.32193 \n",
"L 231.817188 112.269257 \n",
"L 234.868751 112.208477 \n",
"L 237.920313 112.139595 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 42.620312 112.366503 \n",
"L 45.671872 112.808155 \n",
"L 48.723432 113.241707 \n",
"L 51.775006 113.667157 \n",
"L 54.826565 114.084499 \n",
"L 57.878125 114.493739 \n",
"L 60.929685 114.894875 \n",
"L 63.477901 115.223078 \n",
"L 63.981244 115.285026 \n",
"L 67.032818 115.652848 \n",
"L 70.084378 116.012922 \n",
"L 73.135938 116.365258 \n",
"L 76.187512 116.709846 \n",
"L 79.239071 117.046692 \n",
"L 82.290631 117.375794 \n",
"L 85.342191 117.697154 \n",
"L 88.393757 118.01077 \n",
"L 91.44531 118.31664 \n",
"L 94.496869 118.614767 \n",
"L 95.47338 118.70769 \n",
"L 97.548436 118.896748 \n",
"L 100.599996 119.167362 \n",
"L 103.651563 119.430562 \n",
"L 106.703122 119.686349 \n",
"L 109.754682 119.934722 \n",
"L 112.806249 120.175678 \n",
"L 115.857816 120.409221 \n",
"L 118.909375 120.635353 \n",
"L 121.960942 120.854069 \n",
"L 125.012502 121.065367 \n",
"L 128.064069 121.269256 \n",
"L 131.115628 121.465731 \n",
"L 134.167188 121.654788 \n",
"L 137.218755 121.836432 \n",
"L 140.270314 122.010663 \n",
"L 143.321874 122.17748 \n",
"L 143.605786 122.192311 \n",
"L 146.373441 122.330983 \n",
"L 149.425 122.476769 \n",
"L 152.476564 122.615441 \n",
"L 155.528127 122.747002 \n",
"L 158.57969 122.871455 \n",
"L 161.631253 122.988794 \n",
"L 164.682813 123.099019 \n",
"L 167.734376 123.202136 \n",
"L 170.785939 123.298142 \n",
"L 173.837499 123.387034 \n",
"L 176.889062 123.468815 \n",
"L 179.940626 123.543485 \n",
"L 182.992189 123.611044 \n",
"L 186.04375 123.671492 \n",
"L 189.095313 123.724829 \n",
"L 192.146877 123.771052 \n",
"L 195.198438 123.810167 \n",
"L 198.250001 123.842168 \n",
"L 201.301564 123.867058 \n",
"L 204.353126 123.884837 \n",
"L 207.404689 123.895502 \n",
"L 210.456251 123.899059 \n",
"L 213.507813 123.895502 \n",
"L 216.559376 123.884837 \n",
"L 219.610939 123.867058 \n",
"L 222.662501 123.842168 \n",
"L 225.714063 123.810167 \n",
"L 228.765626 123.771052 \n",
"L 231.817188 123.724829 \n",
"L 234.868751 123.671492 \n",
"L 237.920313 123.611044 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 42.620312 123.810167 \n",
"L 45.671872 124.197739 \n",
"L 48.723432 124.578203 \n",
"L 51.775006 124.951556 \n",
"L 54.826565 125.317795 \n",
"L 57.878125 125.676923 \n",
"L 60.929685 126.015136 \n",
"L 63.981244 126.346516 \n",
"L 67.032818 126.671065 \n",
"L 70.084378 126.988778 \n",
"L 73.135938 127.299663 \n",
"L 76.187512 127.603712 \n",
"L 79.239071 127.900929 \n",
"L 82.290631 128.191314 \n",
"L 85.342191 128.474866 \n",
"L 88.393757 128.751587 \n",
"L 91.44531 129.021472 \n",
"L 93.070112 129.161535 \n",
"L 94.496869 129.279884 \n",
"L 97.548436 129.526437 \n",
"L 100.599996 129.766414 \n",
"L 103.651563 129.999818 \n",
"L 106.703122 130.226647 \n",
"L 109.754682 130.446903 \n",
"L 112.806249 130.66058 \n",
"L 115.857816 130.867685 \n",
"L 118.909375 131.068216 \n",
"L 121.960942 131.262171 \n",
"L 125.012502 131.449548 \n",
"L 128.064069 131.630355 \n",
"L 131.115628 131.804587 \n",
"L 134.167188 131.972242 \n",
"L 137.218755 132.133322 \n",
"L 140.270314 132.287828 \n",
"L 143.321874 132.43576 \n",
"L 146.373441 132.577118 \n",
"L 147.936485 132.646155 \n",
"L 149.425 132.70951 \n",
"L 152.476564 132.833055 \n",
"L 155.528127 132.950264 \n",
"L 158.57969 133.06114 \n",
"L 161.631253 133.165679 \n",
"L 164.682813 133.263879 \n",
"L 167.734376 133.355747 \n",
"L 170.785939 133.44128 \n",
"L 173.837499 133.520475 \n",
"L 176.889062 133.593334 \n",
"L 179.940626 133.659858 \n",
"L 182.992189 133.720047 \n",
"L 186.04375 133.773901 \n",
"L 189.095313 133.82142 \n",
"L 192.146877 133.8626 \n",
"L 195.198438 133.897448 \n",
"L 198.250001 133.925958 \n",
"L 201.301564 133.948133 \n",
"L 204.353126 133.963972 \n",
"L 207.404689 133.973474 \n",
"L 210.456251 133.976643 \n",
"L 213.507813 133.973474 \n",
"L 216.559376 133.963972 \n",
"L 219.610939 133.948133 \n",
"L 222.662501 133.925958 \n",
"L 225.714063 133.897448 \n",
"L 228.765626 133.8626 \n",
"L 231.817188 133.82142 \n",
"L 234.868751 133.773901 \n",
"L 237.920313 133.720047 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 42.620312 133.897445 \n",
"L 45.671872 134.242743 \n",
"L 48.723432 134.581696 \n",
"L 51.775006 134.914323 \n",
"L 54.826565 135.240612 \n",
"L 57.878125 135.560557 \n",
"L 60.929685 135.874175 \n",
"L 63.477871 136.130768 \n",
"L 63.981244 136.179677 \n",
"L 67.032818 136.470059 \n",
"L 70.084378 136.754332 \n",
"L 73.135938 137.032489 \n",
"L 76.187512 137.304535 \n",
"L 79.239071 137.570466 \n",
"L 82.290631 137.830284 \n",
"L 85.342191 138.083987 \n",
"L 88.393757 138.331578 \n",
"L 91.44531 138.573055 \n",
"L 94.496869 138.80842 \n",
"L 97.548436 139.037672 \n",
"L 100.599996 139.260809 \n",
"L 103.651563 139.477831 \n",
"L 105.641791 139.615388 \n",
"L 106.703122 139.686256 \n",
"L 109.754682 139.88411 \n",
"L 112.806249 140.076061 \n",
"L 115.857816 140.262103 \n",
"L 118.909375 140.442243 \n",
"L 121.960942 140.616473 \n",
"L 125.012502 140.784796 \n",
"L 128.064069 140.947215 \n",
"L 131.115628 141.103726 \n",
"L 134.167188 141.254334 \n",
"L 137.218755 141.399034 \n",
"L 140.270314 141.537831 \n",
"L 143.321874 141.670719 \n",
"L 146.373441 141.797698 \n",
"L 149.425 141.918775 \n",
"L 152.476564 142.033943 \n",
"L 155.528127 142.143209 \n",
"L 158.57969 142.246565 \n",
"L 161.631253 142.344014 \n",
"L 164.682813 142.435559 \n",
"L 167.734376 142.521202 \n",
"L 170.785939 142.60093 \n",
"L 173.837499 142.674761 \n",
"L 176.889062 142.742678 \n",
"L 179.940626 142.804692 \n",
"L 182.992189 142.860804 \n",
"L 186.04375 142.911006 \n",
"L 189.095313 142.9553 \n",
"L 192.146877 142.993692 \n",
"L 195.198438 143.026174 \n",
"L 198.250001 143.052749 \n",
"L 201.301564 143.07342 \n",
"L 204.353126 143.088189 \n",
"L 207.404689 143.097049 \n",
"L 210.456251 143.1 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" <path d=\"M 210.456251 143.1 \n",
"L 213.507813 143.097049 \n",
"L 216.559376 143.088189 \n",
"L 219.610939 143.07342 \n",
"L 222.662501 143.052749 \n",
"L 225.714063 143.026174 \n",
"L 228.765626 142.993692 \n",
"L 231.817188 142.9553 \n",
"L 234.868751 142.911006 \n",
"L 237.920313 142.860804 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\">\n",
" <path d=\"M 42.620312 143.026174 \n",
"L 43.320206 143.1 \n",
"\" clip-path=\"url(#p2d662c3392)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_9\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p2d662c3392\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def adagrad_2d(x1, x2, s1, s2):\n",
" eps = 1e-6\n",
" g1, g2 = 0.2 * x1, 4 * x2\n",
" s1 += g1 ** 2\n",
" s2 += g2 ** 2\n",
" x1 -= eta / math.sqrt(s1 + eps) * g1\n",
" x2 -= eta / math.sqrt(s2 + eps) * g2\n",
" return x1, x2, s1, s2\n",
"\n",
"def f_2d(x1, x2):\n",
" return 0.1 * x1 ** 2 + 2 * x2 ** 2\n",
"\n",
"eta = 0.4\n",
"d2l.show_trace_2d(f_2d, d2l.train_2d(adagrad_2d))"
]
},
{
"cell_type": "markdown",
"id": "a777f665",
"metadata": {
"origin_pos": 6
},
"source": [
"我们将学习率提高到$2$,可以看到更好的表现。\n",
"这已经表明,即使在无噪声的情况下,学习率的降低可能相当剧烈,我们需要确保参数能够适当地收敛。\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "7f344858",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:07:37.819405Z",
"iopub.status.busy": "2023-08-18T07:07:37.819092Z",
"iopub.status.idle": "2023-08-18T07:07:37.956764Z",
"shell.execute_reply": "2023-08-18T07:07:37.955622Z"
},
"origin_pos": 7,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch 20, x1: -0.002295, x2: -0.000000\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"245.120313pt\" height=\"180.65625pt\" viewBox=\"0 0 245.120313 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:07:37.925339</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 245.120313 180.65625 \n",
"L 245.120313 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"L 42.620312 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <defs>\n",
" <path id=\"m6f3a720570\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m6f3a720570\" x=\"88.39375\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(81.022656 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
"L 4684 2272 \n",
"L 4684 1741 \n",
"L 678 1741 \n",
"L 678 2272 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_2\">\n",
" <g>\n",
" <use xlink:href=\"#m6f3a720570\" x=\"149.425\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(142.053907 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_3\">\n",
" <g>\n",
" <use xlink:href=\"#m6f3a720570\" x=\"210.456251\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(207.275001 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- x1 -->\n",
" <g transform=\"translate(134.129687 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
"L 2247 1797 \n",
"L 3578 0 \n",
"L 2900 0 \n",
"L 1881 1375 \n",
"L 863 0 \n",
"L 184 0 \n",
"L 1544 1831 \n",
"L 300 3500 \n",
"L 978 3500 \n",
"L 1906 2253 \n",
"L 2834 3500 \n",
"L 3513 3500 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_4\">\n",
" <defs>\n",
" <path id=\"m1af605cfd9\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m1af605cfd9\" x=\"42.620312\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 3 -->\n",
" <g transform=\"translate(20.878125 146.899219)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_5\">\n",
" <g>\n",
" <use xlink:href=\"#m1af605cfd9\" x=\"42.620312\" y=\"108.253846\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(20.878125 112.053065)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m1af605cfd9\" x=\"42.620312\" y=\"73.407692\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(20.878125 77.206911)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-2212\"/>\n",
" <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_7\">\n",
" <g>\n",
" <use xlink:href=\"#m1af605cfd9\" x=\"42.620312\" y=\"38.561538\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(29.257812 42.360757)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- x2 -->\n",
" <g transform=\"translate(14.798437 81.290625)rotate(-90)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-78\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"59.179688\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <path d=\"M 57.878125 108.253846 \n",
"L 118.909345 38.561538 \n",
"L 150.30966 38.561538 \n",
"L 169.853409 38.561538 \n",
"L 182.748839 38.561538 \n",
"L 191.45862 38.561538 \n",
"L 197.40211 38.561538 \n",
"L 201.47704 38.561538 \n",
"L 204.277004 38.561538 \n",
"L 206.202904 38.561538 \n",
"L 207.528241 38.561538 \n",
"L 208.440503 38.561538 \n",
"L 209.068504 38.561538 \n",
"L 209.500842 38.561538 \n",
"L 209.798486 38.561538 \n",
"L 210.003402 38.561538 \n",
"L 210.144479 38.561538 \n",
"L 210.241606 38.561538 \n",
"L 210.308475 38.561538 \n",
"L 210.354512 38.561538 \n",
"L 210.386207 38.561538 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #ff7f0e; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" <defs>\n",
" <path id=\"m2bdca4118c\" d=\"M 0 3 \n",
"C 0.795609 3 1.55874 2.683901 2.12132 2.12132 \n",
"C 2.683901 1.55874 3 0.795609 3 0 \n",
"C 3 -0.795609 2.683901 -1.55874 2.12132 -2.12132 \n",
"C 1.55874 -2.683901 0.795609 -3 0 -3 \n",
"C -0.795609 -3 -1.55874 -2.683901 -2.12132 -2.12132 \n",
"C -2.683901 -1.55874 -3 -0.795609 -3 0 \n",
"C -3 0.795609 -2.683901 1.55874 -2.12132 2.12132 \n",
"C -1.55874 2.683901 -0.795609 3 0 3 \n",
"z\n",
"\" style=\"stroke: #ff7f0e\"/>\n",
" </defs>\n",
" <g clip-path=\"url(#p15c6aa45f6)\">\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"57.878125\" y=\"108.253846\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"118.909345\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"150.30966\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"169.853409\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"182.748839\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"191.45862\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"197.40211\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"201.47704\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"204.277004\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"206.202904\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"207.528241\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"208.440503\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"209.068504\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"209.500842\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"209.798486\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"210.003402\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"210.144479\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"210.241606\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"210.308475\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"210.354512\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" <use xlink:href=\"#m2bdca4118c\" x=\"210.386207\" y=\"38.561538\" style=\"fill: #ff7f0e; stroke: #ff7f0e\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"PathCollection_1\"/>\n",
" <g id=\"PathCollection_2\">\n",
" <path d=\"M 97.100868 7.2 \n",
"L 94.496869 7.855929 \n",
"L 91.44531 8.645091 \n",
"L 88.393757 9.454749 \n",
"L 85.342191 10.284908 \n",
"L 83.908323 10.684614 \n",
"L 82.290631 11.19569 \n",
"L 79.239071 12.182998 \n",
"L 76.187512 13.193534 \n",
"L 73.307377 14.16923 \n",
"L 73.135938 14.236244 \n",
"L 70.084378 15.455857 \n",
"L 67.032818 16.702275 \n",
"L 64.752171 17.653845 \n",
"L 63.981244 18.033987 \n",
"L 60.929685 19.570384 \n",
"L 57.878125 21.138461 \n",
"L 54.826565 23.093713 \n",
"L 52.48604 24.623076 \n",
"L 51.775006 25.220431 \n",
"L 48.723432 27.833908 \n",
"L 48.409729 28.107691 \n",
"L 45.671872 31.452924 \n",
"L 45.55989 31.592307 \n",
"L 43.880133 35.076923 \n",
"L 43.320214 38.561539 \n",
"L 43.880133 42.046154 \n",
"L 45.55989 45.530769 \n",
"L 45.671872 45.670152 \n",
"L 48.409729 49.015384 \n",
"L 48.723432 49.289168 \n",
"L 51.775006 51.902645 \n",
"L 52.486042 52.500001 \n",
"L 54.826565 54.029361 \n",
"L 57.878125 55.984615 \n",
"L 60.929685 57.552692 \n",
"L 63.981244 59.089089 \n",
"L 64.752171 59.469231 \n",
"L 67.032818 60.4208 \n",
"L 70.084378 61.667219 \n",
"L 73.135938 62.886832 \n",
"L 73.307377 62.953845 \n",
"L 76.187512 63.929541 \n",
"L 79.239071 64.940078 \n",
"L 82.290631 65.927386 \n",
"L 83.908319 66.438459 \n",
"L 85.342191 66.838166 \n",
"L 88.393757 67.668327 \n",
"L 91.44531 68.477984 \n",
"L 94.496869 69.267147 \n",
"L 97.100868 69.923076 \n",
"L 97.548436 70.023948 \n",
"L 100.599996 70.69336 \n",
"L 103.651563 71.344434 \n",
"L 106.703122 71.977164 \n",
"L 109.754682 72.591556 \n",
"L 112.806249 73.187609 \n",
"L 113.968755 73.407692 \n",
"L 115.857816 73.731264 \n",
"L 118.909375 74.237361 \n",
"L 121.960942 74.726869 \n",
"L 125.012502 75.199781 \n",
"L 128.064069 75.6561 \n",
"L 131.115628 76.095825 \n",
"L 134.167188 76.518956 \n",
"L 136.969651 76.892308 \n",
"L 137.218755 76.922609 \n",
"L 140.270314 77.278645 \n",
"L 143.321874 77.619532 \n",
"L 146.373441 77.945268 \n",
"L 149.425 78.255852 \n",
"L 152.476564 78.551287 \n",
"L 155.528127 78.831573 \n",
"L 158.57969 79.096705 \n",
"L 161.631253 79.346689 \n",
"L 164.682813 79.581522 \n",
"L 167.734376 79.801203 \n",
"L 170.785939 80.005736 \n",
"L 173.837499 80.195117 \n",
"L 176.889062 80.369347 \n",
"L 177.034329 80.37692 \n",
"L 179.940626 80.516308 \n",
"L 182.992189 80.648722 \n",
"L 186.04375 80.767201 \n",
"L 189.095313 80.871738 \n",
"L 192.146877 80.962338 \n",
"L 195.198438 81.038999 \n",
"L 198.250001 81.101723 \n",
"L 201.301564 81.150507 \n",
"L 204.353126 81.185354 \n",
"L 207.404689 81.206262 \n",
"L 210.456251 81.213231 \n",
"L 213.507813 81.206262 \n",
"L 216.559376 81.185354 \n",
"L 219.610939 81.150507 \n",
"L 222.662501 81.101723 \n",
"L 225.714063 81.038999 \n",
"L 228.765626 80.962338 \n",
"L 231.817188 80.871738 \n",
"L 234.868751 80.767201 \n",
"L 237.920313 80.648722 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_3\">\n",
" <path d=\"M 42.620312 81.038999 \n",
"L 45.671872 81.798647 \n",
"L 48.723432 82.544351 \n",
"L 51.775006 83.276123 \n",
"L 54.263642 83.861536 \n",
"L 54.826565 83.984147 \n",
"L 57.878125 84.635897 \n",
"L 60.929685 85.274743 \n",
"L 63.981244 85.900682 \n",
"L 67.032818 86.513718 \n",
"L 70.084378 87.113847 \n",
"L 71.29158 87.346153 \n",
"L 73.135938 87.676592 \n",
"L 76.187512 88.211301 \n",
"L 79.239071 88.733993 \n",
"L 82.290631 89.244669 \n",
"L 85.342191 89.74333 \n",
"L 88.393757 90.229974 \n",
"L 91.44531 90.704602 \n",
"L 92.277553 90.830769 \n",
"L 94.496869 91.145507 \n",
"L 97.548436 91.567034 \n",
"L 100.599996 91.977319 \n",
"L 103.651563 92.376364 \n",
"L 106.703122 92.764167 \n",
"L 109.754682 93.14073 \n",
"L 112.806249 93.506052 \n",
"L 115.857816 93.860135 \n",
"L 118.909375 94.202976 \n",
"L 119.943828 94.315385 \n",
"L 121.960942 94.521292 \n",
"L 125.012502 94.822237 \n",
"L 128.064069 95.112621 \n",
"L 131.115628 95.392446 \n",
"L 134.167188 95.661713 \n",
"L 137.218755 95.92042 \n",
"L 140.270314 96.168567 \n",
"L 143.321874 96.406152 \n",
"L 146.373441 96.633182 \n",
"L 149.425 96.84965 \n",
"L 152.476564 97.055559 \n",
"L 155.528127 97.25091 \n",
"L 158.57969 97.435699 \n",
"L 161.631253 97.60993 \n",
"L 164.682813 97.773602 \n",
"L 165.20896 97.800001 \n",
"L 167.734376 97.919472 \n",
"L 170.785939 98.053879 \n",
"L 173.837499 98.17833 \n",
"L 176.889062 98.292825 \n",
"L 179.940626 98.397363 \n",
"L 182.992189 98.491946 \n",
"L 186.04375 98.576571 \n",
"L 189.095313 98.651243 \n",
"L 192.146877 98.715957 \n",
"L 195.198438 98.770714 \n",
"L 198.250001 98.815518 \n",
"L 201.301564 98.850364 \n",
"L 204.353126 98.875252 \n",
"L 207.404689 98.890187 \n",
"L 210.456251 98.895165 \n",
"L 213.507813 98.890187 \n",
"L 216.559376 98.875252 \n",
"L 219.610939 98.850364 \n",
"L 222.662501 98.815518 \n",
"L 225.714063 98.770714 \n",
"L 228.765626 98.715957 \n",
"L 231.817188 98.651243 \n",
"L 234.868751 98.576571 \n",
"L 237.920313 98.491946 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_4\">\n",
" <path d=\"M 42.620312 98.770713 \n",
"L 45.671872 99.313318 \n",
"L 48.723432 99.845966 \n",
"L 51.775006 100.368661 \n",
"L 54.826565 100.881396 \n",
"L 57.27387 101.284618 \n",
"L 57.878125 101.378794 \n",
"L 60.929685 101.844979 \n",
"L 63.981244 102.301746 \n",
"L 67.032818 102.749096 \n",
"L 70.084378 103.187027 \n",
"L 73.135938 103.615539 \n",
"L 76.187512 104.034638 \n",
"L 79.239071 104.444316 \n",
"L 81.716227 104.769234 \n",
"L 82.290631 104.840712 \n",
"L 85.342191 105.211508 \n",
"L 88.393757 105.573373 \n",
"L 91.44531 105.926301 \n",
"L 94.496869 106.270296 \n",
"L 97.548436 106.605354 \n",
"L 100.599996 106.931478 \n",
"L 103.651563 107.248667 \n",
"L 106.703122 107.556923 \n",
"L 109.754682 107.856243 \n",
"L 112.806249 108.146626 \n",
"L 113.968764 108.253846 \n",
"L 115.857816 108.419576 \n",
"L 118.909375 108.6788 \n",
"L 121.960942 108.929523 \n",
"L 125.012502 109.171742 \n",
"L 128.064069 109.405468 \n",
"L 131.115628 109.630695 \n",
"L 134.167188 109.847419 \n",
"L 137.218755 110.055645 \n",
"L 140.270314 110.255373 \n",
"L 143.321874 110.446602 \n",
"L 146.373441 110.629333 \n",
"L 149.425 110.803565 \n",
"L 152.476564 110.969295 \n",
"L 155.528127 111.126526 \n",
"L 158.57969 111.275263 \n",
"L 161.631253 111.415497 \n",
"L 164.682813 111.547229 \n",
"L 167.734376 111.670467 \n",
"L 169.542874 111.738466 \n",
"L 170.785939 111.783032 \n",
"L 173.837499 111.884327 \n",
"L 176.889062 111.97752 \n",
"L 179.940626 112.062609 \n",
"L 182.992189 112.139595 \n",
"L 186.04375 112.208477 \n",
"L 189.095313 112.269257 \n",
"L 192.146877 112.32193 \n",
"L 195.198438 112.366503 \n",
"L 198.250001 112.402969 \n",
"L 201.301564 112.431332 \n",
"L 204.353126 112.451592 \n",
"L 207.404689 112.463745 \n",
"L 210.456251 112.467798 \n",
"L 213.507813 112.463745 \n",
"L 216.559376 112.451592 \n",
"L 219.610939 112.431332 \n",
"L 222.662501 112.402969 \n",
"L 225.714063 112.366503 \n",
"L 228.765626 112.32193 \n",
"L 231.817188 112.269257 \n",
"L 234.868751 112.208477 \n",
"L 237.920313 112.139595 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_5\">\n",
" <path d=\"M 42.620312 112.366503 \n",
"L 45.671872 112.808155 \n",
"L 48.723432 113.241707 \n",
"L 51.775006 113.667157 \n",
"L 54.826565 114.084499 \n",
"L 57.878125 114.493739 \n",
"L 60.929685 114.894875 \n",
"L 63.477901 115.223078 \n",
"L 63.981244 115.285026 \n",
"L 67.032818 115.652848 \n",
"L 70.084378 116.012922 \n",
"L 73.135938 116.365258 \n",
"L 76.187512 116.709846 \n",
"L 79.239071 117.046692 \n",
"L 82.290631 117.375794 \n",
"L 85.342191 117.697154 \n",
"L 88.393757 118.01077 \n",
"L 91.44531 118.31664 \n",
"L 94.496869 118.614767 \n",
"L 95.47338 118.70769 \n",
"L 97.548436 118.896748 \n",
"L 100.599996 119.167362 \n",
"L 103.651563 119.430562 \n",
"L 106.703122 119.686349 \n",
"L 109.754682 119.934722 \n",
"L 112.806249 120.175678 \n",
"L 115.857816 120.409221 \n",
"L 118.909375 120.635353 \n",
"L 121.960942 120.854069 \n",
"L 125.012502 121.065367 \n",
"L 128.064069 121.269256 \n",
"L 131.115628 121.465731 \n",
"L 134.167188 121.654788 \n",
"L 137.218755 121.836432 \n",
"L 140.270314 122.010663 \n",
"L 143.321874 122.17748 \n",
"L 143.605786 122.192311 \n",
"L 146.373441 122.330983 \n",
"L 149.425 122.476769 \n",
"L 152.476564 122.615441 \n",
"L 155.528127 122.747002 \n",
"L 158.57969 122.871455 \n",
"L 161.631253 122.988794 \n",
"L 164.682813 123.099019 \n",
"L 167.734376 123.202136 \n",
"L 170.785939 123.298142 \n",
"L 173.837499 123.387034 \n",
"L 176.889062 123.468815 \n",
"L 179.940626 123.543485 \n",
"L 182.992189 123.611044 \n",
"L 186.04375 123.671492 \n",
"L 189.095313 123.724829 \n",
"L 192.146877 123.771052 \n",
"L 195.198438 123.810167 \n",
"L 198.250001 123.842168 \n",
"L 201.301564 123.867058 \n",
"L 204.353126 123.884837 \n",
"L 207.404689 123.895502 \n",
"L 210.456251 123.899059 \n",
"L 213.507813 123.895502 \n",
"L 216.559376 123.884837 \n",
"L 219.610939 123.867058 \n",
"L 222.662501 123.842168 \n",
"L 225.714063 123.810167 \n",
"L 228.765626 123.771052 \n",
"L 231.817188 123.724829 \n",
"L 234.868751 123.671492 \n",
"L 237.920313 123.611044 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_6\">\n",
" <path d=\"M 42.620312 123.810167 \n",
"L 45.671872 124.197739 \n",
"L 48.723432 124.578203 \n",
"L 51.775006 124.951556 \n",
"L 54.826565 125.317795 \n",
"L 57.878125 125.676923 \n",
"L 60.929685 126.015136 \n",
"L 63.981244 126.346516 \n",
"L 67.032818 126.671065 \n",
"L 70.084378 126.988778 \n",
"L 73.135938 127.299663 \n",
"L 76.187512 127.603712 \n",
"L 79.239071 127.900929 \n",
"L 82.290631 128.191314 \n",
"L 85.342191 128.474866 \n",
"L 88.393757 128.751587 \n",
"L 91.44531 129.021472 \n",
"L 93.070112 129.161535 \n",
"L 94.496869 129.279884 \n",
"L 97.548436 129.526437 \n",
"L 100.599996 129.766414 \n",
"L 103.651563 129.999818 \n",
"L 106.703122 130.226647 \n",
"L 109.754682 130.446903 \n",
"L 112.806249 130.66058 \n",
"L 115.857816 130.867685 \n",
"L 118.909375 131.068216 \n",
"L 121.960942 131.262171 \n",
"L 125.012502 131.449548 \n",
"L 128.064069 131.630355 \n",
"L 131.115628 131.804587 \n",
"L 134.167188 131.972242 \n",
"L 137.218755 132.133322 \n",
"L 140.270314 132.287828 \n",
"L 143.321874 132.43576 \n",
"L 146.373441 132.577118 \n",
"L 147.936485 132.646155 \n",
"L 149.425 132.70951 \n",
"L 152.476564 132.833055 \n",
"L 155.528127 132.950264 \n",
"L 158.57969 133.06114 \n",
"L 161.631253 133.165679 \n",
"L 164.682813 133.263879 \n",
"L 167.734376 133.355747 \n",
"L 170.785939 133.44128 \n",
"L 173.837499 133.520475 \n",
"L 176.889062 133.593334 \n",
"L 179.940626 133.659858 \n",
"L 182.992189 133.720047 \n",
"L 186.04375 133.773901 \n",
"L 189.095313 133.82142 \n",
"L 192.146877 133.8626 \n",
"L 195.198438 133.897448 \n",
"L 198.250001 133.925958 \n",
"L 201.301564 133.948133 \n",
"L 204.353126 133.963972 \n",
"L 207.404689 133.973474 \n",
"L 210.456251 133.976643 \n",
"L 213.507813 133.973474 \n",
"L 216.559376 133.963972 \n",
"L 219.610939 133.948133 \n",
"L 222.662501 133.925958 \n",
"L 225.714063 133.897448 \n",
"L 228.765626 133.8626 \n",
"L 231.817188 133.82142 \n",
"L 234.868751 133.773901 \n",
"L 237.920313 133.720047 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_7\">\n",
" <path d=\"M 42.620312 133.897445 \n",
"L 45.671872 134.242743 \n",
"L 48.723432 134.581696 \n",
"L 51.775006 134.914323 \n",
"L 54.826565 135.240612 \n",
"L 57.878125 135.560557 \n",
"L 60.929685 135.874175 \n",
"L 63.477871 136.130768 \n",
"L 63.981244 136.179677 \n",
"L 67.032818 136.470059 \n",
"L 70.084378 136.754332 \n",
"L 73.135938 137.032489 \n",
"L 76.187512 137.304535 \n",
"L 79.239071 137.570466 \n",
"L 82.290631 137.830284 \n",
"L 85.342191 138.083987 \n",
"L 88.393757 138.331578 \n",
"L 91.44531 138.573055 \n",
"L 94.496869 138.80842 \n",
"L 97.548436 139.037672 \n",
"L 100.599996 139.260809 \n",
"L 103.651563 139.477831 \n",
"L 105.641791 139.615388 \n",
"L 106.703122 139.686256 \n",
"L 109.754682 139.88411 \n",
"L 112.806249 140.076061 \n",
"L 115.857816 140.262103 \n",
"L 118.909375 140.442243 \n",
"L 121.960942 140.616473 \n",
"L 125.012502 140.784796 \n",
"L 128.064069 140.947215 \n",
"L 131.115628 141.103726 \n",
"L 134.167188 141.254334 \n",
"L 137.218755 141.399034 \n",
"L 140.270314 141.537831 \n",
"L 143.321874 141.670719 \n",
"L 146.373441 141.797698 \n",
"L 149.425 141.918775 \n",
"L 152.476564 142.033943 \n",
"L 155.528127 142.143209 \n",
"L 158.57969 142.246565 \n",
"L 161.631253 142.344014 \n",
"L 164.682813 142.435559 \n",
"L 167.734376 142.521202 \n",
"L 170.785939 142.60093 \n",
"L 173.837499 142.674761 \n",
"L 176.889062 142.742678 \n",
"L 179.940626 142.804692 \n",
"L 182.992189 142.860804 \n",
"L 186.04375 142.911006 \n",
"L 189.095313 142.9553 \n",
"L 192.146877 142.993692 \n",
"L 195.198438 143.026174 \n",
"L 198.250001 143.052749 \n",
"L 201.301564 143.07342 \n",
"L 204.353126 143.088189 \n",
"L 207.404689 143.097049 \n",
"L 210.456251 143.1 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" <path d=\"M 210.456251 143.1 \n",
"L 213.507813 143.097049 \n",
"L 216.559376 143.088189 \n",
"L 219.610939 143.07342 \n",
"L 222.662501 143.052749 \n",
"L 225.714063 143.026174 \n",
"L 228.765626 142.993692 \n",
"L 231.817188 142.9553 \n",
"L 234.868751 142.911006 \n",
"L 237.920313 142.860804 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_8\">\n",
" <path d=\"M 42.620312 143.026174 \n",
"L 43.320206 143.1 \n",
"\" clip-path=\"url(#p15c6aa45f6)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"PathCollection_9\"/>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 42.620312 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 237.920313 143.1 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 42.620312 143.1 \n",
"L 237.920313 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 42.620312 7.2 \n",
"L 237.920313 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p15c6aa45f6\">\n",
" <rect x=\"42.620312\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"eta = 2\n",
"d2l.show_trace_2d(f_2d, d2l.train_2d(adagrad_2d))"
]
},
{
"cell_type": "markdown",
"id": "0fb2a2e5",
"metadata": {
"origin_pos": 8
},
"source": [
"## 从零开始实现\n",
"\n",
"同动量法一样,AdaGrad算法需要对每个自变量维护同它一样形状的状态变量。\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "e69612f9",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:07:37.961770Z",
"iopub.status.busy": "2023-08-18T07:07:37.961458Z",
"iopub.status.idle": "2023-08-18T07:07:37.967644Z",
"shell.execute_reply": "2023-08-18T07:07:37.966629Z"
},
"origin_pos": 10,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"def init_adagrad_states(feature_dim):\n",
" s_w = torch.zeros((feature_dim, 1))\n",
" s_b = torch.zeros(1)\n",
" return (s_w, s_b)\n",
"\n",
"def adagrad(params, states, hyperparams):\n",
" eps = 1e-6\n",
" for p, s in zip(params, states):\n",
" with torch.no_grad():\n",
" s[:] += torch.square(p.grad)\n",
" p[:] -= hyperparams['lr'] * p.grad / torch.sqrt(s + eps)\n",
" p.grad.data.zero_()"
]
},
{
"cell_type": "markdown",
"id": "3abf8b9a",
"metadata": {
"origin_pos": 13
},
"source": [
"与 :numref:`sec_minibatch_sgd`一节中的实验相比,这里使用更大的学习率来训练模型。\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "82c63a0b",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:07:37.971150Z",
"iopub.status.busy": "2023-08-18T07:07:37.970847Z",
"iopub.status.idle": "2023-08-18T07:07:40.594904Z",
"shell.execute_reply": "2023-08-18T07:07:40.593847Z"
},
"origin_pos": 14,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss: 0.242, 0.012 sec/epoch\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"266.957813pt\" height=\"184.455469pt\" viewBox=\"0 0 266.957813 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:07:40.558773</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 184.455469 \n",
"L 266.957813 184.455469 \n",
"L 266.957813 0 \n",
"L -0 0 \n",
"L -0 184.455469 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"L 56.50625 10.999219 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"mb4c409f97c\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mb4c409f97c\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0.0 -->\n",
" <g transform=\"translate(48.554688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 105.33125 146.899219 \n",
"L 105.33125 10.999219 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#mb4c409f97c\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 0.5 -->\n",
" <g transform=\"translate(97.379688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 154.15625 146.899219 \n",
"L 154.15625 10.999219 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#mb4c409f97c\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(146.204688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 202.98125 146.899219 \n",
"L 202.98125 10.999219 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#mb4c409f97c\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 1.5 -->\n",
" <g transform=\"translate(195.029688 161.497656)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#mb4c409f97c\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 2.0 -->\n",
" <g transform=\"translate(243.854688 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 56.50625 141.672296 \n",
"L 251.80625 141.672296 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"meb52d9cda3\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#meb52d9cda3\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.225 -->\n",
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 56.50625 115.53768 \n",
"L 251.80625 115.53768 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#meb52d9cda3\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.250 -->\n",
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 56.50625 89.403065 \n",
"L 251.80625 89.403065 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#meb52d9cda3\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.275 -->\n",
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 56.50625 63.26845 \n",
"L 251.80625 63.26845 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#meb52d9cda3\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.300 -->\n",
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 56.50625 37.133834 \n",
"L 251.80625 37.133834 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#meb52d9cda3\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.325 -->\n",
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#meb52d9cda3\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.350 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- loss -->\n",
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 69.52625 97.960788 \n",
"L 82.54625 119.880734 \n",
"L 95.56625 120.544314 \n",
"L 108.58625 122.34287 \n",
"L 121.60625 117.351223 \n",
"L 134.62625 121.578601 \n",
"L 147.64625 123.323166 \n",
"L 160.66625 118.92602 \n",
"L 173.68625 120.730284 \n",
"L 186.70625 122.453394 \n",
"L 199.72625 122.556154 \n",
"L 212.74625 123.026908 \n",
"L 225.76625 122.374453 \n",
"L 238.78625 123.848526 \n",
"L 251.80625 123.524787 \n",
"\" clip-path=\"url(#pdafd587b65)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"pdafd587b65\">\n",
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"data_iter, feature_dim = d2l.get_data_ch11(batch_size=10)\n",
"d2l.train_ch11(adagrad, init_adagrad_states(feature_dim),\n",
" {'lr': 0.1}, data_iter, feature_dim);"
]
},
{
"cell_type": "markdown",
"id": "1903affc",
"metadata": {
"origin_pos": 15
},
"source": [
"## 简洁实现\n",
"\n",
"我们可直接使用深度学习框架中提供的AdaGrad算法来训练模型。\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "c7c10ee3",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:07:40.599038Z",
"iopub.status.busy": "2023-08-18T07:07:40.598462Z",
"iopub.status.idle": "2023-08-18T07:07:45.691770Z",
"shell.execute_reply": "2023-08-18T07:07:45.690969Z"
},
"origin_pos": 17,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss: 0.242, 0.013 sec/epoch\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"262.1875pt\" height=\"184.455469pt\" viewBox=\"0 0 262.1875 184.455469\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:07:45.657238</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M -0 184.455469 \n",
"L 262.1875 184.455469 \n",
"L 262.1875 0 \n",
"L -0 0 \n",
"L -0 184.455469 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"L 56.50625 10.999219 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m16a8b764b1\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m16a8b764b1\" x=\"56.50625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 0 -->\n",
" <g transform=\"translate(53.325 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 105.33125 146.899219 \n",
"L 105.33125 10.999219 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m16a8b764b1\" x=\"105.33125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 1 -->\n",
" <g transform=\"translate(102.15 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 154.15625 146.899219 \n",
"L 154.15625 10.999219 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m16a8b764b1\" x=\"154.15625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(150.975 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 202.98125 146.899219 \n",
"L 202.98125 10.999219 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m16a8b764b1\" x=\"202.98125\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 3 -->\n",
" <g transform=\"translate(199.8 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-33\" d=\"M 2597 2516 \n",
"Q 3050 2419 3304 2112 \n",
"Q 3559 1806 3559 1356 \n",
"Q 3559 666 3084 287 \n",
"Q 2609 -91 1734 -91 \n",
"Q 1441 -91 1130 -33 \n",
"Q 819 25 488 141 \n",
"L 488 750 \n",
"Q 750 597 1062 519 \n",
"Q 1375 441 1716 441 \n",
"Q 2309 441 2620 675 \n",
"Q 2931 909 2931 1356 \n",
"Q 2931 1769 2642 2001 \n",
"Q 2353 2234 1838 2234 \n",
"L 1294 2234 \n",
"L 1294 2753 \n",
"L 1863 2753 \n",
"Q 2328 2753 2575 2939 \n",
"Q 2822 3125 2822 3475 \n",
"Q 2822 3834 2567 4026 \n",
"Q 2313 4219 1838 4219 \n",
"Q 1578 4219 1281 4162 \n",
"Q 984 4106 628 3988 \n",
"L 628 4550 \n",
"Q 988 4650 1302 4700 \n",
"Q 1616 4750 1894 4750 \n",
"Q 2613 4750 3031 4423 \n",
"Q 3450 4097 3450 3541 \n",
"Q 3450 3153 3228 2886 \n",
"Q 3006 2619 2597 2516 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-33\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m16a8b764b1\" x=\"251.80625\" y=\"146.899219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(248.625 161.497656)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(138.928125 175.175781)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 56.50625 141.672296 \n",
"L 251.80625 141.672296 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m6754abc477\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m6754abc477\" x=\"56.50625\" y=\"141.672296\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.225 -->\n",
" <g transform=\"translate(20.878125 145.471514)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
"L 3169 4666 \n",
"L 3169 4134 \n",
"L 1269 4134 \n",
"L 1269 2991 \n",
"Q 1406 3038 1543 3061 \n",
"Q 1681 3084 1819 3084 \n",
"Q 2600 3084 3056 2656 \n",
"Q 3513 2228 3513 1497 \n",
"Q 3513 744 3044 326 \n",
"Q 2575 -91 1722 -91 \n",
"Q 1428 -91 1123 -41 \n",
"Q 819 9 494 109 \n",
"L 494 744 \n",
"Q 775 591 1075 516 \n",
"Q 1375 441 1709 441 \n",
"Q 2250 441 2565 725 \n",
"Q 2881 1009 2881 1497 \n",
"Q 2881 1984 2565 2268 \n",
"Q 2250 2553 1709 2553 \n",
"Q 1456 2553 1204 2497 \n",
"Q 953 2441 691 2322 \n",
"L 691 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 56.50625 115.53768 \n",
"L 251.80625 115.53768 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m6754abc477\" x=\"56.50625\" y=\"115.53768\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.250 -->\n",
" <g transform=\"translate(20.878125 119.336899)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 56.50625 89.403065 \n",
"L 251.80625 89.403065 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m6754abc477\" x=\"56.50625\" y=\"89.403065\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.275 -->\n",
" <g transform=\"translate(20.878125 93.202284)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-37\" d=\"M 525 4666 \n",
"L 3525 4666 \n",
"L 3525 4397 \n",
"L 1831 0 \n",
"L 1172 0 \n",
"L 2766 4134 \n",
"L 525 4134 \n",
"L 525 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-37\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 56.50625 63.26845 \n",
"L 251.80625 63.26845 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#m6754abc477\" x=\"56.50625\" y=\"63.26845\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 0.300 -->\n",
" <g transform=\"translate(20.878125 67.067668)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 56.50625 37.133834 \n",
"L 251.80625 37.133834 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#m6754abc477\" x=\"56.50625\" y=\"37.133834\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 0.325 -->\n",
" <g transform=\"translate(20.878125 40.933053)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_6\">\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <g>\n",
" <use xlink:href=\"#m6754abc477\" x=\"56.50625\" y=\"10.999219\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- 0.350 -->\n",
" <g transform=\"translate(20.878125 14.798437)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-33\" x=\"95.410156\"/>\n",
" <use xlink:href=\"#DejaVuSans-35\" x=\"159.033203\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"222.65625\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- loss -->\n",
" <g transform=\"translate(14.798438 88.607031)rotate(-90)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-6c\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"27.783203\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"88.964844\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"141.064453\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 63.01625 84.97975 \n",
"L 69.52625 107.499811 \n",
"L 76.03625 114.165117 \n",
"L 82.54625 118.961419 \n",
"L 89.05625 120.937184 \n",
"L 95.56625 122.844344 \n",
"L 102.07625 121.537355 \n",
"L 108.58625 120.107499 \n",
"L 115.09625 122.195204 \n",
"L 121.60625 120.70043 \n",
"L 128.11625 123.229963 \n",
"L 134.62625 120.502694 \n",
"L 141.13625 121.949119 \n",
"L 147.64625 121.144704 \n",
"L 154.15625 122.270494 \n",
"L 160.66625 123.832599 \n",
"L 167.17625 123.390323 \n",
"L 173.68625 123.016308 \n",
"L 180.19625 123.332965 \n",
"L 186.70625 123.280392 \n",
"L 193.21625 123.560747 \n",
"L 199.72625 123.489864 \n",
"L 206.23625 121.585549 \n",
"L 212.74625 123.648954 \n",
"L 219.25625 123.130396 \n",
"L 225.76625 123.554931 \n",
"L 232.27625 123.436168 \n",
"L 238.78625 120.021334 \n",
"L 245.29625 123.766699 \n",
"L 251.80625 123.809279 \n",
"\" clip-path=\"url(#p50a1a60518)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 56.50625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 251.80625 146.899219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 56.50625 146.899219 \n",
"L 251.80625 146.899219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 56.50625 10.999219 \n",
"L 251.80625 10.999219 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p50a1a60518\">\n",
" <rect x=\"56.50625\" y=\"10.999219\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"trainer = torch.optim.Adagrad\n",
"d2l.train_concise_ch11(trainer, {'lr': 0.1}, data_iter)"
]
},
{
"cell_type": "markdown",
"id": "7c70fc97",
"metadata": {
"origin_pos": 20
},
"source": [
"## 小结\n",
"\n",
"* AdaGrad算法会在单个坐标层面动态降低学习率。\n",
"* AdaGrad算法利用梯度的大小作为调整进度速率的手段:用较小的学习率来补偿带有较大梯度的坐标。\n",
"* 在深度学习问题中,由于内存和计算限制,计算准确的二阶导数通常是不可行的。梯度可以作为一个有效的代理。\n",
"* 如果优化问题的结构相当不均匀,AdaGrad算法可以帮助缓解扭曲。\n",
"* AdaGrad算法对于稀疏特征特别有效,在此情况下由于不常出现的问题,学习率需要更慢地降低。\n",
"* 在深度学习问题上,AdaGrad算法有时在降低学习率方面可能过于剧烈。我们将在 :numref:`sec_adam`一节讨论缓解这种情况的策略。\n",
"\n",
"## 练习\n",
"\n",
"1. 证明对于正交矩阵$\\mathbf{U}$和向量$\\mathbf{c}$,以下等式成立:$\\|\\mathbf{c} - \\mathbf{\\delta}\\|_2 = \\|\\mathbf{U} \\mathbf{c} - \\mathbf{U} \\mathbf{\\delta}\\|_2$。为什么这意味着在变量的正交变化之后,扰动的程度不会改变?\n",
"1. 尝试对函数$f(\\mathbf{x}) = 0.1 x_1^2 + 2 x_2^2$、以及它旋转45度后的函数即$f(\\mathbf{x}) = 0.1 (x_1 + x_2)^2 + 2 (x_1 - x_2)^2$使用AdaGrad算法。它的表现会不同吗?\n",
"1. 证明[格什戈林圆盘定理](https://en.wikipedia.org/wiki/Gershgorin_circle_theorem),其中提到,矩阵$\\mathbf{M}$的特征值$\\lambda_i$在至少一个$j$的选项中满足$|\\lambda_i - \\mathbf{M}_{jj}| \\leq \\sum_{k \\neq j} |\\mathbf{M}_{jk}|$的要求。\n",
"1. 关于对角线预处理矩阵$\\mathrm{diag}^{-\\frac{1}{2}}(\\mathbf{M}) \\mathbf{M} \\mathrm{diag}^{-\\frac{1}{2}}(\\mathbf{M})$的特征值,格什戈林的定理告诉了我们什么?\n",
"1. 尝试对适当的深度网络使用AdaGrad算法,例如,:numref:`sec_lenet`中应用于Fashion-MNIST的深度网络。\n",
"1. 要如何修改AdaGrad算法,才能使其在学习率方面的衰减不那么激进?\n"
]
},
{
"cell_type": "markdown",
"id": "6fb87f9b",
"metadata": {
"origin_pos": 22,
"tab": [
"pytorch"
]
},
"source": [
"[Discussions](https://discuss.d2l.ai/t/4319)\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"required_libs": []
},
"nbformat": 4,
"nbformat_minor": 5
}