Files
Python/d2l/d2l-zh/pytorch/chapter_convolutional-modern/batch-norm.ipynb
T
2025-12-16 09:23:53 +08:00

2336 lines
93 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "632caf6d",
"metadata": {
"origin_pos": 0
},
"source": [
"# 批量规范化\n",
":label:`sec_batch_norm`\n",
"\n",
"训练深层神经网络是十分困难的,特别是在较短的时间内使他们收敛更加棘手。\n",
"本节将介绍*批量规范化*batch normalization :cite:`Ioffe.Szegedy.2015`,这是一种流行且有效的技术,可持续加速深层网络的收敛速度。\n",
"再结合在 :numref:`sec_resnet`中将介绍的残差块,批量规范化使得研究人员能够训练100层以上的网络。\n",
"\n",
"## 训练深层网络\n",
"\n",
"为什么需要批量规范化层呢?让我们来回顾一下训练神经网络时出现的一些实际挑战。\n",
"\n",
"首先,数据预处理的方式通常会对最终结果产生巨大影响。\n",
"回想一下我们应用多层感知机来预测房价的例子( :numref:`sec_kaggle_house`)。\n",
"使用真实数据时,我们的第一步是标准化输入特征,使其平均值为0,方差为1。\n",
"直观地说,这种标准化可以很好地与我们的优化器配合使用,因为它可以将参数的量级进行统一。\n",
"\n",
"第二,对于典型的多层感知机或卷积神经网络。当我们训练时,中间层中的变量(例如,多层感知机中的仿射变换输出)可能具有更广的变化范围:不论是沿着从输入到输出的层,跨同一层中的单元,或是随着时间的推移,模型参数的随着训练更新变幻莫测。\n",
"批量规范化的发明者非正式地假设,这些变量分布中的这种偏移可能会阻碍网络的收敛。\n",
"直观地说,我们可能会猜想,如果一个层的可变值是另一层的100倍,这可能需要对学习率进行补偿调整。\n",
"\n",
"第三,更深层的网络很复杂,容易过拟合。\n",
"这意味着正则化变得更加重要。\n",
"\n",
"批量规范化应用于单个可选层(也可以应用到所有层),其原理如下:在每次训练迭代中,我们首先规范化输入,即通过减去其均值并除以其标准差,其中两者均基于当前小批量处理。\n",
"接下来,我们应用比例系数和比例偏移。\n",
"正是由于这个基于*批量*统计的*标准化*,才有了*批量规范化*的名称。\n",
"\n",
"请注意,如果我们尝试使用大小为1的小批量应用批量规范化,我们将无法学到任何东西。\n",
"这是因为在减去均值之后,每个隐藏单元将为0。\n",
"所以,只有使用足够大的小批量,批量规范化这种方法才是有效且稳定的。\n",
"请注意,在应用批量规范化时,批量大小的选择可能比没有批量规范化时更重要。\n",
"\n",
"从形式上来说,用$\\mathbf{x} \\in \\mathcal{B}$表示一个来自小批量$\\mathcal{B}$的输入,批量规范化$\\mathrm{BN}$根据以下表达式转换$\\mathbf{x}$\n",
"\n",
"$$\\mathrm{BN}(\\mathbf{x}) = \\boldsymbol{\\gamma} \\odot \\frac{\\mathbf{x} - \\hat{\\boldsymbol{\\mu}}_\\mathcal{B}}{\\hat{\\boldsymbol{\\sigma}}_\\mathcal{B}} + \\boldsymbol{\\beta}.$$\n",
":eqlabel:`eq_batchnorm`\n",
"\n",
"在 :eqref:`eq_batchnorm`中,$\\hat{\\boldsymbol{\\mu}}_\\mathcal{B}$是小批量$\\mathcal{B}$的样本均值,$\\hat{\\boldsymbol{\\sigma}}_\\mathcal{B}$是小批量$\\mathcal{B}$的样本标准差。\n",
"应用标准化后,生成的小批量的平均值为0和单位方差为1。\n",
"由于单位方差(与其他一些魔法数)是一个主观的选择,因此我们通常包含\n",
"*拉伸参数*scale$\\boldsymbol{\\gamma}$和*偏移参数*shift$\\boldsymbol{\\beta}$,它们的形状与$\\mathbf{x}$相同。\n",
"请注意,$\\boldsymbol{\\gamma}$和$\\boldsymbol{\\beta}$是需要与其他模型参数一起学习的参数。\n",
"\n",
"由于在训练过程中,中间层的变化幅度不能过于剧烈,而批量规范化将每一层主动居中,并将它们重新调整为给定的平均值和大小(通过$\\hat{\\boldsymbol{\\mu}}_\\mathcal{B}$和${\\hat{\\boldsymbol{\\sigma}}_\\mathcal{B}}$)。\n",
"\n",
"从形式上来看,我们计算出 :eqref:`eq_batchnorm`中的$\\hat{\\boldsymbol{\\mu}}_\\mathcal{B}$和${\\hat{\\boldsymbol{\\sigma}}_\\mathcal{B}}$,如下所示:\n",
"\n",
"$$\\begin{aligned} \\hat{\\boldsymbol{\\mu}}_\\mathcal{B} &= \\frac{1}{|\\mathcal{B}|} \\sum_{\\mathbf{x} \\in \\mathcal{B}} \\mathbf{x},\\\\\n",
"\\hat{\\boldsymbol{\\sigma}}_\\mathcal{B}^2 &= \\frac{1}{|\\mathcal{B}|} \\sum_{\\mathbf{x} \\in \\mathcal{B}} (\\mathbf{x} - \\hat{\\boldsymbol{\\mu}}_{\\mathcal{B}})^2 + \\epsilon.\\end{aligned}$$\n",
"\n",
"请注意,我们在方差估计值中添加一个小的常量$\\epsilon > 0$,以确保我们永远不会尝试除以零,即使在经验方差估计值可能消失的情况下也是如此。估计值$\\hat{\\boldsymbol{\\mu}}_\\mathcal{B}$和${\\hat{\\boldsymbol{\\sigma}}_\\mathcal{B}}$通过使用平均值和方差的噪声(noise)估计来抵消缩放问题。\n",
"乍看起来,这种噪声是一个问题,而事实上它是有益的。\n",
"\n",
"事实证明,这是深度学习中一个反复出现的主题。\n",
"由于尚未在理论上明确的原因,优化中的各种噪声源通常会导致更快的训练和较少的过拟合:这种变化似乎是正则化的一种形式。\n",
"在一些初步研究中, :cite:`Teye.Azizpour.Smith.2018`和 :cite:`Luo.Wang.Shao.ea.2018`分别将批量规范化的性质与贝叶斯先验相关联。\n",
"这些理论揭示了为什么批量规范化最适应$50 \\sim 100$范围中的中等批量大小的难题。\n",
"\n",
"另外,批量规范化层在”训练模式“(通过小批量统计数据规范化)和“预测模式”(通过数据集统计规范化)中的功能不同。\n",
"在训练过程中,我们无法得知使用整个数据集来估计平均值和方差,所以只能根据每个小批次的平均值和方差不断训练模型。\n",
"而在预测模式下,可以根据整个数据集精确计算批量规范化所需的平均值和方差。\n",
"\n",
"现在,我们了解一下批量规范化在实践中是如何工作的。\n",
"\n",
"## 批量规范化层\n",
"\n",
"回想一下,批量规范化和其他层之间的一个关键区别是,由于批量规范化在完整的小批量上运行,因此我们不能像以前在引入其他层时那样忽略批量大小。\n",
"我们在下面讨论这两种情况:全连接层和卷积层,他们的批量规范化实现略有不同。\n",
"\n",
"### 全连接层\n",
"\n",
"通常,我们将批量规范化层置于全连接层中的仿射变换和激活函数之间。\n",
"设全连接层的输入为x,权重参数和偏置参数分别为$\\mathbf{W}$和$\\mathbf{b}$,激活函数为$\\phi$,批量规范化的运算符为$\\mathrm{BN}$。\n",
"那么,使用批量规范化的全连接层的输出的计算详情如下:\n",
"\n",
"$$\\mathbf{h} = \\phi(\\mathrm{BN}(\\mathbf{W}\\mathbf{x} + \\mathbf{b}) ).$$\n",
"\n",
"回想一下,均值和方差是在应用变换的\"相同\"小批量上计算的。\n",
"\n",
"### 卷积层\n",
"\n",
"同样,对于卷积层,我们可以在卷积层之后和非线性激活函数之前应用批量规范化。\n",
"当卷积有多个输出通道时,我们需要对这些通道的“每个”输出执行批量规范化,每个通道都有自己的拉伸(scale)和偏移(shift)参数,这两个参数都是标量。\n",
"假设我们的小批量包含$m$个样本,并且对于每个通道,卷积的输出具有高度$p$和宽度$q$。\n",
"那么对于卷积层,我们在每个输出通道的$m \\cdot p \\cdot q$个元素上同时执行每个批量规范化。\n",
"因此,在计算平均值和方差时,我们会收集所有空间位置的值,然后在给定通道内应用相同的均值和方差,以便在每个空间位置对值进行规范化。\n",
"\n",
"### 预测过程中的批量规范化\n",
"\n",
"正如我们前面提到的,批量规范化在训练模式和预测模式下的行为通常不同。\n",
"首先,将训练好的模型用于预测时,我们不再需要样本均值中的噪声以及在微批次上估计每个小批次产生的样本方差了。\n",
"其次,例如,我们可能需要使用我们的模型对逐个样本进行预测。\n",
"一种常用的方法是通过移动平均估算整个训练数据集的样本均值和方差,并在预测时使用它们得到确定的输出。\n",
"可见,和暂退法一样,批量规范化层在训练模式和预测模式下的计算结果也是不一样的。\n",
"\n",
"## (**从零实现**)\n",
"\n",
"下面,我们从头开始实现一个具有张量的批量规范化层。\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "042456d3",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:16:16.084651Z",
"iopub.status.busy": "2023-08-18T07:16:16.083898Z",
"iopub.status.idle": "2023-08-18T07:16:18.925904Z",
"shell.execute_reply": "2023-08-18T07:16:18.924662Z"
},
"origin_pos": 2,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"import torch\n",
"from torch import nn\n",
"from d2l import torch as d2l\n",
"\n",
"\n",
"def batch_norm(X, gamma, beta, moving_mean, moving_var, eps, momentum):\n",
" # 通过is_grad_enabled来判断当前模式是训练模式还是预测模式\n",
" if not torch.is_grad_enabled():\n",
" # 如果是在预测模式下,直接使用传入的移动平均所得的均值和方差\n",
" X_hat = (X - moving_mean) / torch.sqrt(moving_var + eps)\n",
" else:\n",
" assert len(X.shape) in (2, 4)\n",
" if len(X.shape) == 2:\n",
" # 使用全连接层的情况,计算特征维上的均值和方差\n",
" mean = X.mean(dim=0)\n",
" var = ((X - mean) ** 2).mean(dim=0)\n",
" else:\n",
" # 使用二维卷积层的情况,计算通道维上(axis=1)的均值和方差。\n",
" # 这里我们需要保持X的形状以便后面可以做广播运算\n",
" mean = X.mean(dim=(0, 2, 3), keepdim=True)\n",
" var = ((X - mean) ** 2).mean(dim=(0, 2, 3), keepdim=True)\n",
" # 训练模式下,用当前的均值和方差做标准化\n",
" X_hat = (X - mean) / torch.sqrt(var + eps)\n",
" # 更新移动平均的均值和方差\n",
" moving_mean = momentum * moving_mean + (1.0 - momentum) * mean\n",
" moving_var = momentum * moving_var + (1.0 - momentum) * var\n",
" Y = gamma * X_hat + beta # 缩放和移位\n",
" return Y, moving_mean.data, moving_var.data"
]
},
{
"cell_type": "markdown",
"id": "c51f9a6e",
"metadata": {
"origin_pos": 5
},
"source": [
"我们现在可以[**创建一个正确的`BatchNorm`层**]。\n",
"这个层将保持适当的参数:拉伸`gamma`和偏移`beta`,这两个参数将在训练过程中更新。\n",
"此外,我们的层将保存均值和方差的移动平均值,以便在模型预测期间随后使用。\n",
"\n",
"撇开算法细节,注意我们实现层的基础设计模式。\n",
"通常情况下,我们用一个单独的函数定义其数学原理,比如说`batch_norm`。\n",
"然后,我们将此功能集成到一个自定义层中,其代码主要处理数据移动到训练设备(如GPU)、分配和初始化任何必需的变量、跟踪移动平均线(此处为均值和方差)等问题。\n",
"为了方便起见,我们并不担心在这里自动推断输入形状,因此我们需要指定整个特征的数量。\n",
"不用担心,深度学习框架中的批量规范化API将为我们解决上述问题,我们稍后将展示这一点。\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f9b0ce07",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:16:18.930541Z",
"iopub.status.busy": "2023-08-18T07:16:18.929642Z",
"iopub.status.idle": "2023-08-18T07:16:18.937402Z",
"shell.execute_reply": "2023-08-18T07:16:18.936597Z"
},
"origin_pos": 7,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"class BatchNorm(nn.Module):\n",
" # num_features:完全连接层的输出数量或卷积层的输出通道数。\n",
" # num_dims2表示完全连接层,4表示卷积层\n",
" def __init__(self, num_features, num_dims):\n",
" super().__init__()\n",
" if num_dims == 2:\n",
" shape = (1, num_features)\n",
" else:\n",
" shape = (1, num_features, 1, 1)\n",
" # 参与求梯度和迭代的拉伸和偏移参数,分别初始化成1和0\n",
" self.gamma = nn.Parameter(torch.ones(shape))\n",
" self.beta = nn.Parameter(torch.zeros(shape))\n",
" # 非模型参数的变量初始化为0和1\n",
" self.moving_mean = torch.zeros(shape)\n",
" self.moving_var = torch.ones(shape)\n",
"\n",
" def forward(self, X):\n",
" # 如果X不在内存上,将moving_mean和moving_var\n",
" # 复制到X所在显存上\n",
" if self.moving_mean.device != X.device:\n",
" self.moving_mean = self.moving_mean.to(X.device)\n",
" self.moving_var = self.moving_var.to(X.device)\n",
" # 保存更新过的moving_mean和moving_var\n",
" Y, self.moving_mean, self.moving_var = batch_norm(\n",
" X, self.gamma, self.beta, self.moving_mean,\n",
" self.moving_var, eps=1e-5, momentum=0.9)\n",
" return Y"
]
},
{
"cell_type": "markdown",
"id": "4e7484c9",
"metadata": {
"origin_pos": 10
},
"source": [
"## 使用批量规范化层的 LeNet\n",
"\n",
"为了更好理解如何[**应用`BatchNorm`**],下面我们将其应用(**于LeNet模型**) :numref:`sec_lenet`)。\n",
"回想一下,批量规范化是在卷积层或全连接层之后、相应的激活函数之前应用的。\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "89ca8ab0",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:16:18.940903Z",
"iopub.status.busy": "2023-08-18T07:16:18.940366Z",
"iopub.status.idle": "2023-08-18T07:16:18.966572Z",
"shell.execute_reply": "2023-08-18T07:16:18.965740Z"
},
"origin_pos": 12,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"net = nn.Sequential(\n",
" nn.Conv2d(1, 6, kernel_size=5), BatchNorm(6, num_dims=4), nn.Sigmoid(),\n",
" nn.AvgPool2d(kernel_size=2, stride=2),\n",
" nn.Conv2d(6, 16, kernel_size=5), BatchNorm(16, num_dims=4), nn.Sigmoid(),\n",
" nn.AvgPool2d(kernel_size=2, stride=2), nn.Flatten(),\n",
" nn.Linear(16*4*4, 120), BatchNorm(120, num_dims=2), nn.Sigmoid(),\n",
" nn.Linear(120, 84), BatchNorm(84, num_dims=2), nn.Sigmoid(),\n",
" nn.Linear(84, 10))"
]
},
{
"cell_type": "markdown",
"id": "c9e088bf",
"metadata": {
"origin_pos": 15
},
"source": [
"和以前一样,我们将[**在Fashion-MNIST数据集上训练网络**]。\n",
"这个代码与我们第一次训练LeNet :numref:`sec_lenet`)时几乎完全相同,主要区别在于学习率大得多。\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a0c4988d",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:16:18.970436Z",
"iopub.status.busy": "2023-08-18T07:16:18.969896Z",
"iopub.status.idle": "2023-08-18T07:17:04.740786Z",
"shell.execute_reply": "2023-08-18T07:17:04.739449Z"
},
"origin_pos": 16,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss 0.273, train acc 0.899, test acc 0.807\n",
"32293.9 examples/sec on cuda:0\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"238.965625pt\" height=\"180.65625pt\" viewBox=\"0 0 238.965625 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:17:04.694266</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 238.965625 180.65625 \n",
"L 238.965625 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"L 30.103125 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 51.803125 143.1 \n",
"L 51.803125 7.2 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m69556207a2\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m69556207a2\" x=\"51.803125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(48.621875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 95.203125 143.1 \n",
"L 95.203125 7.2 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m69556207a2\" x=\"95.203125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(92.021875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 138.603125 143.1 \n",
"L 138.603125 7.2 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m69556207a2\" x=\"138.603125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 6 -->\n",
" <g transform=\"translate(135.421875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
"Q 1688 2584 1439 2293 \n",
"Q 1191 2003 1191 1497 \n",
"Q 1191 994 1439 701 \n",
"Q 1688 409 2113 409 \n",
"Q 2538 409 2786 701 \n",
"Q 3034 994 3034 1497 \n",
"Q 3034 2003 2786 2293 \n",
"Q 2538 2584 2113 2584 \n",
"z\n",
"M 3366 4563 \n",
"L 3366 3988 \n",
"Q 3128 4100 2886 4159 \n",
"Q 2644 4219 2406 4219 \n",
"Q 1781 4219 1451 3797 \n",
"Q 1122 3375 1075 2522 \n",
"Q 1259 2794 1537 2939 \n",
"Q 1816 3084 2150 3084 \n",
"Q 2853 3084 3261 2657 \n",
"Q 3669 2231 3669 1497 \n",
"Q 3669 778 3244 343 \n",
"Q 2819 -91 2113 -91 \n",
"Q 1303 -91 875 529 \n",
"Q 447 1150 447 2328 \n",
"Q 447 3434 972 4092 \n",
"Q 1497 4750 2381 4750 \n",
"Q 2619 4750 2861 4703 \n",
"Q 3103 4656 3366 4563 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-36\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 182.003125 143.1 \n",
"L 182.003125 7.2 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m69556207a2\" x=\"182.003125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 8 -->\n",
" <g transform=\"translate(178.821875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
"Q 1584 2216 1326 1975 \n",
"Q 1069 1734 1069 1313 \n",
"Q 1069 891 1326 650 \n",
"Q 1584 409 2034 409 \n",
"Q 2484 409 2743 651 \n",
"Q 3003 894 3003 1313 \n",
"Q 3003 1734 2745 1975 \n",
"Q 2488 2216 2034 2216 \n",
"z\n",
"M 1403 2484 \n",
"Q 997 2584 770 2862 \n",
"Q 544 3141 544 3541 \n",
"Q 544 4100 942 4425 \n",
"Q 1341 4750 2034 4750 \n",
"Q 2731 4750 3128 4425 \n",
"Q 3525 4100 3525 3541 \n",
"Q 3525 3141 3298 2862 \n",
"Q 3072 2584 2669 2484 \n",
"Q 3125 2378 3379 2068 \n",
"Q 3634 1759 3634 1313 \n",
"Q 3634 634 3220 271 \n",
"Q 2806 -91 2034 -91 \n",
"Q 1263 -91 848 271 \n",
"Q 434 634 434 1313 \n",
"Q 434 1759 690 2068 \n",
"Q 947 2378 1403 2484 \n",
"z\n",
"M 1172 3481 \n",
"Q 1172 3119 1398 2916 \n",
"Q 1625 2713 2034 2713 \n",
"Q 2441 2713 2670 2916 \n",
"Q 2900 3119 2900 3481 \n",
"Q 2900 3844 2670 4047 \n",
"Q 2441 4250 2034 4250 \n",
"Q 1625 4250 1398 4047 \n",
"Q 1172 3844 1172 3481 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-38\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m69556207a2\" x=\"225.403125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(219.040625 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(112.525 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 30.103125 119.556491 \n",
"L 225.403125 119.556491 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"m5ed06a1768\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m5ed06a1768\" x=\"30.103125\" y=\"119.556491\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.4 -->\n",
" <g transform=\"translate(7.2 123.35571)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 30.103125 92.539376 \n",
"L 225.403125 92.539376 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#m5ed06a1768\" x=\"30.103125\" y=\"92.539376\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.6 -->\n",
" <g transform=\"translate(7.2 96.338595)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-36\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 30.103125 65.522261 \n",
"L 225.403125 65.522261 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#m5ed06a1768\" x=\"30.103125\" y=\"65.522261\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.8 -->\n",
" <g transform=\"translate(7.2 69.32148)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-38\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 30.103125 38.505146 \n",
"L 225.403125 38.505146 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#m5ed06a1768\" x=\"30.103125\" y=\"38.505146\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(7.2 42.304365)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 30.103125 11.488031 \n",
"L 225.403125 11.488031 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#m5ed06a1768\" x=\"30.103125\" y=\"11.488031\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 1.2 -->\n",
" <g transform=\"translate(7.2 15.28725)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 12.743125 13.377273 \n",
"L 17.083125 43.683501 \n",
"L 21.423125 58.273821 \n",
"L 25.763125 67.759169 \n",
"L 30.103125 73.832896 \n",
"L 34.443125 103.5807 \n",
"L 38.783125 105.540886 \n",
"L 43.123125 107.10832 \n",
"L 47.463125 108.288922 \n",
"L 51.803125 109.751123 \n",
"L 56.143125 116.300996 \n",
"L 60.483125 116.379102 \n",
"L 64.823125 117.584692 \n",
"L 69.163125 118.058771 \n",
"L 73.503125 118.796969 \n",
"L 77.843125 121.113726 \n",
"L 82.183125 123.499347 \n",
"L 86.523125 124.505609 \n",
"L 90.863125 124.038593 \n",
"L 95.203125 124.329098 \n",
"L 99.543125 126.659169 \n",
"L 103.883125 127.611292 \n",
"L 108.223125 128.05468 \n",
"L 112.563125 128.212204 \n",
"L 116.903125 128.210662 \n",
"L 121.243125 129.99772 \n",
"L 125.583125 130.453666 \n",
"L 129.923125 130.83043 \n",
"L 134.263125 130.837851 \n",
"L 138.603125 131.15863 \n",
"L 142.943125 133.731654 \n",
"L 147.283125 133.5983 \n",
"L 151.623125 133.617532 \n",
"L 155.963125 133.313086 \n",
"L 160.303125 133.398724 \n",
"L 164.643125 133.998651 \n",
"L 168.983125 134.586114 \n",
"L 173.323125 134.935325 \n",
"L 177.663125 134.629911 \n",
"L 182.003125 134.561439 \n",
"L 186.343125 136.922727 \n",
"L 190.683125 136.677852 \n",
"L 195.023125 136.234547 \n",
"L 199.363125 136.1579 \n",
"L 203.703125 136.177425 \n",
"L 208.043125 134.555978 \n",
"L 212.383125 135.46046 \n",
"L 216.723125 135.860701 \n",
"L 221.063125 136.230399 \n",
"L 225.403125 136.645454 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <path d=\"M 12.743125 92.653894 \n",
"L 17.083125 84.053864 \n",
"L 21.423125 79.559245 \n",
"L 25.763125 76.444635 \n",
"L 30.103125 74.471681 \n",
"L 34.443125 64.170507 \n",
"L 38.783125 63.418285 \n",
"L 43.123125 62.70723 \n",
"L 47.463125 62.360123 \n",
"L 51.803125 61.827671 \n",
"L 56.143125 59.578586 \n",
"L 60.483125 59.651562 \n",
"L 64.823125 59.13324 \n",
"L 69.163125 58.733739 \n",
"L 73.503125 58.461788 \n",
"L 77.843125 57.905734 \n",
"L 82.183125 56.934582 \n",
"L 86.523125 56.532274 \n",
"L 90.863125 56.60338 \n",
"L 95.203125 56.426499 \n",
"L 99.543125 56.053247 \n",
"L 103.883125 55.570478 \n",
"L 108.223125 55.32348 \n",
"L 112.563125 55.230856 \n",
"L 116.903125 55.262512 \n",
"L 121.243125 54.414077 \n",
"L 125.583125 54.273737 \n",
"L 129.923125 54.077262 \n",
"L 134.263125 54.147431 \n",
"L 138.603125 53.967942 \n",
"L 142.943125 53.268904 \n",
"L 147.283125 53.46538 \n",
"L 151.623125 53.351236 \n",
"L 155.963125 53.26329 \n",
"L 160.303125 53.213714 \n",
"L 164.643125 53.257677 \n",
"L 168.983125 52.98261 \n",
"L 173.323125 52.786134 \n",
"L 177.663125 52.822623 \n",
"L 182.003125 52.905268 \n",
"L 186.343125 52.134957 \n",
"L 190.683125 52.168639 \n",
"L 195.023125 52.337047 \n",
"L 199.363125 52.384762 \n",
"L 203.703125 52.358172 \n",
"L 208.043125 52.651408 \n",
"L 212.383125 52.645795 \n",
"L 216.723125 52.400668 \n",
"L 221.063125 52.25565 \n",
"L 225.403125 52.117269 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 30.103125 71.452518 \n",
"L 51.803125 66.18418 \n",
"L 73.503125 70.385342 \n",
"L 95.203125 66.251723 \n",
"L 116.903125 66.8461 \n",
"L 138.603125 60.524095 \n",
"L 160.303125 57.930452 \n",
"L 182.003125 70.088154 \n",
"L 203.703125 60.929352 \n",
"L 225.403125 64.509119 \n",
"\" clip-path=\"url(#pb48bb8adde)\" style=\"fill: none; stroke-dasharray: 9.6,2.4,1.5,2.4; stroke-dashoffset: 0; stroke: #008000; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 30.103125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 225.403125 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 30.103125 7.2 \n",
"L 225.403125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"legend_1\">\n",
" <g id=\"patch_7\">\n",
" <path d=\"M 37.103125 59.234375 \n",
"L 114.871875 59.234375 \n",
"Q 116.871875 59.234375 116.871875 57.234375 \n",
"L 116.871875 14.2 \n",
"Q 116.871875 12.2 114.871875 12.2 \n",
"L 37.103125 12.2 \n",
"Q 35.103125 12.2 35.103125 14.2 \n",
"L 35.103125 57.234375 \n",
"Q 35.103125 59.234375 37.103125 59.234375 \n",
"z\n",
"\" style=\"fill: #ffffff; opacity: 0.8; stroke: #cccccc; stroke-linejoin: miter\"/>\n",
" </g>\n",
" <g id=\"line2d_24\">\n",
" <path d=\"M 39.103125 20.298438 \n",
"L 49.103125 20.298438 \n",
"L 59.103125 20.298438 \n",
"\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- train loss -->\n",
" <g transform=\"translate(67.103125 23.798438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-74\" d=\"M 1172 4494 \n",
"L 1172 3500 \n",
"L 2356 3500 \n",
"L 2356 3053 \n",
"L 1172 3053 \n",
"L 1172 1153 \n",
"Q 1172 725 1289 603 \n",
"Q 1406 481 1766 481 \n",
"L 2356 481 \n",
"L 2356 0 \n",
"L 1766 0 \n",
"Q 1100 0 847 248 \n",
"Q 594 497 594 1153 \n",
"L 594 3053 \n",
"L 172 3053 \n",
"L 172 3500 \n",
"L 594 3500 \n",
"L 594 4494 \n",
"L 1172 4494 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-72\" d=\"M 2631 2963 \n",
"Q 2534 3019 2420 3045 \n",
"Q 2306 3072 2169 3072 \n",
"Q 1681 3072 1420 2755 \n",
"Q 1159 2438 1159 1844 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2956 \n",
"Q 1341 3275 1631 3429 \n",
"Q 1922 3584 2338 3584 \n",
"Q 2397 3584 2469 3576 \n",
"Q 2541 3569 2628 3553 \n",
"L 2631 2963 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-61\" d=\"M 2194 1759 \n",
"Q 1497 1759 1228 1600 \n",
"Q 959 1441 959 1056 \n",
"Q 959 750 1161 570 \n",
"Q 1363 391 1709 391 \n",
"Q 2188 391 2477 730 \n",
"Q 2766 1069 2766 1631 \n",
"L 2766 1759 \n",
"L 2194 1759 \n",
"z\n",
"M 3341 1997 \n",
"L 3341 0 \n",
"L 2766 0 \n",
"L 2766 531 \n",
"Q 2569 213 2275 61 \n",
"Q 1981 -91 1556 -91 \n",
"Q 1019 -91 701 211 \n",
"Q 384 513 384 1019 \n",
"Q 384 1609 779 1909 \n",
"Q 1175 2209 1959 2209 \n",
"L 2766 2209 \n",
"L 2766 2266 \n",
"Q 2766 2663 2505 2880 \n",
"Q 2244 3097 1772 3097 \n",
"Q 1472 3097 1187 3025 \n",
"Q 903 2953 641 2809 \n",
"L 641 3341 \n",
"Q 956 3463 1253 3523 \n",
"Q 1550 3584 1831 3584 \n",
"Q 2591 3584 2966 3190 \n",
"Q 3341 2797 3341 1997 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-69\" d=\"M 603 3500 \n",
"L 1178 3500 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 3500 \n",
"z\n",
"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 4134 \n",
"L 603 4134 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6e\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-20\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-72\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"80.322266\"/>\n",
" <use xlink:href=\"#DejaVuSans-69\" x=\"141.601562\"/>\n",
" <use xlink:href=\"#DejaVuSans-6e\" x=\"169.384766\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"232.763672\"/>\n",
" <use xlink:href=\"#DejaVuSans-6c\" x=\"264.550781\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"292.333984\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"353.515625\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"405.615234\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_25\">\n",
" <path d=\"M 39.103125 34.976562 \n",
"L 49.103125 34.976562 \n",
"L 59.103125 34.976562 \n",
"\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- train acc -->\n",
" <g transform=\"translate(67.103125 38.476562)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-72\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"80.322266\"/>\n",
" <use xlink:href=\"#DejaVuSans-69\" x=\"141.601562\"/>\n",
" <use xlink:href=\"#DejaVuSans-6e\" x=\"169.384766\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"232.763672\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"264.550781\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"325.830078\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"380.810547\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_26\">\n",
" <path d=\"M 39.103125 49.654688 \n",
"L 49.103125 49.654688 \n",
"L 59.103125 49.654688 \n",
"\" style=\"fill: none; stroke-dasharray: 9.6,2.4,1.5,2.4; stroke-dashoffset: 0; stroke: #008000; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"text_14\">\n",
" <!-- test acc -->\n",
" <g transform=\"translate(67.103125 53.154688)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"100.732422\"/>\n",
" <use xlink:href=\"#DejaVuSans-74\" x=\"152.832031\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"192.041016\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"223.828125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"285.107422\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"340.087891\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"pb48bb8adde\">\n",
" <rect x=\"30.103125\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"lr, num_epochs, batch_size = 1.0, 10, 256\n",
"train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)\n",
"d2l.train_ch6(net, train_iter, test_iter, num_epochs, lr, d2l.try_gpu())"
]
},
{
"cell_type": "markdown",
"id": "4987131a",
"metadata": {
"origin_pos": 18
},
"source": [
"让我们来看看从第一个批量规范化层中学到的[**拉伸参数`gamma`和偏移参数`beta`**]。\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "055a3583",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:17:04.745528Z",
"iopub.status.busy": "2023-08-18T07:17:04.744678Z",
"iopub.status.idle": "2023-08-18T07:17:04.755775Z",
"shell.execute_reply": "2023-08-18T07:17:04.754582Z"
},
"origin_pos": 20,
"tab": [
"pytorch"
]
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([0.4863, 2.8573, 2.3190, 4.3188, 3.8588, 1.7942], device='cuda:0',\n",
" grad_fn=<ReshapeAliasBackward0>),\n",
" tensor([-0.0124, 1.4839, -1.7753, 2.3564, -3.8801, -2.1589], device='cuda:0',\n",
" grad_fn=<ReshapeAliasBackward0>))"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"net[1].gamma.reshape((-1,)), net[1].beta.reshape((-1,))"
]
},
{
"cell_type": "markdown",
"id": "da5c2465",
"metadata": {
"origin_pos": 23
},
"source": [
"## [**简明实现**]\n",
"\n",
"除了使用我们刚刚定义的`BatchNorm`,我们也可以直接使用深度学习框架中定义的`BatchNorm`。\n",
"该代码看起来几乎与我们上面的代码相同。\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b8604933",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:17:04.759625Z",
"iopub.status.busy": "2023-08-18T07:17:04.758859Z",
"iopub.status.idle": "2023-08-18T07:17:04.769251Z",
"shell.execute_reply": "2023-08-18T07:17:04.768076Z"
},
"origin_pos": 25,
"tab": [
"pytorch"
]
},
"outputs": [],
"source": [
"net = nn.Sequential(\n",
" nn.Conv2d(1, 6, kernel_size=5), nn.BatchNorm2d(6), nn.Sigmoid(),\n",
" nn.AvgPool2d(kernel_size=2, stride=2),\n",
" nn.Conv2d(6, 16, kernel_size=5), nn.BatchNorm2d(16), nn.Sigmoid(),\n",
" nn.AvgPool2d(kernel_size=2, stride=2), nn.Flatten(),\n",
" nn.Linear(256, 120), nn.BatchNorm1d(120), nn.Sigmoid(),\n",
" nn.Linear(120, 84), nn.BatchNorm1d(84), nn.Sigmoid(),\n",
" nn.Linear(84, 10))"
]
},
{
"cell_type": "markdown",
"id": "5a054ab6",
"metadata": {
"origin_pos": 28
},
"source": [
"下面,我们[**使用相同超参数来训练模型**]。\n",
"请注意,通常高级API变体运行速度快得多,因为它的代码已编译为C++或CUDA,而我们的自定义代码由Python实现。\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "add53e76",
"metadata": {
"execution": {
"iopub.execute_input": "2023-08-18T07:17:04.772567Z",
"iopub.status.busy": "2023-08-18T07:17:04.772282Z",
"iopub.status.idle": "2023-08-18T07:17:54.677901Z",
"shell.execute_reply": "2023-08-18T07:17:54.676931Z"
},
"origin_pos": 29,
"tab": [
"pytorch"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loss 0.267, train acc 0.902, test acc 0.708\n",
"50597.3 examples/sec on cuda:0\n"
]
},
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"238.965625pt\" height=\"180.65625pt\" viewBox=\"0 0 238.965625 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
" <metadata>\n",
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
" <cc:Work>\n",
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
" <dc:date>2023-08-18T07:17:54.611775</dc:date>\n",
" <dc:format>image/svg+xml</dc:format>\n",
" <dc:creator>\n",
" <cc:Agent>\n",
" <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
" </cc:Agent>\n",
" </dc:creator>\n",
" </cc:Work>\n",
" </rdf:RDF>\n",
" </metadata>\n",
" <defs>\n",
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
" </defs>\n",
" <g id=\"figure_1\">\n",
" <g id=\"patch_1\">\n",
" <path d=\"M 0 180.65625 \n",
"L 238.965625 180.65625 \n",
"L 238.965625 0 \n",
"L 0 0 \n",
"L 0 180.65625 \n",
"z\n",
"\" style=\"fill: none\"/>\n",
" </g>\n",
" <g id=\"axes_1\">\n",
" <g id=\"patch_2\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"L 30.103125 7.2 \n",
"z\n",
"\" style=\"fill: #ffffff\"/>\n",
" </g>\n",
" <g id=\"matplotlib.axis_1\">\n",
" <g id=\"xtick_1\">\n",
" <g id=\"line2d_1\">\n",
" <path d=\"M 51.803125 143.1 \n",
"L 51.803125 7.2 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_2\">\n",
" <defs>\n",
" <path id=\"m621c46cd9a\" d=\"M 0 0 \n",
"L 0 3.5 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#m621c46cd9a\" x=\"51.803125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_1\">\n",
" <!-- 2 -->\n",
" <g transform=\"translate(48.621875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
"L 3431 531 \n",
"L 3431 0 \n",
"L 469 0 \n",
"L 469 531 \n",
"Q 828 903 1448 1529 \n",
"Q 2069 2156 2228 2338 \n",
"Q 2531 2678 2651 2914 \n",
"Q 2772 3150 2772 3378 \n",
"Q 2772 3750 2511 3984 \n",
"Q 2250 4219 1831 4219 \n",
"Q 1534 4219 1204 4116 \n",
"Q 875 4013 500 3803 \n",
"L 500 4441 \n",
"Q 881 4594 1212 4672 \n",
"Q 1544 4750 1819 4750 \n",
"Q 2544 4750 2975 4387 \n",
"Q 3406 4025 3406 3419 \n",
"Q 3406 3131 3298 2873 \n",
"Q 3191 2616 2906 2266 \n",
"Q 2828 2175 2409 1742 \n",
"Q 1991 1309 1228 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-32\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_2\">\n",
" <g id=\"line2d_3\">\n",
" <path d=\"M 95.203125 143.1 \n",
"L 95.203125 7.2 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_4\">\n",
" <g>\n",
" <use xlink:href=\"#m621c46cd9a\" x=\"95.203125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_2\">\n",
" <!-- 4 -->\n",
" <g transform=\"translate(92.021875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
"L 825 1625 \n",
"L 2419 1625 \n",
"L 2419 4116 \n",
"z\n",
"M 2253 4666 \n",
"L 3047 4666 \n",
"L 3047 1625 \n",
"L 3713 1625 \n",
"L 3713 1100 \n",
"L 3047 1100 \n",
"L 3047 0 \n",
"L 2419 0 \n",
"L 2419 1100 \n",
"L 313 1100 \n",
"L 313 1709 \n",
"L 2253 4666 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-34\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_3\">\n",
" <g id=\"line2d_5\">\n",
" <path d=\"M 138.603125 143.1 \n",
"L 138.603125 7.2 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_6\">\n",
" <g>\n",
" <use xlink:href=\"#m621c46cd9a\" x=\"138.603125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_3\">\n",
" <!-- 6 -->\n",
" <g transform=\"translate(135.421875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
"Q 1688 2584 1439 2293 \n",
"Q 1191 2003 1191 1497 \n",
"Q 1191 994 1439 701 \n",
"Q 1688 409 2113 409 \n",
"Q 2538 409 2786 701 \n",
"Q 3034 994 3034 1497 \n",
"Q 3034 2003 2786 2293 \n",
"Q 2538 2584 2113 2584 \n",
"z\n",
"M 3366 4563 \n",
"L 3366 3988 \n",
"Q 3128 4100 2886 4159 \n",
"Q 2644 4219 2406 4219 \n",
"Q 1781 4219 1451 3797 \n",
"Q 1122 3375 1075 2522 \n",
"Q 1259 2794 1537 2939 \n",
"Q 1816 3084 2150 3084 \n",
"Q 2853 3084 3261 2657 \n",
"Q 3669 2231 3669 1497 \n",
"Q 3669 778 3244 343 \n",
"Q 2819 -91 2113 -91 \n",
"Q 1303 -91 875 529 \n",
"Q 447 1150 447 2328 \n",
"Q 447 3434 972 4092 \n",
"Q 1497 4750 2381 4750 \n",
"Q 2619 4750 2861 4703 \n",
"Q 3103 4656 3366 4563 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-36\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_4\">\n",
" <g id=\"line2d_7\">\n",
" <path d=\"M 182.003125 143.1 \n",
"L 182.003125 7.2 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_8\">\n",
" <g>\n",
" <use xlink:href=\"#m621c46cd9a\" x=\"182.003125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_4\">\n",
" <!-- 8 -->\n",
" <g transform=\"translate(178.821875 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
"Q 1584 2216 1326 1975 \n",
"Q 1069 1734 1069 1313 \n",
"Q 1069 891 1326 650 \n",
"Q 1584 409 2034 409 \n",
"Q 2484 409 2743 651 \n",
"Q 3003 894 3003 1313 \n",
"Q 3003 1734 2745 1975 \n",
"Q 2488 2216 2034 2216 \n",
"z\n",
"M 1403 2484 \n",
"Q 997 2584 770 2862 \n",
"Q 544 3141 544 3541 \n",
"Q 544 4100 942 4425 \n",
"Q 1341 4750 2034 4750 \n",
"Q 2731 4750 3128 4425 \n",
"Q 3525 4100 3525 3541 \n",
"Q 3525 3141 3298 2862 \n",
"Q 3072 2584 2669 2484 \n",
"Q 3125 2378 3379 2068 \n",
"Q 3634 1759 3634 1313 \n",
"Q 3634 634 3220 271 \n",
"Q 2806 -91 2034 -91 \n",
"Q 1263 -91 848 271 \n",
"Q 434 634 434 1313 \n",
"Q 434 1759 690 2068 \n",
"Q 947 2378 1403 2484 \n",
"z\n",
"M 1172 3481 \n",
"Q 1172 3119 1398 2916 \n",
"Q 1625 2713 2034 2713 \n",
"Q 2441 2713 2670 2916 \n",
"Q 2900 3119 2900 3481 \n",
"Q 2900 3844 2670 4047 \n",
"Q 2441 4250 2034 4250 \n",
"Q 1625 4250 1398 4047 \n",
"Q 1172 3844 1172 3481 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-38\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"xtick_5\">\n",
" <g id=\"line2d_9\">\n",
" <path d=\"M 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_10\">\n",
" <g>\n",
" <use xlink:href=\"#m621c46cd9a\" x=\"225.403125\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_5\">\n",
" <!-- 10 -->\n",
" <g transform=\"translate(219.040625 157.698438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
"L 1825 531 \n",
"L 1825 4091 \n",
"L 703 3866 \n",
"L 703 4441 \n",
"L 1819 4666 \n",
"L 2450 4666 \n",
"L 2450 531 \n",
"L 3481 531 \n",
"L 3481 0 \n",
"L 794 0 \n",
"L 794 531 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
"Q 1547 4250 1301 3770 \n",
"Q 1056 3291 1056 2328 \n",
"Q 1056 1369 1301 889 \n",
"Q 1547 409 2034 409 \n",
"Q 2525 409 2770 889 \n",
"Q 3016 1369 3016 2328 \n",
"Q 3016 3291 2770 3770 \n",
"Q 2525 4250 2034 4250 \n",
"z\n",
"M 2034 4750 \n",
"Q 2819 4750 3233 4129 \n",
"Q 3647 3509 3647 2328 \n",
"Q 3647 1150 3233 529 \n",
"Q 2819 -91 2034 -91 \n",
"Q 1250 -91 836 529 \n",
"Q 422 1150 422 2328 \n",
"Q 422 3509 836 4129 \n",
"Q 1250 4750 2034 4750 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"63.623047\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_6\">\n",
" <!-- epoch -->\n",
" <g transform=\"translate(112.525 171.376563)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-65\" d=\"M 3597 1894 \n",
"L 3597 1613 \n",
"L 953 1613 \n",
"Q 991 1019 1311 708 \n",
"Q 1631 397 2203 397 \n",
"Q 2534 397 2845 478 \n",
"Q 3156 559 3463 722 \n",
"L 3463 178 \n",
"Q 3153 47 2828 -22 \n",
"Q 2503 -91 2169 -91 \n",
"Q 1331 -91 842 396 \n",
"Q 353 884 353 1716 \n",
"Q 353 2575 817 3079 \n",
"Q 1281 3584 2069 3584 \n",
"Q 2775 3584 3186 3129 \n",
"Q 3597 2675 3597 1894 \n",
"z\n",
"M 3022 2063 \n",
"Q 3016 2534 2758 2815 \n",
"Q 2500 3097 2075 3097 \n",
"Q 1594 3097 1305 2825 \n",
"Q 1016 2553 972 2059 \n",
"L 3022 2063 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-70\" d=\"M 1159 525 \n",
"L 1159 -1331 \n",
"L 581 -1331 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2969 \n",
"Q 1341 3281 1617 3432 \n",
"Q 1894 3584 2278 3584 \n",
"Q 2916 3584 3314 3078 \n",
"Q 3713 2572 3713 1747 \n",
"Q 3713 922 3314 415 \n",
"Q 2916 -91 2278 -91 \n",
"Q 1894 -91 1617 61 \n",
"Q 1341 213 1159 525 \n",
"z\n",
"M 3116 1747 \n",
"Q 3116 2381 2855 2742 \n",
"Q 2594 3103 2138 3103 \n",
"Q 1681 3103 1420 2742 \n",
"Q 1159 2381 1159 1747 \n",
"Q 1159 1113 1420 752 \n",
"Q 1681 391 2138 391 \n",
"Q 2594 391 2855 752 \n",
"Q 3116 1113 3116 1747 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6f\" d=\"M 1959 3097 \n",
"Q 1497 3097 1228 2736 \n",
"Q 959 2375 959 1747 \n",
"Q 959 1119 1226 758 \n",
"Q 1494 397 1959 397 \n",
"Q 2419 397 2687 759 \n",
"Q 2956 1122 2956 1747 \n",
"Q 2956 2369 2687 2733 \n",
"Q 2419 3097 1959 3097 \n",
"z\n",
"M 1959 3584 \n",
"Q 2709 3584 3137 3096 \n",
"Q 3566 2609 3566 1747 \n",
"Q 3566 888 3137 398 \n",
"Q 2709 -91 1959 -91 \n",
"Q 1206 -91 779 398 \n",
"Q 353 888 353 1747 \n",
"Q 353 2609 779 3096 \n",
"Q 1206 3584 1959 3584 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-63\" d=\"M 3122 3366 \n",
"L 3122 2828 \n",
"Q 2878 2963 2633 3030 \n",
"Q 2388 3097 2138 3097 \n",
"Q 1578 3097 1268 2742 \n",
"Q 959 2388 959 1747 \n",
"Q 959 1106 1268 751 \n",
"Q 1578 397 2138 397 \n",
"Q 2388 397 2633 464 \n",
"Q 2878 531 3122 666 \n",
"L 3122 134 \n",
"Q 2881 22 2623 -34 \n",
"Q 2366 -91 2075 -91 \n",
"Q 1284 -91 818 406 \n",
"Q 353 903 353 1747 \n",
"Q 353 2603 823 3093 \n",
"Q 1294 3584 2113 3584 \n",
"Q 2378 3584 2631 3529 \n",
"Q 2884 3475 3122 3366 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-68\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 4863 \n",
"L 1159 4863 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-65\"/>\n",
" <use xlink:href=\"#DejaVuSans-70\" x=\"61.523438\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"186.181641\"/>\n",
" <use xlink:href=\"#DejaVuSans-68\" x=\"241.162109\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"matplotlib.axis_2\">\n",
" <g id=\"ytick_1\">\n",
" <g id=\"line2d_11\">\n",
" <path d=\"M 30.103125 118.642556 \n",
"L 225.403125 118.642556 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_12\">\n",
" <defs>\n",
" <path id=\"mc92bba050c\" d=\"M 0 0 \n",
"L -3.5 0 \n",
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </defs>\n",
" <g>\n",
" <use xlink:href=\"#mc92bba050c\" x=\"30.103125\" y=\"118.642556\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_7\">\n",
" <!-- 0.4 -->\n",
" <g transform=\"translate(7.2 122.441775)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
"L 1344 794 \n",
"L 1344 0 \n",
"L 684 0 \n",
"L 684 794 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-34\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_2\">\n",
" <g id=\"line2d_13\">\n",
" <path d=\"M 30.103125 92.630015 \n",
"L 225.403125 92.630015 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_14\">\n",
" <g>\n",
" <use xlink:href=\"#mc92bba050c\" x=\"30.103125\" y=\"92.630015\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_8\">\n",
" <!-- 0.6 -->\n",
" <g transform=\"translate(7.2 96.429234)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-36\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_3\">\n",
" <g id=\"line2d_15\">\n",
" <path d=\"M 30.103125 66.617474 \n",
"L 225.403125 66.617474 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_16\">\n",
" <g>\n",
" <use xlink:href=\"#mc92bba050c\" x=\"30.103125\" y=\"66.617474\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_9\">\n",
" <!-- 0.8 -->\n",
" <g transform=\"translate(7.2 70.416693)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-30\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-38\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_4\">\n",
" <g id=\"line2d_17\">\n",
" <path d=\"M 30.103125 40.604934 \n",
"L 225.403125 40.604934 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_18\">\n",
" <g>\n",
" <use xlink:href=\"#mc92bba050c\" x=\"30.103125\" y=\"40.604934\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_10\">\n",
" <!-- 1.0 -->\n",
" <g transform=\"translate(7.2 44.404152)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"ytick_5\">\n",
" <g id=\"line2d_19\">\n",
" <path d=\"M 30.103125 14.592393 \n",
"L 225.403125 14.592393 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_20\">\n",
" <g>\n",
" <use xlink:href=\"#mc92bba050c\" x=\"30.103125\" y=\"14.592393\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"text_11\">\n",
" <!-- 1.2 -->\n",
" <g transform=\"translate(7.2 18.391611)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-31\"/>\n",
" <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
" <use xlink:href=\"#DejaVuSans-32\" x=\"95.410156\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_21\">\n",
" <path d=\"M 12.743125 13.377273 \n",
"L 17.083125 44.380515 \n",
"L 21.423125 59.699336 \n",
"L 25.763125 68.032228 \n",
"L 30.103125 74.443428 \n",
"L 34.443125 103.238984 \n",
"L 38.783125 106.697713 \n",
"L 43.123125 107.93026 \n",
"L 47.463125 109.305601 \n",
"L 51.803125 110.232821 \n",
"L 56.143125 117.48204 \n",
"L 60.483125 116.821938 \n",
"L 64.823125 116.392961 \n",
"L 69.163125 117.65391 \n",
"L 73.503125 118.441025 \n",
"L 77.843125 124.268517 \n",
"L 82.183125 122.543142 \n",
"L 86.523125 122.662327 \n",
"L 90.863125 123.121934 \n",
"L 95.203125 123.541077 \n",
"L 99.543125 125.448467 \n",
"L 103.883125 125.728083 \n",
"L 108.223125 126.451788 \n",
"L 112.563125 127.038452 \n",
"L 116.903125 127.31891 \n",
"L 121.243125 127.878426 \n",
"L 125.583125 129.376324 \n",
"L 129.923125 130.164087 \n",
"L 134.263125 130.019464 \n",
"L 138.603125 130.035596 \n",
"L 142.943125 130.878784 \n",
"L 147.283125 131.667441 \n",
"L 151.623125 131.13423 \n",
"L 155.963125 131.653062 \n",
"L 160.303125 131.922779 \n",
"L 164.643125 131.469092 \n",
"L 168.983125 132.460443 \n",
"L 173.323125 132.546275 \n",
"L 177.663125 132.86345 \n",
"L 182.003125 133.315089 \n",
"L 186.343125 134.95568 \n",
"L 190.683125 135.220623 \n",
"L 195.023125 135.135746 \n",
"L 199.363125 135.294664 \n",
"L 203.703125 134.900771 \n",
"L 208.043125 136.922727 \n",
"L 212.383125 136.551566 \n",
"L 216.723125 135.887731 \n",
"L 221.063125 136.065999 \n",
"L 225.403125 135.987509 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"line2d_22\">\n",
" <path d=\"M 12.743125 95.648093 \n",
"L 17.083125 85.751782 \n",
"L 21.423125 80.557706 \n",
"L 25.763125 77.657995 \n",
"L 30.103125 75.483415 \n",
"L 34.443125 65.164646 \n",
"L 38.783125 63.775596 \n",
"L 43.123125 63.445899 \n",
"L 47.463125 62.959461 \n",
"L 51.803125 62.570357 \n",
"L 56.143125 59.803019 \n",
"L 60.483125 60.14893 \n",
"L 64.823125 60.285854 \n",
"L 69.163125 59.857068 \n",
"L 73.503125 59.544231 \n",
"L 77.843125 57.435687 \n",
"L 82.183125 58.235608 \n",
"L 86.523125 58.242814 \n",
"L 90.863125 57.997794 \n",
"L 95.203125 57.844745 \n",
"L 99.543125 57.208683 \n",
"L 103.883125 57.068157 \n",
"L 108.223125 56.679006 \n",
"L 112.563125 56.492538 \n",
"L 116.903125 56.459577 \n",
"L 121.243125 56.462811 \n",
"L 125.583125 55.91692 \n",
"L 129.923125 55.489936 \n",
"L 134.263125 55.557496 \n",
"L 138.603125 55.596828 \n",
"L 142.943125 55.879086 \n",
"L 147.283125 55.354814 \n",
"L 151.623125 55.313377 \n",
"L 155.963125 55.054844 \n",
"L 160.303125 54.987701 \n",
"L 164.643125 55.046736 \n",
"L 168.983125 54.69542 \n",
"L 173.323125 54.607141 \n",
"L 177.663125 54.58462 \n",
"L 182.003125 54.424096 \n",
"L 186.343125 53.479325 \n",
"L 190.683125 53.760378 \n",
"L 195.023125 53.767585 \n",
"L 199.363125 53.6739 \n",
"L 203.703125 53.884336 \n",
"L 208.043125 52.819932 \n",
"L 212.383125 52.960458 \n",
"L 216.723125 53.230701 \n",
"L 221.063125 53.303667 \n",
"L 225.403125 53.403104 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"line2d_23\">\n",
" <path d=\"M 30.103125 75.773889 \n",
"L 51.803125 63.04075 \n",
"L 73.503125 81.886836 \n",
"L 95.203125 62.442462 \n",
"L 116.903125 65.095741 \n",
"L 138.603125 56.017364 \n",
"L 160.303125 67.527913 \n",
"L 182.003125 67.892089 \n",
"L 203.703125 60.023295 \n",
"L 225.403125 78.622262 \n",
"\" clip-path=\"url(#p3636248d04)\" style=\"fill: none; stroke-dasharray: 9.6,2.4,1.5,2.4; stroke-dashoffset: 0; stroke: #008000; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"patch_3\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 30.103125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_4\">\n",
" <path d=\"M 225.403125 143.1 \n",
"L 225.403125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_5\">\n",
" <path d=\"M 30.103125 143.1 \n",
"L 225.403125 143.1 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"patch_6\">\n",
" <path d=\"M 30.103125 7.2 \n",
"L 225.403125 7.2 \n",
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"legend_1\">\n",
" <g id=\"patch_7\">\n",
" <path d=\"M 37.103125 59.234375 \n",
"L 114.871875 59.234375 \n",
"Q 116.871875 59.234375 116.871875 57.234375 \n",
"L 116.871875 14.2 \n",
"Q 116.871875 12.2 114.871875 12.2 \n",
"L 37.103125 12.2 \n",
"Q 35.103125 12.2 35.103125 14.2 \n",
"L 35.103125 57.234375 \n",
"Q 35.103125 59.234375 37.103125 59.234375 \n",
"z\n",
"\" style=\"fill: #ffffff; opacity: 0.8; stroke: #cccccc; stroke-linejoin: miter\"/>\n",
" </g>\n",
" <g id=\"line2d_24\">\n",
" <path d=\"M 39.103125 20.298438 \n",
"L 49.103125 20.298438 \n",
"L 59.103125 20.298438 \n",
"\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
" </g>\n",
" <g id=\"text_12\">\n",
" <!-- train loss -->\n",
" <g transform=\"translate(67.103125 23.798438)scale(0.1 -0.1)\">\n",
" <defs>\n",
" <path id=\"DejaVuSans-74\" d=\"M 1172 4494 \n",
"L 1172 3500 \n",
"L 2356 3500 \n",
"L 2356 3053 \n",
"L 1172 3053 \n",
"L 1172 1153 \n",
"Q 1172 725 1289 603 \n",
"Q 1406 481 1766 481 \n",
"L 2356 481 \n",
"L 2356 0 \n",
"L 1766 0 \n",
"Q 1100 0 847 248 \n",
"Q 594 497 594 1153 \n",
"L 594 3053 \n",
"L 172 3053 \n",
"L 172 3500 \n",
"L 594 3500 \n",
"L 594 4494 \n",
"L 1172 4494 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-72\" d=\"M 2631 2963 \n",
"Q 2534 3019 2420 3045 \n",
"Q 2306 3072 2169 3072 \n",
"Q 1681 3072 1420 2755 \n",
"Q 1159 2438 1159 1844 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2956 \n",
"Q 1341 3275 1631 3429 \n",
"Q 1922 3584 2338 3584 \n",
"Q 2397 3584 2469 3576 \n",
"Q 2541 3569 2628 3553 \n",
"L 2631 2963 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-61\" d=\"M 2194 1759 \n",
"Q 1497 1759 1228 1600 \n",
"Q 959 1441 959 1056 \n",
"Q 959 750 1161 570 \n",
"Q 1363 391 1709 391 \n",
"Q 2188 391 2477 730 \n",
"Q 2766 1069 2766 1631 \n",
"L 2766 1759 \n",
"L 2194 1759 \n",
"z\n",
"M 3341 1997 \n",
"L 3341 0 \n",
"L 2766 0 \n",
"L 2766 531 \n",
"Q 2569 213 2275 61 \n",
"Q 1981 -91 1556 -91 \n",
"Q 1019 -91 701 211 \n",
"Q 384 513 384 1019 \n",
"Q 384 1609 779 1909 \n",
"Q 1175 2209 1959 2209 \n",
"L 2766 2209 \n",
"L 2766 2266 \n",
"Q 2766 2663 2505 2880 \n",
"Q 2244 3097 1772 3097 \n",
"Q 1472 3097 1187 3025 \n",
"Q 903 2953 641 2809 \n",
"L 641 3341 \n",
"Q 956 3463 1253 3523 \n",
"Q 1550 3584 1831 3584 \n",
"Q 2591 3584 2966 3190 \n",
"Q 3341 2797 3341 1997 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-69\" d=\"M 603 3500 \n",
"L 1178 3500 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 3500 \n",
"z\n",
"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 4134 \n",
"L 603 4134 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6e\" d=\"M 3513 2113 \n",
"L 3513 0 \n",
"L 2938 0 \n",
"L 2938 2094 \n",
"Q 2938 2591 2744 2837 \n",
"Q 2550 3084 2163 3084 \n",
"Q 1697 3084 1428 2787 \n",
"Q 1159 2491 1159 1978 \n",
"L 1159 0 \n",
"L 581 0 \n",
"L 581 3500 \n",
"L 1159 3500 \n",
"L 1159 2956 \n",
"Q 1366 3272 1645 3428 \n",
"Q 1925 3584 2291 3584 \n",
"Q 2894 3584 3203 3211 \n",
"Q 3513 2838 3513 2113 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-20\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-6c\" d=\"M 603 4863 \n",
"L 1178 4863 \n",
"L 1178 0 \n",
"L 603 0 \n",
"L 603 4863 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" <path id=\"DejaVuSans-73\" d=\"M 2834 3397 \n",
"L 2834 2853 \n",
"Q 2591 2978 2328 3040 \n",
"Q 2066 3103 1784 3103 \n",
"Q 1356 3103 1142 2972 \n",
"Q 928 2841 928 2578 \n",
"Q 928 2378 1081 2264 \n",
"Q 1234 2150 1697 2047 \n",
"L 1894 2003 \n",
"Q 2506 1872 2764 1633 \n",
"Q 3022 1394 3022 966 \n",
"Q 3022 478 2636 193 \n",
"Q 2250 -91 1575 -91 \n",
"Q 1294 -91 989 -36 \n",
"Q 684 19 347 128 \n",
"L 347 722 \n",
"Q 666 556 975 473 \n",
"Q 1284 391 1588 391 \n",
"Q 1994 391 2212 530 \n",
"Q 2431 669 2431 922 \n",
"Q 2431 1156 2273 1281 \n",
"Q 2116 1406 1581 1522 \n",
"L 1381 1569 \n",
"Q 847 1681 609 1914 \n",
"Q 372 2147 372 2553 \n",
"Q 372 3047 722 3315 \n",
"Q 1072 3584 1716 3584 \n",
"Q 2034 3584 2315 3537 \n",
"Q 2597 3491 2834 3397 \n",
"z\n",
"\" transform=\"scale(0.015625)\"/>\n",
" </defs>\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-72\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"80.322266\"/>\n",
" <use xlink:href=\"#DejaVuSans-69\" x=\"141.601562\"/>\n",
" <use xlink:href=\"#DejaVuSans-6e\" x=\"169.384766\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"232.763672\"/>\n",
" <use xlink:href=\"#DejaVuSans-6c\" x=\"264.550781\"/>\n",
" <use xlink:href=\"#DejaVuSans-6f\" x=\"292.333984\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"353.515625\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"405.615234\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_25\">\n",
" <path d=\"M 39.103125 34.976562 \n",
"L 49.103125 34.976562 \n",
"L 59.103125 34.976562 \n",
"\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"text_13\">\n",
" <!-- train acc -->\n",
" <g transform=\"translate(67.103125 38.476562)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-72\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"80.322266\"/>\n",
" <use xlink:href=\"#DejaVuSans-69\" x=\"141.601562\"/>\n",
" <use xlink:href=\"#DejaVuSans-6e\" x=\"169.384766\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"232.763672\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"264.550781\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"325.830078\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"380.810547\"/>\n",
" </g>\n",
" </g>\n",
" <g id=\"line2d_26\">\n",
" <path d=\"M 39.103125 49.654688 \n",
"L 49.103125 49.654688 \n",
"L 59.103125 49.654688 \n",
"\" style=\"fill: none; stroke-dasharray: 9.6,2.4,1.5,2.4; stroke-dashoffset: 0; stroke: #008000; stroke-width: 1.5\"/>\n",
" </g>\n",
" <g id=\"text_14\">\n",
" <!-- test acc -->\n",
" <g transform=\"translate(67.103125 53.154688)scale(0.1 -0.1)\">\n",
" <use xlink:href=\"#DejaVuSans-74\"/>\n",
" <use xlink:href=\"#DejaVuSans-65\" x=\"39.208984\"/>\n",
" <use xlink:href=\"#DejaVuSans-73\" x=\"100.732422\"/>\n",
" <use xlink:href=\"#DejaVuSans-74\" x=\"152.832031\"/>\n",
" <use xlink:href=\"#DejaVuSans-20\" x=\"192.041016\"/>\n",
" <use xlink:href=\"#DejaVuSans-61\" x=\"223.828125\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"285.107422\"/>\n",
" <use xlink:href=\"#DejaVuSans-63\" x=\"340.087891\"/>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" </g>\n",
" <defs>\n",
" <clipPath id=\"p3636248d04\">\n",
" <rect x=\"30.103125\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
" </clipPath>\n",
" </defs>\n",
"</svg>\n"
],
"text/plain": [
"<Figure size 252x180 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"d2l.train_ch6(net, train_iter, test_iter, num_epochs, lr, d2l.try_gpu())"
]
},
{
"cell_type": "markdown",
"id": "27877eed",
"metadata": {
"origin_pos": 30
},
"source": [
"## 争议\n",
"\n",
"直观地说,批量规范化被认为可以使优化更加平滑。\n",
"然而,我们必须小心区分直觉和对我们观察到的现象的真实解释。\n",
"回想一下,我们甚至不知道简单的神经网络(多层感知机和传统的卷积神经网络)为什么如此有效。\n",
"即使在暂退法和权重衰减的情况下,它们仍然非常灵活,因此无法通过常规的学习理论泛化保证来解释它们是否能够泛化到看不见的数据。\n",
"\n",
"在提出批量规范化的论文中,作者除了介绍了其应用,还解释了其原理:通过减少*内部协变量偏移*(internal covariate shift)。\n",
"据推测,作者所说的*内部协变量转移*类似于上述的投机直觉,即变量值的分布在训练过程中会发生变化。\n",
"然而,这种解释有两个问题:\n",
"1、这种偏移与严格定义的*协变量偏移*(covariate shift)非常不同,所以这个名字用词不当;\n",
"2、这种解释只提供了一种不明确的直觉,但留下了一个有待后续挖掘的问题:为什么这项技术如此有效?\n",
"本书旨在传达实践者用来发展深层神经网络的直觉。\n",
"然而,重要的是将这些指导性直觉与既定的科学事实区分开来。\n",
"最终,当你掌握了这些方法,并开始撰写自己的研究论文时,你会希望清楚地区分技术和直觉。\n",
"\n",
"随着批量规范化的普及,*内部协变量偏移*的解释反复出现在技术文献的辩论,特别是关于“如何展示机器学习研究”的更广泛的讨论中。\n",
"Ali Rahimi在接受2017年NeurIPS大会的“接受时间考验奖”(Test of Time Award)时发表了一篇令人难忘的演讲。他将“内部协变量转移”作为焦点,将现代深度学习的实践比作炼金术。\n",
"他对该示例进行了详细回顾 :cite:`Lipton.Steinhardt.2018`,概述了机器学习中令人不安的趋势。\n",
"此外,一些作者对批量规范化的成功提出了另一种解释:在某些方面,批量规范化的表现出与原始论文 :cite:`Santurkar.Tsipras.Ilyas.ea.2018`中声称的行为是相反的。\n",
"\n",
"然而,与机器学习文献中成千上万类似模糊的说法相比,内部协变量偏移没有更值得批评。\n",
"很可能,它作为这些辩论的焦点而产生共鸣,要归功于目标受众对它的广泛认可。\n",
"批量规范化已经被证明是一种不可或缺的方法。它适用于几乎所有图像分类器,并在学术界获得了数万引用。\n",
"\n",
"## 小结\n",
"\n",
"* 在模型训练过程中,批量规范化利用小批量的均值和标准差,不断调整神经网络的中间输出,使整个神经网络各层的中间输出值更加稳定。\n",
"* 批量规范化在全连接层和卷积层的使用略有不同。\n",
"* 批量规范化层和暂退层一样,在训练模式和预测模式下计算不同。\n",
"* 批量规范化有许多有益的副作用,主要是正则化。另一方面,”减少内部协变量偏移“的原始动机似乎不是一个有效的解释。\n",
"\n",
"## 练习\n",
"\n",
"1. 在使用批量规范化之前,我们是否可以从全连接层或卷积层中删除偏置参数?为什么?\n",
"1. 比较LeNet在使用和不使用批量规范化情况下的学习率。\n",
" 1. 绘制训练和测试准确度的提高。\n",
" 1. 学习率有多高?\n",
"1. 我们是否需要在每个层中进行批量规范化?尝试一下?\n",
"1. 可以通过批量规范化来替换暂退法吗?行为会如何改变?\n",
"1. 确定参数`beta`和`gamma`,并观察和分析结果。\n",
"1. 查看高级API中有关`BatchNorm`的在线文档,以查看其他批量规范化的应用。\n",
"1. 研究思路:可以应用的其他“规范化”转换?可以应用概率积分变换吗?全秩协方差估计可以么?\n"
]
},
{
"cell_type": "markdown",
"id": "19fc75ac",
"metadata": {
"origin_pos": 32,
"tab": [
"pytorch"
]
},
"source": [
"[Discussions](https://discuss.d2l.ai/t/1874)\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"required_libs": []
},
"nbformat": 4,
"nbformat_minor": 5
}