更新

2025-12-16 09:23:53 +08:00
parent 19138d3cc1
commit 9e7efd0626
409 changed files with 272713 additions and 241 deletions
@@ -0,0 +1,596 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "b4873f83",
+   "metadata": {
+    "origin_pos": 0
+   },
+   "source": [
+    "# 自动微分\n",
+    ":label:`sec_autograd`\n",
+    "\n",
+    "正如 :numref:`sec_calculus`中所说，求导是几乎所有深度学习优化算法的关键步骤。\n",
+    "虽然求导的计算很简单，只需要一些基本的微积分。\n",
+    "但对于复杂的模型，手工进行更新是一件很痛苦的事情（而且经常容易出错）。\n",
+    "\n",
+    "深度学习框架通过自动计算导数，即*自动微分*（automatic differentiation）来加快求导。\n",
+    "实际中，根据设计好的模型，系统会构建一个*计算图*（computational graph），\n",
+    "来跟踪计算是哪些数据通过哪些操作组合起来产生输出。\n",
+    "自动微分使系统能够随后反向传播梯度。\n",
+    "这里，*反向传播*（backpropagate）意味着跟踪整个计算图，填充关于每个参数的偏导数。\n",
+    "\n",
+    "## 一个简单的例子\n",
+    "\n",
+    "作为一个演示例子，(**假设我们想对函数$y=2\\mathbf{x}^{\\top}\\mathbf{x}$关于列向量$\\mathbf{x}$求导**)。\n",
+    "首先，我们创建变量`x`并为其分配一个初始值。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "98cd8a9e",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:31.627945Z",
+     "iopub.status.busy": "2023-08-18T07:07:31.627424Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.686372Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.685559Z"
+    },
+    "origin_pos": 2,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([0., 1., 2., 3.])"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import torch\n",
+    "\n",
+    "x = torch.arange(4.0)\n",
+    "x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ec430520",
+   "metadata": {
+    "origin_pos": 5
+   },
+   "source": [
+    "[**在我们计算$y$关于$\\mathbf{x}$的梯度之前，需要一个地方来存储梯度。**]\n",
+    "重要的是，我们不会在每次对一个参数求导时都分配新的内存。\n",
+    "因为我们经常会成千上万次地更新相同的参数，每次都分配新的内存可能很快就会将内存耗尽。\n",
+    "注意，一个标量函数关于向量$\\mathbf{x}$的梯度是向量，并且与$\\mathbf{x}$具有相同的形状。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "e27a5df4",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.690633Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.689882Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.694159Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.693367Z"
+    },
+    "origin_pos": 7,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "x.requires_grad_(True)  # 等价于x=torch.arange(4.0,requires_grad=True)\n",
+    "x.grad  # 默认值是None"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bd993524",
+   "metadata": {
+    "origin_pos": 10
+   },
+   "source": [
+    "(**现在计算$y$。**)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "4c3f80b7",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.698006Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.697167Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.705385Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.704593Z"
+    },
+    "origin_pos": 12,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor(28., grad_fn=<MulBackward0>)"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "y = 2 * torch.dot(x, x)\n",
+    "y"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35523dbc",
+   "metadata": {
+    "origin_pos": 15
+   },
+   "source": [
+    "`x`是一个长度为4的向量，计算`x`和`x`的点积，得到了我们赋值给`y`的标量输出。\n",
+    "接下来，[**通过调用反向传播函数来自动计算`y`关于`x`每个分量的梯度**]，并打印这些梯度。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "a1c3a419",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.708698Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.708196Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.713924Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.713091Z"
+    },
+    "origin_pos": 17,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([ 0.,  4.,  8., 12.])"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "y.backward()\n",
+    "x.grad"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dca6a271",
+   "metadata": {
+    "origin_pos": 20
+   },
+   "source": [
+    "函数$y=2\\mathbf{x}^{\\top}\\mathbf{x}$关于$\\mathbf{x}$的梯度应为$4\\mathbf{x}$。\n",
+    "让我们快速验证这个梯度是否计算正确。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "b8493d0a",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.718858Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.718156Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.724091Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.723104Z"
+    },
+    "origin_pos": 22,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([True, True, True, True])"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "x.grad == 4 * x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2733c623",
+   "metadata": {
+    "origin_pos": 25
+   },
+   "source": [
+    "[**现在计算`x`的另一个函数。**]\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "f2fcd392",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.729368Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.728433Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.736493Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.735715Z"
+    },
+    "origin_pos": 27,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([1., 1., 1., 1.])"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# 在默认情况下，PyTorch会累积梯度，我们需要清除之前的值\n",
+    "x.grad.zero_()\n",
+    "y = x.sum()\n",
+    "y.backward()\n",
+    "x.grad"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "58f4f459",
+   "metadata": {
+    "origin_pos": 30
+   },
+   "source": [
+    "## 非标量变量的反向传播\n",
+    "\n",
+    "当`y`不是标量时，向量`y`关于向量`x`的导数的最自然解释是一个矩阵。\n",
+    "对于高阶和高维的`y`和`x`，求导的结果可以是一个高阶张量。\n",
+    "\n",
+    "然而，虽然这些更奇特的对象确实出现在高级机器学习中（包括[**深度学习中**]），\n",
+    "但当调用向量的反向计算时，我们通常会试图计算一批训练样本中每个组成部分的损失函数的导数。\n",
+    "这里(**，我们的目的不是计算微分矩阵，而是单独计算批量中每个样本的偏导数之和。**)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "f4e62a5d",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.740109Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.739419Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.745803Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.744893Z"
+    },
+    "origin_pos": 32,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([0., 2., 4., 6.])"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# 对非标量调用backward需要传入一个gradient参数，该参数指定微分函数关于self的梯度。\n",
+    "# 本例只想求偏导数的和，所以传递一个1的梯度是合适的\n",
+    "x.grad.zero_()\n",
+    "y = x * x\n",
+    "# 等价于y.backward(torch.ones(len(x)))\n",
+    "y.sum().backward()\n",
+    "x.grad"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "80f510c4",
+   "metadata": {
+    "origin_pos": 35
+   },
+   "source": [
+    "## 分离计算\n",
+    "\n",
+    "有时，我们希望[**将某些计算移动到记录的计算图之外**]。\n",
+    "例如，假设`y`是作为`x`的函数计算的，而`z`则是作为`y`和`x`的函数计算的。\n",
+    "想象一下，我们想计算`z`关于`x`的梯度，但由于某种原因，希望将`y`视为一个常数，\n",
+    "并且只考虑到`x`在`y`被计算后发挥的作用。\n",
+    "\n",
+    "这里可以分离`y`来返回一个新变量`u`，该变量与`y`具有相同的值，\n",
+    "但丢弃计算图中如何计算`y`的任何信息。\n",
+    "换句话说，梯度不会向后流经`u`到`x`。\n",
+    "因此，下面的反向传播函数计算`z=u*x`关于`x`的偏导数，同时将`u`作为常数处理，\n",
+    "而不是`z=x*x*x`关于`x`的偏导数。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "8dab493d",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.749398Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.748759Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.755280Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.754543Z"
+    },
+    "origin_pos": 37,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([True, True, True, True])"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "x.grad.zero_()\n",
+    "y = x * x\n",
+    "u = y.detach()\n",
+    "z = u * x\n",
+    "\n",
+    "z.sum().backward()\n",
+    "x.grad == u"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8fe6f9c",
+   "metadata": {
+    "origin_pos": 40
+   },
+   "source": [
+    "由于记录了`y`的计算结果，我们可以随后在`y`上调用反向传播，\n",
+    "得到`y=x*x`关于的`x`的导数，即`2*x`。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "271a9b3a",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.759344Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.758633Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.764663Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.763922Z"
+    },
+    "origin_pos": 42,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([True, True, True, True])"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "x.grad.zero_()\n",
+    "y.sum().backward()\n",
+    "x.grad == 2 * x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fd79d12f",
+   "metadata": {
+    "origin_pos": 45
+   },
+   "source": [
+    "## Python控制流的梯度计算\n",
+    "\n",
+    "使用自动微分的一个好处是：\n",
+    "[**即使构建函数的计算图需要通过Python控制流（例如，条件、循环或任意函数调用），我们仍然可以计算得到的变量的梯度**]。\n",
+    "在下面的代码中，`while`循环的迭代次数和`if`语句的结果都取决于输入`a`的值。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "6323b2ff",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.769249Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.768616Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.773175Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.772293Z"
+    },
+    "origin_pos": 47,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "def f(a):\n",
+    "    b = a * 2\n",
+    "    while b.norm() < 1000:\n",
+    "        b = b * 2\n",
+    "    if b.sum() > 0:\n",
+    "        c = b\n",
+    "    else:\n",
+    "        c = 100 * b\n",
+    "    return c"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "51aaf333",
+   "metadata": {
+    "origin_pos": 50
+   },
+   "source": [
+    "让我们计算梯度。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "7719d6b6",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.777740Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.777207Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.782254Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.781458Z"
+    },
+    "origin_pos": 52,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "a = torch.randn(size=(), requires_grad=True)\n",
+    "d = f(a)\n",
+    "d.backward()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "816a1ac2",
+   "metadata": {
+    "origin_pos": 55
+   },
+   "source": [
+    "我们现在可以分析上面定义的`f`函数。\n",
+    "请注意，它在其输入`a`中是分段线性的。\n",
+    "换言之，对于任何`a`，存在某个常量标量`k`，使得`f(a)=k*a`，其中`k`的值取决于输入`a`，因此可以用`d/a`验证梯度是否正确。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "2595bdc0",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:07:32.785728Z",
+     "iopub.status.busy": "2023-08-18T07:07:32.785179Z",
+     "iopub.status.idle": "2023-08-18T07:07:32.790672Z",
+     "shell.execute_reply": "2023-08-18T07:07:32.789892Z"
+    },
+    "origin_pos": 57,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor(True)"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a.grad == d / a"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "67fb5517",
+   "metadata": {
+    "origin_pos": 60
+   },
+   "source": [
+    "## 小结\n",
+    "\n",
+    "* 深度学习框架可以自动计算导数：我们首先将梯度附加到想要对其计算偏导数的变量上，然后记录目标值的计算，执行它的反向传播函数，并访问得到的梯度。\n",
+    "\n",
+    "## 练习\n",
+    "\n",
+    "1. 为什么计算二阶导数比一阶导数的开销要更大？\n",
+    "1. 在运行反向传播函数之后，立即再次运行它，看看会发生什么。\n",
+    "1. 在控制流的例子中，我们计算`d`关于`a`的导数，如果将变量`a`更改为随机向量或矩阵，会发生什么？\n",
+    "1. 重新设计一个求控制流梯度的例子，运行并分析结果。\n",
+    "1. 使$f(x)=\\sin(x)$，绘制$f(x)$和$\\frac{df(x)}{dx}$的图像，其中后者不使用$f'(x)=\\cos(x)$。\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "530f74f8",
+   "metadata": {
+    "origin_pos": 62,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "source": [
+    "[Discussions](https://discuss.d2l.ai/t/1759)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  },
+  "required_libs": []
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -0,0 +1,59 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "fc08a2aa",
+   "metadata": {
+    "origin_pos": 0
+   },
+   "source": [
+    "#  预备知识\n",
+    ":label:`chap_preliminaries`\n",
+    "\n",
+    "要学习深度学习，首先需要先掌握一些基本技能。\n",
+    "所有机器学习方法都涉及从数据中提取信息。\n",
+    "因此，我们先学习一些关于数据的实用技能，包括存储、操作和预处理数据。\n",
+    "\n",
+    "机器学习通常需要处理大型数据集。\n",
+    "我们可以将某些数据集视为一个表，其中表的行对应样本，列对应属性。\n",
+    "线性代数为人们提供了一些用来处理表格数据的方法。\n",
+    "我们不会太深究细节，而是将重点放在矩阵运算的基本原理及其实现上。\n",
+    "\n",
+    "深度学习是关于优化的学习。\n",
+    "对于一个带有参数的模型，我们想要找到其中能拟合数据的最好模型。\n",
+    "在算法的每个步骤中，决定以何种方式调整参数需要一点微积分知识。\n",
+    "本章将简要介绍这些知识。\n",
+    "幸运的是，`autograd`包会自动计算微分，本章也将介绍它。\n",
+    "\n",
+    "机器学习还涉及如何做出预测：给定观察到的信息，某些未知属性可能的值是多少？\n",
+    "要在不确定的情况下进行严格的推断，我们需要借用概率语言。\n",
+    "\n",
+    "最后，官方文档提供了本书之外的大量描述和示例。\n",
+    "在本章的结尾，我们将展示如何在官方文档中查找所需信息。\n",
+    "\n",
+    "本书对读者数学基础无过分要求，只要可以正确理解深度学习所需的数学知识即可。\n",
+    "但这并不意味着本书中不涉及数学方面的内容，本章会快速介绍一些基本且常用的数学知识，\n",
+    "以便读者能够理解书中的大部分数学内容。\n",
+    "如果读者想要深入理解全部数学内容，可以进一步学习本书数学附录中给出的数学基础知识。\n",
+    "\n",
+    ":begin_tab:toc\n",
+    " - [ndarray](ndarray.ipynb)\n",
+    " - [pandas](pandas.ipynb)\n",
+    " - [linear-algebra](linear-algebra.ipynb)\n",
+    " - [calculus](calculus.ipynb)\n",
+    " - [autograd](autograd.ipynb)\n",
+    " - [probability](probability.ipynb)\n",
+    " - [lookup-api](lookup-api.ipynb)\n",
+    ":end_tab:\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  },
+  "required_libs": []
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -0,0 +1,238 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "01132d59",
+   "metadata": {
+    "origin_pos": 0
+   },
+   "source": [
+    "# 查阅文档\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b7f72d17",
+   "metadata": {
+    "origin_pos": 2,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "source": [
+    "由于篇幅限制，本书不可能介绍每一个PyTorch函数和类。\n",
+    "API文档、其他教程和示例提供了本书之外的大量文档。\n",
+    "本节提供了一些查看PyTorch API的指导。\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "97173144",
+   "metadata": {
+    "origin_pos": 4
+   },
+   "source": [
+    "## 查找模块中的所有函数和类\n",
+    "\n",
+    "为了知道模块中可以调用哪些函数和类，可以调用`dir`函数。\n",
+    "例如，我们可以(**查询随机数生成模块中的所有属性：**)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "8f7f4d63",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:05:30.519062Z",
+     "iopub.status.busy": "2023-08-18T07:05:30.518501Z",
+     "iopub.status.idle": "2023-08-18T07:05:31.469749Z",
+     "shell.execute_reply": "2023-08-18T07:05:31.468858Z"
+    },
+    "origin_pos": 6,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "['AbsTransform', 'AffineTransform', 'Bernoulli', 'Beta', 'Binomial', 'CatTransform', 'Categorical', 'Cauchy', 'Chi2', 'ComposeTransform', 'ContinuousBernoulli', 'CorrCholeskyTransform', 'CumulativeDistributionTransform', 'Dirichlet', 'Distribution', 'ExpTransform', 'Exponential', 'ExponentialFamily', 'FisherSnedecor', 'Gamma', 'Geometric', 'Gumbel', 'HalfCauchy', 'HalfNormal', 'Independent', 'IndependentTransform', 'Kumaraswamy', 'LKJCholesky', 'Laplace', 'LogNormal', 'LogisticNormal', 'LowRankMultivariateNormal', 'LowerCholeskyTransform', 'MixtureSameFamily', 'Multinomial', 'MultivariateNormal', 'NegativeBinomial', 'Normal', 'OneHotCategorical', 'OneHotCategoricalStraightThrough', 'Pareto', 'Poisson', 'PowerTransform', 'RelaxedBernoulli', 'RelaxedOneHotCategorical', 'ReshapeTransform', 'SigmoidTransform', 'SoftmaxTransform', 'SoftplusTransform', 'StackTransform', 'StickBreakingTransform', 'StudentT', 'TanhTransform', 'Transform', 'TransformedDistribution', 'Uniform', 'VonMises', 'Weibull', 'Wishart', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'bernoulli', 'beta', 'biject_to', 'binomial', 'categorical', 'cauchy', 'chi2', 'constraint_registry', 'constraints', 'continuous_bernoulli', 'dirichlet', 'distribution', 'exp_family', 'exponential', 'fishersnedecor', 'gamma', 'geometric', 'gumbel', 'half_cauchy', 'half_normal', 'identity_transform', 'independent', 'kl', 'kl_divergence', 'kumaraswamy', 'laplace', 'lkj_cholesky', 'log_normal', 'logistic_normal', 'lowrank_multivariate_normal', 'mixture_same_family', 'multinomial', 'multivariate_normal', 'negative_binomial', 'normal', 'one_hot_categorical', 'pareto', 'poisson', 'register_kl', 'relaxed_bernoulli', 'relaxed_categorical', 'studentT', 'transform_to', 'transformed_distribution', 'transforms', 'uniform', 'utils', 'von_mises', 'weibull', 'wishart']\n"
+     ]
+    }
+   ],
+   "source": [
+    "import torch\n",
+    "\n",
+    "print(dir(torch.distributions))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a6e589e9",
+   "metadata": {
+    "origin_pos": 9
+   },
+   "source": [
+    "通常可以忽略以“`__`”（双下划线）开始和结束的函数，它们是Python中的特殊对象，\n",
+    "或以单个“`_`”（单下划线）开始的函数，它们通常是内部函数。\n",
+    "根据剩余的函数名或属性名，我们可能会猜测这个模块提供了各种生成随机数的方法，\n",
+    "包括从均匀分布（`uniform`）、正态分布（`normal`）和多项分布（`multinomial`）中采样。\n",
+    "\n",
+    "## 查找特定函数和类的用法\n",
+    "\n",
+    "有关如何使用给定函数或类的更具体说明，可以调用`help`函数。\n",
+    "例如，我们来[**查看张量`ones`函数的用法。**]\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "a16494ed",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:05:31.473606Z",
+     "iopub.status.busy": "2023-08-18T07:05:31.472946Z",
+     "iopub.status.idle": "2023-08-18T07:05:31.477780Z",
+     "shell.execute_reply": "2023-08-18T07:05:31.476938Z"
+    },
+    "origin_pos": 11,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Help on built-in function ones in module torch:\n",
+      "\n",
+      "ones(...)\n",
+      "    ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) -> Tensor\n",
+      "    \n",
+      "    Returns a tensor filled with the scalar value `1`, with the shape defined\n",
+      "    by the variable argument :attr:`size`.\n",
+      "    \n",
+      "    Args:\n",
+      "        size (int...): a sequence of integers defining the shape of the output tensor.\n",
+      "            Can be a variable number of arguments or a collection like a list or tuple.\n",
+      "    \n",
+      "    Keyword arguments:\n",
+      "        out (Tensor, optional): the output tensor.\n",
+      "        dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.\n",
+      "            Default: if ``None``, uses a global default (see :func:`torch.set_default_tensor_type`).\n",
+      "        layout (:class:`torch.layout`, optional): the desired layout of returned Tensor.\n",
+      "            Default: ``torch.strided``.\n",
+      "        device (:class:`torch.device`, optional): the desired device of returned tensor.\n",
+      "            Default: if ``None``, uses the current device for the default tensor type\n",
+      "            (see :func:`torch.set_default_tensor_type`). :attr:`device` will be the CPU\n",
+      "            for CPU tensor types and the current CUDA device for CUDA tensor types.\n",
+      "        requires_grad (bool, optional): If autograd should record operations on the\n",
+      "            returned tensor. Default: ``False``.\n",
+      "    \n",
+      "    Example::\n",
+      "    \n",
+      "        >>> torch.ones(2, 3)\n",
+      "        tensor([[ 1.,  1.,  1.],\n",
+      "                [ 1.,  1.,  1.]])\n",
+      "    \n",
+      "        >>> torch.ones(5)\n",
+      "        tensor([ 1.,  1.,  1.,  1.,  1.])\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "help(torch.ones)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "903c096e",
+   "metadata": {
+    "origin_pos": 14
+   },
+   "source": [
+    "从文档中，我们可以看到`ones`函数创建一个具有指定形状的新张量，并将所有元素值设置为1。\n",
+    "下面来[**运行一个快速测试**]来确认这一解释：\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "7870b2f5",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:05:31.481310Z",
+     "iopub.status.busy": "2023-08-18T07:05:31.480685Z",
+     "iopub.status.idle": "2023-08-18T07:05:31.490398Z",
+     "shell.execute_reply": "2023-08-18T07:05:31.489581Z"
+    },
+    "origin_pos": 16,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([1., 1., 1., 1.])"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "torch.ones(4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dd4f531d",
+   "metadata": {
+    "origin_pos": 19
+   },
+   "source": [
+    "在Jupyter记事本中，我们可以使用`?`指令在另一个浏览器窗口中显示文档。\n",
+    "例如，`list?`指令将创建与`help(list)`指令几乎相同的内容，并在新的浏览器窗口中显示它。\n",
+    "此外，如果我们使用两个问号，如`list??`，将显示实现该函数的Python代码。\n",
+    "\n",
+    "## 小结\n",
+    "\n",
+    "* 官方文档提供了本书之外的大量描述和示例。\n",
+    "* 可以通过调用`dir`和`help`函数或在Jupyter记事本中使用`?`和`??`查看API的用法文档。\n",
+    "\n",
+    "## 练习\n",
+    "\n",
+    "1. 在深度学习框架中查找任何函数或类的文档。请尝试在这个框架的官方网站上找到文档。\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "197b3dc7",
+   "metadata": {
+    "origin_pos": 21,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "source": [
+    "[Discussions](https://discuss.d2l.ai/t/1765)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  },
+  "required_libs": []
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -0,0 +1,303 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "ab73852c",
+   "metadata": {
+    "origin_pos": 0
+   },
+   "source": [
+    "# 数据预处理\n",
+    ":label:`sec_pandas`\n",
+    "\n",
+    "为了能用深度学习来解决现实世界的问题，我们经常从预处理原始数据开始，\n",
+    "而不是从那些准备好的张量格式数据开始。\n",
+    "在Python中常用的数据分析工具中，我们通常使用`pandas`软件包。\n",
+    "像庞大的Python生态系统中的许多其他扩展包一样，`pandas`可以与张量兼容。\n",
+    "本节我们将简要介绍使用`pandas`预处理原始数据，并将原始数据转换为张量格式的步骤。\n",
+    "后面的章节将介绍更多的数据预处理技术。\n",
+    "\n",
+    "## 读取数据集\n",
+    "\n",
+    "举一个例子，我们首先(**创建一个人工数据集，并存储在CSV（逗号分隔值）文件**)\n",
+    "`../data/house_tiny.csv`中。\n",
+    "以其他格式存储的数据也可以通过类似的方式进行处理。\n",
+    "下面我们将数据集按行写入CSV文件中。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "ee72fd16",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:03:38.903209Z",
+     "iopub.status.busy": "2023-08-18T07:03:38.902351Z",
+     "iopub.status.idle": "2023-08-18T07:03:38.918117Z",
+     "shell.execute_reply": "2023-08-18T07:03:38.916775Z"
+    },
+    "origin_pos": 1,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.makedirs(os.path.join('..', 'data'), exist_ok=True)\n",
+    "data_file = os.path.join('..', 'data', 'house_tiny.csv')\n",
+    "with open(data_file, 'w') as f:\n",
+    "    f.write('NumRooms,Alley,Price\\n')  # 列名\n",
+    "    f.write('NA,Pave,127500\\n')  # 每行表示一个数据样本\n",
+    "    f.write('2,NA,106000\\n')\n",
+    "    f.write('4,NA,178100\\n')\n",
+    "    f.write('NA,NA,140000\\n')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f5be7568",
+   "metadata": {
+    "origin_pos": 2
+   },
+   "source": [
+    "要[**从创建的CSV文件中加载原始数据集**]，我们导入`pandas`包并调用`read_csv`函数。该数据集有四行三列。其中每行描述了房间数量（“NumRooms”）、巷子类型（“Alley”）和房屋价格（“Price”）。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "5fb16e52",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:03:38.923957Z",
+     "iopub.status.busy": "2023-08-18T07:03:38.923101Z",
+     "iopub.status.idle": "2023-08-18T07:03:39.372116Z",
+     "shell.execute_reply": "2023-08-18T07:03:39.371151Z"
+    },
+    "origin_pos": 3,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "   NumRooms Alley   Price\n",
+      "0       NaN  Pave  127500\n",
+      "1       2.0   NaN  106000\n",
+      "2       4.0   NaN  178100\n",
+      "3       NaN   NaN  140000\n"
+     ]
+    }
+   ],
+   "source": [
+    "# 如果没有安装pandas，只需取消对以下行的注释来安装pandas\n",
+    "# !pip install pandas\n",
+    "import pandas as pd\n",
+    "\n",
+    "data = pd.read_csv(data_file)\n",
+    "print(data)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "30188bf5",
+   "metadata": {
+    "origin_pos": 4
+   },
+   "source": [
+    "## 处理缺失值\n",
+    "\n",
+    "注意，“NaN”项代表缺失值。\n",
+    "[**为了处理缺失的数据，典型的方法包括*插值法*和*删除法*，**]\n",
+    "其中插值法用一个替代值弥补缺失值，而删除法则直接忽略缺失值。\n",
+    "在(**这里，我们将考虑插值法**)。\n",
+    "\n",
+    "通过位置索引`iloc`，我们将`data`分成`inputs`和`outputs`，\n",
+    "其中前者为`data`的前两列，而后者为`data`的最后一列。\n",
+    "对于`inputs`中缺少的数值，我们用同一列的均值替换“NaN”项。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "d460a301",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:03:39.375828Z",
+     "iopub.status.busy": "2023-08-18T07:03:39.375535Z",
+     "iopub.status.idle": "2023-08-18T07:03:39.389220Z",
+     "shell.execute_reply": "2023-08-18T07:03:39.387998Z"
+    },
+    "origin_pos": 5,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "   NumRooms Alley\n",
+      "0       3.0  Pave\n",
+      "1       2.0   NaN\n",
+      "2       4.0   NaN\n",
+      "3       3.0   NaN\n"
+     ]
+    }
+   ],
+   "source": [
+    "inputs, outputs = data.iloc[:, 0:2], data.iloc[:, 2]\n",
+    "inputs = inputs.fillna(inputs.mean())\n",
+    "print(inputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eae762a4",
+   "metadata": {
+    "origin_pos": 6
+   },
+   "source": [
+    "[**对于`inputs`中的类别值或离散值，我们将“NaN”视为一个类别。**]\n",
+    "由于“巷子类型”（“Alley”）列只接受两种类型的类别值“Pave”和“NaN”，\n",
+    "`pandas`可以自动将此列转换为两列“Alley_Pave”和“Alley_nan”。\n",
+    "巷子类型为“Pave”的行会将“Alley_Pave”的值设置为1，“Alley_nan”的值设置为0。\n",
+    "缺少巷子类型的行会将“Alley_Pave”和“Alley_nan”分别设置为0和1。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "09ab8738",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:03:39.394176Z",
+     "iopub.status.busy": "2023-08-18T07:03:39.393444Z",
+     "iopub.status.idle": "2023-08-18T07:03:39.409892Z",
+     "shell.execute_reply": "2023-08-18T07:03:39.408559Z"
+    },
+    "origin_pos": 7,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "   NumRooms  Alley_Pave  Alley_nan\n",
+      "0       3.0           1          0\n",
+      "1       2.0           0          1\n",
+      "2       4.0           0          1\n",
+      "3       3.0           0          1\n"
+     ]
+    }
+   ],
+   "source": [
+    "inputs = pd.get_dummies(inputs, dummy_na=True)\n",
+    "print(inputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea1dd875",
+   "metadata": {
+    "origin_pos": 8
+   },
+   "source": [
+    "## 转换为张量格式\n",
+    "\n",
+    "[**现在`inputs`和`outputs`中的所有条目都是数值类型，它们可以转换为张量格式。**]\n",
+    "当数据采用张量格式后，可以通过在 :numref:`sec_ndarray`中引入的那些张量函数来进一步操作。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "4f551c6d",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2023-08-18T07:03:39.414531Z",
+     "iopub.status.busy": "2023-08-18T07:03:39.413831Z",
+     "iopub.status.idle": "2023-08-18T07:03:40.467689Z",
+     "shell.execute_reply": "2023-08-18T07:03:40.466637Z"
+    },
+    "origin_pos": 10,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(tensor([[3., 1., 0.],\n",
+       "         [2., 0., 1.],\n",
+       "         [4., 0., 1.],\n",
+       "         [3., 0., 1.]], dtype=torch.float64),\n",
+       " tensor([127500., 106000., 178100., 140000.], dtype=torch.float64))"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import torch\n",
+    "\n",
+    "X = torch.tensor(inputs.to_numpy(dtype=float))\n",
+    "y = torch.tensor(outputs.to_numpy(dtype=float))\n",
+    "X, y"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dbcbca0d",
+   "metadata": {
+    "origin_pos": 13
+   },
+   "source": [
+    "## 小结\n",
+    "\n",
+    "* `pandas`软件包是Python中常用的数据分析工具中，`pandas`可以与张量兼容。\n",
+    "* 用`pandas`处理缺失的数据时，我们可根据情况选择用插值法和删除法。\n",
+    "\n",
+    "## 练习\n",
+    "\n",
+    "创建包含更多行和列的原始数据集。\n",
+    "\n",
+    "1. 删除缺失值最多的列。\n",
+    "2. 将预处理后的数据集转换为张量格式。\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7b8c6c96",
+   "metadata": {
+    "origin_pos": 15,
+    "tab": [
+     "pytorch"
+    ]
+   },
+   "source": [
+    "[Discussions](https://discuss.d2l.ai/t/1750)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  },
+  "required_libs": []
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}