Files
2025-12-16 09:23:53 +08:00

59 lines
2.4 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "729b9613",
"metadata": {
"origin_pos": 0
},
"source": [
"# 现代循环神经网络\n",
":label:`chap_modern_rnn`\n",
"\n",
"前一章中我们介绍了循环神经网络的基础知识,\n",
"这种网络可以更好地处理序列数据。\n",
"我们在文本数据上实现了基于循环神经网络的语言模型,\n",
"但是对于当今各种各样的序列学习问题,这些技术可能并不够用。\n",
"\n",
"例如,循环神经网络在实践中一个常见问题是数值不稳定性。\n",
"尽管我们已经应用了梯度裁剪等技巧来缓解这个问题,\n",
"但是仍需要通过设计更复杂的序列模型来进一步处理它。\n",
"具体来说,我们将引入两个广泛使用的网络,\n",
"即*门控循环单元*gated recurrent unitsGRU)和\n",
"*长短期记忆网络*long short-term memoryLSTM)。\n",
"然后,我们将基于一个单向隐藏层来扩展循环神经网络架构。\n",
"我们将描述具有多个隐藏层的深层架构,\n",
"并讨论基于前向和后向循环计算的双向设计。\n",
"现代循环网络经常采用这种扩展。\n",
"在解释这些循环神经网络的变体时,\n",
"我们将继续考虑 :numref:`chap_rnn`中的语言建模问题。\n",
"\n",
"事实上,语言建模只揭示了序列学习能力的冰山一角。\n",
"在各种序列学习问题中,如自动语音识别、文本到语音转换和机器翻译,\n",
"输入和输出都是任意长度的序列。\n",
"为了阐述如何拟合这种类型的数据,\n",
"我们将以机器翻译为例介绍基于循环神经网络的\n",
"“编码器-解码器”架构和束搜索,并用它们来生成序列。\n",
"\n",
":begin_tab:toc\n",
" - [gru](gru.ipynb)\n",
" - [lstm](lstm.ipynb)\n",
" - [deep-rnn](deep-rnn.ipynb)\n",
" - [bi-rnn](bi-rnn.ipynb)\n",
" - [machine-translation-and-dataset](machine-translation-and-dataset.ipynb)\n",
" - [encoder-decoder](encoder-decoder.ipynb)\n",
" - [seq2seq](seq2seq.ipynb)\n",
" - [beam-search](beam-search.ipynb)\n",
":end_tab:\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"required_libs": []
},
"nbformat": 4,
"nbformat_minor": 5
}