cb0195135e
添加完整的强化学习项目报告,包含 LaTeX 源文件、生成的 PDF 文档以及训练过程的可视化图表。主要新增内容包括: - 完整的项目报告(report.tex 和 report.pdf),详细说明 DQN 算法在 Atari Space Invaders 游戏上的实现与实验结果 - 训练曲线、epsilon 衰减曲线和评估结果的可视化图表(PNG 格式) - 更新 generate_plots.py 脚本,改进代码格式和错误处理,支持更灵活的参数配置 - 添加训练好的最佳模型文件(dqn_best.pt)和项目源代码压缩包 - 包含 LaTeX 编译生成的辅助文件(.aux, .log) 这些文件构成了完整的项目交付物,便于复现实验结果和展示项目成果。
43 lines
4.6 KiB
TeX
43 lines
4.6 KiB
TeX
\relax
|
|
\providecommand\hyper@newdestlabel[2]{}
|
|
\providecommand\HyField@AuxAddToFields[1]{}
|
|
\providecommand\HyField@AuxAddToCoFields[2]{}
|
|
\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}{section.1}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {1.1}Game Selection and Challenges}{1}{subsection.1.1}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {1.2}Motivation}{1}{subsection.1.2}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {section}{\numberline {2}Literature Review}{2}{section.2}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {2.1}Deep Reinforcement Learning in Atari Games}{2}{subsection.2.1}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {2.2}Algorithm Comparison}{2}{subsection.2.2}\protected@file@percent }
|
|
\@writefile{lot}{\contentsline {table}{\numberline {1}{\ignorespaces Comparison of reinforcement learning algorithms}}{2}{table.caption.1}\protected@file@percent }
|
|
\providecommand*\caption@xref[2]{\@setref\relax\@undefined{#1}}
|
|
\newlabel{tab:algorithm_comparison}{{1}{2}{Comparison of reinforcement learning algorithms}{table.caption.1}{}}
|
|
\@writefile{toc}{\contentsline {section}{\numberline {3}Algorithm and Implementation}{3}{section.3}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {3.1}DQN Algorithm}{3}{subsection.3.1}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.1.1}Q-Learning Foundation}{3}{subsubsection.3.1.1}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.1.2}Experience Replay}{3}{subsubsection.3.1.2}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.1.3}Target Network}{3}{subsubsection.3.1.3}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.1.4}Double DQN Extension}{3}{subsubsection.3.1.4}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {3.2}Network Architecture}{3}{subsection.3.2}\protected@file@percent }
|
|
\@writefile{lot}{\contentsline {table}{\numberline {2}{\ignorespaces Network architecture details}}{4}{table.caption.2}\protected@file@percent }
|
|
\newlabel{tab:network}{{2}{4}{Network architecture details}{table.caption.2}{}}
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {3.3}Environment Preprocessing}{4}{subsection.3.3}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {3.4}Training Details}{4}{subsection.3.4}\protected@file@percent }
|
|
\@writefile{lot}{\contentsline {table}{\numberline {3}{\ignorespaces Training hyperparameters}}{4}{table.caption.3}\protected@file@percent }
|
|
\newlabel{tab:hyperparameters}{{3}{4}{Training hyperparameters}{table.caption.3}{}}
|
|
\@writefile{toc}{\contentsline {section}{\numberline {4}Experimental Results}{4}{section.4}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {4.1}Training Performance}{4}{subsection.4.1}\protected@file@percent }
|
|
\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces Training curves showing reward, loss, and Q-value evolution}}{5}{figure.caption.4}\protected@file@percent }
|
|
\newlabel{fig:training_curves}{{1}{5}{Training curves showing reward, loss, and Q-value evolution}{figure.caption.4}{}}
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {4.2}Evaluation Results}{5}{subsection.4.2}\protected@file@percent }
|
|
\@writefile{lot}{\contentsline {table}{\numberline {4}{\ignorespaces Evaluation results}}{5}{table.caption.5}\protected@file@percent }
|
|
\newlabel{tab:evaluation}{{4}{5}{Evaluation results}{table.caption.5}{}}
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {4.3}Comparison with Baselines}{6}{subsection.4.3}\protected@file@percent }
|
|
\@writefile{lot}{\contentsline {table}{\numberline {5}{\ignorespaces Comparison with baselines}}{6}{table.caption.6}\protected@file@percent }
|
|
\newlabel{tab:comparison}{{5}{6}{Comparison with baselines}{table.caption.6}{}}
|
|
\@writefile{toc}{\contentsline {section}{\numberline {5}Discussion}{6}{section.5}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {5.1}Performance Analysis}{6}{subsection.5.1}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {5.2}Limitations}{6}{subsection.5.2}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {subsection}{\numberline {5.3}Potential Improvements}{6}{subsection.5.3}\protected@file@percent }
|
|
\@writefile{toc}{\contentsline {section}{\numberline {6}Conclusion}{7}{section.6}\protected@file@percent }
|
|
\gdef \@abspage@last{7}
|