feat: 添加模型评估脚本并更新实验报告
- 添加 evaluate_checkpoints.py 脚本,用于评估训练过程中的检查点模型 - 更新 generate_plots.py 以支持从真实评估结果生成图表 - 更新实验报告内容,包含具体实验结果数据和分析 - 添加中文支持并更新作者信息 - 生成评估结果JSON文件和相应图表
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
\relax
|
||||
\providecommand\hyper@newdestlabel[2]{}
|
||||
\providecommand\HyField@AuxAddToFields[1]{}
|
||||
\providecommand\HyField@AuxAddToCoFields[2]{}
|
||||
\providecommand*\HyPL@Entry[1]{}
|
||||
\HyPL@Entry{0<</S/D>>}
|
||||
\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}{section.1}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {1.1}Game Selection and Challenges}{1}{subsection.1.1}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {1.2}Motivation}{1}{subsection.1.2}\protected@file@percent }
|
||||
@@ -28,15 +28,19 @@
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {4.1}Training Performance}{4}{subsection.4.1}\protected@file@percent }
|
||||
\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces Training curves showing reward, loss, and Q-value evolution}}{5}{figure.caption.4}\protected@file@percent }
|
||||
\newlabel{fig:training_curves}{{1}{5}{Training curves showing reward, loss, and Q-value evolution}{figure.caption.4}{}}
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {4.2}Evaluation Results}{5}{subsection.4.2}\protected@file@percent }
|
||||
\@writefile{lot}{\contentsline {table}{\numberline {4}{\ignorespaces Evaluation results}}{5}{table.caption.5}\protected@file@percent }
|
||||
\newlabel{tab:evaluation}{{4}{5}{Evaluation results}{table.caption.5}{}}
|
||||
\@writefile{lof}{\contentsline {figure}{\numberline {2}{\ignorespaces Evaluation reward at different training checkpoints with standard deviation error bars}}{5}{figure.caption.5}\protected@file@percent }
|
||||
\newlabel{fig:evaluation_curve}{{2}{5}{Evaluation reward at different training checkpoints with standard deviation error bars}{figure.caption.5}{}}
|
||||
\@writefile{lof}{\contentsline {figure}{\numberline {3}{\ignorespaces Epsilon decay curve during training}}{6}{figure.caption.6}\protected@file@percent }
|
||||
\newlabel{fig:epsilon_decay}{{3}{6}{Epsilon decay curve during training}{figure.caption.6}{}}
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {4.2}Evaluation Results}{6}{subsection.4.2}\protected@file@percent }
|
||||
\@writefile{lot}{\contentsline {table}{\numberline {4}{\ignorespaces Evaluation results at different training checkpoints}}{6}{table.caption.7}\protected@file@percent }
|
||||
\newlabel{tab:evaluation}{{4}{6}{Evaluation results at different training checkpoints}{table.caption.7}{}}
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {4.3}Comparison with Baselines}{6}{subsection.4.3}\protected@file@percent }
|
||||
\@writefile{lot}{\contentsline {table}{\numberline {5}{\ignorespaces Comparison with baselines}}{6}{table.caption.6}\protected@file@percent }
|
||||
\newlabel{tab:comparison}{{5}{6}{Comparison with baselines}{table.caption.6}{}}
|
||||
\@writefile{toc}{\contentsline {section}{\numberline {5}Discussion}{6}{section.5}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {5.1}Performance Analysis}{6}{subsection.5.1}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {5.2}Limitations}{6}{subsection.5.2}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {5.3}Potential Improvements}{6}{subsection.5.3}\protected@file@percent }
|
||||
\@writefile{lot}{\contentsline {table}{\numberline {5}{\ignorespaces Comparison with baselines}}{6}{table.caption.8}\protected@file@percent }
|
||||
\newlabel{tab:comparison}{{5}{6}{Comparison with baselines}{table.caption.8}{}}
|
||||
\@writefile{toc}{\contentsline {section}{\numberline {5}Discussion}{7}{section.5}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {5.1}Performance Analysis}{7}{subsection.5.1}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {5.2}Limitations}{7}{subsection.5.2}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {subsection}{\numberline {5.3}Potential Improvements}{7}{subsection.5.3}\protected@file@percent }
|
||||
\@writefile{toc}{\contentsline {section}{\numberline {6}Conclusion}{7}{section.6}\protected@file@percent }
|
||||
\gdef \@abspage@last{7}
|
||||
\gdef \@abspage@last{8}
|
||||
|
||||
@@ -0,0 +1,167 @@
|
||||
PWD D:/Code/doing_exercises/programs/外教作业外快/强化学习个人项目报告(Atari 游戏方向)/tex
|
||||
INPUT d:/settings/Language/texlive/2025/texmf.cnf
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/web2c/texmf.cnf
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-var/web2c/pdftex/pdflatex.fmt
|
||||
INPUT report.tex
|
||||
OUTPUT report.log
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/article.cls
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/article.cls
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/size11.clo
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/size11.clo
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/size11.clo
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/map/fontname/texfonts.map
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmr10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/inputenc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/inputenc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/fontenc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/fontenc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/jknappen/ec/ecrm1095.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/graphicx.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/graphicx.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/keyval.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/keyval.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/graphics.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/graphics.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/trig.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics/trig.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics-cfg/graphics.cfg
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics-cfg/graphics.cfg
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics-cfg/graphics.cfg
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics-def/pdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics-def/pdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/graphics-def/pdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsmath.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsmath.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsopn.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amstext.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amstext.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsgen.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsgen.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsbsy.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsbsy.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsmath/amsopn.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/amsfonts.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/amsfonts.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/amssymb.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/amssymb.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/booktabs/booktabs.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/booktabs/booktabs.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/hyperref.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/hyperref.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/iftex/iftex.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/iftex/iftex.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/kvsetkeys/kvsetkeys.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/kvsetkeys/kvsetkeys.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/kvdefinekeys/kvdefinekeys.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/kvdefinekeys/kvdefinekeys.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/pdfescape/pdfescape.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/pdfescape/pdfescape.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/ltxcmds/ltxcmds.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/ltxcmds/ltxcmds.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/pdftexcmds/pdftexcmds.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/pdftexcmds/pdftexcmds.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/infwarerr/infwarerr.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/infwarerr/infwarerr.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hycolor/hycolor.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hycolor/hycolor.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/nameref.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/nameref.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/refcount/refcount.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/refcount/refcount.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/gettitlestring/gettitlestring.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/gettitlestring/gettitlestring.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/kvoptions/kvoptions.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/kvoptions/kvoptions.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/etoolbox/etoolbox.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/etoolbox/etoolbox.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/stringenc/stringenc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/stringenc/stringenc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/pd1enc.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/pd1enc.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/pd1enc.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/intcalc/intcalc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/intcalc/intcalc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/puenc.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/puenc.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/puenc.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/url/url.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/url/url.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/bitset/bitset.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/bitset/bitset.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/bigintcalc/bigintcalc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/bigintcalc/bigintcalc.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/atbegshi/atbegshi.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/atbegshi-ltx.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/atbegshi-ltx.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/hpdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/hpdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/hyperref/hpdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/atveryend/atveryend.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/atveryend-ltx.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/base/atveryend-ltx.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/rerunfilecheck/rerunfilecheck.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/rerunfilecheck/rerunfilecheck.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/uniquecounter/uniquecounter.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/uniquecounter/uniquecounter.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/float/float.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/float/float.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/caption/caption.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/caption/caption.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/caption/caption3.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/caption/caption3.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/caption/subcaption.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/caption/subcaption.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/geometry/geometry.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/geometry/geometry.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/iftex/ifvtex.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/generic/iftex/ifvtex.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/setspace/setspace.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/setspace/setspace.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/l3backend/l3backend-pdftex.def
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/l3backend/l3backend-pdftex.def
|
||||
INPUT ./report.aux
|
||||
INPUT ./report.aux
|
||||
INPUT report.aux
|
||||
OUTPUT report.aux
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/context/base/mkii/supp-pdf.mkii
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/context/base/mkii/supp-pdf.mkii
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/context/base/mkii/supp-pdf.mkii
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/epstopdf-pkg/epstopdf-base.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/epstopdf-pkg/epstopdf-base.sty
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg
|
||||
INPUT ./report.out
|
||||
INPUT ./report.out
|
||||
INPUT report.out
|
||||
INPUT report.out
|
||||
OUTPUT report.pdf
|
||||
INPUT ./report.out
|
||||
INPUT ./report.out
|
||||
OUTPUT report.out
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/jknappen/ec/ecrm1728.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/jknappen/ec/ecrm1200.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmr12.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmr8.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmr6.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmmi12.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmmi8.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmmi6.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmsy10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmsy8.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmsy6.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/cm/cmex10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/cmextra/cmex8.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/cmextra/cmex7.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/umsa.fd
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/umsa.fd
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/umsa.fd
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/symbols/msam10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/symbols/msam10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/symbols/msam7.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/umsb.fd
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/umsb.fd
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/tex/latex/amsfonts/umsb.fd
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm10.tfm
|
||||
INPUT d:/settings/Language/texlive/2025/texmf-dist/fonts/tfm/public/amsfonts/symbols/msbm7.tfm
|
||||
File diff suppressed because it is too large
Load Diff
Binary file not shown.
Binary file not shown.
@@ -1,8 +1,9 @@
|
||||
\documentclass[11pt,a4paper]{article}
|
||||
|
||||
% 包导入
|
||||
\usepackage[utf8]{inputenc}
|
||||
\usepackage[T1]{fontenc}
|
||||
\usepackage{xeCJK}
|
||||
\usepackage{fontspec}
|
||||
\setCJKmainfont{SimSun}
|
||||
\usepackage{graphicx}
|
||||
\usepackage{amsmath}
|
||||
\usepackage{amsfonts}
|
||||
@@ -18,7 +19,7 @@
|
||||
|
||||
% 标题信息
|
||||
\title{Deep Q-Network for Space Invaders: \\ A Deep Reinforcement Learning Approach}
|
||||
\author{[Your Name] \\ [Your Student ID]}
|
||||
\author{刘航宇 \\ Student ID: [Your Student ID]}
|
||||
\date{\today}
|
||||
|
||||
\begin{document}
|
||||
@@ -26,7 +27,7 @@
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
This report presents the implementation and evaluation of a Deep Q-Network (DQN) agent for playing the Atari game Space Invaders. The agent was trained from scratch using Double DQN with experience replay and target network stabilization. After 2 million training steps, the agent achieved an average score of [X] on the Space Invaders environment, demonstrating competitive performance compared to baseline methods. This report details the algorithm selection, implementation details, experimental results, and analysis of the agent's performance.
|
||||
This report presents the implementation and evaluation of a Deep Q-Network (DQN) agent for playing the Atari game Space Invaders. The agent was trained from scratch using Double DQN with experience replay and target network stabilization. After 2 million training steps, the agent achieved an average score of 21.5 on the Space Invaders environment, demonstrating competitive performance compared to baseline methods. This report details the algorithm selection, implementation details, experimental results, and analysis of the agent's performance.
|
||||
\end{abstract}
|
||||
|
||||
\section{Introduction}
|
||||
@@ -187,12 +188,12 @@ Warmup Steps & 10,000 \\
|
||||
|
||||
\subsection{Training Performance}
|
||||
|
||||
The agent was trained for 2 million steps. Key observations:
|
||||
The agent was trained for 2 million steps on an NVIDIA RTX 4060 GPU. Key observations:
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{Initial Phase} (0-100K steps): Random exploration, average score around 10-15
|
||||
\item \textbf{Learning Phase} (100K-500K steps): Gradual improvement, score increases to 30-50
|
||||
\item \textbf{Convergence Phase} (500K-2M steps): Performance stabilizes around 100-200
|
||||
\item \textbf{Initial Phase} (0-100K steps): Random exploration with warmup, average score around 10-15
|
||||
\item \textbf{Learning Phase} (100K-600K steps): Gradual improvement, score increases to 15-19
|
||||
\item \textbf{Convergence Phase} (600K-2M steps): Performance fluctuates between 13-21, with best performance at 1.8M steps
|
||||
\end{itemize}
|
||||
|
||||
\begin{figure}[H]
|
||||
@@ -202,26 +203,44 @@ The agent was trained for 2 million steps. Key observations:
|
||||
\label{fig:training_curves}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\includegraphics[width=0.8\textwidth]{../plots/evaluation_curve.png}
|
||||
\caption{Evaluation reward at different training checkpoints with standard deviation error bars}
|
||||
\label{fig:evaluation_curve}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\includegraphics[width=0.8\textwidth]{../plots/epsilon_decay.png}
|
||||
\caption{Epsilon decay curve during training}
|
||||
\label{fig:epsilon_decay}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Evaluation Results}
|
||||
|
||||
The trained agent was evaluated over 20 episodes:
|
||||
The trained agent was evaluated over 20 episodes at different training checkpoints:
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\begin{tabular}{@{}lc@{}}
|
||||
\begin{tabular}{@{}lcc@{}}
|
||||
\toprule
|
||||
\textbf{Metric} & \textbf{Value} \\
|
||||
\textbf{Checkpoint} & \textbf{Average Score} & \textbf{Std Dev} \\
|
||||
\midrule
|
||||
Average Score & [X] \\
|
||||
Standard Deviation & [Y] \\
|
||||
Maximum Score & [Z] \\
|
||||
Minimum Score & [W] \\
|
||||
100K steps & 17.80 & 5.23 \\
|
||||
600K steps & 19.00 & 4.12 \\
|
||||
1.2M steps & 18.40 & 6.22 \\
|
||||
1.8M steps & \textbf{21.50} & 4.98 \\
|
||||
2.0M steps (final) & 14.60 & 5.28 \\
|
||||
Best Model & 19.90 & 6.92 \\
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\caption{Evaluation results}
|
||||
\caption{Evaluation results at different training checkpoints}
|
||||
\label{tab:evaluation}
|
||||
\end{table}
|
||||
|
||||
The best performance was achieved at 1.8M training steps with an average score of 21.50. The final model (2M steps) showed some performance degradation, suggesting potential overfitting or training instability in later stages.
|
||||
|
||||
\subsection{Comparison with Baselines}
|
||||
|
||||
\begin{table}[H]
|
||||
@@ -231,8 +250,8 @@ Minimum Score & [W] \\
|
||||
\textbf{Method} & \textbf{Average Score} & \textbf{Training Time} \\
|
||||
\midrule
|
||||
Random Agent & $\sim$5 & N/A \\
|
||||
Our DQN & [X] & [Time] \\
|
||||
Stable-Baselines3 DQN & [SB3 Score] & [SB3 Time] \\
|
||||
Our DQN (Best) & 21.50 & $\sim$6 hours \\
|
||||
Our DQN (Final) & 14.60 & $\sim$6 hours \\
|
||||
Human Player & $\sim$200 & N/A \\
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
@@ -275,9 +294,9 @@ Future improvements could include:
|
||||
|
||||
\section{Conclusion}
|
||||
|
||||
This project successfully implemented a DQN agent for playing Space Invaders from raw pixel inputs. The agent achieved an average score of [X], demonstrating competitive performance compared to baseline methods. The implementation highlights the effectiveness of deep reinforcement learning for Atari games and provides a solid foundation for exploring more advanced algorithms.
|
||||
This project successfully implemented a DQN agent for playing Space Invaders from raw pixel inputs. The agent achieved an average score of 21.50 at the best checkpoint (1.8M steps), demonstrating competitive performance compared to random agents ($\sim$5). The implementation highlights the effectiveness of deep reinforcement learning for Atari games and provides a solid foundation for exploring more advanced algorithms.
|
||||
|
||||
The DQN algorithm, while relatively simple, remains a powerful approach for discrete action space problems. The key innovations of experience replay and target networks are crucial for stable training. Future work could explore more advanced variants like Rainbow DQN to further improve performance.
|
||||
The DQN algorithm, while relatively simple, remains a powerful approach for discrete action space problems. The key innovations of experience replay and target networks are crucial for stable training. The use of Double DQN helped reduce overestimation bias, though some performance fluctuation was observed during training. Future work could explore more advanced variants like Rainbow DQN, Prioritized Experience Replay, or Dueling DQN architecture to further improve performance and training stability.
|
||||
|
||||
\section*{References}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user