feat: 添加并行训练脚本和奖励塑形以改进PPO性能
引入并行环境训练脚本 train_parallel_improved.py,实现多进程并行数据收集 添加奖励塑形包装器,根据速度、赛道位置和完成圈数调整奖励信号 优化神经网络结构和训练参数,包括更大的rollout缓冲区 删除旧的tensorboard日志文件,创建新的训练运行记录
This commit is contained in:
BIN
Binary file not shown.
BIN
Binary file not shown.
BIN
Binary file not shown.
BIN
Binary file not shown.
BIN
Binary file not shown.
BIN
Binary file not shown.
BIN
Binary file not shown.
BIN
Binary file not shown.
Reference in New Issue
Block a user