chore: 更新项目文档、依赖和训练脚本

- 更新 requirements.txt，添加 opencv-python-headless 并补充 uv 安装说明 - 修复 CSV 文件中的换行符格式（CRLF 转 LF） - 更新 TASK_PROGRESS.md，记录并行训练实现和 WSL 支持 - 优化 train_improved.py 代码格式，移除多余空行和注释 - 更新课程作业要求文档的字符编码 - 添加新的 TensorBoard 日志文件和训练模型
2026-05-01 09:26:23 +08:00
parent 6b929e9790
commit d6860f1f15
16 changed files with 25712 additions and 25680 deletions
@@ -1,251 +1,251 @@
-        XJTLU Entrepreneur College (Taicang) Cover Sheet
-
- Module code and Title       DTS307TC Reinforcement Learning
- School Title                School of AI and Advanced Computing
- Assignment Title            Coursework 1
- Submission Deadline         04/May/2026 23:59
- Final Word Count
- If you agree to let the university use your work anonymously for teaching
- and learning purposes, please type “yes” here.
-
-
-I certify that I have read and understood the University’s Policy for dealing with Plagiarism,
-Collusion and the Fabrication of Data (available on Learning Mall Online). With reference to this
-policy I certify that:
-
-    •   My work does not contain any instances of plagiarism and/or collusion.
-        My work does not contain any fabricated data.
-
-
-
-By uploading my assignment onto Learning Mall Online, I formally declare
-that all of the above information is true to the best of my knowledge and
-belief.
-                                         Scoring – For Tutor Use
- Student ID
-
- Stage of           Marker            Learning Outcomes Achieved （F/P/M/D）                          Final
- Marking             Code                   (please modify as appropriate)                          Score
-                               A                      B                C
- 1st Marker – red
- pen
- Moderation                        The original mark has been accepted by the moderator              Y/N
-                      IM                        (please circle as appropriate):
- – green pen        Initials
-                                   Data entry and score calculation have been checked by               Y
-                                                another tutor (please circle):
- 2nd Marker if
- needed – green
- pen
- For Academic Office Use             Possible Academic Infringement (please tick as appropriate)
- Date          Days Late                   ☐ Category A
- Received late      Penalty                                                Total Academic Infringement Penalty
-                                            ☐ Category B                   (A,B, C, D, E, Please modify where
-                                                                           necessary) _____________________
-                                            ☐ Category C
-                                            ☐ Category D
-                                            ☐ Category E
-School of Artificial Intelligence and Advanced Computing
-Xi’an Jiaotong-Liverpool University
-
-
-
-
-                             DTS307TC Reinforcement Learning
-
-                               Coursework - Individual Report
-
-Due: 04/May/2026 23:59
-Weight: 40%
-Maximum score: 40 marks
-
-
-
-
-Overview
-
-The purpose of this assignment is to gain experience in Python programming and the design of
-reinforcement leaning algorithms. You are expected to implement an RL algorithm that solves a
-specific environment and provide an explanation of the algorithm’s methodology. You are expected
-to analyse your results, including challenges and your solutions.
-
-
-Learning Outcomes Assessed
-
-A: Systematically understand the fundamental concepts and principles of reinforcement learning
-B: Critically analyse real-life problem situations and expertly map them as reinforcement learning
-tasks.
-C: Mastery of Monte Carlo Methods and Temporal Difference Learning
-D: Proficiency in Deep Reinforcement Learning algorithms
-
-
-Late policy
-
-5% of the total marks available for the assessment shall be deducted from the assessment mark for
-each working day after the submission date, up to a maximum of five working days
-
-
-Avoid Plagiarism
-
-  • Do not submit work from other students.
-
-  • Do not share code/work with other students
-
-  • Do not use open-source code as it is or without proper reference.
-
-
-
-
-                                                2
-Risks
-
-   • Please read the coursework instructions and requirements carefully. Not following these instructions
-     and requirements may result in a loss of marks.
-   • The assignment must be submitted via Learning Mall. Only electronic submission is accepted
-     and no hard copy submission.
-   • All students must download their file and check that it is viewable after submission. Documents
-     may become corrupted during the uploading process (e.g. due to slow internet connections).
-     However, students are responsible for submitting a functional and correct file for assessments.
-   • Academic Integrity Policy is strictly followed.
-
-
-Individual Report (40 marks)
-
-The primary objective of this coursework is to familiarize students with the PPO algorithm using
-basic deep learning libraries, enabling them to improve their capability in transferring mathematical
-and theoretical knowledge into Python implementation, and further their understanding of the actor-
-critic algorithm.
-
-
-Algorithm Overview
-
-Proximal Policy Optimization (PPO) is a state-of-the-art reinforcement learning algorithm that optimizes
-a stochastic policy in an on-policy manner. To ensure stable training and avoid catastrophic performance
-collapse, PPO utilizes a clipped surrogate objective to prevent the policy update from stepping too
-far from the current behavior.
-
-
-The Environment: CarRacing-v3
-
-We will be using the Car Racing environment from the OpenAI Gymnasium. This environment
-features a top-down racing track where the agent must learn to navigate through tiles based on
-pixel inputs. You can find more details about this environment on their website.(https://gymnasium.
-farama.org/environments/box2d/car_racing/)
-Here’s a code snippet for you to get started:
-
-import gymnasium as gym
-env = gym . make ( " CarRacing - v3 " , render_mode = " rgb_array " )
-env . reset ()
-
-Since CarRacing-v3 is quite computationally expensive for a standard laptop (due to the pixel processing),
-you might want to consider using a gray-scaling or frame-stacking wrapper to speed up training.
-Alternatively, you can also use the lab computers, which have GPUs and have all the environment
-already set up.
-
-
-The PPO Agent
-
-You will implement an RL agent using PPO to play the CarRacing-v3 environment. The agent
-will use the standard observation and actions provided by the environment. You may edit the
-
-                                                      3
-environment to speed up your training, but your agent must still perform well in the standard
-environment. (i.e, removing the camera zoom at the beginning is allowed during training, but
-your agent should still be tested in the original environment.) You should record your training and
-evaluation process using Tensorboard. You should also record important losses and other data for
-your analysis later.
-
-
-The Report
-
-Upon completion of your implementation, you are required to submit a comprehensive technical
-report. The report should document your engineering decisions, the theoretical grounding of your
-code, and a critical analysis of the agent’s performance.
-
-  1. Introduction
-
-       • Provide a brief overview of Reinforcement Learning in the context of the CarRacing-v3
-         environment.
-       • Define the state space (pixels), action space (discrete commands), and the reward structure
-         of the task.
-
-  2. Methodology
-
-       • Mathematical Foundation: Formulate the PPO objective function. Explain the significance
-         of the clipping parameter and the probability ratio.
-       • Advantage Estimation: Describe your method for calculating advantages (e.g., standard
-         advantage vs. Generalized Advantage Estimation (GAE)).
-
-  3. Implementation Details
-
-       • Describe your implementation, including any challenges faced and how you addressed
-         them.
-       • Explain the structure of your policy and value networks.
-       • Detail the training process and hyperparameters used.
-
-  4. Results and Analysis
-
-       • Present your results (use graphs for better clarity).
-       • Discuss the performance of your agent and any trends observed.
-       • Briefly compare your custom implementation’s stability and sample efficiency against baseline
-         benchmarks (e.g., Stable-Baselines3).
-
-  5. Conclusion
-
-       • Summarize your key findings regarding the sensitivity of PPO to hyperparameter tuning
-         and the effectiveness of the actor-critic framework in continuous-input environments.
-
-     Note: All figures and plots must be clearly labeled with axes titles and legends. Raw code
-     snippets should be kept to a minimum in the report; focus on high-level logic and pseudo-
-     code where necessary.
-
-
-
-
-                                                 4
-Important Note
-
-   • Do NOT use Stable-baselines libraries or any other reinforcement learning specific libraries in
-     your implementation (You may use tensorboard for recording your results).
-
-   • Do NOT exceed the word count limit of 3000 words for each report, reference and appendix
-     excluded.
-
-   • Although you are allowed to use any generative AI tools to assist your work, please keep in mind
-     that you should be using them responsibly. (Good use: Improve your report after writing it
-     and always review its output to ensure that it is correct. Bad use: Copy-pasting an entire report
-     from AI without any effort of your own. )
-
-
-Submission Requirements
-
-Please prepare and submit the following documents:
-
-   • A cover page featuring your student ID. This page should be the first page of your report.
-
-   • A zip file containing all the source codes and your trained agent model, which should be named
-     using your full name and student ID in the following format: CW1_ID_Name.zip
-
-   • One PDF file for your report. The file should be separated from the zip file, which contains your
-     code. The files should be named in the following format: CW1_ID_Name.pdf
-
-Note that the quality of the code, the clarity of your writing, and the format/style of your report will
-be taken into consideration during the evaluation. The detailed rubric is outlined below.
-
-
-Rubric
-
- CW1 (40 makrs)     Criteria                                                                                       Marks
- Code Performance   Code runs without errors and performs tasks as specified.                                      6
- Code Quality       Code is well-organized, includes meaningful comments, and uses appropriate variable names.     6
- Methodology        Comprehensive coverage of topics with detailed explanations of approaches and methodologies.   6
- Result analysis    Insightful analysis of results.                                                                6
- Report Quality     Report is well-structured, formatted, and free of grammatical errors.                          6
- Evidence of Work   All required elements are included and correct.                                                6
- Submission         Follows all requirements for submission                                                        4
-
-
-
-
-                                                           5
+        XJTLU Entrepreneur College (Taicang) Cover Sheet
+
+ Module code and Title       DTS307TC Reinforcement Learning
+ School Title                School of AI and Advanced Computing
+ Assignment Title            Coursework 1
+ Submission Deadline         04/May/2026 23:59
+ Final Word Count
+ If you agree to let the university use your work anonymously for teaching
+ and learning purposes, please type “yes” here.
+
+
+I certify that I have read and understood the University’s Policy for dealing with Plagiarism,
+Collusion and the Fabrication of Data (available on Learning Mall Online). With reference to this
+policy I certify that:
+
+    •   My work does not contain any instances of plagiarism and/or collusion.
+        My work does not contain any fabricated data.
+
+
+
+By uploading my assignment onto Learning Mall Online, I formally declare
+that all of the above information is true to the best of my knowledge and
+belief.
+                                         Scoring – For Tutor Use
+ Student ID
+
+ Stage of           Marker            Learning Outcomes Achieved （F/P/M/D）                          Final
+ Marking             Code                   (please modify as appropriate)                          Score
+                               A                      B                C
+ 1st Marker – red
+ pen
+ Moderation                        The original mark has been accepted by the moderator              Y/N
+                      IM                        (please circle as appropriate):
+ – green pen        Initials
+                                   Data entry and score calculation have been checked by               Y
+                                                another tutor (please circle):
+ 2nd Marker if
+ needed – green
+ pen
+ For Academic Office Use             Possible Academic Infringement (please tick as appropriate)
+ Date          Days Late                   ☐ Category A
+ Received late      Penalty                                                Total Academic Infringement Penalty
+                                            ☐ Category B                   (A,B, C, D, E, Please modify where
+                                                                           necessary) _____________________
+                                            ☐ Category C
+                                            ☐ Category D
+                                            ☐ Category E
+School of Artificial Intelligence and Advanced Computing
+Xi’an Jiaotong-Liverpool University
+
+
+
+
+                             DTS307TC Reinforcement Learning
+
+                               Coursework - Individual Report
+
+Due: 04/May/2026 23:59
+Weight: 40%
+Maximum score: 40 marks
+
+
+
+
+Overview
+
+The purpose of this assignment is to gain experience in Python programming and the design of
+reinforcement leaning algorithms. You are expected to implement an RL algorithm that solves a
+specific environment and provide an explanation of the algorithm’s methodology. You are expected
+to analyse your results, including challenges and your solutions.
+
+
+Learning Outcomes Assessed
+
+A: Systematically understand the fundamental concepts and principles of reinforcement learning
+B: Critically analyse real-life problem situations and expertly map them as reinforcement learning
+tasks.
+C: Mastery of Monte Carlo Methods and Temporal Difference Learning
+D: Proficiency in Deep Reinforcement Learning algorithms
+
+
+Late policy
+
+5% of the total marks available for the assessment shall be deducted from the assessment mark for
+each working day after the submission date, up to a maximum of five working days
+
+
+Avoid Plagiarism
+
+  • Do not submit work from other students.
+
+  • Do not share code/work with other students
+
+  • Do not use open-source code as it is or without proper reference.
+
+
+
+
+                                                2
+Risks
+
+   • Please read the coursework instructions and requirements carefully. Not following these instructions
+     and requirements may result in a loss of marks.
+   • The assignment must be submitted via Learning Mall. Only electronic submission is accepted
+     and no hard copy submission.
+   • All students must download their file and check that it is viewable after submission. Documents
+     may become corrupted during the uploading process (e.g. due to slow internet connections).
+     However, students are responsible for submitting a functional and correct file for assessments.
+   • Academic Integrity Policy is strictly followed.
+
+
+Individual Report (40 marks)
+
+The primary objective of this coursework is to familiarize students with the PPO algorithm using
+basic deep learning libraries, enabling them to improve their capability in transferring mathematical
+and theoretical knowledge into Python implementation, and further their understanding of the actor-
+critic algorithm.
+
+
+Algorithm Overview
+
+Proximal Policy Optimization (PPO) is a state-of-the-art reinforcement learning algorithm that optimizes
+a stochastic policy in an on-policy manner. To ensure stable training and avoid catastrophic performance
+collapse, PPO utilizes a clipped surrogate objective to prevent the policy update from stepping too
+far from the current behavior.
+
+
+The Environment: CarRacing-v3
+
+We will be using the Car Racing environment from the OpenAI Gymnasium. This environment
+features a top-down racing track where the agent must learn to navigate through tiles based on
+pixel inputs. You can find more details about this environment on their website.(https://gymnasium.
+farama.org/environments/box2d/car_racing/)
+Here’s a code snippet for you to get started:
+
+import gymnasium as gym
+env = gym . make ( " CarRacing - v3 " , render_mode = " rgb_array " )
+env . reset ()
+
+Since CarRacing-v3 is quite computationally expensive for a standard laptop (due to the pixel processing),
+you might want to consider using a gray-scaling or frame-stacking wrapper to speed up training.
+Alternatively, you can also use the lab computers, which have GPUs and have all the environment
+already set up.
+
+
+The PPO Agent
+
+You will implement an RL agent using PPO to play the CarRacing-v3 environment. The agent
+will use the standard observation and actions provided by the environment. You may edit the
+
+                                                      3
+environment to speed up your training, but your agent must still perform well in the standard
+environment. (i.e, removing the camera zoom at the beginning is allowed during training, but
+your agent should still be tested in the original environment.) You should record your training and
+evaluation process using Tensorboard. You should also record important losses and other data for
+your analysis later.
+
+
+The Report
+
+Upon completion of your implementation, you are required to submit a comprehensive technical
+report. The report should document your engineering decisions, the theoretical grounding of your
+code, and a critical analysis of the agent’s performance.
+
+  1. Introduction
+
+       • Provide a brief overview of Reinforcement Learning in the context of the CarRacing-v3
+         environment.
+       • Define the state space (pixels), action space (discrete commands), and the reward structure
+         of the task.
+
+  2. Methodology
+
+       • Mathematical Foundation: Formulate the PPO objective function. Explain the significance
+         of the clipping parameter and the probability ratio.
+       • Advantage Estimation: Describe your method for calculating advantages (e.g., standard
+         advantage vs. Generalized Advantage Estimation (GAE)).
+
+  3. Implementation Details
+
+       • Describe your implementation, including any challenges faced and how you addressed
+         them.
+       • Explain the structure of your policy and value networks.
+       • Detail the training process and hyperparameters used.
+
+  4. Results and Analysis
+
+       • Present your results (use graphs for better clarity).
+       • Discuss the performance of your agent and any trends observed.
+       • Briefly compare your custom implementation’s stability and sample efficiency against baseline
+         benchmarks (e.g., Stable-Baselines3).
+
+  5. Conclusion
+
+       • Summarize your key findings regarding the sensitivity of PPO to hyperparameter tuning
+         and the effectiveness of the actor-critic framework in continuous-input environments.
+
+     Note: All figures and plots must be clearly labeled with axes titles and legends. Raw code
+     snippets should be kept to a minimum in the report; focus on high-level logic and pseudo-
+     code where necessary.
+
+
+
+
+                                                 4
+Important Note
+
+   • Do NOT use Stable-baselines libraries or any other reinforcement learning specific libraries in
+     your implementation (You may use tensorboard for recording your results).
+
+   • Do NOT exceed the word count limit of 3000 words for each report, reference and appendix
+     excluded.
+
+   • Although you are allowed to use any generative AI tools to assist your work, please keep in mind
+     that you should be using them responsibly. (Good use: Improve your report after writing it
+     and always review its output to ensure that it is correct. Bad use: Copy-pasting an entire report
+     from AI without any effort of your own. )
+
+
+Submission Requirements
+
+Please prepare and submit the following documents:
+
+   • A cover page featuring your student ID. This page should be the first page of your report.
+
+   • A zip file containing all the source codes and your trained agent model, which should be named
+     using your full name and student ID in the following format: CW1_ID_Name.zip
+
+   • One PDF file for your report. The file should be separated from the zip file, which contains your
+     code. The files should be named in the following format: CW1_ID_Name.pdf
+
+Note that the quality of the code, the clarity of your writing, and the format/style of your report will
+be taken into consideration during the evaluation. The detailed rubric is outlined below.
+
+
+Rubric
+
+ CW1 (40 makrs)     Criteria                                                                                       Marks
+ Code Performance   Code runs without errors and performs tasks as specified.                                      6
+ Code Quality       Code is well-organized, includes meaningful comments, and uses appropriate variable names.     6
+ Methodology        Comprehensive coverage of topics with detailed explanations of approaches and methodologies.   6
+ Result analysis    Insightful analysis of results.                                                                6
+ Report Quality     Report is well-structured, formatted, and free of grammatical errors.                          6
+ Evidence of Work   All required elements are included and correct.                                                6
+ Submission         Follows all requirements for submission                                                        4
+
+
+
+
+                                                           5

@@ -1,260 +1,260 @@
-                                XJTLU Entrepreneur College (Taicang) Cover Sheet
-
-                                                                                                School of AI and Advanced
-         Module code           DTS304TC: Machine Learning                 School title
-                                                                                                Computing
-
-         Assessment title      Coursework Task 1                          Assessment type       Coursework
-
-         Submission
-                               01/May/2026 23:59
-         deadline
-
-
-I certify that I have read and understood the University's Policy for dealing with Plagiarism, Collusion and the Fabrication of Data
-(available on Learning Mall Online).
-My work does not contain any instances of plagiarism and/or collusion.
-My work does not contain any fabricated data.
-
-
-  By uploading my assignment onto Learning Mall Online, I formally declare that all of the
-            above information is true to the best of my knowledge and belief.
-                                              Scoring – For Tutor Use
-                             Student ID
-          Theory and Reflection PDF Word Count (Filled by
-                             Students)
-
-        Stage of Marking       Marker              Learning Outcomes Achieved （F/P/M/D）                           Final
-                               Code                                                                               Score
-                                                        (please modify as appropriate)
-                                                     A                   B             C
-         1st Marker – red
-               pen
-            Moderation                        The original mark has been accepted by the moderator                 Y/N
-                                  IM                       (please circle as appropriate):
-           – green pen         Initials
-                                             Data entry and score calculation have been checked by                   Y
-                                                          another tutor (please circle):
-           2nd Marker if
-         needed – green
-               pen
-          For Academic Office Use                  Possible Academic Infringement (please tick as appropriate)
-          Date      Days     Late                       ☐ Category A
-        Received     late  Penalty                                                        Total Academic Infringement Penalty
-                                                        ☐ Category B                       (A,B, C, D, E, Please modify where
-                                                                                          necessary) _____________________
-                                                        ☐ Category C
-                                                        ☐ Category D
-                                                        ☐ Category E
-                                              DTS304TC Machine Learning
-                                            Coursework - Assessment Task 1
-•     Percentage in final mark: 50%
-•     Assessment type: individual coursework
-•     Submission files: one Jupyter notebook (.ipynb), one Coursework Answer Sheet / Theory and Reflection PDF, and one
-      hidden-test CSV
-
-    Learning outcomes assessed
-•     A. Demonstrate a solid understanding of the theoretical issues related to problems that machine-learning methods try to
-      address.
-•     B. Demonstrate understanding of the properties of existing machine-learning algorithms and how they behave on practical data.
-
-
-
-    Notes
-•     Please read the coursework instructions and requirements carefully. Not following these instructions and requirements may
-      result in a loss of marks.
-•     The formal procedure for submitting coursework at XJTLU is strictly followed. Submission link on Learning Mall will be provided
-      in due course. The submission timestamp on Learning Mall will be used to check late submission.
-•     5% of the total marks available for the assessment shall be deducted from the assessment mark for each working day after the
-      submission date, up to a maximum of five working days.
-•     All modelling work must be completed individually. Discussion of general ideas is allowed, but code, experiments, and
-      notebooks must be independently developed.
-•     You may not use ChatGPT to directly generate answers for the coursework. High-scoring work must demonstrate your own
-      experimental design, controlled comparisons, failure analysis, and image-level interpretation. ChatGPT or similar tools may be
-      used only in a limited support role such as code understanding, debugging, or grammar support. They must not replace your
-      method design, ablation logic, qualitative analysis, or reflection. Generic AI-produced descriptions without matching evidence in
-      code, tables, figures, and discussion will not receive high marks.
-•     If you use AI tools or outside code in any meaningful way, you must fully understand, verify, and take ownership of every
-      method, number, figure, and written claim that appears in your submission.
-
-
-
-     Question 1: Notebook-Based Coding Exercise - Insurance Premium-Risk Classification (60
-     Marks)
-    In this coursework you will build and improve a multiclass classifier for a fictionalised health-insurance dataset. The task is
-    to predict whether each applicant belongs to a Low, Standard, or High premium-risk group before pricing a policy. The
-    dataset is intentionally realistic: it mixes numerical and categorical variables, contains missing values and dirty entries, and
-    includes some fields that require careful handling to avoid weak modelling practice or label leakage.
-    Your work should show a clear machine-learning workflow: build a sensible first pipeline, compare model families, apply
-    stronger hyperparameter optimisation, complete one compulsory improvement category plus at least one optional category,
-    carry out a compact K-Means/Gaussian Mixture Model (GMM) exploration, and then produce a hidden-test CSV using
-    validation evidence only.
-    The prediction target variable is ‘premium_risk’, and it has 3 imbalanced classes: Standard, High, Low. The dataset
-    contains 33 raw columns: admin/PII columns, synthetic noise features, 1 leakage feature, and genuine predictors.
-    Unless otherwise stated, macro-F1 is the primary validation metric because the dataset is imbalanced; accuracy is reported
-    as a secondary metric.
-    (A) Clean First Pipeline and Baseline Modelling (8 marks)
-•     Load the provided training and validation files and define a consistent target / feature setup.
-•     Handle leakage features, dirty values, missing values, and categorical variables sensibly. A compact sanity check is enough; a
-      long data-audit section is not required.
-      Important: The dataset contains a leakage feature. You must identify and remove it before proceeding to the next stage
-      of analysis; otherwise, the classification results will be severely biased by this leakage and will not be meaningful. If
-      this occurs, multiple parts of your Coursework 1 may be affected, which could significantly impact your marks.
-•     Build one baseline modelling pipeline.
-•     Report at least one validation result using accuracy and macro-F1 score and include a confusion matrix for the baseline model.
-•     Keep preprocessing consistent across train, validation, and hidden-test files.
-
-
-    (B) Controlled Comparison: Random Forest and One Boosting Model (8 marks)
-•     Using the same preprocessing pipeline, validation split, and evaluation metric (primary metric is macro-F1 also report accuracy),
-      carry out an initial controlled comparison between one Random Forest model and one boosting model.
-•     Default XGBoost is recommended because it provides a richer tuning space later, but others may also be used. Default settings
-      or only light sensible adjustments are acceptable in this section.
-•     In the notebook, report the validation result of each model and support the comparison with one or two additional analyses, such
-      as class-wise metrics, a confusion matrix, train-versus-validation behaviour, or stability / sensitivity after tuning.
-•     Your goal is not to prove that one model type always wins. Your goal is to compare the two models fairly, explain the high-level
-      learning difference between bagging and boosting, and use your own notebook evidence to give a careful, dataset-specific
-      interpretation. A generic textbook answer without reference to your own results will receive limited credit.
-    (C) Advanced Hyperparameter Optimisation (12 marks)
-•     At least one main model should be tuned with a genuinely advanced strategy such as Optuna/TPE, Bayesian optimisation,
-      Hyperopt, Ray Tune, or another comparably strong approach.
-•     Hyperparameter tuning should optimise macro-F1 score on the validation set, and the final tuned result should be reported
-      using both accuracy and macro-F1.
-•     RandomizedSearchCV alone is normally not enough for the top band.
-•     Explain briefly why your search space and optimiser are reasonable for the chosen model.
-    (D) Personalised Improvement Work (18 marks)
-    You must complete one compulsory category based on the last digit of your XJTLU student ID, plus at least one additional
-    optional category of your choice. A second optional category is recommended for stronger differentiation but is not compulsory.
-    You should report accuracy and macro-F1 for improved models and include class-wise metrics where helpful. A compact ablation
-    table should normally be included in the notebook for the personalized improvement work
-
-     Last digit                                                    Compulsory category
-                                0-1                                Category A - Data quality and missingness
-                                2-3                                Category B - Feature representation and engineering
-                                4-5                                Category C - Imbalance and objective design
-                                6-7                                Category D - Model robustness, calibration, or ensembling
-                                8-9                                Category E - Fairness, diagnostics, or interpretability
-                  Category                     Examples of what may be done                     What good evidence looks like
-                                             better missing-value strategy;              A concise before/after comparison with a short
-     A                                       MissForest or iterative imputation;         explanation of why the data handling changed the
-                                             sensible outlier handling; value cleaning   result
-                                             feature crosses; grouped categories;
-                                                                                         A compact ablation showing what representation
-     B                                       alternative encodings; modest feature
-                                                                                         changed and whether it helped
-                                             selection; transformations
-                                             class weighting; focal-style loss if
-                                                                                         Clear evidence of how minority or harder classes
-     C                                       relevant; sampling / resampling;
-                                                                                         changed, even if overall score moved only slightly
-                                             thresholding logic
-                                             bagging/boosting variants; calibration
-                                                                                         A meaningful diagnostic or comparison rather
-     D                                       checks; soft voting; stacking;
-                                                                                         than a large collection of loosely connected trials
-                                             robustness checks
-                                             SHAP / feature importance; subgroup-
-                                                                                         Concrete insight into model behaviour, not only
-     E                                       style fairness checks; error analysis;
-                                                                                         screenshots
-                                             model interpretation
-    (E) K-Means and Gaussian Mixture Model (GMM) Exploration (6 marks)
-    This is a compact exploratory section. It is not the main performance section, and it does not require clusters to match the class
-    labels exactly. The aim is to show your understanding of unsupervised learning methods and your ability to interpret their results
-    carefully.
-•     Use a sensible processed numeric feature space and briefly explain what you clustered on.
-•     Explore a small range of cluster/component numbers, such as 2-8.
-•     For K-Means, provide sensible supporting evidence, such as inertia (SSE), cluster sizes, or another simple analysis..
-•     For Gaussian Mixture Model (GMM), provide sensible supporting evidence, such as component sizes, posterior
-      confidence/responsibility, or overlap/uncertainty between components.
-•     Include at least one compact table or figure comparing K-Means and GMM.
-•     If class labels are used for reference, explain clearly that unsupervised structure does not need to align exactly with supervised
-      labels
-•     Stronger work may additionally use silhouette score, log-likelihood trends, or a simple visualization.
-
-
-    (F) Final Model Choice and Hidden-Test Export (8 marks)
-•     Choose the final model using validation evidence only.
-•     Retrain appropriately using both train and validation dataset and generate the hidden-test CSV in the required format.
-•     Submit the hidden-test results as test_result_[your_student_id].csv. The first column must contain applicant_id, the second
-      column must contain customer_key, and the third column must contain the predicted premium_risk labels (Standard, High,
-      Low).
-      Incorrect file naming or CSV formatting may prevent automated scoring and will result in an automatic deduction of 4 marks
-      from this section.
-•     Do not tune on the hidden test and do not claim hidden test performance.
-•     Note: Hidden test score contributes only a small portion of the final marks. High leaderboard rank alone cannot compensate for
-      weak experimental design or poor documentation.
-
-
-     Coursework Answer Sheet / Theory and Reflection (PDF) - all questions below are compulsory
-     (30 Marks)
-    The Coursework Answer Sheet / Theory and Reflection PDF should not repeat the notebook section by section. All prompt areas
-    below are compulsory. The PDF must be concise, directly linked to your own notebook evidence, and no longer than 4 pages /
-    1,200 words in total. Exceeding either limit will incur a fixed deduction of 5 marks from the PDF section. You should aim to
-    demonstrate both your theoretical or algorithmic understanding and your experimental findings or practical observations and
-    clearly link your understanding of the algorithms to your experimental analysis. At least one table, figure, or metric from the
-    notebook must be referenced in each theory answer.
-
-                            Prompt area                                                       What you should do
-                                                                     (1) Briefly state the definitions and key theoretical properties of bagging
-                                                                     and boosting models;
-                                                                     (2) report the validation results of each model;
-                                                                     (3) support your comparison with one or two additional analyses, such as
-                                                                     class-wise metrics, a confusion matrix, train–validation behaviour, or
-     1. Bagging versus boosting                                      stability/sensitivity after tuning; and
-                                                                     (4) provide a careful interpretation of what this comparison suggests
-                                                                     about this dataset and how it relates to the theoretical properties of
-                                                                     bagging versus boosting methods.
-                                                                     You are not expected to prove that one model type always performs
-                                                                     better.
-                                                                     Explain why your optimiser and search space were reasonable for the
-                                                                     chosen model, which hyperparameters you expected to matter most,
-     2. Hyperparameter optimisation
-                                                                     whether the tuned results matched that intuition, and what you learned
-                                                                     from the tuning process.
-                                                                     Explain hard versus soft assignment and the main assumption difference
-                                                                     between K-Means and GMM. Then use your own compact evidence to
-     3. K-Means versus Gaussian Mixture Model (GMM)                  discuss whether the results matched your intuition and whether GMM
-                                                                     revealed anything extra, such as soft membership, uncertainty, or a
-                                                                     better fit to partial cluster structure.
-                                                                     Reflect on the compulsory category and on every optional category you
-                                                                     implemented. Highlight any unique or interesting algorithm or strategy
-     4. Personalised reflection                                      you tried, the personal challenges you faced, the effort you made to
-                                                                     address them, and the key lessons you learned. Honest reflection on a
-                                                                     neutral or negative result is acceptable if the reasoning is concrete.
-                                                                     State briefly what forms of AI assistance, if any, were used. Generic AI-
-     5. AI-use declaration                                           written theory that does not match your notebook evidence will receive
-                                                                     limited credit.
-
-
-
-    Coding Quality, Coursework Answer Sheet Quality, and Submission Guidelines (10 marks)
-
-•     Submit your Jupyter Notebook in .ipynb format. It must be well organised, include clear commentary and clean code practices,
-      and show visible outputs. Do not write a second mini-report repeating notebook content.
-      •    The notebook should be reproducible from start to finish without errors. Results cited in the PDF should be visible in the
-           notebook and should match the reported values.
-      •    If you used supplementary code outside the notebook, submit that code as well so the full workflow remains reproducible.
-•     Submit the hidden-test results as test_result_[your_student_id].csv. The first column must contain applicant_id, the second
-      column must contain customer_key, and the third column must contain the predicted premium_risk labels (Standard, High,
-      Low). Incorrect file naming or CSV formatting may prevent automated scoring and will result in an automatic deduction of 4
-      marks from this section.
-•     Submit the Coursework Answer Sheet / Theory and Reflection in PDF format. All questions in that section are compulsory. The
-      Coursework Answer Sheet / Theory and Reflection PDF must answer every required prompt, refer to your own notebook
-      evidence, and remain within 4 pages and 1,200 words in total. Exceeding either limit will incur a fixed deduction of 5 marks from
-      the PDF section.
-•     Include all required components: Jupyter notebooks (code), any additional experimental scripts or custom code, the hidden
-      test-results CSV file, and the Coursework Answer Sheet PDF. Submit all files through the Learning Mall platform. After
-      submission, download your files to verify that they can be opened and viewed correctly to ensure the submission was
-      successful.
-
-    Project Material Access Instructions
-
-    To access the complete set of materials for this project, please use the links below:
-
-        •    OneDrive Link:
-             https://1drv.ms/f/c/18f09d1a39585f84/IgCXDMbXkFYSSZUZkkTyXyZzAQ1poX9mujUqF8N3JlL0GD0?e=uNhAHq
-        •    The same coursework materials have also been uploaded to Learning Mall.
-    When extracting the materials, use the following password to unlock the zip file: DTS304TC (case-sensitive, enter in
-    uppercase).
+                                XJTLU Entrepreneur College (Taicang) Cover Sheet
+
+                                                                                                School of AI and Advanced
+         Module code           DTS304TC: Machine Learning                 School title
+                                                                                                Computing
+
+         Assessment title      Coursework Task 1                          Assessment type       Coursework
+
+         Submission
+                               01/May/2026 23:59
+         deadline
+
+
+I certify that I have read and understood the University's Policy for dealing with Plagiarism, Collusion and the Fabrication of Data
+(available on Learning Mall Online).
+My work does not contain any instances of plagiarism and/or collusion.
+My work does not contain any fabricated data.
+
+
+  By uploading my assignment onto Learning Mall Online, I formally declare that all of the
+            above information is true to the best of my knowledge and belief.
+                                              Scoring – For Tutor Use
+                             Student ID
+          Theory and Reflection PDF Word Count (Filled by
+                             Students)
+
+        Stage of Marking       Marker              Learning Outcomes Achieved （F/P/M/D）                           Final
+                               Code                                                                               Score
+                                                        (please modify as appropriate)
+                                                     A                   B             C
+         1st Marker – red
+               pen
+            Moderation                        The original mark has been accepted by the moderator                 Y/N
+                                  IM                       (please circle as appropriate):
+           – green pen         Initials
+                                             Data entry and score calculation have been checked by                   Y
+                                                          another tutor (please circle):
+           2nd Marker if
+         needed – green
+               pen
+          For Academic Office Use                  Possible Academic Infringement (please tick as appropriate)
+          Date      Days     Late                       ☐ Category A
+        Received     late  Penalty                                                        Total Academic Infringement Penalty
+                                                        ☐ Category B                       (A,B, C, D, E, Please modify where
+                                                                                          necessary) _____________________
+                                                        ☐ Category C
+                                                        ☐ Category D
+                                                        ☐ Category E
+                                              DTS304TC Machine Learning
+                                            Coursework - Assessment Task 1
+•     Percentage in final mark: 50%
+•     Assessment type: individual coursework
+•     Submission files: one Jupyter notebook (.ipynb), one Coursework Answer Sheet / Theory and Reflection PDF, and one
+      hidden-test CSV
+
+    Learning outcomes assessed
+•     A. Demonstrate a solid understanding of the theoretical issues related to problems that machine-learning methods try to
+      address.
+•     B. Demonstrate understanding of the properties of existing machine-learning algorithms and how they behave on practical data.
+
+
+
+    Notes
+•     Please read the coursework instructions and requirements carefully. Not following these instructions and requirements may
+      result in a loss of marks.
+•     The formal procedure for submitting coursework at XJTLU is strictly followed. Submission link on Learning Mall will be provided
+      in due course. The submission timestamp on Learning Mall will be used to check late submission.
+•     5% of the total marks available for the assessment shall be deducted from the assessment mark for each working day after the
+      submission date, up to a maximum of five working days.
+•     All modelling work must be completed individually. Discussion of general ideas is allowed, but code, experiments, and
+      notebooks must be independently developed.
+•     You may not use ChatGPT to directly generate answers for the coursework. High-scoring work must demonstrate your own
+      experimental design, controlled comparisons, failure analysis, and image-level interpretation. ChatGPT or similar tools may be
+      used only in a limited support role such as code understanding, debugging, or grammar support. They must not replace your
+      method design, ablation logic, qualitative analysis, or reflection. Generic AI-produced descriptions without matching evidence in
+      code, tables, figures, and discussion will not receive high marks.
+•     If you use AI tools or outside code in any meaningful way, you must fully understand, verify, and take ownership of every
+      method, number, figure, and written claim that appears in your submission.
+
+
+
+     Question 1: Notebook-Based Coding Exercise - Insurance Premium-Risk Classification (60
+     Marks)
+    In this coursework you will build and improve a multiclass classifier for a fictionalised health-insurance dataset. The task is
+    to predict whether each applicant belongs to a Low, Standard, or High premium-risk group before pricing a policy. The
+    dataset is intentionally realistic: it mixes numerical and categorical variables, contains missing values and dirty entries, and
+    includes some fields that require careful handling to avoid weak modelling practice or label leakage.
+    Your work should show a clear machine-learning workflow: build a sensible first pipeline, compare model families, apply
+    stronger hyperparameter optimisation, complete one compulsory improvement category plus at least one optional category,
+    carry out a compact K-Means/Gaussian Mixture Model (GMM) exploration, and then produce a hidden-test CSV using
+    validation evidence only.
+    The prediction target variable is ‘premium_risk’, and it has 3 imbalanced classes: Standard, High, Low. The dataset
+    contains 33 raw columns: admin/PII columns, synthetic noise features, 1 leakage feature, and genuine predictors.
+    Unless otherwise stated, macro-F1 is the primary validation metric because the dataset is imbalanced; accuracy is reported
+    as a secondary metric.
+    (A) Clean First Pipeline and Baseline Modelling (8 marks)
+•     Load the provided training and validation files and define a consistent target / feature setup.
+•     Handle leakage features, dirty values, missing values, and categorical variables sensibly. A compact sanity check is enough; a
+      long data-audit section is not required.
+      Important: The dataset contains a leakage feature. You must identify and remove it before proceeding to the next stage
+      of analysis; otherwise, the classification results will be severely biased by this leakage and will not be meaningful. If
+      this occurs, multiple parts of your Coursework 1 may be affected, which could significantly impact your marks.
+•     Build one baseline modelling pipeline.
+•     Report at least one validation result using accuracy and macro-F1 score and include a confusion matrix for the baseline model.
+•     Keep preprocessing consistent across train, validation, and hidden-test files.
+
+
+    (B) Controlled Comparison: Random Forest and One Boosting Model (8 marks)
+•     Using the same preprocessing pipeline, validation split, and evaluation metric (primary metric is macro-F1 also report accuracy),
+      carry out an initial controlled comparison between one Random Forest model and one boosting model.
+•     Default XGBoost is recommended because it provides a richer tuning space later, but others may also be used. Default settings
+      or only light sensible adjustments are acceptable in this section.
+•     In the notebook, report the validation result of each model and support the comparison with one or two additional analyses, such
+      as class-wise metrics, a confusion matrix, train-versus-validation behaviour, or stability / sensitivity after tuning.
+•     Your goal is not to prove that one model type always wins. Your goal is to compare the two models fairly, explain the high-level
+      learning difference between bagging and boosting, and use your own notebook evidence to give a careful, dataset-specific
+      interpretation. A generic textbook answer without reference to your own results will receive limited credit.
+    (C) Advanced Hyperparameter Optimisation (12 marks)
+•     At least one main model should be tuned with a genuinely advanced strategy such as Optuna/TPE, Bayesian optimisation,
+      Hyperopt, Ray Tune, or another comparably strong approach.
+•     Hyperparameter tuning should optimise macro-F1 score on the validation set, and the final tuned result should be reported
+      using both accuracy and macro-F1.
+•     RandomizedSearchCV alone is normally not enough for the top band.
+•     Explain briefly why your search space and optimiser are reasonable for the chosen model.
+    (D) Personalised Improvement Work (18 marks)
+    You must complete one compulsory category based on the last digit of your XJTLU student ID, plus at least one additional
+    optional category of your choice. A second optional category is recommended for stronger differentiation but is not compulsory.
+    You should report accuracy and macro-F1 for improved models and include class-wise metrics where helpful. A compact ablation
+    table should normally be included in the notebook for the personalized improvement work
+
+     Last digit                                                    Compulsory category
+                                0-1                                Category A - Data quality and missingness
+                                2-3                                Category B - Feature representation and engineering
+                                4-5                                Category C - Imbalance and objective design
+                                6-7                                Category D - Model robustness, calibration, or ensembling
+                                8-9                                Category E - Fairness, diagnostics, or interpretability
+                  Category                     Examples of what may be done                     What good evidence looks like
+                                             better missing-value strategy;              A concise before/after comparison with a short
+     A                                       MissForest or iterative imputation;         explanation of why the data handling changed the
+                                             sensible outlier handling; value cleaning   result
+                                             feature crosses; grouped categories;
+                                                                                         A compact ablation showing what representation
+     B                                       alternative encodings; modest feature
+                                                                                         changed and whether it helped
+                                             selection; transformations
+                                             class weighting; focal-style loss if
+                                                                                         Clear evidence of how minority or harder classes
+     C                                       relevant; sampling / resampling;
+                                                                                         changed, even if overall score moved only slightly
+                                             thresholding logic
+                                             bagging/boosting variants; calibration
+                                                                                         A meaningful diagnostic or comparison rather
+     D                                       checks; soft voting; stacking;
+                                                                                         than a large collection of loosely connected trials
+                                             robustness checks
+                                             SHAP / feature importance; subgroup-
+                                                                                         Concrete insight into model behaviour, not only
+     E                                       style fairness checks; error analysis;
+                                                                                         screenshots
+                                             model interpretation
+    (E) K-Means and Gaussian Mixture Model (GMM) Exploration (6 marks)
+    This is a compact exploratory section. It is not the main performance section, and it does not require clusters to match the class
+    labels exactly. The aim is to show your understanding of unsupervised learning methods and your ability to interpret their results
+    carefully.
+•     Use a sensible processed numeric feature space and briefly explain what you clustered on.
+•     Explore a small range of cluster/component numbers, such as 2-8.
+•     For K-Means, provide sensible supporting evidence, such as inertia (SSE), cluster sizes, or another simple analysis..
+•     For Gaussian Mixture Model (GMM), provide sensible supporting evidence, such as component sizes, posterior
+      confidence/responsibility, or overlap/uncertainty between components.
+•     Include at least one compact table or figure comparing K-Means and GMM.
+•     If class labels are used for reference, explain clearly that unsupervised structure does not need to align exactly with supervised
+      labels
+•     Stronger work may additionally use silhouette score, log-likelihood trends, or a simple visualization.
+
+
+    (F) Final Model Choice and Hidden-Test Export (8 marks)
+•     Choose the final model using validation evidence only.
+•     Retrain appropriately using both train and validation dataset and generate the hidden-test CSV in the required format.
+•     Submit the hidden-test results as test_result_[your_student_id].csv. The first column must contain applicant_id, the second
+      column must contain customer_key, and the third column must contain the predicted premium_risk labels (Standard, High,
+      Low).
+      Incorrect file naming or CSV formatting may prevent automated scoring and will result in an automatic deduction of 4 marks
+      from this section.
+•     Do not tune on the hidden test and do not claim hidden test performance.
+•     Note: Hidden test score contributes only a small portion of the final marks. High leaderboard rank alone cannot compensate for
+      weak experimental design or poor documentation.
+
+
+     Coursework Answer Sheet / Theory and Reflection (PDF) - all questions below are compulsory
+     (30 Marks)
+    The Coursework Answer Sheet / Theory and Reflection PDF should not repeat the notebook section by section. All prompt areas
+    below are compulsory. The PDF must be concise, directly linked to your own notebook evidence, and no longer than 4 pages /
+    1,200 words in total. Exceeding either limit will incur a fixed deduction of 5 marks from the PDF section. You should aim to
+    demonstrate both your theoretical or algorithmic understanding and your experimental findings or practical observations and
+    clearly link your understanding of the algorithms to your experimental analysis. At least one table, figure, or metric from the
+    notebook must be referenced in each theory answer.
+
+                            Prompt area                                                       What you should do
+                                                                     (1) Briefly state the definitions and key theoretical properties of bagging
+                                                                     and boosting models;
+                                                                     (2) report the validation results of each model;
+                                                                     (3) support your comparison with one or two additional analyses, such as
+                                                                     class-wise metrics, a confusion matrix, train–validation behaviour, or
+     1. Bagging versus boosting                                      stability/sensitivity after tuning; and
+                                                                     (4) provide a careful interpretation of what this comparison suggests
+                                                                     about this dataset and how it relates to the theoretical properties of
+                                                                     bagging versus boosting methods.
+                                                                     You are not expected to prove that one model type always performs
+                                                                     better.
+                                                                     Explain why your optimiser and search space were reasonable for the
+                                                                     chosen model, which hyperparameters you expected to matter most,
+     2. Hyperparameter optimisation
+                                                                     whether the tuned results matched that intuition, and what you learned
+                                                                     from the tuning process.
+                                                                     Explain hard versus soft assignment and the main assumption difference
+                                                                     between K-Means and GMM. Then use your own compact evidence to
+     3. K-Means versus Gaussian Mixture Model (GMM)                  discuss whether the results matched your intuition and whether GMM
+                                                                     revealed anything extra, such as soft membership, uncertainty, or a
+                                                                     better fit to partial cluster structure.
+                                                                     Reflect on the compulsory category and on every optional category you
+                                                                     implemented. Highlight any unique or interesting algorithm or strategy
+     4. Personalised reflection                                      you tried, the personal challenges you faced, the effort you made to
+                                                                     address them, and the key lessons you learned. Honest reflection on a
+                                                                     neutral or negative result is acceptable if the reasoning is concrete.
+                                                                     State briefly what forms of AI assistance, if any, were used. Generic AI-
+     5. AI-use declaration                                           written theory that does not match your notebook evidence will receive
+                                                                     limited credit.
+
+
+
+    Coding Quality, Coursework Answer Sheet Quality, and Submission Guidelines (10 marks)
+
+•     Submit your Jupyter Notebook in .ipynb format. It must be well organised, include clear commentary and clean code practices,
+      and show visible outputs. Do not write a second mini-report repeating notebook content.
+      •    The notebook should be reproducible from start to finish without errors. Results cited in the PDF should be visible in the
+           notebook and should match the reported values.
+      •    If you used supplementary code outside the notebook, submit that code as well so the full workflow remains reproducible.
+•     Submit the hidden-test results as test_result_[your_student_id].csv. The first column must contain applicant_id, the second
+      column must contain customer_key, and the third column must contain the predicted premium_risk labels (Standard, High,
+      Low). Incorrect file naming or CSV formatting may prevent automated scoring and will result in an automatic deduction of 4
+      marks from this section.
+•     Submit the Coursework Answer Sheet / Theory and Reflection in PDF format. All questions in that section are compulsory. The
+      Coursework Answer Sheet / Theory and Reflection PDF must answer every required prompt, refer to your own notebook
+      evidence, and remain within 4 pages and 1,200 words in total. Exceeding either limit will incur a fixed deduction of 5 marks from
+      the PDF section.
+•     Include all required components: Jupyter notebooks (code), any additional experimental scripts or custom code, the hidden
+      test-results CSV file, and the Coursework Answer Sheet PDF. Submit all files through the Learning Mall platform. After
+      submission, download your files to verify that they can be opened and viewed correctly to ensure the submission was
+      successful.
+
+    Project Material Access Instructions
+
+    To access the complete set of materials for this project, please use the links below:
+
+        •    OneDrive Link:
+             https://1drv.ms/f/c/18f09d1a39585f84/IgCXDMbXkFYSSZUZkkTyXyZzAQ1poX9mujUqF8N3JlL0GD0?e=uNhAHq
+        •    The same coursework materials have also been uploaded to Learning Mall.
+    When extracting the materials, use the following password to unlock the zip file: DTS304TC (case-sensitive, enter in
+    uppercase).