elderly-heat-warning/thesis/chapters/ch2-theory.tex

\chapter{相关理论与技术基础}

\section{LSTM神经网络}

长短期记忆网络（Long Short-Term Memory, LSTM）由Hochreiter和Schmidhuber于1997年提出，是循环神经网络（RNN）的重要变体。传统RNN在处理长序列时因梯度消失/爆炸问题难以学习远距离依赖，LSTM通过引入门控机制有效解决了这一问题。

\subsection{LSTM单元结构}

LSTM单元的核心是一个细胞状态（cell state）$\mathbf{C}_t$，贯穿整个序列的信息传递通道，由三个门控结构（遗忘门、输入门、输出门）进行调控：

\begin{equation}
\mathbf{f}_t = \sigma(\mathbf{W}_f \cdot [\mathbf{h}_{t-1}, \mathbf{x}_t] + \mathbf{b}_f)
\end{equation}
\begin{equation}
\mathbf{i}_t = \sigma(\mathbf{W}_i \cdot [\mathbf{h}_{t-1}, \mathbf{x}_t] + \mathbf{b}_i)
\end{equation}
\begin{equation}
\tilde{\mathbf{C}}_t = \tanh(\mathbf{W}_C \cdot [\mathbf{h}_{t-1}, \mathbf{x}_t] + \mathbf{b}_C)
\end{equation}
\begin{equation}
\mathbf{C}_t = \mathbf{f}_t \odot \mathbf{C}_{t-1} + \mathbf{i}_t \odot \tilde{\mathbf{C}}_t
\end{equation}
\begin{equation}
\mathbf{o}_t = \sigma(\mathbf{W}_o \cdot [\mathbf{h}_{t-1}, \mathbf{x}_t] + \mathbf{b}_o)
\end{equation}
\begin{equation}
\mathbf{h}_t = \mathbf{o}_t \odot \tanh(\mathbf{C}_t)
\end{equation}

其中，$\sigma$为sigmoid激活函数，$\odot$为逐元素乘积，$\mathbf{f}_t$、$\mathbf{i}_t$、$\mathbf{o}_t$分别为遗忘门、输入门和输出门的激活向量，$\tilde{\mathbf{C}}_t$为候选细胞状态。遗忘门控制前一时刻细胞状态的保留比例，输入门决定新信息的写入量，输出门调节细胞状态对隐藏状态的贡献。

\subsection{双向LSTM}

双向LSTM（Bidirectional LSTM, BiLSTM）由前向LSTM和后向LSTM组成，分别从序列的正向和反向读取输入：

\begin{equation}
\overrightarrow{\mathbf{h}}_t = \text{LSTM}_{\text{fwd}}(\mathbf{x}_t, \overrightarrow{\mathbf{h}}_{t-1})
\end{equation}
\begin{equation}
\overleftarrow{\mathbf{h}}_t = \text{LSTM}_{\text{bwd}}(\mathbf{x}_t, \overleftarrow{\mathbf{h}}_{t+1})
\end{equation}
\begin{equation}
\mathbf{h}_t^{\text{bi}} = [\overrightarrow{\mathbf{h}}_t; \overleftarrow{\mathbf{h}}_t]
\end{equation}

BiLSTM在每个时间步同时利用过去和未来的上下文信息，在气象时序预测中特别有用——某一天的温度既受前期天气积累影响，也与后续天气系统的演变有关。

\section{注意力机制}

注意力机制（Attention Mechanism）由Bahdanau等（2014）首次引入序列到序列学习，其核心思想是动态地为输入序列的不同位置分配不同的重要性权重。Vaswani等（2017）提出的Transformer架构将注意力机制推向了新的高度。

\subsection{缩放点积注意力}

缩放点积注意力（Scaled Dot-Product Attention）是多头注意力的基础计算单元：

\begin{equation}
\text{Attention}(\mathbf{Q}, \mathbf{K}, \mathbf{V}) = \text{softmax}\left(\frac{\mathbf{Q}\mathbf{K}^T}{\sqrt{d_k}}\right)\mathbf{V}
\end{equation}

其中$\mathbf{Q}$（Query）、$\mathbf{K}$（Key）、$\mathbf{V}$（Value）分别为查询、键和值矩阵，$d_k$为键向量的维度。除以$\sqrt{d_k}$的作用是防止点积值过大导致softmax梯度弥散。

\subsection{多头自注意力}

多头自注意力（Multi-Head Self-Attention）将$\mathbf{Q}$、$\mathbf{K}$、$\mathbf{V}$分别通过$h$个不同的线性投影映射到多个子空间，在每个子空间中独立计算注意力：

\begin{equation}
\text{head}_i = \text{Attention}(\mathbf{Q}\mathbf{W}_i^Q, \mathbf{K}\mathbf{W}_i^K, \mathbf{V}\mathbf{W}_i^V)
\end{equation}
\begin{equation}
\text{MultiHead}(\mathbf{Q}, \mathbf{K}, \mathbf{V}) = \text{Concat}(\text{head}_1, \dots, \text{head}_h)\mathbf{W}^O
\end{equation}

在自注意力中，$\mathbf{Q} = \mathbf{K} = \mathbf{V} = \mathbf{X}$（输入序列）。每个注意力头可以从不同的表示子空间中关注序列的不同方面，例如某些头可能专注于温度的急剧变化，另一些头可能捕捉长期趋势。

\section{XGBoost算法}

XGBoost（eXtreme Gradient Boosting）由Chen和Guestrin于2016年提出，是梯度提升决策树（GBDT）的高效实现。其核心优势包括：

\textbf{正则化目标函数：}XGBoost在目标函数中引入了正则项以控制模型复杂度：

\begin{equation}
\mathcal{L}(\phi) = \sum_i l(\hat{y}_i, y_i) + \sum_k \Omega(f_k)
\end{equation}
\begin{equation}
\Omega(f) = \gamma T + \frac{1}{2}\lambda \|\mathbf{w}\|^2
\end{equation}

其中$T$为叶节点数量，$\mathbf{w}$为叶节点权重，$\gamma$和$\lambda$为正则化系数。

\textbf{二阶泰勒展开：}使用损失函数的二阶展开近似进行树的分裂增益计算，比传统GBDT的一阶近似更精确：

\begin{equation}
\text{Gain} = \frac{1}{2}\left[\frac{(\sum_{i\in I_L} g_i)^2}{\sum_{i\in I_L} h_i + \lambda} + \frac{(\sum_{i\in I_R} g_i)^2}{\sum_{i\in I_R} h_i + \lambda} - \frac{(\sum_{i\in I} g_i)^2}{\sum_{i\in I} h_i + \lambda}\right] - \gamma
\end{equation}

其中$g_i$和$h_i$分别为损失函数的一阶和二阶梯度。

\textbf{并行化与特征采样：}XGBoost支持特征级别的并行计算（按特征值排序）和列采样（类似随机森林），在大规模数据集上具有显著的效率优势。

\section{体感温度计算方法}

体感温度是高温健康风险评估的核心指标。单纯的空气温度不能完全反映人体对热环境的感知，湿度、风速和辐射等因素同样影响体感温度。本研究采用以下两种经典公式：

\subsection{Magnus公式——相对湿度计算}

从ERA5-Land获取的2m温度（$T$）和2m露点温度（$T_d$）出发，使用Magnus公式计算相对湿度：

\begin{equation}
e_s(T) = \exp\left(\frac{17.27 \cdot T}{237.7 + T}\right)
\end{equation}
\begin{equation}
e_a(T_d) = \exp\left(\frac{17.27 \cdot T_d}{237.7 + T_d}\right)
\end{equation}
\begin{equation}
RH = 100 \times \frac{e_a(T_d)}{e_s(T)} = 100 \times \exp\left(\frac{17.27 \cdot T_d}{237.7 + T_d} - \frac{17.27 \cdot T}{237.7 + T}\right)
\end{equation}

其中$e_s$为饱和水汽压（hPa），$e_a$为实际水汽压（hPa），$RH$为相对湿度（\%），温度单位为°C。

\subsection{NOAA Rothfusz公式——体感温度}

美国国家海洋和大气管理局（NOAA）提出的Rothfusz回归公式是体感温度（Heat Index, HI）计算的标准方法。以华氏度为计算单位，最终转换回摄氏度：

\begin{equation}
T_F = T_C \times 1.8 + 32
\end{equation}

当$T_F < 80$°F（约26.7°C）时，使用简化公式：
\begin{equation}
HI_F = 0.5 \times [T_F + 61.0 + (T_F - 68.0) \times 1.2 + RH \times 0.094]
\end{equation}

当$T_F \geq 80$°F时，使用完整Rothfusz回归：
\begin{equation}
\begin{aligned}
HI_F &= -42.379 + 2.04901523 \times T_F + 10.14333127 \times RH \\
     &- 0.22475541 \times T_F \times RH - 6.83783 \times 10^{-3} \times T_F^2 \\
     &- 5.481717 \times 10^{-2} \times RH^2 + 1.22874 \times 10^{-3} \times T_F^2 \times RH \\
     &+ 8.5282 \times 10^{-4} \times T_F \times RH^2 - 1.99 \times 10^{-6} \times T_F^2 \times RH^2
\end{aligned}
\end{equation}

随后进行NOAA标准修正（当RH<13\%且80°F<T<112°F时调整），最终转回摄氏度：
\begin{equation}
HI_C = (HI_F - 32) / 1.8
\end{equation}

\section{高温健康风险等级划分}

参考世界气象组织（WMO）和中国气象局的高温预警标准，结合老年人群体的生理特征，本研究定义四级高温健康风险等级：

\begin{table}[H]
\centering
\caption{高温健康风险等级划分标准}
\begin{tabular}{cccc}
\toprule
\textbf{风险等级} & \textbf{标签} & \textbf{体感温度阈值} & \textbf{对应措施} \\
\midrule
低风险（0级） & 低 & HI < 32°C & 正常户外活动，注意补水 \\
中风险（1级） & 中 & 32°C ≤ HI < 35°C & 减少午后户外活动，保持室内通风 \\
高风险（2级） & 高 & 35°C ≤ HI < 38°C & 避免户外活动，开启降温设备 \\
严重风险（3级） & 严重 & HI ≥ 38°C & 停止一切户外活动，社区入户巡查 \\
\bottomrule
\end{tabular}
\end{table}

\section{Focal Loss损失函数}

在高温健康风险预测中，极端风险事件（高风险和严重风险）的发生频率远低于正常天气条件（低风险），导致严重的类别不平衡问题。传统交叉熵损失在此场景下会使模型偏向预测多数类。

Focal Loss由Lin等（2017）在目标检测领域提出，通过调制因子降低易分类样本的损失贡献，迫使模型聚焦于困难样本：

\begin{equation}
\text{FL}(p_t) = -\alpha (1 - p_t)^\gamma \log(p_t)
\end{equation}

其中$p_t$为模型对正确类别的预测概率，$\alpha$为类别平衡因子，$\gamma$为聚焦参数。当$\gamma=0$时退化为加权交叉熵损失；$\gamma>0$时，对已正确分类的高置信度样本（$p_t$接近1）施加更大的衰减，从而将梯度信号集中于难分类的样本。本研究采用$\alpha=0.5, \gamma=2.0$作为默认参数。

\section{Flask框架与ECharts可视化}

Flask是Python生态中最广泛使用的轻量级Web框架之一，遵循WSGI标准，以路由装饰器和Jinja2模板引擎为核心特性。其\"微框架\"（microframework）设计理念使得开发者可自由组合扩展组件。本研究使用Flask提供4个RESTful API端点（预测、历史、统计、主页），前后端通过JSON格式进行数据交换。

ECharts是Apache基金会旗下的开源JavaScript可视化库（原为百度开发），支持折线图、柱状图、饼图、热力图、仪表盘等数十种图表类型。其声明式配置语法和丰富的交互特性（数据缩放、tooltip、图例切换）使其成为数据大屏开发的主流选择。本研究基于ECharts 5.5构建6个可视化面板。