TY - JOUR
T1 - Improving Medium Range Severe Weather Prediction through Transformer Post-processing of AI Weather Forecasts
AU - Hua, Zhanxiang
AU - Sobash, Ryan
AU - Gagne, David John, II
AU - Sha, Yingkai
AU - Anderson-Frey, Alexandra
PY - 2025/5/1
Y1 - 2025/5/1
N2 - Improving the skill of medium-range (3-8 day) severe weather prediction
is crucial for mitigating societal impacts. This study introduces a
novel approach leveraging decoder-only transformer networks to
post-process AI-based weather forecasts, specifically from the
Pangu-Weather model, for improved severe weather guidance. Unlike
traditional post-processing methods that use a dense neural network to
predict the probability of severe weather using discrete forecast
samples, our method treats forecast lead times as sequential ``tokens'',
enabling the transformer to learn complex temporal relationships within
the evolving atmospheric state. We compare this approach against
post-processing of the Global Forecast System (GFS) using both a
traditional dense neural network and our transformer, as well as
configurations that exclude convective parameters to fairly evaluate the
impact of using the Pangu-Weather AI model. Results demonstrate that the
transformer-based post-processing significantly enhances forecast skill
compared to dense neural networks. Furthermore, AI-driven forecasts,
particularly Pangu-Weather initialized from high resolution analysis,
exhibit superior performance to GFS in the medium-range, even without
explicit convective parameters. Our approach offers improved accuracy,
and reliability, which also provides interpretability through feature
attribution analysis, advancing medium-range severe weather prediction
capabilities.
AB - Improving the skill of medium-range (3-8 day) severe weather prediction
is crucial for mitigating societal impacts. This study introduces a
novel approach leveraging decoder-only transformer networks to
post-process AI-based weather forecasts, specifically from the
Pangu-Weather model, for improved severe weather guidance. Unlike
traditional post-processing methods that use a dense neural network to
predict the probability of severe weather using discrete forecast
samples, our method treats forecast lead times as sequential ``tokens'',
enabling the transformer to learn complex temporal relationships within
the evolving atmospheric state. We compare this approach against
post-processing of the Global Forecast System (GFS) using both a
traditional dense neural network and our transformer, as well as
configurations that exclude convective parameters to fairly evaluate the
impact of using the Pangu-Weather AI model. Results demonstrate that the
transformer-based post-processing significantly enhances forecast skill
compared to dense neural networks. Furthermore, AI-driven forecasts,
particularly Pangu-Weather initialized from high resolution analysis,
exhibit superior performance to GFS in the medium-range, even without
explicit convective parameters. Our approach offers improved accuracy,
and reliability, which also provides interpretability through feature
attribution analysis, advancing medium-range severe weather prediction
capabilities.
KW - Atmospheric and Oceanic Physics
KW - Artificial Intelligence
KW - Machine Learning
U2 - 10.48550/arXiv.2505.11750
DO - 10.48550/arXiv.2505.11750
M3 - Article
JO - arXiv
JF - arXiv
ER -