TY - JOUR
T1 - Measuring Sharpness of AI-Generated Meteorological Imagery
AU - Ebert-Uphoff, Imme
AU - Hoef, Lander Ver
AU - Schreck, John S.
AU - Stock, Jason
AU - Molina, Maria J.
AU - McGovern, Amy
AU - Yu, Michael
AU - Petzke, Bill
AU - Hilburn, Kyle
AU - Hall, David M.
AU - Gagne, David John
AU - Campbell, William F.
AU - Radford, Jacob T.
AU - Stewart, Jebb Q.
AU - Scheuerman, Sam
PY - 2025/7
Y1 - 2025/7
N2 - Artificial intelligence (AI)-based algorithms are emerging in many meteorological applications that produce imagery as output, including for global weather forecasting models. However, the imagery produced by AI algorithms, especially by convolutional neural networks (CNNs), is often described as too blurry to look realistic, partly because CNNs tend to represent uncertainty as blurriness. This blurriness can be undesirable since it might obscure important meteorological features. More complex AI models, such as generative AI models, produce images that appear to be sharper. However, improved sharpness may come at the expense of a decline in other performance criteria, such as standard forecast verifica-tion metrics. To navigate any trade-off between sharpness and other performance metrics, it is important to quantitatively assess those other metrics along with sharpness. While there is a rich set of forecast verification metrics available for meteorological images, none of them focus on sharpness. This paper seeks to fill this gap by 1) exploring a variety of sharpness metrics from other fields, 2) evaluating properties of these metrics, 3) proposing the new concept of Gaussian blur equivalence as a tool for their uniform interpretation, and 4) demonstrating their use for sample meteorological applications, including a CNN that emulates radar imagery from satellite imagery [GOES Radar Estimation via Machine Learning to Inform NWP (GREMLIN)] and an AI-based global weather forecasting model (GraphCast). SIGNIFICANCE STATEMENT: Artificial intelligence (AI)-based estimates of meteorological images, e.g., for forecasting applications, often lack sharpness, but there are no well-established metrics to measure the sharpness of meteorological imagery. This manuscript seeks to close this gap by exploring sharpness metrics for meteorological imagery, analyzing their properties, and providing guidelines for their interpretation. We hope that the tools provided here will aid the development of AI algorithms that provide more realistic meteorological imagery.
AB - Artificial intelligence (AI)-based algorithms are emerging in many meteorological applications that produce imagery as output, including for global weather forecasting models. However, the imagery produced by AI algorithms, especially by convolutional neural networks (CNNs), is often described as too blurry to look realistic, partly because CNNs tend to represent uncertainty as blurriness. This blurriness can be undesirable since it might obscure important meteorological features. More complex AI models, such as generative AI models, produce images that appear to be sharper. However, improved sharpness may come at the expense of a decline in other performance criteria, such as standard forecast verifica-tion metrics. To navigate any trade-off between sharpness and other performance metrics, it is important to quantitatively assess those other metrics along with sharpness. While there is a rich set of forecast verification metrics available for meteorological images, none of them focus on sharpness. This paper seeks to fill this gap by 1) exploring a variety of sharpness metrics from other fields, 2) evaluating properties of these metrics, 3) proposing the new concept of Gaussian blur equivalence as a tool for their uniform interpretation, and 4) demonstrating their use for sample meteorological applications, including a CNN that emulates radar imagery from satellite imagery [GOES Radar Estimation via Machine Learning to Inform NWP (GREMLIN)] and an AI-based global weather forecasting model (GraphCast). SIGNIFICANCE STATEMENT: Artificial intelligence (AI)-based estimates of meteorological images, e.g., for forecasting applications, often lack sharpness, but there are no well-established metrics to measure the sharpness of meteorological imagery. This manuscript seeks to close this gap by exploring sharpness metrics for meteorological imagery, analyzing their properties, and providing guidelines for their interpretation. We hope that the tools provided here will aid the development of AI algorithms that provide more realistic meteorological imagery.
KW - Artificial intelligence
KW - Fourier analysis
KW - Model evaluation/performance
KW - Neural networks
U2 - 10.1175/AIES-D-24-0083.1
DO - 10.1175/AIES-D-24-0083.1
M3 - Article
VL - 4
SP - e240083
JO - Artificial Intelligence for the Earth Systems
JF - Artificial Intelligence for the Earth Systems
IS - 3
M1 - e240083
ER -