TY - JOUR
T1 - Best practices in software development for robust and reproducible geoscientific models based on insights from the Global Carbon Budget's dynamic vegetation models
AU - Gregor, Konstantin
AU - Meyer, Benjamin F.
AU - Gaida, Tillmann
AU - Justo Vasquez, Victor
AU - Bett-Williams, Karina
AU - Forrest, Matthew
AU - Darela-Filho, João P.
AU - Rabin, Sam
AU - Longo, Marcos
AU - Melton, Joe R.
AU - Nord, Johan
AU - Anthoni, Peter
AU - Bastrikov, Vladislav
AU - Colligan, Thomas
AU - Delire, Christine
AU - Dietze, Michael C.
AU - Hurtt, George
AU - Ito, Akihiko
AU - Keetz, Lasse T.
AU - Knauer, Jürgen
AU - Köster, Johannes
AU - Lin, Tzu Shun
AU - Ma, Lei
AU - Minvielle, Marie
AU - Olin, Stefan
AU - Ostberg, Sebastian
AU - Shi, Hao
AU - Schnur, Reiner
AU - Sun, Qing
AU - Thornton, Peter E.
AU - Rammig, Anja
N1 - Publisher Copyright:
© 2026 Konstantin Gregor et al.
PY - 2026/3/25
Y1 - 2026/3/25
N2 - Computational models play an increasingly vital role in scientific research by enabling the numerical simulation of complex processes. Such models are also fundamental in geosciences. For instance, they offer critical insights into the impacts of global change on the Earth system today and in the future. Beyond their value as research tools, models are also software products and should therefore adhere to certain established software engineering standards. However, scientists are rarely trained as software developers, which can lead to potential deficiencies in software quality like unreadable, inefficient, or erroneous code. The complexity of models, coupled with their integration into broader workflows, also often makes it challenging to reproduce results, evaluate processes, and build upon them. In this paper, we review the state and current practices of the development processes of the state-of-the-art land surface models used by the Global Carbon Budget. We combine the experience of modelers from the respective research groups with the expertise of software engineers from tech companies to outline key principles and tools for improving software quality in research. We explore four main areas: (1) model testing and validation, (2) scientific, technical, and user documentation, (3) version control, continuous integration, and code review, and (4) the portability and reproducibility of workflows. Our review reveals that while modeling communities are incorporating many best practices, significant room for improvement remains in areas such as automated testing, automated documentation, and reproducibility. Therefore, we here identify and promote essential software engineering practices, including numerous examples of practices from within the community that can serve as guidelines for other models and could help streamline processes across the entire community. We conclude with an open-source example implementation of these principles, demonstrating portable and reproducible data flows, a continuous integration setup, and web-based visualizations. This example may serve as a practical resource for model developers, users, and all scientists engaged in scientific programming.
AB - Computational models play an increasingly vital role in scientific research by enabling the numerical simulation of complex processes. Such models are also fundamental in geosciences. For instance, they offer critical insights into the impacts of global change on the Earth system today and in the future. Beyond their value as research tools, models are also software products and should therefore adhere to certain established software engineering standards. However, scientists are rarely trained as software developers, which can lead to potential deficiencies in software quality like unreadable, inefficient, or erroneous code. The complexity of models, coupled with their integration into broader workflows, also often makes it challenging to reproduce results, evaluate processes, and build upon them. In this paper, we review the state and current practices of the development processes of the state-of-the-art land surface models used by the Global Carbon Budget. We combine the experience of modelers from the respective research groups with the expertise of software engineers from tech companies to outline key principles and tools for improving software quality in research. We explore four main areas: (1) model testing and validation, (2) scientific, technical, and user documentation, (3) version control, continuous integration, and code review, and (4) the portability and reproducibility of workflows. Our review reveals that while modeling communities are incorporating many best practices, significant room for improvement remains in areas such as automated testing, automated documentation, and reproducibility. Therefore, we here identify and promote essential software engineering practices, including numerous examples of practices from within the community that can serve as guidelines for other models and could help streamline processes across the entire community. We conclude with an open-source example implementation of these principles, demonstrating portable and reproducible data flows, a continuous integration setup, and web-based visualizations. This example may serve as a practical resource for model developers, users, and all scientists engaged in scientific programming.
UR - https://www.scopus.com/pages/publications/105034177930
U2 - 10.5194/gmd-19-2407-2026
DO - 10.5194/gmd-19-2407-2026
M3 - Review article
AN - SCOPUS:105034177930
SN - 1991-959X
VL - 19
SP - 2407
EP - 2436
JO - Geoscientific Model Development
JF - Geoscientific Model Development
IS - 6
ER -