Skip to main navigation Skip to search Skip to main content

Best practices in software development for robust and reproducible geoscientific models based on insights from the Global Carbon Budget's dynamic vegetation models

  • Konstantin Gregor
  • , Benjamin F. Meyer
  • , Tillmann Gaida
  • , Victor Justo Vasquez
  • , Karina Bett-Williams
  • , Matthew Forrest
  • , João P. Darela-Filho
  • , Sam Rabin
  • , Marcos Longo
  • , Joe R. Melton
  • , Johan Nord
  • , Peter Anthoni
  • , Vladislav Bastrikov
  • , Thomas Colligan
  • , Christine Delire
  • , Michael C. Dietze
  • , George Hurtt
  • , Akihiko Ito
  • , Lasse T. Keetz
  • , Jürgen Knauer
  • Johannes Köster, Tzu Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Qing Sun, Peter E. Thornton, Anja Rammig
  • Goto10 GmbH
  • University of Exeter
  • Met Office
  • Lawrence Berkeley National Laboratory
  • Instituto Nacional de Pesquisas Espaciais
  • Lund University
  • Science Partners
  • University of Maryland, College Park
  • NASA Goddard Space Flight Center
  • Paul Sabatier University
  • Boston University
  • The University of Tokyo
  • University of Oslo
  • University of Technology Sydney
  • University of Duisburg-Essen
  • Leibniz Association
  • CAS - Research Center for Eco-Environmental Sciences
  • University of Bern
  • Oak Ridge National Laboratory

Research output: Contribution to journalReview articlepeer-review

Abstract

Computational models play an increasingly vital role in scientific research by enabling the numerical simulation of complex processes. Such models are also fundamental in geosciences. For instance, they offer critical insights into the impacts of global change on the Earth system today and in the future. Beyond their value as research tools, models are also software products and should therefore adhere to certain established software engineering standards. However, scientists are rarely trained as software developers, which can lead to potential deficiencies in software quality like unreadable, inefficient, or erroneous code. The complexity of models, coupled with their integration into broader workflows, also often makes it challenging to reproduce results, evaluate processes, and build upon them. In this paper, we review the state and current practices of the development processes of the state-of-the-art land surface models used by the Global Carbon Budget. We combine the experience of modelers from the respective research groups with the expertise of software engineers from tech companies to outline key principles and tools for improving software quality in research. We explore four main areas: (1) model testing and validation, (2) scientific, technical, and user documentation, (3) version control, continuous integration, and code review, and (4) the portability and reproducibility of workflows. Our review reveals that while modeling communities are incorporating many best practices, significant room for improvement remains in areas such as automated testing, automated documentation, and reproducibility. Therefore, we here identify and promote essential software engineering practices, including numerous examples of practices from within the community that can serve as guidelines for other models and could help streamline processes across the entire community. We conclude with an open-source example implementation of these principles, demonstrating portable and reproducible data flows, a continuous integration setup, and web-based visualizations. This example may serve as a practical resource for model developers, users, and all scientists engaged in scientific programming.

Original languageEnglish
Pages (from-to)2407-2436
Number of pages30
JournalGeoscientific Model Development
Volume19
Issue number6
DOIs
StatePublished - Mar 25 2026
Externally publishedYes

Fingerprint

Dive into the research topics of 'Best practices in software development for robust and reproducible geoscientific models based on insights from the Global Carbon Budget's dynamic vegetation models'. Together they form a unique fingerprint.

Cite this