Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • ML-based weather simulator called “GraphCast” outperforms most accurate deterministic operational medium-range weather forecasting system in the world
  • GraphCast is an autoregressive model based on graph neural networks and a novel high-resolution multi-scale mesh representation
  • GraphCast can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid
  • GraphCast is more accurate than ECMWF’s deterministic operational forecasting system, HRES
  • GraphCast can generate a 10-day forecast in under 60 seconds on Cloud TPU v4 hardware
  • ML-based forecasting scales well with data

Paper Content

Introduction

  • People factor in upcoming weather when planning activities
  • Weather bureaus provide medium-range forecasts up to 10 days
  • Machine learning can rival traditional approaches used by bureaus
  • Weather forecasting involves two components: data assimilation and forecast model
  • Data assimilation uses NWP model to infer and track weather
  • Forecast model approximates governing equations of Earth’s weather numerically
  • ML-based methods can increase accuracy with more data
  • ML-based methods are beginning to improve on NWP-based forecasting
  • ML-based methods have recently begun to be competitive with traditional NWP
  • ECMWF’s Integrated Forecasting System is the most accurate medium-range operational forecasting system
  • ML-based weather models have not been comprehensively compared to operational forecasting systems
  • GraphCast uses GNNs to autoregressively generate forecast trajectories
  • GraphCast trained on 39 years of historical weather data
  • GraphCast has greater forecasting skill than HRES on 90.0% of variables and pressure levels
  • GraphCast has greater forecasting skill than Pangu-Weather on 99.2% of targets

Era5 dataset

  • Datasets were built from a subset of ECMWF’s ERA5 reanalysis archive
  • Reanalysis means estimating the full state of the weather globally over time
  • ERA5 is regarded as the most comprehensive and accurate reanalysis archive
  • Model predicts a total of 227 target variables
  • Variables are uniquely identified by their short name and pressure level
  • Variables include 5 surface variables and 6 atmospheric variables at each of 37 pressure levels
  • Static/external variables include information such as the geometry of the grid/mesh, orography and radiation at the top of the atmosphere

Generating a forecast

  • GraphCast takes two weather states as input and predicts the weather state at the next time step.
  • GraphCast uses an autoregressive fashion to generate a multi-step forecast, feeding its own predictions back in as input.

Architecture

  • GraphCast uses GNNs in an “encode-process-decode” configuration
  • GNNs are effective at learning complex physical dynamics
  • GNNs allow arbitrary patterns of spatial interactions
  • CNNs are restricted to local patches
  • Transformers can compute long-range computations but don’t scale well with large inputs
  • GraphCast uses a multi-mesh representation with homogeneous spatial resolution
  • GraphCast encoder and decoder can be applied to arbitrary mesh-like state discretizations
  • GNN-based learned simulators have been successful in many complex fluid systems and other physical domains
  • GraphCast can generate a 0.25°resolution, 10-day forecast in under 60 seconds

Training procedure

  • GraphCast was trained to minimize an objective function over 12-step forecasts
  • The objective function was an average of squared errors over forecast date-times, lead times, spatial locations, variables and levels
  • Data from 2018 was never observed by research team or training procedures until after model was frozen
  • Preliminary experiments showed improved performance when training data included years immediately preceding test period
  • Training GraphCast took 3 weeks on 32 Cloud TPU v4 devices using batch parallelism

Model evaluation

  • Quantified skillfulness of GraphCast, ML models, and HRES using RMSE and ACC
  • RMSE measures magnitude of differences between forecasts and ground truth
  • ACC measures how well model forecasts correlate with ground truth
  • Normalized RMSE difference between model and baseline
  • Normalized ACC difference
  • Trained GraphCast to predict ERA5 data
  • Built separate dataset for HRES errors
  • 10 headline variables chosen from ECMWF Scorecard
  • 69 variable-level combinations evaluated
  • RMSE between forecasts and ground truths shown in Figure 2 and Figure 3

Results

  • GraphCast outperforms HRES in weather forecasting skill across 10-day forecasts
  • GraphCast has higher skill than HRES for 10 headline surface and atmospheric variables
  • GraphCast has lower error than HRES at early lead times, typically plateauing to around 10-15% after 10 days
  • GraphCast outperformed HRES on 90.0% of the 2760 variables, levels, and lead times in the evaluation set
  • GraphCast has substantially greater skill than HRES across the variables, levels, and lead times tested

How autoregressive training affects forecast skill

  • Model performance varies with number of autoregressive steps
  • Fewer autoregressive steps better for short lead times, worse for longer lead times
  • Increasing autoregressive steps worse for short lead times, better for longer lead times
  • Combining multiple models with varying numbers of AR steps could capitalize on advantages
  • Pangu-Weather is the state-of-the-art ML-based weather forecasting
  • GraphCast outperforms Pangu-Weather on 99.2% of targets
  • HRES-against-ERA5 approach can lead to worse skill estimates

Discussion

  • GraphCast outperforms ECMWF’s HRES on 90.0% of 2760 metrics
  • GraphCast outperforms Pangu-Weather on 99.2% of 252 metrics
  • GraphCast captures longer-range spatial interactions than traditional NWP methods
  • GraphCast can generate 10-day forecast in under 60 seconds
  • GraphCast expresses uncertainty over longer lead times by producing forecast closer to the mean
  • GraphCast should not be regarded as a replacement for traditional weather forecasting methods
  • GraphCast can apply to wider range of environmental and other geo-spatiotemporal forecasting problems
  • GraphCast trained on complex, real-world data
  • GraphCast evaluated against ensemble systems
  • GraphCast evaluated on 0.25°latitude-longitude resolution
  • GraphCast data split into training and test sets
  • HRES operational forecasts evaluated with HRES-fc0 dataset
  • ERA5 used as ground truth for surface and atmospheric weather state
  • Temporal resolution of data and forecasts is 6 hours with 10-day forecast horizon