Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- ML-based weather simulator called “GraphCast” outperforms most accurate deterministic operational medium-range weather forecasting system in the world
- GraphCast is an autoregressive model based on graph neural networks and a novel high-resolution multi-scale mesh representation
- GraphCast can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid
- GraphCast is more accurate than ECMWF’s deterministic operational forecasting system, HRES
- GraphCast can generate a 10-day forecast in under 60 seconds on Cloud TPU v4 hardware
- ML-based forecasting scales well with data
Paper Content
Introduction
- People factor in upcoming weather when planning activities
- Weather bureaus provide medium-range forecasts up to 10 days
- Machine learning can rival traditional approaches used by bureaus
- Weather forecasting involves two components: data assimilation and forecast model
- Data assimilation uses NWP model to infer and track weather
- Forecast model approximates governing equations of Earth’s weather numerically
- ML-based methods can increase accuracy with more data
- ML-based methods are beginning to improve on NWP-based forecasting
- ML-based methods have recently begun to be competitive with traditional NWP
- ECMWF’s Integrated Forecasting System is the most accurate medium-range operational forecasting system
- ML-based weather models have not been comprehensively compared to operational forecasting systems
- GraphCast uses GNNs to autoregressively generate forecast trajectories
- GraphCast trained on 39 years of historical weather data
- GraphCast has greater forecasting skill than HRES on 90.0% of variables and pressure levels
- GraphCast has greater forecasting skill than Pangu-Weather on 99.2% of targets
Era5 dataset
- Datasets were built from a subset of ECMWF’s ERA5 reanalysis archive
- Reanalysis means estimating the full state of the weather globally over time
- ERA5 is regarded as the most comprehensive and accurate reanalysis archive
- Model predicts a total of 227 target variables
- Variables are uniquely identified by their short name and pressure level
- Variables include 5 surface variables and 6 atmospheric variables at each of 37 pressure levels
- Static/external variables include information such as the geometry of the grid/mesh, orography and radiation at the top of the atmosphere
Generating a forecast
- GraphCast takes two weather states as input and predicts the weather state at the next time step.
- GraphCast uses an autoregressive fashion to generate a multi-step forecast, feeding its own predictions back in as input.
Architecture
- GraphCast uses GNNs in an “encode-process-decode” configuration
- GNNs are effective at learning complex physical dynamics
- GNNs allow arbitrary patterns of spatial interactions
- CNNs are restricted to local patches
- Transformers can compute long-range computations but don’t scale well with large inputs
- GraphCast uses a multi-mesh representation with homogeneous spatial resolution
- GraphCast encoder and decoder can be applied to arbitrary mesh-like state discretizations
- GNN-based learned simulators have been successful in many complex fluid systems and other physical domains
- GraphCast can generate a 0.25°resolution, 10-day forecast in under 60 seconds
Training procedure
- GraphCast was trained to minimize an objective function over 12-step forecasts
- The objective function was an average of squared errors over forecast date-times, lead times, spatial locations, variables and levels
- Data from 2018 was never observed by research team or training procedures until after model was frozen
- Preliminary experiments showed improved performance when training data included years immediately preceding test period
- Training GraphCast took 3 weeks on 32 Cloud TPU v4 devices using batch parallelism
Model evaluation
- Quantified skillfulness of GraphCast, ML models, and HRES using RMSE and ACC
- RMSE measures magnitude of differences between forecasts and ground truth
- ACC measures how well model forecasts correlate with ground truth
- Normalized RMSE difference between model and baseline
- Normalized ACC difference
- Trained GraphCast to predict ERA5 data
- Built separate dataset for HRES errors
- 10 headline variables chosen from ECMWF Scorecard
- 69 variable-level combinations evaluated
- RMSE between forecasts and ground truths shown in Figure 2 and Figure 3
Results
- GraphCast outperforms HRES in weather forecasting skill across 10-day forecasts
- GraphCast has higher skill than HRES for 10 headline surface and atmospheric variables
- GraphCast has lower error than HRES at early lead times, typically plateauing to around 10-15% after 10 days
- GraphCast outperformed HRES on 90.0% of the 2760 variables, levels, and lead times in the evaluation set
- GraphCast has substantially greater skill than HRES across the variables, levels, and lead times tested
How autoregressive training affects forecast skill
- Model performance varies with number of autoregressive steps
- Fewer autoregressive steps better for short lead times, worse for longer lead times
- Increasing autoregressive steps worse for short lead times, better for longer lead times
- Combining multiple models with varying numbers of AR steps could capitalize on advantages
- Pangu-Weather is the state-of-the-art ML-based weather forecasting
- GraphCast outperforms Pangu-Weather on 99.2% of targets
- HRES-against-ERA5 approach can lead to worse skill estimates
Discussion
- GraphCast outperforms ECMWF’s HRES on 90.0% of 2760 metrics
- GraphCast outperforms Pangu-Weather on 99.2% of 252 metrics
- GraphCast captures longer-range spatial interactions than traditional NWP methods
- GraphCast can generate 10-day forecast in under 60 seconds
- GraphCast expresses uncertainty over longer lead times by producing forecast closer to the mean
- GraphCast should not be regarded as a replacement for traditional weather forecasting methods
- GraphCast can apply to wider range of environmental and other geo-spatiotemporal forecasting problems
- GraphCast trained on complex, real-world data
- GraphCast evaluated against ensemble systems
- GraphCast evaluated on 0.25°latitude-longitude resolution
- GraphCast data split into training and test sets
- HRES operational forecasts evaluated with HRES-fc0 dataset
- ERA5 used as ground truth for surface and atmospheric weather state
- Temporal resolution of data and forecasts is 6 hours with 10-day forecast horizon