Prediction Rollouts#

Prediction Ingredients#

Before beginning rollouts of a CREDIT model, you will need the following ingredients/files available on your machine.

🌎Initial conditions for upper air and surface variables in Zarr format. If running processed ERA5 on Derecho or Casper, you can access the processed files at /glade/campaign/cisl/aiml/credit/era5_zarr/. The y_TOTAL*.zarr and SixHourly_y_TOTAL*.zarr are at 0.28 degree grid spacing, and SixHourly_y_ONEdeg*.zarr for 1 degree data.
🌞 Dynamic forcing files covering the full period of prediction. In the current CREDIT models, the only dynamic forcing variable is top-of-atmosphere shortwave irradiance. Pre-calculated solar irradiance values integrated over 1 and 6 hour periods are available on Derecho/Casper at /glade/campaign/cisl/aiml/credit/credit_solar_nc_6h_0.25deg and credit_solar_nc_1h_0.25deg. You can calculate top of atmosphere solar irradiance for any grid and integration time period with applications/calc_global_solar.py. If you plan to issue regular predictions, we recommend pre-computing solar irradiance values for a given year of inference rather than calculating on the fly.
⛰️ Static forcing files with and without normalization. These forcing files include elements like terrain height, land-sea mask, and land-use type. Static forcing files for the initial CREDIT models are currently archived at /glade/campaign/cisl/static_scalers/. static_norm_old.nc has normalized terrain height and land sea mask, while unnormalized values are in LSM_static_variables_ERA5_zhght.nc. The unnormalized values are needed for interpolation to pressure and height levels.
Files containing the mean and standard deviation scaling values for each variable. Currently, CREDIT uses values stored in netCDF files. These are currently stored on Derecho in /glade/campaign/cisl/static_scalers/. The appropriate files to use are mean_6h_1979_2018_16lev_0.25deg.nc for the mean and std_residual_6h_1979_2018_16lev_0.25deg.nc for the combined standard deviation of each variable and the standard deviation of the temporal residual.

Realtime Rollouts#

The goal of realtime inference is to launch model forecasts from GFS or ERA5 initial conditions. The predict section of your configuration file should contain the following fields:

predict:
  mode: none
  realtime:
    forecast_start_time: "2025-04-14 12:00:00" # change to your init date
    forecast_end_time: "2025-04-24 12:00:00" # Should be sometime after init date
    forecast_timestep: "6h" # Needs to contain h for hours and should match 1 or 6 hour model.
  initial_condition_path: "/path/to/gfs_init/" # change 
  static_fields: "/Users/dgagne/data/CREDIT_data/LSM_static_variables_ERA5_zhght.nc" # Static forcing file.
  metadata: '/Users/dgagne/miles-credit/credit/metadata/era5.yaml' # Path to metadata for output
  save_forecast: '/Users/dgagne/data/wxformer_6h_test/' # path to save forecast data

If you want to use GFS initial conditions, run python applications/gfs_init.py -c <config file>. It will download fields from a GFS initial condition on model levels, which are archived for the past 10 days on the NOAA NOMADS server. GDAS Analyses and GFS initial timesteps on model levels are also available on Google Cloud back to 2021. The gfs_init.py program regrids the data onto the appropriate CREDIT grid and interpolates in the vertical from the GFS to selected CREDIT ERA5 hybrid sigma-pressure levels.

[!IMPORTANT]
gfs_init.py requires xesmf, which depends on the ESMF suite and cannot be installed from PyPI. The easist way to install xesmf without messing up your CREDIT environment is to run conda install -c conda-forge esmf then pip install xesmf after building your CREDIT environment first.

Realtime rollouts are handled by applications/rollout_realtime.py. Update the paths in the data section of the config file to point to the GFS initial conditions zarr file. rollout_realtime.py only outputs one forecast at a time.

Rollout to netCDF for ERA5 initiated forecasts#

rollout_to_netcdf.py enables you to generate forecasts for many initialization times using processed ERA5 data as initial conditions. It can be run either in serial or parallel mode with both single node and multi node support (MPI-enabled PyTorch required).

To run rollout_to_netcdf.py include the following section in your config file.

predict:
    mode: none
    forecasts:
        type: "custom"       # keep it as "custom"
        start_year: 2020     # year of the first initialization (where rollout will start)
        start_month: 1       # month of the first initialization
        start_day: 1         # day of the first initialization
        start_hours: [0, 12] # hour-of-day for each initialization, 0 for 00Z, 12 for 12Z
        duration: 32         # number of days to initialize, starting from the (year, mon, day) above
                             # duration should be divisible by the number of GPUs
                             # (e.g., duration: 384 for 365-day rollout using 32 GPUs)
        days: 10             # forecast lead time as days (1 means 24-hour forecast)

To submit the rollout script as a PBS job, use python rollout_to_netcdf.py -l 1 -c <config file>.

To issue predictions on multiple GPUs on a single node: torchrun rollout_to_netcdf.py -c <confg file>

For multi-node rollouts with MPI (MPI-enabled PyTorch required):

nodes=( $( cat $PBS_NODEFILE ) )
head_node=${nodes[0]}
head_node_ip=$(ssh $head_node hostname -i | awk '{print $1}')
export NUM_RANKS=32
MASTER_ADDR=$head_node_ip
MASTER_PORT=1234
mpiexec -n $NUM_RANKS -ppn 4 --cpu-bind none python rollout_to_netcdf.py -c <config file>

Interpolation to constant pressure and height above ground levels#

Both rollout_realtime.py and rollout_to_netcdf.py support vertical interpolation to constant pressure and constant height above ground level (AGL) levels from the hybrid sigma-pressure levels used by most models in CREDIT. To enable interpolation, add the following lines to your config file in the predict section

data:
  level_ids: [10, 30, 40, 50, 60, 70, 80, 90, 95, 100, 105, 110, 120, 130, 136, 137]
predict:
  interp_pressure:
    pressure_levels: [300.0, 500.0, 850.0, 925.0] # in hPa
    height_levels: [100.0, 500.0, 1000.0, 2000.0, 3000.0, 4000.0, 5000.0, 6000.0] # in meters

More configuration options are listed in full_state_pressure_interpolation in credit/interp.py and can be set from the config file in the interp_pressure section. The interpolation routine interpolates to pressure levels using approximately the same approach as ECMWF, although results will not be exactly the same due to slight numerical and implementation differences. The routine also calculates pressure and geopotential on all levels. Mean sea level pressure is also calculated in this routine.

Saving compressed and chunked netCDF files#

By default, the rollout scripts will save uncompressed netCDF files. These can grow to be quite large if you are producing a lot of forecasts and are saving all the fields. Space can be saved greatly by turning on netCDF compression and setting chunks that align with your preferred access pattern. Encoding options like the ones below go into the config file.

model:
    # crossformer example
    type: "crossformer"
    frames: 1                         # number of input states (default: 1)
    image_height: &height 640         # number of latitude grids (default: 640)
    image_width: &width 1280          # number of longitude grids (default: 1280)
    levels: &levels 16                # number of upper-air variable levels (default: 15)
    channels: 4                       # upper-air variable channels
predict:
  ua_var_encoding:
    zlib: True # turns on zlib compression.
    complevel: 1 # ranges from 1 to 9. 1 is faster with a lower compression ratio, 9 is slower.
    shuffle: True
    chunksizes: [1, *levels, *height, *width]

  pressure_var_encoding:
    zlib: True
    complevel: 1
    shuffle: True
    chunksizes: [ 1, 4, *height, *width] # second dim should match number of interp pres. levels
    
  height_var_encoding:
    zlib: True
    complevel: 1
    shuffle: True
    chunksizes: [ 1, 8, *height, *width] # second dim should match number of interp height levels

  surface_var_encoding:
    zlib: true
    complevel: 1
    shuffle: True
    chunksizes: [1, *height, *width]

Prediction Rollouts

Contents