credit.parser#

Content:
  • credit_main_parser

  • training_data_check

  • predict_data_check

  • remove_string_by_pattern

Functions#

remove_string_by_pattern(list_string, pattern)

Given a list of strings, remove some of them based on a given pattern.

credit_main_parser(conf[, parse_training, ...])

Parses and validates the configuration input for the CREDIT project.

training_data_check(conf[, print_summary])

Note: this function is designed for model training, NOT for rollout

predict_data_check(conf[, print_summary])

Note: this function is designed for model rollout.

Module Contents#

credit.parser.remove_string_by_pattern(list_string, pattern)#

Given a list of strings, remove some of them based on a given pattern. Usage: remove ‘time’/’datetime’/’lead_time’ coordinates from a list of all coordinate names.

credit.parser.credit_main_parser(conf, parse_training=True, parse_predict=True, print_summary=False)#

Parses and validates the configuration input for the CREDIT project.

This function examines the provided configuration dictionary (conf), ensures that all required fields are present, and assigns default values where necessary. It is designed to be used in various training and prediction modules within the CREDIT repository. Missing critical fields will trigger assertion errors, while others will receive default values. A standardized version of the input configuration will be returned, ensuring consistency across different applications.

Parameters:
  • conf (dict) – Configuration dictionary containing all settings for data, model, trainer, and prediction phases.

  • parse_training (bool, optional) – If True, the function will check for training-specific fields. Defaults to True.

  • parse_predict (bool, optional) – If True, the function will check for prediction-specific fields. Defaults to True.

  • print_summary (bool, optional) – If True, a summary of the parsed variables will be printed. Defaults to False.

Returns:

The standardized and validated configuration dictionary.

Return type:

dict

Raises:

AssertionError – If any critical fields are missing or invalid in the provided configuration.

Notes

This function is used in the following scripts: - applications/train.py - applications/train_multistep.py - applications/rollout_to_netcdf.py

credit.parser.training_data_check(conf, print_summary=False)#

Note: this function is designed for model training, NOT for rollout

The following items are covered:
  • All yearly files (upper-air, surface, dynamic forcing, diagnostic) can support conf[‘data’][‘train_years’], conf[‘data’][‘valid_years’]

  • All variables (upper-air, surface, dynamic forcing, diagnostic) do exist in their corresponding files Note: only one file of each group will be checked.

  • All files (upper-air, surface, dynamic forcing, diagnostic, forcing, static, mean, std, lat_weights) have the same coordinate names and coordinate values Note: this part checks lat, lon, level coordinates, and it ignores ‘time’ coordinates.

Where is it applied?
  • applications/train.py

  • applications/train_multistep.py

credit.parser.predict_data_check(conf, print_summary=False)#
Note: this function is designed for model rollout.

Diagnostic variables are checked in mean and std files only

The following items are covered:
  • All variables (upper-air, surface, dynamic forcing) do exist in their corresponding files Note: only one file of each group will be checked.

  • All files (upper-air, surface, dynamic forcing, forcing, static, mean, std, lat_weights) have the same coordinate names and coordinate values Note: this part checks lat, lon, level coordinates, and it ignores ‘time’ coordinates.

Where is it applied?
  • applications/rollout_to_netcdf_new.py