credit.parser#

Content:

credit_main_parser
training_data_check
predict_data_check
remove_string_by_pattern

Functions#

`validate_args`(function, argdict, context[, ignore])	For calling 'function(**argdict)'. Checks that all arguments
`replace_nested_key`(data, key, value)	Recursively searches a nested dictionary and sets each instance
`remove_string_by_pattern`(list_string, pattern)	Given a list of strings, remove some of them based on a given pattern.
`credit_main_parser`(conf[, parse_training, ...])	Parses and validates the configuration input for the CREDIT project.
`training_data_check`(conf[, print_summary])	Note: this function is designed for model training, NOT for rollout
`predict_data_check`(conf[, print_summary])	Note: this function is designed for model rollout.

Module Contents#

credit.parser.validate_args(function, argdict, context, ignore=[])#: For calling ‘function(**argdict)’. Checks that all arguments required by function exist in argdict and throws an error if they don’t. Checks that arguments in argdict appear in the sigature of function and deletes any that don’t (with a warning). ‘context’ is a string added to the warning/error messages to make them more informative. ‘ignore’ is a list of parameters to leave alone even if they don’t appear in the signature.

credit.parser.replace_nested_key(data, key, value)#: Recursively searches a nested dictionary and sets each instance of key to value. Behavior may be unpredictable if the original value is also a dict.

credit.parser.remove_string_by_pattern(list_string, pattern)#: Given a list of strings, remove some of them based on a given pattern. Usage: remove ‘time’/’datetime’/’lead_time’ coordinates from a list of all coordinate names.

credit.parser.credit_main_parser(conf, parse_training=True, parse_predict=True, print_summary=False)#

Parses and validates the configuration input for the CREDIT project.

This function examines the provided configuration dictionary (conf), ensures that all required fields are present, and assigns default values where necessary. It is designed to be used in various training and prediction modules within the CREDIT repository. Missing critical fields will trigger assertion errors, while others will receive default values. A standardized version of the input configuration will be returned, ensuring consistency across different applications.

Parameters:

conf (dict) – Configuration dictionary containing all settings for data, model, trainer, and prediction phases.
parse_training (bool, optional) – If True, the function will check for training-specific fields. Defaults to True.
parse_predict (bool, optional) – If True, the function will check for prediction-specific fields. Defaults to True.
print_summary (bool, optional) – If True, a summary of the parsed variables will be printed. Defaults to False.

Returns:

The standardized and validated configuration dictionary.

Return type:

dict

Raises:

AssertionError – If any critical fields are missing or invalid in the provided configuration.

Notes

This function is used in the following scripts: - applications/train_gen1.py - applications/train_multistep.py - applications/rollout_to_netcdf.py

credit.parser.training_data_check(conf, print_summary=False)#

Note: this function is designed for model training, NOT for rollout

The following items are covered:

All yearly files (upper-air, surface, dynamic forcing, diagnostic) can support conf[‘data’][‘train_years’], conf[‘data’][‘valid_years’]
All variables (upper-air, surface, dynamic forcing, diagnostic) do exist in their corresponding files Note: only one file of each group will be checked.
All files (upper-air, surface, dynamic forcing, diagnostic, forcing, static, mean, std, lat_weights) have the same coordinate names and coordinate values Note: this part checks lat, lon, level coordinates, and it ignores ‘time’ coordinates.

Where is it applied?

applications/train_gen1.py
applications/train_multistep.py

credit.parser.predict_data_check(conf, print_summary=False)#

Note: this function is designed for model rollout.

Diagnostic variables are checked in mean and std files only

The following items are covered:

All variables (upper-air, surface, dynamic forcing) do exist in their corresponding files Note: only one file of each group will be checked.
All files (upper-air, surface, dynamic forcing, forcing, static, mean, std, lat_weights) have the same coordinate names and coordinate values Note: this part checks lat, lon, level coordinates, and it ignores ‘time’ coordinates.

Where is it applied?

applications/rollout_to_netcdf_new.py