credit.datasets#

Submodules#

Attributes#

Functions#

__getattr__(name)

set_globals(data_config[, namespace])

Sets global variables from the provided configuration dictionary in the specified namespace.

setup_data_loading(conf)

Sets up the data loading configuration by reading and processing data paths,

Package Contents#

credit.datasets.logger#
credit.datasets._CLASS_SOURCES#
credit.datasets.__getattr__(name)#
credit.datasets.set_globals(data_config, namespace=None)#

Sets global variables from the provided configuration dictionary in the specified namespace.

This method updates the global variables in either the given namespace or the caller’s namespace (if namespace is not provided). If the namespace is not specified, it uses the global namespace of the caller (using sys._getframe(1).f_globals).

Parameters:
  • data_config (dict) – A dictionary where the keys are the global variable names and the values are the corresponding values to set.

  • namespace (dict, optional) – The namespace (or dictionary) where the global variables should be set. If not provided, the caller’s global namespace is used.

The method logs each global variable being created and its name.

credit.datasets.setup_data_loading(conf)#
Sets up the data loading configuration by reading and processing data paths,

surface, dynamic forcing, and diagnostic files based on the given configuration.

The function processes the configuration dictionary (conf) and performs the following: - Globs and filters data files (ERA5, surface, dynamic forcing, diagnostic). - Determines the training and validation file sets based on specified years. - Sets up variables like historical data length, forecast length, and additional metadata. - Returns a dictionary containing all the paths and configuration details for further use.

Parameters:

conf (dict) – A dictionary containing configuration details, including data paths, variable names, forecast details, and other settings.

Returns:

A dictionary containing paths to various datasets and other
configuration values used in data loading, such as:
  • all_ERA_files: All ERA5 dataset files.

  • train_files: Filtered training dataset files.

  • valid_files: Filtered validation dataset files.

  • surface_files: Surface data files, if available.

  • dyn_forcing_files: Dynamic forcing files, if available.

  • diagnostic_files: Diagnostic files, if available.

  • varname_upper_air, varname_surface, varname_dyn_forcing, etc.: Variable names for

    each data type.

  • history_len: Length of the history data for training.

  • forecast_len: Number of steps ahead to forecast.

  • Other configuration values related to skipping periods, one-shot learning, etc.

Return type:

data_config (dict)