credit.trainers.trainer404#

Classes#

Trainer

Helper class that provides a standard way to create an ABC using

Module Contents#

class credit.trainers.trainer404.Trainer(model: torch.nn.Module, rank: int)#

Bases: credit.trainers.base_trainer.BaseTrainer

Helper class that provides a standard way to create an ABC using inheritance.

train_one_epoch(epoch, conf, trainloader, optimizer, criterion, scaler, scheduler, metrics)#

Train the model for one epoch.

Parameters:
  • epoch (int) – The current epoch number.

  • conf (Dict[str, Any]) – The configuration dictionary.

  • trainloader (torch.utils.data.DataLoader) – The training data loader.

  • optimizer (torch.optim.Optimizer) – The optimizer.

  • criterion (torch.nn.Module) – The loss function.

  • scaler (torch.cuda.amp.GradScaler) – The gradient scaler for mixed precision training.

  • scheduler (torch.optim.lr_scheduler.LRScheduler) – The learning rate scheduler.

  • metrics (Dict[str, Any]) – The metrics to track during training.

Returns:

A dictionary containing the training results.

Return type:

Dict[str, float]

validate(epoch, conf, valid_loader, criterion, metrics)#

Validate the model on the validation set.

Parameters:
  • epoch (int) – The current epoch number.

  • conf (Dict[str, Any]) – The configuration dictionary.

  • valid_loader (torch.utils.data.DataLoader) – The validation data loader.

  • criterion (torch.nn.Module) – The loss function.

  • metrics (Dict[str, Any]) – The metrics to track during validation.

Returns:

A dictionary containing the validation results.

Return type:

Dict[str, float]

fit(conf, train_loader, valid_loader, optimizer, train_criterion, valid_criterion, scaler, scheduler, metrics, rollout_scheduler=None, trial=False)#

Fit the model to the data.

Parameters:
  • conf (Dict[str, Any]) – Configuration dictionary.

  • train_loader (DataLoader) – DataLoader for training data.

  • valid_loader (DataLoader) – DataLoader for validation data.

  • optimizer (Optimizer) – The optimizer to use for training.

  • train_criterion (torch.nn.Module) – Loss function for training.

  • valid_criterion (torch.nn.Module) – Loss function for validation.

  • scaler (GradScaler) – Gradient scaler for mixed precision training.

  • scheduler (_LRScheduler) – Learning rate scheduler.

  • metrics (Dict[str, Any]) – Dictionary of metrics to track during training.

  • rollout_scheduler (Optional[callable]) – Function to schedule rollout probability, if applicable.

  • trial (bool) – Whether this is a trial run (e.g., for hyperparameter tuning).

Returns:

Dictionary containing the best results from training.

Return type:

Dict[str, Any]