credit.trainers.trainerERA5_Diffusion#
Attributes#
Classes#
Helper class that provides a standard way to create an ABC using |
Module Contents#
- credit.trainers.trainerERA5_Diffusion.logger#
- class credit.trainers.trainerERA5_Diffusion.TrainerERA5Diffusion(model: torch.nn.Module, rank: int, conf: dict)#
Bases:
credit.trainers.base_trainer.BaseTrainerHelper class that provides a standard way to create an ABC using inheritance.
- train_one_epoch(epoch, trainloader, optimizer, criterion, scaler, scheduler, metrics)#
Trains the model for one epoch.
- Parameters:
epoch (int) – Current epoch number.
conf (dict) – Configuration dictionary containing training settings.
trainloader (DataLoader) – DataLoader for the training dataset.
optimizer (torch.optim.Optimizer) – Optimizer used for training.
criterion (callable) – Loss function used for training.
scaler (torch.cuda.amp.GradScaler) – Gradient scaler for mixed precision training.
scheduler (torch.optim.lr_scheduler._LRScheduler) – Learning rate scheduler.
metrics (callable) – Function to compute metrics for evaluation.
- Returns:
Dictionary containing training metrics and loss for the epoch.
- Return type:
dict
- validate(epoch, valid_loader, criterion, metrics)#
Validates the model on the validation dataset.
- Parameters:
epoch (int) – Current epoch number.
conf (dict) – Configuration dictionary containing validation settings.
valid_loader (DataLoader) – DataLoader for the validation dataset.
criterion (callable) – Loss function used for validation.
metrics (callable) – Function to compute metrics for evaluation.
- Returns:
Dictionary containing validation metrics and loss for the epoch.
- Return type:
dict