credit.samplers
===============

.. py:module:: credit.samplers


Attributes
----------

.. autoapisummary::

   credit.samplers.logger


Classes
-------

.. autoapisummary::

   credit.samplers.MultiStepBatchSamplerSubset
   credit.samplers.DistributedMultiStepBatchSampler


Module Contents
---------------

.. py:data:: logger

.. py:class:: MultiStepBatchSamplerSubset(dataset: torch.utils.data.Dataset, batch_size: int, index_subset, num_forecast_steps: int)

   Bases: :py:obj:`torch.utils.data.Sampler`


   Base class for all Samplers.

   Every Sampler subclass has to provide an :meth:`__iter__` method, providing a
   way to iterate over indices or lists of indices (batches) of dataset elements,
   and may provide a :meth:`__len__` method that returns the length of the returned iterators.

   .. rubric:: Example

   >>> # xdoctest: +SKIP
   >>> class AccedingSequenceLengthSampler(Sampler[int]):
   >>>     def __init__(self, data: List[str]) -> None:
   >>>         self.data = data
   >>>
   >>>     def __len__(self) -> int:
   >>>         return len(self.data)
   >>>
   >>>     def __iter__(self) -> Iterator[int]:
   >>>         sizes = torch.tensor([len(x) for x in self.data])
   >>>         yield from torch.argsort(sizes).tolist()
   >>>
   >>> class AccedingSequenceLengthBatchSampler(Sampler[List[int]]):
   >>>     def __init__(self, data: List[str], batch_size: int) -> None:
   >>>         self.data = data
   >>>         self.batch_size = batch_size
   >>>
   >>>     def __len__(self) -> int:
   >>>         return (len(self.data) + self.batch_size - 1) // self.batch_size
   >>>
   >>>     def __iter__(self) -> Iterator[List[int]]:
   >>>         sizes = torch.tensor([len(x) for x in self.data])
   >>>         for batch in torch.chunk(torch.argsort(sizes), len(self)):
   >>>             yield batch.tolist()

   .. note:: The :meth:`__len__` method isn't strictly required by
             :class:`~torch.utils.data.DataLoader`, but is expected in any
             calculation involving the length of a :class:`~torch.utils.data.DataLoader`.


   .. py:attribute:: dataset


   .. py:attribute:: num_forecast_steps


   .. py:attribute:: init_times


   .. py:attribute:: dt


   .. py:attribute:: index_subset


   .. py:attribute:: batch_size


   .. py:attribute:: num_start_batches


   .. py:method:: __len__()


   .. py:method:: __iter__()


.. py:class:: DistributedMultiStepBatchSampler(dataset: torch.utils.data.Dataset, batch_size: int, num_forecast_steps: int, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)

   Bases: :py:obj:`torch.utils.data.DistributedSampler`


   Sampler that restricts data loading to a subset of the dataset.

   It is especially useful in conjunction with
   :class:`torch.nn.parallel.DistributedDataParallel`. In such a case, each
   process can pass a :class:`~torch.utils.data.DistributedSampler` instance as a
   :class:`~torch.utils.data.DataLoader` sampler, and load a subset of the
   original dataset that is exclusive to it.

   .. note::
       Dataset is assumed to be of constant size and that any instance of it always
       returns the same elements in the same order.

   :param dataset: Dataset used for sampling.
   :param num_replicas: Number of processes participating in
                        distributed training. By default, :attr:`world_size` is retrieved from the
                        current distributed group.
   :type num_replicas: int, optional
   :param rank: Rank of the current process within :attr:`num_replicas`.
                By default, :attr:`rank` is retrieved from the current distributed
                group.
   :type rank: int, optional
   :param shuffle: If ``True`` (default), sampler will shuffle the
                   indices.
   :type shuffle: bool, optional
   :param seed: random seed used to shuffle the sampler if
                :attr:`shuffle=True`. This number should be identical across all
                processes in the distributed group. Default: ``0``.
   :type seed: int, optional
   :param drop_last: if ``True``, then the sampler will drop the
                     tail of the data to make it evenly divisible across the number of
                     replicas. If ``False``, the sampler will add extra indices to make
                     the data evenly divisible across the replicas. Default: ``False``.
   :type drop_last: bool, optional

   .. warning::
       In distributed mode, calling the :meth:`set_epoch` method at
       the beginning of each epoch **before** creating the :class:`DataLoader` iterator
       is necessary to make shuffling work properly across multiple epochs. Otherwise,
       the same ordering will be always used.

   Example::

       >>> # xdoctest: +SKIP
       >>> sampler = DistributedSampler(dataset) if is_distributed else None
       >>> loader = DataLoader(dataset, shuffle=(sampler is None),
       ...                     sampler=sampler)
       >>> for epoch in range(start_epoch, n_epochs):
       ...     if is_distributed:
       ...         sampler.set_epoch(epoch)
       ...     train(loader)


   .. py:attribute:: batch_size


   .. py:attribute:: num_forecast_steps


   .. py:method:: __iter__()


   .. py:method:: __len__() -> int


