cerebras.modelzoo.data.common.input_utils.check_sharding_sanity#
- cerebras.modelzoo.data.common.input_utils.check_sharding_sanity(examples_per_file, batch_size, num_workers, drop_last)[source]#
Checks if with the given sharding, at least one batch is generated.
Note that this method is operating based on how shard_and_shuffle_data is sharding the data across workers.
- Parameters
examples_per_file (list) – Total examples per file for this task.
batch_size (int) – Batch size of the model.
num_workers (int) – Number of workers to use in the dataloader.
drop_last (bool) – Boolean indicating whether the last incomplete batch of the dataloader is dropped.
- Raises
ValueError – If no batches are generated with the given sharding.