cerebras.modelzoo.data.vision.segmentation.Hdf5BaseDataProcessor.Hdf5BaseDataProcessor#

class cerebras.modelzoo.data.vision.segmentation.Hdf5BaseDataProcessor.Hdf5BaseDataProcessor(*args, **kwargs)[source]#

A HDF5 dataset processor for UNet HDF dataset. Performs on-the-fly augmentation of image and labek.

Parameters: params (dict) – dict containing training input parameters for creating dataset.

Expects the following fields:

“data_dir” (str or list of str): Path to dataset HDF5 files
“num_classes (int): Maximum length of the sequence to generate
“image_shape” (int): Expected shape of output images and label, used in assert checks.
“loss” (str): Loss type, supported: {“bce”, “multilabel_bce”, “ssce”}
“normalize_data_method” (str): Can be one of {None, “zero_centered”, “zero_one”}
“batch_size” (int): Batch size.
“shuffle” (bool): Flag to enable data shuffling.
“shuffle_buffer” (int): Size of shuffle buffer in samples.
“shuffle_seed” (int): Shuffle seed.
“num_workers” (int): How many subprocesses to use for data loading.
“drop_last” (bool): If True and the dataset size is not divisible
by the batch size, the last incomplete batch will be dropped.
“prefetch_factor” (int): Number of samples loaded in advance by each worker.
“persistent_workers” (bool): If True, the data loader will not shutdown
the worker processes after a dataset has been consumed once.

Methods

Classmethod to create the dataloader object.

create_dataloader(is_training=True)[source]#: Classmethod to create the dataloader object.

cerebras.modelzoo.data.vision.segmentation.Hdf5BaseDataProcessor

cerebras.modelzoo.data.vision.segmentation.Hdf5BaseIterDataProcessor