cerebras.modelzoo.data.common.h5_map_dataset.dataset.MLMHDF5Dataset#
- class cerebras.modelzoo.data.common.h5_map_dataset.dataset.MLMHDF5Dataset(*args, **kwargs)[source]#
Bases:
cerebras.modelzoo.data.common.h5_map_dataset.dataset.HDF5Dataset
Dataset class to handle text preprocessing in bert mlm datasets.
- Parameters
params (dict) – A dictionary containing parameters that HDF5Dataset accepts along with the following add-ons: - “data_dir” (str): the path to the directory containing the images. - “transforms” (list[dict]): a specification of the torchvision transforms.
Methods
generate_sample
load_state_dict
map
state_dict
Attributes
by_sample