cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.DatasetStats#

class cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.DatasetStats(num_sequences: int, num_tokens: int, detokenized_bytes: int, detokenized_chars: int, non_pad_tokens: int, loss_valid_tokens: int)[source]#

Bases: object

Methods

Attributes

num_sequences

num_tokens

detokenized_bytes

detokenized_chars

non_pad_tokens

loss_valid_tokens