cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.Reader#

class cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.Reader(in_path, tokenizable_columns, multi_turn=False)[source]#

Bases: object

Methods

read_gz

read_json

read_jsongz

read_jsonl

read_jsonl_tar

read_jsonl_zst

read_parquet

read_txt

stream_data