cerebras.modelzoo.data_preparation.nlp.tokenizers.BPETokenizer.BPETokenizer#

class cerebras.modelzoo.data_preparation.nlp.tokenizers.BPETokenizer.BPETokenizer(vocab_file, encoder_file, errors='replace', special_tokens=None)[source]#

Bases: object

Methods

add_token

bpe

decode

encode

get_token_id