cerebras.modelzoo.data.nlp.gpt.InferenceDataProcessor.get_token_ids#
- cerebras.modelzoo.data.nlp.gpt.InferenceDataProcessor.get_token_ids(text, tokenizer)[source]#
Get encoded token ids from a string using the specified tokenizer.
- Parameters
text (str) – The input string.
tokenizer (PreTrainedTokenizerBase) – Tokenizer class from huggingface transformers library.
- Returns
List of token ids.
- Return type
List[int]