cerebras.modelzoo.data.nlp.bert.bert_utils#
Functions
Load up the vocab file. :param: str vocab_file: Path to the vocab file. :param: bool do_lower: Whether the tokens should be converted to lower case. :param str oov_token: Token reserved for the out of vocabulary tokens. |
|
Converts text to unicode, assuming utf-8 input. |
|
Creates the predictions for the masked LM objective. |
|
Read data from meta files. |
|
Returns the whole word start and end indices. |
|
Postprocessing of the CSV file. |
|
Shard the data across the processes. |
Classes
Class to store vocab related attributes. |