cerebras.modelzoo.data_preparation.nlp.write_csv_ner#
File: write_csv_ner.py
Use to create pre-processed CSV files for the Data Processor from the NER raw dataset CSV files.
Based on https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/LanguageModeling/BERT/run_ner.py with minor modifications
Example Usage:
python write_csv_ner.py –data_dir /cb/ml/language/datasets/blurb/data_generation/data/BC5CDR-chem/ –vocab_file /cb/ml/language/datasets/pubmed_abstracts_baseline_fulltext_vocab/Pubmed_fulltext_vocab.txt –output_dir /cb/ml/language/datasets/ner-pt/bc5cdr-chem-csv –do_lower_case
Functions
Add required command-line arguments. |
|
Classes
Create new instance of InputFeatures(tokens, labels) |