How to format data for Named Entity Recognition (NER)
NER dataset shapes:
- Key information file: The key information file contains a list of entities, which serves as key information for the training data.
- Training data: Training data consists of a file (.txt, .tsv) containing columns separated by a Tab character. One of the columns is a sentence column, while the others represent labels for tokens within the sentence column.
Collaborate with us on GitHub
The source for this content can be found on GitHub, where you can also create and review issues and pull requests. For more information, see our contributor guide.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for