This project focuses on implementing Named Entity Recognition (NER) utilizing the RoBERTa model. NER is a crucial task in Natural Language Processing (NLP) that involves identifying and classifying entities such as names of people, organizations, locations, and more within text.
To set up the project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/SagarMaddela/NER-using-Roberta-Model.git cd NER-using-Roberta-Model
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
To train and evaluate the NER model:
-
Prepare your dataset:
- Place your training and evaluation data in the
data/
directory. Ensure the data is in the correct format expected by the scripts.
- Place your training and evaluation data in the
-
Configure the model parameters:
- Adjust the configuration settings in the
config/
directory as needed.
- Adjust the configuration settings in the
-
Run the training script:
python main.py
This script will train the RoBERTa model on your dataset and evaluate its performance.
config/
: Contains configuration files for model parameters and training settings.data/
: Directory to store training and evaluation datasets.scripts/
: Includes utility scripts for data preprocessing and other tasks.main.py
: The main script to train and evaluate the NER model.requirements.txt
: Lists the Python dependencies required for the project.
Contributions are welcome! If you'd like to contribute, please fork the repository and submit a pull request. For major changes, it's advisable to open an issue first to discuss the proposed modifications.
This project is licensed under the MIT License.