Following are the main contents to follow, you can jump to any section:
Natural Language Processing (NLP) has become an essential technology in various industries, including healthcare, finance, marketing, and customer service. NLP techniques enable computers to understand, analyze, and generate human language, making it possible to automate tasks such as sentiment analysis, named entity recognition, grammer & spell check, and text summarization.
This project aims to develop a production-grade system that offers multiple NLP services, leveraging the power of GitHub Actions for continuous integration and continuous deployment (CI/CD), and AWS ECR (Elastic Container Registry) for container image management. The system will be deployed on EC2 self-hosted runners, providing an efficient and scalable solution for delivering high-quality NLP services.
The NLP Hub consists of multiple NLP services, each providing specific functionality, such as sentiment analysis, named entity recognition, grammar & spell check, and text summarization. The system will be designed to handle a high volume of text data.
The system will be containerized using Docker, allowing for easy deployment and scaling. Docker images will be built and pushed to AWS ECR, a fully managed container registry service provided by AWS, which provides a secure and scalable way to store and manage Docker container images. AWS ECR will be used for image management and versioning, enabling seamless deployment of new versions of the containers.
The CI/CD pipeline will be implemented using GitHub Actions, which will be triggered automatically whenever changes are pushed to the GitHub repository. GitHub Actions will build the Docker images, run tests, and push the images to AWS ECR. Then, the images will be pulled from ECR and deployed on EC2 self-hosted runners, which are EC2 instances configured as GitHub Actions runners. These runners will handle the deployment of the containers, ensuring a smooth and automated CI/CD process.
- Python
- Docker
- Deep Learning
- PyTorch
- Hugging Face
- Cloud Computing
- SMTP Server
- GitHub
- DockerHub
- S3 Bucket
- GitHub Actions
- Amazon ECR (Elastic Container Registry)
- EC2 (Self Hosted Runner)
-
Ensure you have Python 3.7+ installed.
conda create -n venv python=3.10
conda activate venv OR
- Create a new Python virtual environment with pip:
virtualenv venv
source venv/Scripts/activateInstall dependencies
pip install -r requirements.txtClone the project
git clone https://github.com/Hassi34/NLP-Hub.gitGo to the project directory
cd NLP-HubProvision an EC2 instance and run the following commands in order: optinal commands
sudo apt-get update -y
sudo apt-get upgraderequired commands
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp dockerAWS_ACCESS_KEY_ID=""
AWS_SECRET_ACCESS_KEY=""
AWS_REGION=""
#DockerHub
DOCKERHUB_ACCESS_TOKEN=""
DOCKERHUB_USERNAME=""
IMAGE_NAME_DOCKER_HUB=""
#Email Alerts
EMAIL_PASS=""
SERVER_EMAIL=""
EMAIL_RECIPIENTS=""
#Applicable to Actions only
AWS_ECR_LOGIN_URI=""
ECR_REPOSITORY_NAME=""
To run CICD, make sure to setup the above environment variables as the secrets in GitHub Actions
First setup a self-hosted runner on GitHub then create a GitHub repository and push the code to the main branch
git add . && git commit -m "first commit" && git push -u origin mainTo run the following command sequence, ensure you have the docker installed on your system.
In case you have not already pulled the image from the Docker Hub, you can use the following command:
docker pull hassi34/nlp-hubNow once you have the docker image from the Docker Hub, you can now run the following commands to test and deploy the container to the web
docker imagesUse the following command to run a docker container on your system:
docker run --name <CONTAINER NAME> -p 80:8080 -d <IMAGE NAME OR ID>Check if the container is running:
docker psIf the container is running, then the API services will be available on all the network interfaces
To access the API service, type localhost in the browser.
In conclusion, this project aims to develop a production-grade NLP services system with CI/CD implemented using GitHub Actions and AWS ECR. The system will be scalable, secure, and highly available, and it will provide multiple NLP services to cater to different use cases. The project will leverage the power of NLP libraries, containerization, and container image management using AWS ECR, along with GitHub Actions and EC2 self-hosted runners, to deliver a robust and efficient solution for NLP applications.
MIT Β© Hasanain
Let's connect on LinkedIn


