This is a simple script to scrape manga pages from a websites and save them to a folder on AWS EC2 instance. There is an api and a consumer, the api is a Flask app that takes a chapter from a manga and if not already scraped, send a message to the consumer SQS to scrape the pages.
SQS Message
{
"source": "manga_livre",
"manga": "Naruto",
"chapter": "692"
}
- Get a single chapter page
GET /page
Query Param | Type | Description |
---|---|---|
source |
string |
Required. manga_livre or muito_manga |
manga |
string |
Required. manga name |
number |
string |
Required. chapter number |
page |
string |
Required. page number |
- Save a single chapter page on EBS
POST /page
Form | Type | Description |
---|---|---|
source |
string |
Required. manga_livre or muito_manga |
manga |
string |
Required. manga name |
number |
string |
Required. chapter number |
page |
string |
Required. number of pages |
image |
file |
Required. image file |
- Get a chapter
GET /chapter
Query Param | Type | Description |
---|---|---|
source |
string |
Required. manga_livre or muito_manga |
manga |
string |
Required. manga name |
number |
string |
Required. chapter number |
- Variables to be set in the repository secrets
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=
AWS_SECURITY_GROUP =
SSH_PRIVATE_KEY=
HOSTNAME=
USERNAME=
- Workflow to deploy to EC2 instance
- Script to config the EC2 instance, install docker, update nginx and run the container
Use docker-compose to run both the api and the consumer
docker-compose up --build --scale manga_consumer=10 -d
--scale manga_consumer=10
will run 10 consumers in parallel
This project is for study purposes only, I do not encourage piracy. If you like the manga, buy it. If you want to read it for free, go to the official website. I am not responsible for any misuse of this project.
Developed by Jean Jacques Barros