Manga Page Scrapper

This is a simple script to scrape manga pages from a websites and save them to a folder on AWS EC2 instance. There is an api and a consumer, the api is a Flask app that takes a chapter from a manga and if not already scraped, send a message to the consumer SQS to scrape the pages.

SQS Queue

SQS Message

{
    "source": "manga_livre",
    "manga": "Naruto",
    "chapter": "692"
}

Endpoints

Get a single chapter page

GET /page

Query Param	Type	Description
`source`	`string`	Required. manga_livre or muito_manga
`manga`	`string`	Required. manga name
`number`	`string`	Required. chapter number
`page`	`string`	Required. page number

Save a single chapter page on EBS

POST /page

Form	Type	Description
`source`	`string`	Required. manga_livre or muito_manga
`manga`	`string`	Required. manga name
`number`	`string`	Required. chapter number
`page`	`string`	Required. number of pages
`image`	`file`	Required. image file

Get a chapter

GET /chapter

Query Param	Type	Description
`source`	`string`	Required. manga_livre or muito_manga
`manga`	`string`	Required. manga name
`number`	`string`	Required. chapter number

Sites Supported

Architecture

GitHub Actions

Variables to be set in the repository secrets

AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=
AWS_SECURITY_GROUP =
SSH_PRIVATE_KEY=
HOSTNAME=
USERNAME=

Workflow to deploy to EC2 instance

.github/workflows/deploy.yml

Script to config the EC2 instance, install docker, update nginx and run the container

app.sh

Run Locally

Use docker-compose to run both the api and the consumer

docker-compose up --build --scale manga_consumer=10 -d

--scale manga_consumer=10 will run 10 consumers in parallel

Licença

MIT

⚠ Atention ⚠

This project is for study purposes only, I do not encourage piracy. If you like the manga, buy it. If you want to read it for free, go to the official website. I am not responsible for any misuse of this project.

Developed by Jean Jacques Barros

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
app		app
environment		environment
files		files
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.sh		app.sh
docker-compose.yaml		docker-compose.yaml
nginx.conf		nginx.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Manga Page Scrapper

SQS Queue

Endpoints

Sites Supported

Architecture

GitHub Actions

Run Locally

Licença

⚠ Atention ⚠

About

Releases

Packages

Languages

License

jjeanjacques10/manga-scrapper-api

Folders and files

Latest commit

History

Repository files navigation

Manga Page Scrapper

SQS Queue

Endpoints

Sites Supported

Architecture

GitHub Actions

Run Locally

Licença

⚠ Atention ⚠

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages