Provides a very simple REST API and subscription system. You launch the first endpoint with the passed parameters - the URL to which we will send updates, and a list of data that you already have, and the process of collecting data from web pages begins. Once the parsing process is complete, we send the data to the URL that you pass in the request parameters.
# Clone repo
git clone <url of this repo>
# Change dir to cloned repository
cd <repo name>
# Install dependencies
poetry install --no-root
replace if __name__ = '__main__'
section in this files:
- bubble_parser/api.py
replace port
in files:
- tests/server_receiver.py
- tests/test_api.py
- With docker
docker build -t bubble_parser .
docker run -p 8000:8000 bubble_parser .
- Without Docker
# Activate virtualenv
poetry shell
# Run microservice
fastapi run bubble_parser/api.py
- pdfminer-six - for work with pdf
- aiohttp - async parser
- aiofiles - async work with files
- fastapi - rest api framework
- and other...
Collect 7320 lines of json
data, in 17 seconds
// If successfull
{
"ok": true,
"message": "the proccess finish successfully",
"data": {
"articolul_10": [
{
"list_name": "1170P",
"number_order": "(9977/2022)",
"year": 2023,
"date": "03.07.2023"
},
{
"list_name": "1170P",
"number_order": "(9994/2022)",
"year": 2023,
"date": "03.07.2023"
},
{
"list_name": "1170P",
"number_order": "(10064/2022)",
"year": 2023,
"date": "03.07.2023"
}
],
"articolul_11": [
{
"list_name": "1170P",
"number_order": "(9972/2022)",
"year": 2023,
"date": "03.07.2023"
},
{
"list_name": "1170P",
"number_order": "(9975/2022)",
"year": 2023,
"date": "03.07.2023"
},
{
"list_name": "1170P",
"number_order": "(9978/2022)",
"year": 2023,
"date": "03.07.2023"
},
{
"list_name": "1170P",
"number_order": "(9997/2022)",
"year": 2023,
"date": "03.07.2023"
}
]
}
},
// Error
{
"ok": false,
"message": "<exception message>",
"data": "<raw exception data>"
},
// Wrong result
{
"ok": false,
"message": "Wrong result",
"data": "<raw result data>"
}