Skip to content

Commit

Permalink
Merge pull request #1 from NoamNol/dev
Browse files Browse the repository at this point in the history
Save youtube comments to db and find links
  • Loading branch information
NoamNol authored Aug 12, 2021
2 parents 50acd3d + 1aabcff commit 593e317
Show file tree
Hide file tree
Showing 30 changed files with 892 additions and 0 deletions.
7 changes: 7 additions & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
ASYNC_WORKERS=13

MONGODB_FLASK_USERNAME=mongodbuser
MONGODB_FLASK_PASSWORD=temp_password

MONGODB_ADMIN_USERNAME=mongodbadmin
MONGODB_ADMIN_PASSWORD=temp_admin_password
99 changes: 99 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,101 @@
# Findevil
Find evil content in YouTube comments

## Get started
```powershell
# in Linux replace 'set' with 'export'
set YOUTUBE_API_KEY=<your_key>
docker-compose up -d --build
```

Before `docker-compose up`, you can set more optional settings:
```powershell
# in Linux replace 'set' with 'export'
set ASYNC_WORKERS=<number>
set MONGODB_FLASK_USERNAME=<name>
set MONGODB_FLASK_PASSWORD=<password>
set MONGODB_ADMIN_USERNAME=<name>
set MONGODB_ADMIN_PASSWORD=<password>
```

### Send a request
> You can use tools like [Postman](https://www.postman.com/) to send HTTP requests.
Scan all comments of a YouTube video:
</br>
`http://127.0.0.1:5001/youtube/videos/<video_id>/comments` `[PUT]`

Or with `max` parameter:
</br>
`http://127.0.0.1:5001/youtube/videos/<video_id>/comments?max=<max>` `[PUT]`

### Connect to MongoDB
Now you can connect to the database and view the new data in `flaskdb` db.
</br>
The connection string is:
`mongodb://mongodbadmin:temp_admin_password@localhost:27019`

> Change `mongodbadmin` and `temp_admin_password`
if you used custom `MONGODB_ADMIN_USERNAME` and `MONGODB_ADMIN_PASSWORD`.

### Clean the database
To delete the users and data in MongoDB, run:
```
docker-compose down
docker volume rm findevil_mongodbdata
```

Now you can start again with fresh db and run `docker-compose up` as described above.

## Development
### Requirements
- Python 3.9
- Docker
### Build and run MongoDB for development
```powershell
cd mongo
docker build -t mongo-findevil:latest .
docker run --name mongodb-dev -p 27018:27017 --env-file .env.dev -d mongo-findevil:latest
```

Use port `27018` to connect to the `dev` database:
</br>
`mongodb://mongodbadmin:temp_admin_password@localhost:27018`

### Build and run Flask

Windows:
```powershell
cd flask
copy .env.dev .env
# edit your .env file...
py -m venv env
.\env\Scripts\activate
pip install -r requirements-dev.txt
flask run
```

Linux:
```shell
cd flask
cp .env.dev .env
# edit your .env file...
python3 -m venv env
source env/bin/activate
pip install -r requirements-dev.txt
flask run
```

Use port `5000` to send requests to the `dev` server:
</br>
`http://127.0.0.1:5000/youtube/videos/<video_id>/comments` `[PUT]`

### Debugging
To debug Flask, instead of `flask run` use the debugger in VSCode.

## Todo
- Find evil content in comments.
- Performance: Find links in comments by using another service.
- Performance: Consider using [Aiogoogle](https://github.com/omarryhan/aiogoogle) for better async.
- Performance: Add load balancing to handle multiple requests.

50 changes: 50 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
version: '3.7'

services:
flask:
build: ./flask
container_name: flask
image: flask-findevil:latest
restart: unless-stopped
environment:
YOUTUBE_API_KEY: ${YOUTUBE_API_KEY}
ASYNC_WORKERS: ${ASYNC_WORKERS}
MONGODB_DATABASE: flaskdb
MONGODB_USERNAME: ${MONGODB_FLASK_USERNAME}
MONGODB_PASSWORD: ${MONGODB_FLASK_PASSWORD}
MONGODB_HOSTNAME: mongodb
MONGODB_PORT: 27017 # default port, used inside the network
depends_on:
- mongodb
ports:
- 5001:5000
networks:
- backend

mongodb:
build: ./mongo
container_name: mongodb
image: mongo-findevil:latest
restart: unless-stopped
environment:
MONGO_INITDB_ROOT_USERNAME: ${MONGODB_ADMIN_USERNAME}
MONGO_INITDB_ROOT_PASSWORD: ${MONGODB_ADMIN_PASSWORD}
MONGO_INITDB_DATABASE: flaskdb
flaskdbUser: ${MONGODB_FLASK_USERNAME}
flaskdbPwd: ${MONGODB_FLASK_PASSWORD}
MONGODB_DATA_DIR: /data/db
MONDODB_LOG_DIR: /dev/null
volumes:
# see https://stackoverflow.com/questions/54911021/unable-to-start-docker-mongo-image-on-windows
- mongodbdata:/data/db
networks:
- backend
ports:
- 27019:27017

networks:
backend:
driver: bridge
volumes:
mongodbdata:
driver: local
7 changes: 7 additions & 0 deletions flask/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.vscode/
env/
__pycache__/
*.py[cod]
.env
.gitignore
Dockerfile
11 changes: 11 additions & 0 deletions flask/.env.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FLASK_ENV=development

YOUTUBE_API_KEY=A...z

ASYNC_WORKERS=13

MONGODB_DATABASE=flaskdb
MONGODB_USERNAME=mongodbuser
MONGODB_PASSWORD=temp_password
MONGODB_HOSTNAME=localhost
MONGODB_PORT=27018
5 changes: 5 additions & 0 deletions flask/.flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@

[flake8]
ignore = E402, E123, W504
exclude = .git,__pycache__,docs/conf.py,old,build,dist,env
max-line-length = 100
File renamed without changes.
23 changes: 23 additions & 0 deletions flask/.vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python: Flask",
"type": "python",
"request": "launch",
"module": "flask",
"env": {
"FLASK_APP": "app.py",
"FLASK_ENV": "development"
},
"args": [
"run",
"--no-debugger"
],
"jinja": true
}
]
}
5 changes: 5 additions & 0 deletions flask/.vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"python.linting.pylintEnabled": true,
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
}
20 changes: 20 additions & 0 deletions flask/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM python:3.9-slim-buster

# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE=1

# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED=1

WORKDIR /app
COPY . /app

# Install pip requirements
RUN python -m pip install -r requirements.txt

# Creates a non-root user with an explicit UID and adds permission to access the /app folder
# For more info, please refer to https://aka.ms/vscode-docker-python-configure-containers
RUN adduser -u 5678 --disabled-password --gecos "" appuser && chown -R appuser /app
USER appuser

CMD ["flask", "run", "--host", "0.0.0.0"]
4 changes: 4 additions & 0 deletions flask/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from dotenv import load_dotenv
load_dotenv()

from flaskr import app # noqa
Empty file added flask/contentapi/__init__.py
Empty file.
10 changes: 10 additions & 0 deletions flask/contentapi/content_type.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from enum import Enum


class ContentType(Enum):
'''
All types of content in Findevil.
The value is the name of the corresponding table/collection in the db.
'''
YOUTUBE_COMMENT = 'youtube'
41 changes: 41 additions & 0 deletions flask/contentapi/helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
from collections import Counter


def normalize_params(params_data: list[dict]) -> dict[str, str]:
'''
1. Remove missing parameters or use default value
2. Force no more than one parameter in group (For example, only one 'filter' parameter)
3. Fix value with a middleware function
4. Validate parameters value
5. Convert value to string with simple str(value), or with custom toString function
'''
result = {}
groups_counter = Counter()

for p in params_data:
name = p['name']
value = p['value']
group = p.get('group')
middleware = p.get('middleware')
validator = p.get('validator')

if value is None:
default = p.get('default')
if default is not None:
value = default
else:
continue
if group:
groups_counter[group] += 1
if groups_counter[group] > 1:
raise ValueError(f"Must specify exactly one {group} parameter")
if middleware:
value = middleware(value)
if validator:
message, is_valid = validator
if (not is_valid(value)):
raise ValueError(f"{name}:{value} is invalid parameter ({message})")

to_string = p.get('toString', str)
result[name] = to_string(value)
return result
Empty file.
8 changes: 8 additions & 0 deletions flask/contentapi/youtube/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from abc import ABC


class BaseApi(ABC):
def __init__(self, api_key: str):
self.api_key = api_key
self.api_service_name = "youtube"
self.api_version = "v3"
Loading

0 comments on commit 593e317

Please sign in to comment.