Subtitle scraper and publisher - Academic projet - HEIG-VD 2016,
Out of the box it supports the following features:
- Amazon ec2 instance creation and handling with 24h ip blacklist
- Download subtitle over SSH and increment error count if redirection
- MovieCollection aggregator
- IMDb and opensubtitles.org scrapper, gets youtube video id for video trailer
- Databse seed from opensubtitles.org list
- Create and Update methode for Wordpress posts
- Handlebar template for Wordpress posts
- Publish subtitles according to a website scope stored in databse
- Image and zip file storage in databse
- Wordpress posts retrieve subtitle from database
- Comprehensive logs
- Error handeling
Check out the Wordpress demo! more than 10'000 movies posted http://149.202.172.22
This projet has been deployed on a ubuntu 16.04 virtual server that should be fully compatible now
Database : MongoDB 3.2
PHP: 7.x
projectdir$ npm install app.js
Small override is need to avoid casting issue if custom id used. Replace, line 53 in lib/collection
function (str) {
if (null == str) return this.col.id();
return 'string' == typeof str ? this.col.id(str) : str;
};With this
function (str) { return str; };youtube API key (google dev) for youtube trailer matching Wordpress user/password JSON Basic Authentication needed
Regular update Wordpress is sufficent but following plugin are required
WP REST API
JSON Basic Authentication
MCE Table Buttons (just for design)
Also to download subtitles from Mongo in PHP7 beware of native driver change
```php
\MongoClient -> \MongoDB\Client
\MongoCollection -> \MongoDB\Collection
```
Default usage counts the number of main app loop
if (count == 1) finished();templateFr.html -> handle if no image
awsManager -> ? if blacklist
zipZupload -> Handle error if no files
dbSeeder:csv2json -> Random? conversion error
dbSeeder -> Huge file handeling is not done here
lib/database -> Update and clean lib, implement $upsert, handle collection "subtitlesToHandleManualy"