Quick Guide - iScrap - Subtitle Scraper & More

iScrap

Subtitle scraper and publisher - Academic projet - HEIG-VD 2016,

Out of the box it supports the following features:

Amazon ec2 instance creation and handling with 24h ip blacklist
Download subtitle over SSH and increment error count if redirection
MovieCollection aggregator
IMDb and opensubtitles.org scrapper, gets youtube video id for video trailer
Databse seed from opensubtitles.org list
Create and Update methode for Wordpress posts
Handlebar template for Wordpress posts
Publish subtitles according to a website scope stored in databse
Image and zip file storage in databse
Wordpress posts retrieve subtitle from database
Comprehensive logs
Error handeling

Demo

Check out the Wordpress demo! more than 10'000 movies posted http://149.202.172.22

Tech specs

This projet has been deployed on a ubuntu 16.04 virtual server that should be fully compatible now
Database : MongoDB 3.2 PHP: 7.x

Installation

projectdir$ npm install app.js

Monkii override

Small override is need to avoid casting issue if custom id used. Replace, line 53 in lib/collection

function (str) {
  if (null == str) return this.col.id();
 return 'string' == typeof str ? this.col.id(str) : str;
};

With this

function (str) { return str; };

Credential

youtube API key (google dev) for youtube trailer matching Wordpress user/password JSON Basic Authentication needed

Wordpress installation

Regular update Wordpress is sufficent but following plugin are required
WP REST API
JSON Basic Authentication
MCE Table Buttons (just for design)
Also to download subtitles from Mongo in PHP7 beware of native driver change
```php

\MongoClient -> \MongoDB\Client
\MongoCollection -> \MongoDB\Collection
```

Usage

Default usage counts the number of main app loop

if (count == 1) finished();

ToDo

templateFr.html -> handle if no image
awsManager -> ? if blacklist
zipZupload -> Handle error if no files
dbSeeder:csv2json -> Random? conversion error
dbSeeder -> Huge file handeling is not done here
lib/database -> Update and clean lib, implement $upsert, handle collection "subtitlesToHandleManualy"

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.aws		.aws
data		data
lib		lib
modules		modules
README.md		README.md
SampleConfig.js		SampleConfig.js
app.js		app.js
dbSeeder.js		dbSeeder.js
package.json		package.json
templateFR.html		templateFR.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quick Guide - iScrap - Subtitle Scraper & More

iScrap

Demo

Tech specs

Installation

Monkii override

Credential

Wordpress installation

Usage

ToDo

About

Uh oh!

Releases

Packages

Languages

nscheuner/ISCRAP

Folders and files

Latest commit

History

Repository files navigation

Quick Guide - iScrap - Subtitle Scraper & More

iScrap

Demo

Tech specs

Installation

Monkii override

Credential

Wordpress installation

Usage

ToDo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages