Skip to content

asdf93074/Octo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Octo

A (WIP) URL-crawler + scrapper from a script written to crawl goodreads.com.

Quick Start

Initializer a Crawler instance with a:

  • Datasource to keep track of URLs, comes with a built-in one for redis though provide creds in a more secure manner
  • Storage to store the parsed results
  • Parser which is made of ParseStep to control how the webpage should be accessed and parsed
  • Array of ParseNode instances to select specific details to be parsed from the HTML of the page

and then do:

async with crawler:
    await crawler.start()

Example

Check examples folder for a simple way to use this library until some docs are written.

About

A URL crawler + scrapper, primarily built to crawl goodreads.com.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages