mx-scraper

Download image galleries or metadata on the web.

This rewrite is expected to support previous implementation's metadata format.

The main idea was to separate the core (mx-scraper) from the plugins (user defined) as it was not possible from previous implementations.

Usage

# pip install beautifulsoup4

# Plugins can be specified with -p or --plugin
# By default, it will be inferred from the args
# Each plugin may have its own set of dependencies that are independent from mx-scraper
# Uses bs4
mx-scraper fetch --plugin images https://www.google.com
# Uses gallery-dl
mx-scraper fetch --meta-only -v https://x.com/afmikasenpai/status/1901323062949159354
mx-scraper fetch -p gallery-dl https://x.com/afmikasenpai/status/1901323062949159354

# Alternatively, to infer batched terms targeting various sources/plugins, prefixing is often required (e.g. id or name)
# The prefix is plugin specific (refer to plugin_name/__init__.py :: mx_is_supported)
mx-scraper fetch --meta-only -v img:https://www.google.com https://mto.to/series/68737
mx-scraper fetch --meta-only -v nh:177013

Commands

mx-scraper engine

Usage: mx-scraper <COMMAND>

Commands:
  fetch        Fetch a sequence of terms
  fetch-files  Fetch a sequence of terms from a collection of files
  request      Request a url
  infos        Display various informations
  server       Spawn a graphql server interfacing mx-scraper
  help         Print this message or the help of the given subcommand(s)

Options:
  -h, --help  Print help

Each fetch strategy will share the same configuration..

Features

GraphQL server

You can also use the extractors through GraphQL queries. You will have the same options as the command-line interface.

Usage: mx-scraper server [OPTIONS]

Options:
      --port <PORT>  Server port
  -h, --help         Print help

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
.vscode		.vscode
misc-extensions		misc-extensions
plugins		plugins
src		src
static		static
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
mx-config.yaml		mx-config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mx-scraper

Usage

Commands

Features

GraphQL server

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

futureg-lab/mx-scraper

Folders and files

Latest commit

History

Repository files navigation

mx-scraper

Usage

Commands

Features

GraphQL server

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages