Skip to content

📩A Python package for downloading old visualizations

License

Notifications You must be signed in to change notification settings

oldvis/oldvis_dataset

Repository files navigation

Newest PyPI version Code style: black Commitizen friendly

oldvis_dataset

A Python package for downloading metadata and images of old visualizations in oldvis/dataset.

Installation

pip install oldvis_dataset

Usage Example

Downloading metadata of visualizations:

from oldvis_dataset import visualizations
visualizations.download(path="./visualizations.json")

Downloading images:

from oldvis_dataset import visualizations, fetch_images
visualizations.download(path="./visualizations.json")
fetch_images(metadata_path="./visualizations.json", img_dir="./images/")

Downloading images with filtering condition:

import json
from oldvis_dataset import visualizations, fetch_images
metadata = visualizations.load()
# Download public domain images.
metadata = [d for d in metadata if d["rights"] == "public domain"]
path = "./visualizations.json"
with open(path, "w", encoding="utf-8") as f:
    json.dump(metadata, ensure_ascii=False)
fetch_images(metadata_path=path, img_dir="./images/")

API

oldvis_dataset.visualizations

oldvis_dataset.visualizations.download(path: str) -> None

Request the metadata of visualizations and store at path. Each store metadata entry follows the data structure ProcessedMetadataEntry (Source).

visualizations.download(path="./visualizations.json")

oldvis_dataset.visualizations.load() -> List

Request the metadata of visualizations without saving.

data = visualizations.load()

oldvis_dataset.authors

oldvis_dataset.authors.download(path: str) -> None

Request the metadata of authors and store at path.

authors.download(path="./authors.json")

oldvis_dataset.authors.load() -> List

Request the metadata of authors without saving.

data = authors.load()

oldvis_dataset.fetch_images(metadata_path: str, img_dir: str) -> None

Fetch images and store at img_dir according to the URLs in the downloaded metadata of visualizations stored at metadata_path.

fetch_images(metadata_path="./visualizations.json", img_dir="./images/")

⚠️The image fetching can be slow.

oldvis_dataset.save_as_bib(metadata_path: str, bib_path: str) -> None

Save the fetched metadata at metadata_path as a BibTeX file and store at bib_path.

save_as_bib(metadata_path="./visualizations.json", bib_path="./visualizations.bib")