Skip to content

unibox provides unified interface for common file operations

License

Notifications You must be signed in to change notification settings

trojblue/unibox

Repository files navigation

unibox

ci documentation pypi version gitter

unibox provides unified interface for common file operations

Installation

pip install unibox

With uv:

uv tool install unibox

If you're not using python 3.13, it's also recommended to install pandas[performance]:

pip install "pandas[performance]"

to update or remove project dependencies:

uv add requests

uv remove requests

# after adding new package: rerun
make setup

Usage

import the lib:

import unibox as ub

Using Huggingface Backend

you can load and use a huggingface dataset directly with hf://{username}/{daataset_repo}:

hf_dset = ub.loads("hf://incantor/aesthetic_eagle_5category_iter99")
df = hf_dset.to_pandas()

and upload a processed dataframe back to huggingface:

df["new_col"] = "new changes"
ub.saves(df, "hf://datatmp/updated_repo")

Dev notes

current concerns:

  1. loads(): temp files could accumulate on global dir, and take up all of /tmp/; also concurrency issues
  2. s3_backend: only one that takes a dir; should make others do the same

to get a coverage report, run:

pytest --cov=src/unibox --cov-report=term-missing tests

To build the docs:

make docs host=0.0.0.0

# or in debug mode:
make check-docs

to manual make a release:

# python -m pip install build twine
python -m build
twine check dist/*
twine upload dist/*

migrating from unibox 0.4

no longer supported:

  • ub.traverses(): removed handlers and exclude_extensions (include_extensions still works but depreciated with exts)

About

unibox provides unified interface for common file operations

Resources

License

Code of conduct

Stars

Watchers

Forks

Sponsor this project

Packages

No packages published