Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: implement push/pull interface from JAC, file and s3 (docarray#1182
) * refactor: move streaming serialization into separate method Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: add binary io like protocol definition Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: ported push pull to JAC Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: protocol is not in 3.7 typing Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: make mypy happy Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: patch missing waterfall Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: jit import backends Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: implement cache in jinaai pull Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: add hubble dependency to jina group Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: better division of concerns Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: add concept of namespace Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: ignore missing hubble stubs Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: streaming protocol stubs Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: make more general buffered caching reader Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: add tests for hubble pushpull Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: add tests for file backend Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: remove hubble dependency from jina group This reverts commit b304421. Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: implement push pull for local filesystem Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: test concurrent pushes and pulls in file protocol Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: resolve concurrent pushes and pulls correctly Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: rename text to textdoc Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: added some logging Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: s3 tests Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: s3 pushpull Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: add smart open dependency Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: add smart opens silly python bound Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: update hubble tests (failing) Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: fix delete return in hubble pushpull Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * Revert "fix: add smart open dependency" This reverts commit cf78c6c. This reverts commit eb0e52b. Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: add hubble and smart open dependencies Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: mypy fixes Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * ci: allow tests to see jina auth token Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: add progress bars for streaming Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * style: blacken Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: buffer writes to s3 Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: mypy no like sequence Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: make progress bar quieter when disabled Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: skip failing tests Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: add tables when listing Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: add jina auth token to uncaped test Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: mock s3 tests with minio container Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: silly error that cost me 2 hours of life Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: use tolerance ratio in file tests Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: add caching to s3 pull Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: add log messages for unused parameters Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: take out unneeded buffering smart open already buffers Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: pick fastest protocol compression configuration for s3 Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: bump tolerance ratio for s3 test Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: reduce code duplication Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: put reader chunk size constant at top of file Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: reduce reader chunk size for memory tests Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: multipart uploads get stuck frequently lets just do big uploads for now... Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * docs: add docstrings to mixin and file backend Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * docs: add docstring for s3 and hubble backends Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * test: remove unused test Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: use literal in protocol Co-authored-by: samsja <55492238+samsja@users.noreply.github.com> Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: protocols dont need to be inherited Co-authored-by: samsja <55492238+samsja@users.noreply.github.com> Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: add make mypy happy with the literals Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: literals not in 3.7 Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: move mixin out of init file Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: move cache path resolution to utils Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * feat: cache path is only evaluated once Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: loading backends makes more sense as debug log Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * tests: add slow and internet marks Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: pin image tag Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: use abc instead of protocol for typing backends Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: revert - add hubble and smart open dependencies This reverts commit 1d1d2ee. Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: add hubble and aws dependencies Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: change all push pull mixin methods to class methods Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: misstyped class method self reference Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: rename pushpull to docstore and use more classmethods Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: separate remote backend implementations from mixin Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * fix: missed import refactor Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: change submodule name to store Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: remove list and delete from mixin Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * tests: clear all the garbage in ci account Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * tests: skip test that is broken on ci Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> * refactor: standardize naming to jac Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> --------- Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com> Co-authored-by: samsja <55492238+samsja@users.noreply.github.com>
- Loading branch information