Skip to content

Commit

Permalink
[apache#2059] feat(client-python): Support Gravitino Virtual FileSyst…
Browse files Browse the repository at this point in the history
…em in Python (apache#3528)

### What changes were proposed in this pull request?

Support Gravitino Virtual File System in Python so that we can read and
write Fileset storage data. The first PR only supports HDFS.

After research, the following popular cloud storages or companies have
implemented their own FileSystem based on
fsspec(https://filesystem-spec.readthedocs.io/en/latest/index.html):
1. S3(https://github.com/fsspec/s3fs)
2. Azure(https://github.com/fsspec/adlfs)
3. Gcs(https://github.com/fsspec/gcsfs)
4. OSS(https://github.com/fsspec/ossfs)
5.
Databricks(https://github.com/fsspec/filesystem_spec/blob/master/fsspec/implementations/dbfs.py)
6. Snowflake(https://github.com/snowflakedb/snowflake-ml-python), 

So this PR will implement GVFS based on the fsspec interface.

### Why are the changes needed?

Fix: apache#2059 

### How was this patch tested?

Add some UTs and ITs.

---------

Co-authored-by: xiaojiebao <xiaojiebao@xiaomi.com>
  • Loading branch information
2 people authored and shaofengshi committed Jun 24, 2024
1 parent 5c15214 commit e2e84b1
Show file tree
Hide file tree
Showing 8 changed files with 1,761 additions and 1 deletion.
4 changes: 4 additions & 0 deletions LICENSE.bin
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,7 @@
XNIO API
WildFly
Confluent Kafka Streams Examples
Apache Arrow

This product bundles various third-party components also under the
Apache Software Foundation License 1.1
Expand Down Expand Up @@ -382,6 +383,7 @@
ParaNamer
RE2/J
ZSTD JNI
fsspec

This product bundles various third-party components also under the
MIT license
Expand All @@ -393,6 +395,8 @@
Protocol Buffers
Treelayout
Kyligence/kylinpy
elarivie/pyReaderWriterLock
tkem/cachetools

This product bundles various third-party components also under the
Common Development and Distribution License 1.0
Expand Down
1 change: 1 addition & 0 deletions clients/client-python/gravitino/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@
from gravitino.client.gravitino_admin_client import GravitinoAdminClient
from gravitino.client.gravitino_metalake import GravitinoMetalake
from gravitino.name_identifier import NameIdentifier
from gravitino.filesystem import gvfs
4 changes: 4 additions & 0 deletions clients/client-python/gravitino/filesystem/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"""
Copyright 2024 Datastrato Pvt Ltd.
This software is licensed under the Apache License version 2.
"""
Loading

0 comments on commit e2e84b1

Please sign in to comment.