Skip to content

Common Generic DataFile DB V1.0 Aim to ingest kind of dataFile (file format including csv/json/xml/arvo/orc/parquet/protobuf/apache arrow) and Filter/group and order those kind of data using plain sql without flush datas to any Database or hadoop filesystem.

License

Notifications You must be signed in to change notification settings

robinhood-jim/GenericFileDB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Common Generic DataFile DB V1.0 Aim to ingest kind of dataFile (file format including csv/json/xml/arvo/orc/parquet/protobuf/apache arrow) and Filter/group and order those kind of data using plain sql without flush datas to any Database or hadoop filesystem. Data file can ingest from local/hdfs/ApacheVfs/AWS s3/google cloud storage/minio/Aliyun/tencent cos/baidu BOS/huawei OBS and etc. Files less than 4G bytes can process without flush to tmp path. large than 4G orc/parquet/arrow binary file must be download first.

license

    Develop Environment
            JDK 11 above
            Maven 3.8 above

About

Common Generic DataFile DB V1.0 Aim to ingest kind of dataFile (file format including csv/json/xml/arvo/orc/parquet/protobuf/apache arrow) and Filter/group and order those kind of data using plain sql without flush datas to any Database or hadoop filesystem.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages