Skip to content

ZenW00kie/koalas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to koalas, the wrapper on the wrapper for Spark. The real iteration of this should circumvent using PySpark, but for now we're going to use that as our base. We've found the learning the syntax and some of the functionality from pandas doesn't necessarily exist, so koalas looks to tackle that problem.

NOTE: This assumes you are using Databricks as they give you by default the SparkSession. Will endeavor to add the functionality later.

Setup:

git clone http://prdbitbucket.saccap.int/scm/gdwn/koalas.git
cd koalas
python setup.py bdist_egg

Copy the .egg file from dist to Databricks.

To do:

  • Deal with caching results as execute rather than chaining (this is going to be tricky).
  • Have upon return execute show()
  • deal with conversion of types

About

No description, website, or topics provided.

Resources

License

Rate limit · GitHub

Access has been restricted

You have triggered a rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages