Skip to content

revans2/cudf

Repository files navigation

PyGDF

Documentation Status

PyGDF implements the Python interface to access and manipulate the GPU DataFrame of GPU Open Analytics Initiative (GoAi). We aim to provide a simple interface that is similar to the Pandas DataFrame and hide the details of GPU programming.

Read more about GoAi and the GDF

Setup

Conda

You can get a minimal conda installation with Miniconda or get the full installation with Anaconda.

You can install and update PyGDF using the conda command:

conda install -c numba -c conda-forge -c gpuopenanalytics/label/dev -c defaults pygdf=0.1.0a2

You can create and activate a development environment using the conda command:

conda env create --name pygdf_dev --file conda_environments/testing_py35.yml
source activate pygdf_dev

Install from Source

To install PyGDF from source, clone the repository and run the python install command:

git clone https://github.com/gpuopenanalytics/pygdf.git
python setup.py install

Note: This will not install dependencies automatically, so it is recommended to use the conda environment.

Pip

Currently, we don't support pip install yet. Please use conda for the time being.

Testing

This project uses py.test.

In the source root directory and with the development environment activated, run:

py.test

Getting Started

Please see the Demo Docker Repository for example notebooks on how you can utilize the GPU DataFrame.

GPU Open Analytics Initiative

The GPU Open Analytics Initiative (GoAi) seeks to foster and develop open collaboration between GPU analytics projects and products to enable data scientists to efficiently combine the best tools for their workflows. The first project of GoAi is the GPU DataFrame (GDF), which enables tabular data to be directly exchanged between libraries and applications on the GPU.

GPU DataFrame

The GPU DataFrame is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. The GPU DataFrame uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Arrow are supported.

About

cuDF - GPU DataFrame Library

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Cuda 43.3%
  • Python 27.1%
  • C++ 25.1%
  • C 1.4%
  • Objective-C 1.4%
  • CMake 1.0%
  • Other 0.7%