Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
- 
            Updated
            Jun 9, 2024 
- C++
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
TPC-H queries in Apache Spark SQL using native DataFrames API
Java Application, uses Apache Spark, handles batch as well as streaming processing
mainframe - a lightweight dataframe library for C++
Apache Spark project for Advanced Topics on Databases course
A sandbox environment designed to simulate a pseudo-distributed Hadoop cluster with integrated Apache Spark and Kafka components. It allows developers to prototype and experiment with big data workflows, test distributed computing patterns, and explore cluster behavior in a contained virtual setup.
Python Skills Checkpoint
API converting NYC Department of Health: https://github.com/nychealth/coronavirus-data
Construct Source files as per the target files in Spark using Datframe api and spark
Semester assignment for ECE NTUA 3189 Advanced Topics in Database Systems
Soundhopper project - created for users to skip ahead to specified sections of track - built using Python, and Jupyter notebook.
make easier the use of columnar spark files
Analysis of American Time Use Survey (ATUS): https://www.kaggle.com/bls/american-time-use-survey
Add a description, image, and links to the dataframes-api topic page so that developers can more easily learn about it.
To associate your repository with the dataframes-api topic, visit your repo's landing page and select "manage topics."