Skip to content

Container-based development environment for Data Engineers

License

Notifications You must be signed in to change notification settings

xaniasd/spark-hive-s3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark local development environment

Overview

A simple setup for a local development envrironment using Spark and S3. The repository provides a set of container images with all the required libraries and configuration, as well as a a docker-compose file to deploy the environment.

The scripts create a Spark standalone cluster with a Hive metastore and localstack as a mock S3 service. Furthermore, the Spark Master UI is exposed and configured as proxy to access running applications.

Get started

# start up environment
docker-compose up --build -d
# run a test script
docker-compose exec spark-edge bash
spark-submit src/test-spark.py
# clean up environment (including volumes)
docker-compose down -v

About

Container-based development environment for Data Engineers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published