Apache Spark with HDFS cluster within Kubernetes
-
Updated
Jul 11, 2023 - Python
Apache Spark with HDFS cluster within Kubernetes
Jupyter notebook server prepared for running Spark with Scala kernels on a remote Spark master
Simplified Hadoop Setup and Configuration Automation
Hadoop in docker cluster, created by docker-compose. Create Hadoop cluster in less than 5mins.
News Sentiment Analysis using ETL pipeline
Hdfs Block Storage System
Your go-to-cheatsheet to learn apache-Hadoop.
Modern Big Data Analysis: recommend which pair of United States airports should be connected with a high-speed passenger rail tunnel.
In this project we have used comments from reddit to play around with multiple functionalities of Apache Spark, HDFS and Docker.
This project sets up a Hadoop (v3.2.3) cluster on a virtual machine (Multipass) on macOS. It includes instructions for configuring HDFS, YARN, and uploading files via command-line and web interface.
An example of installation Apache Spark on AWS
Add a description, image, and links to the hdfs-cluster topic page so that developers can more easily learn about it.
To associate your repository with the hdfs-cluster topic, visit your repo's landing page and select "manage topics."