spark-submit

Star

Here are 8 public repositories matching this topic...

tienlonghungson / BigData-HDFS-Spark-Elasticsearch-Kibana

Star

Processing IT Recruitment Data in HDFS Cluster, Spark, Elasticsearch and Kibana, deployed by Docker compose

elasticsearch kibana spark docker-compose hdfs spark-submit it-recruitment

Updated Jan 13, 2022
Python

HussainTaj-W / spark_submit_project

Star

An easy to use script that automatically adds files to the spark-submit command.

python spark spark-submit

Updated Jan 19, 2020
Python

ash-0521 / Ensuring-Smiles-using-Spark-ML

Star

The primary objective of this study is to explore the feasibility of using machine learning algorithms to classify health insurance plans based on their coverage for routine dental services. To achieve this, I used six different classification algorithms: LR, DT, RF, GBT, SVM, FM(Tech: PySpark, SQL, Databricks, Zeppelin books, Hadoop, Spark-Submit)

python sql random-forest pyspark logistic-regression factorization-machines support-vector-machine databricks hadoop-mapreduce gradient-boosting spark-submit zeppelin-notebook decison-trees

Updated Jun 25, 2023
Python

Sabab080 / pyspark-etl-customer-sales

Star

PySpark-based ETL pipeline that extracts transaction data from a MySQL database, cleans and transforms it, aggregates monthly sales per customer, and writes the processed data to an S3 bucket in Parquet format.

python pyspark datapipeline spark-submit dataengineering etl-pipeline