add spark-scala-quickstart#148
Closed
NiloFreitas wants to merge 2 commits intoGoogleCloudDataproc:masterfrom
NiloFreitas:spark-scala-quickstart
Closed
add spark-scala-quickstart#148NiloFreitas wants to merge 2 commits intoGoogleCloudDataproc:masterfrom NiloFreitas:spark-scala-quickstart
NiloFreitas wants to merge 2 commits intoGoogleCloudDataproc:masterfrom
NiloFreitas:spark-scala-quickstart
Conversation
Author
Member
|
Hi @NiloFreitas thanks for the quickstart. It seems that many other files were added by mistake, among them existing notebooks, codelabs, etc. Can you please verify that only the relevant files are in the PR? |
Author
|
Hi @davidrabinowitz . What files do you mean? I could not find what you referring to. |
|
This code can helps a lot! ;-) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Dataproc - Spark Scala Quickstart is an effort to assist in the creation of Spark jobs written in Scala to run on Dataproc.
It provides different pre-implemented Spark jobs and technical guides to run them on GCP.
It is all based on the on the WordCount ETL example with common sources and sinks (Kafka, GCS, BigQuery, etc).
It demonstrates how to run Spark jobs using Dataproc Submit, Serverless, Workflow and how to orchestrate them with Cloud Composer.