Ephemeral Hadoop clusters using Google Compute Platform
-
Updated
Mar 31, 2022 - Java
Ephemeral Hadoop clusters using Google Compute Platform
Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
Trino Autoscaler on Dataproc automates the scaling of Dataproc cluster based on real-time resource utilization by Trino workloads
Working examples for some components on GCP, and instructions on how to run them.
Determination of which words occur in a dataset of textbooks along with each word's occurrence count identification with the help of Google Cloud Platform based Dataproc cluster formation.
Add a description, image, and links to the dataproc topic page so that developers can more easily learn about it.
To associate your repository with the dataproc topic, visit your repo's landing page and select "manage topics."