Apache Spark application as end to end use case from data acquisition, transformation, model training and deployment.
- setting.sh: Pull a Dataset of Accelerometer data from here.
- input.py: Load dataset and store it as a dataframe.
- transfrom.py: Transform the data into parquet files.
- train.py: Train and get the model file using pyspark ml.
- deploy.py: Deploy model into watson IBM Cloud.
Exception: Java gateway process exited before sending its port number
- Make sure you have JAVA8 (macOS)
brew tap adoptopenjdk/openjdk
brew install --cask adoptopenjdk8- Find your JAVA8's home directory then add those two lines.
import os
os.environ['JAVA_HOME'] = "/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home"