Skip to content

wesslen/Code-Tutorials-for-SOPHI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 

Repository files navigation

SOPHI Code

Introduction

This repository provides template code for running Spark on SOPHI. The code will include a mixture of Scala, PySpark and SparkR.

Code

Topics
Twitter Gnip SQL-DataFrame Manipulation with PySpark
Twitter Gnip Summary Count Files with PySpark
Twitter Gnip Latent Dirichlet Allocation with Scala

How to access SOPHI

To access SOPHI, you must have an active UNCC ID username (student, faculty or staff) and be connected to the UNCC network either directly (edu-roam) or through VPN. See this link on how to set up VPN access.

This link (https://cci-hadoopm3.uncc.edu) provides access to SOPHI's Hue Interface.

To start, click this link and then when prompted, enter your UNCC ID and password.

How to open a Notebook

Within SOPHI, click the "Notebook" button on the top ribbon and click the "+ Notebook" button to create a new Notebook.

Once within a new Notebook, create a PySpark, Scala or SparkR (not available yet) session.

Further Links

About

Tutorials and templates for running Spark on UNCC's SOPHI platform

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors