Skip to content

Jvardas/spark-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark Project

CodeFactor

This constitutes a segment of a project undertaken as part of the master's course at the Department of Informatics and Telecommunications at the University of Athens. Specific details regarding the course can be accessed here, along with a corresponding description.

What is Spark?

Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.

For additional information, please refer to the main repository that serves as the foundation for this project, accessible at this location.

Information about the Master Course Μ111 - Big Data

The course deals with contemporary issues related to the principles and systems Big Data management. The topics we will examine are:

  • The Map-Reduce programming model and systems such as Hadoop, HBase, HBase, and others.
  • The HDFS file storage system. The Spark systems and TensorFlow.
  • Message and streaming systems (e.g. Kafka and Samza).
  • Repositories key value stores.
  • Similarity detection techniques (similarity search, locality-sensitive hashing).
  • Techniques for analysing links in large (PageRank, Hubs & Authorities). Clustering; hinting systems.
  • Computational advertising issues.
  • The course includes presentation and study of research topics as well as practical application of and practical application of these topics.

About

Analyze data foun in https://www.kaggle.com/chicago/chicago-311-service-requests with spark and visualize the data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •