Skip to content

BrooksIan/SacWomenInData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Data Science in Apache Spark

Exploring the Global Terrorism Database Dataset

Level: Moderate

Language: Scala

Requirements:

Author: Ian Brooks

Follow LinkedIn - Ian Brooks PhD

Context

Information on more than 150,000 Terrorist Attacks

The Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2015 (with annual updates planned for the future). The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 150,000 cases. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland. More Information

Instructions

  1. Using the provided link, please download HDP Sandbox. For assistance, please use the following tutorial for assistance.

  2. Using the provided link, please download Global Terrorism Database CSV file from Kaggle. Note: You will need a Kaggle account.

  3. Using the provided link, please download the Zeppelin Note.

  4. Launch HDP Sandbox.

  5. Log into Apache Ambari as User: raj_ops & Password: raj_ops

  6. In Ambari, select "Files View" and upload GTDB CSV file to the /tmp/ directory. For assistance, please use the following tutorial.

  7. In Zeppelin, download the Zeppelin Note JSON file. For assistance, please use the following tutorial

License

Unlike all other Apache projects which use Apache license, this project uses an advanced and modern license named The Star And Thank Author License (SATA). Please see the LICENSE file for more information.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published