This project is part of BGU University's Distributed System Programming course, Assignment 2.
The project is map-reduce algorithm.
Implementation with Java, Amazon Web Services (AWS) and Hadoop framework.
Instructions Assignment 2
In this assignment you will generate a knowledge-base for Hebrew word-prediction system, based on
Google 3-Gram Hebrew dataset, using Amazon Elastic Map-Reduce (EMR).
Outputs Examples
- Configure AWS credentials in your machine.
- Create
S3 bucketwith the name specified atAppline 25. - Create a
jarfor each step (5 steps). When creating aJARfile, ensure that theMETA-INF/MANIFEST.MFfile specifies the appropriatemain class. - Using the file system change the name of the
jarsto:Step1,Step2... (exact name) - At the
S3 bucketcreate ajarfolder. - Upload the
jarsto<bucketName>/jars. - For Demo:
arbix.txtfile is in the<bucketName>. This file used as an example input. - Run
App. - Output will be in
<bucketName>/outputs/after a successful run.
Bucket Structure At Start:
Bucket Jars Structure At Start:

Note: make sure that the S3 bucket doesn't include output or log folder pre-run.
