|
3 | 3 |
|
4 | 4 | ## Introduction
|
5 | 5 | This repository contains source code for the assignments of Udacity's course, [Introduction to Hadoop and MapReduce](https://www.udacity.com/course/ud617), which was unveiled on 15th November, 2013.<br>
|
6 |
| -This is a short course by Cloudera guys in association with Udacity. Instructors are Sarah Sproehnle and Ian Wrigley, both from Cloudera and Gundega Dekena, Course Developer is from Udacity.<br> |
| 6 | +This is a short course by [Cloudera](http://www.cloudera.com) guys in association with [Udacity](https://www.udacity.com). Instructors for this course are Sarah Sproehnle and Ian Wrigley, both from Cloudera and Gundega Dekena, Course Developer is from Udacity.<br> |
7 | 7 |
|
8 | 8 | Course does not mandate any programming language for writing Hadoop MapReduce jobs; but they have mainly used / taught Hadoop MapReduce jobs using `Python` [i.e. with Hadoop Streaming approach for running jobs] during the course.<br>
|
9 | 9 |
|
10 |
| -I have developed Hadoop MapReduce code for the 2 problem statements [3 questions each] in 2 programming languages; `Python` and `Java`.<br> |
| 10 | +I have developed Hadoop MapReduce code for the 2 problem statements [3 questions each] in 2 programming languages; `Python` as well as `Java`.<br> |
11 | 11 |
|
12 |
| -## Instructions for Virtual Machine download / setup |
13 |
| -Please refer [instructions document](IntroductiontoHadoopandMapreduce-VMsetup.doc) provided by Course Instructors for details on the setup required for running these examples.<br> |
14 |
| -As mentioned in the above document, Virtual Machine image with Hadoop installed and preconfigured, can be downloaded from [Udacity website](http://content.udacity-data.com/courses/ud617/Cloudera-Udacity-Training-VM-4.1.1.c.zip). |
| 12 | +## Instructions for Virtual Machine download and setup |
| 13 | +Please refer [instructions document](IntroductiontoHadoopandMapreduce-VMsetup.doc) provided by Course Instructors for details on the Hadoop Virtual Machine [VM henceforth] setup required for running these examples.<br> |
| 14 | +As mentioned in the above document, VM image with Hadoop installed and preconfigured, can be downloaded from [Udacity CDN](http://content.udacity-data.com/courses/ud617/Cloudera-Udacity-Training-VM-4.1.1.c.zip). |
15 | 15 |
|
16 |
| -Please be forewarned, the size of this compressed VM archive is 1.7 GB. And it does not uncompress with either 7-Zip or Windows default Zip utility. Please use WinRAR or WinZip or even Cygwin unzip to uncompress the same, if you are on Windows. On other Operating Systems, probably `unzip` command might work just fine. Uncompressed size of this VM is 4.2 GB. |
| 16 | +Please be forewarned, the size of this compressed VM archive is 1.7 GB. Also it does not uncompress with either 7-Zip or Windows default Zip utility. You might have to use WinRAR or WinZip or even Cygwin unzip to uncompress the same, if you are on a Windows platform. On other Operating Systems, probably `unzip` command might work just fine. Uncompressed size of this VM is 4.2 GB. |
17 | 17 |
|
18 |
| -Credentials to login to this Virtual Machine are: `training` / `training`. You will not need `root` access for any of the assignments of this Course. |
| 18 | +Credentials to login to this Virtual Machine are: `training` / `training`. You will not need `root` access for any of the assignments of this Course. But just in case if you need, the password for `root` is `training`. |
19 | 19 |
|
20 |
| -Please ensure that you configure the VM to at least 1.5 GB of RAM. It might run much better with 2 GB though. |
| 20 | +Please ensure that you configure the VM to at least 1.5 GB of RAM in [VMware Player](https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0). It might run much better with 2 GB though. I have used VMware Player v5.0.2, the current latest version as of this writing [i.e. 28th November, 2013] is v6.0.1. |
21 | 21 |
|
22 | 22 | ## Data
|
23 | 23 | ### Input Files
|
24 |
| -Input files for the problem statements [ProblemStatement#1](ProblemStatement1/0_Input) and [ProblemStatement#2](ProblemStatement2/0_Input) have also been uploaded to GitHub.<br> |
| 24 | +~~Input files for the problem statements [ProblemStatement#1](ProblemStatement1/0_Input) and [ProblemStatement#2](ProblemStatement2/0_Input) have also been uploaded to GitHub.~~<br> |
25 | 25 |
|
26 | 26 | > *Update at 11/27/2013 10:00:26 PM IST*: Had to remove these input files from the repo as the GitHub Windows client is not able to sync the repo [or rather getting badly stuck with illegitimate alphabets] with these compressed archives.<br>
|
27 | 27 |
|
28 |
| -These input compressed archives can also be downloaded from Udacity servers. Look [here](http://content.udacity-data.com/courses/ud617/purchases.txt.gz) for input file for Problem Statement 1 and [here](http://content.udacity-data.com/courses/ud617/access_log.gz) for Problem Statement 2.<br> |
| 28 | +These input compressed archives can also be downloaded from Udacity servers. Please check [here](http://content.udacity-data.com/courses/ud617/purchases.txt.gz) for input file for Problem Statement 1 and [here](http://content.udacity-data.com/courses/ud617/access_log.gz) for Problem Statement 2.<br> |
29 | 29 | These links are also mentioned in the instructions document provided by Udacity Course Instructors.
|
30 | 30 |
|
31 | 31 | ### Output Files
|
@@ -159,4 +159,4 @@ Please check [`acc_p2q3.txt`](ProblemStatement2/2_ExecLogs/Java/acc_p2q3.txt) an
|
159 | 159 |
|
160 | 160 | ## License
|
161 | 161 | Copyright © 2013 Prashanth Babu.<br>
|
162 |
| -Licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0). |
| 162 | +Licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0). |
0 commit comments