This repo contains raw Java source files written for our Big Data course final project.
We implemented matrix operations using Hadoop MapReduce and tested them on a Linux Hadoop cluster using IntelliJ IDEA.
Implemented in:
MatrixOperationMapper.javaMatrixOperationReducer.javaMatrixOperationDriver.java
Supported operations:
- Matrix Addition
- Matrix Subtraction
- Matrix Multiplication
- (Also experimented with Division & Mod in
Notes/)
Input format: CSV files in the input/ folder.
Files in: InvertedIndexFiles/javaClasses/
Reads from multiple text files and builds an inverted index using MapReduce.
Example:
file1.txt: hello world
file2.txt: world is beautiful
file3.txt: hello again
Result:
hello -> file1.txt, file3.txt
world -> file1.txt, file2.txt
- Java
- Hadoop (MapReduce)
- CSV File Input/Output
- Linux Terminal
- IntelliJ IDEA (Linux version - ideaIC)
Only source code files are available — this is not a complete IntelliJ project folder.
| Folder / File | Description |
|---|---|
input/ |
CSV matrix files for MapReduce processing |
MatrixOperations/ |
Main implementation of matrix operations |
InvertedIndexFiles/ |
Inverted index logic and test files |
Notes/ |
Extra experiments, helpers, and backup logic |
*.xml.txt |
IntelliJ project configs saved from Linux |
These are raw
.javafiles. You’ll need to compile and run them inside a Hadoop-compatible environment.
Steps:
- Copy the
.javafiles into your IDE (e.g., IntelliJ) - Compile and package them into a JAR
- Run the MapReduce job using:
hadoop jar YourJarFile.jar MainDriver input/ output/
This repo contains source code only, not a full runnable IntelliJ project.
Originally built and executed on Linux using IntelliJ IDEA Community Edition (ideaIC).
✅ Working source code
❌ Not plug-and-play without Hadoop setup
MIT License