- selectreducer.py : this does the reducing job.
- selectmapper.py : this does the mappping job.
- implementation.py : this file contains the implemenations of SQL queries and loading function which calls the mapper and reducer.
- select
- project
- load
- min
- max
- sum
- where
hadoop python3
python3 implementation.py
- LOAD
load bigdata/csv_file_name.csv AS (column_name1:data_type,column_name2:data_type,...)
- SELECT
select column_name1 from table_name where condition
Note : data_type can be int,float,str
- Implement select and project
- Implement aggregate functions MAX, COUNT, SUM
- Loading of the csv file onto the hadoop.