This project involved extending an existing relational algebra engine called SimpleRA, which provides basic SQL-like functionality such as loading tables, selecting data, and performing simple operations. SimpleRA simulates how a database engine works under memory constraints (only 10 blocks of main memory), and it uses block-based file I/O to mimic disk-based storage systems.
Over the course of three phases, we extended SimpleRA to support:
- Matrix operations
- External sorting and aggregation
- Hash-based joins
- Index-based query optimization
This allowed us to work with both tables and matrices, implement advanced queries like GROUP BY and JOIN, and optimize execution with indexing structures similar to those in real-world databases.
Added functionality for working with square matrices (integer-only):
-
SOURCE <filename>
Execute all queries listed in a.rafile from the/datadirectory. -
LOAD MATRIX <matrix name>
Load a matrix from a.csvfile and store it block-wise. -
PRINT MATRIX <matrix name>
Print matrix content (up to 20×20 for large matrices). -
EXPORT MATRIX <matrix name>
Save the matrix as a.csvfile in/data. -
ROTATE <matrix name>
Rotate matrix 90° clockwise (in-place). -
CROSSTRANSPOSE <matrix1> <matrix2>
Transpose and swap two matrices (in-place). -
CHECKANTISYM <matrix1> <matrix2>
Verify if matrices are anti-symmetric (A = -Bᵀ).
Added external-memory algorithms and query operations on tables:
-
SORT <table> BY col1, col2 IN ASC|DESC,...
K-way merge sort using 10 memory blocks (in-place). -
<result> <- ORDER BY col ASC|DESC ON <table>
Output a new table ordered by specified column. -
<result> <- GROUP BY attr FROM <table> HAVING AGG(attr2) op value RETURN AGG(attr3)
Implemented aggregate filters using:- Aggregates:
SUM,AVG,MAX,MIN,COUNT - Operators:
==, <, <=, >, >=
- Aggregates:
-
<result> <- JOIN table1, table2 ON col1, col2>
Partitioned Hash Join for efficient equi-joins.
Introduced an indexing structure (e.g., B+ Tree or Hash Index) to speed up conditional queries:
-
res_table <- SEARCH FROM <table> WHERE <condition>
Efficient indexed search (supports==, !=, <, <=, >, >=). -
INSERT INTO <table> (col1 = val1, col2 = val2, ...)
Insert new records with default values for unspecified columns. -
UPDATE <table> WHERE condition SET col = value
Update values for matching rows. -
DELETE FROM <table> WHERE condition
Delete matching rows from table.
- Understood internals of database systems (block I/O, memory limits)
- Extended SQL-like operations for both matrices and relational tables
- Implemented join, aggregation, and sorting using external algorithms
- Integrated indexing structures for faster query execution
We use make to compile all the files and creste the server executable. make is used primarily in Linux systems, so those of you who want to use Windows will probably have to look up alternatives (I hear there are ways to install make on Windows). To compile
cd into the SimpleRA directory
cd SimpleRA
cd into the soure directory (called src)
cd src
To compile
make clean
make
Post compilation, an executable names server will be created in the src directory
./server