Skip to content

Latest commit

 

History

History

IRLib

Features

A simple IR library to index files and retrieve key words using scoring algorithms.

Classes

[x] Corpus [x] Indexer [x] Retriever [x] Scorer [x] Tokenizer

First priority

[x] Translate Corpus into Inverted Index. [x] Cleanup Existing Code Base - [x] Ensure class files and folder names are updated accordingly. - [x] Ensure class files are not the same as namespaces for sake of using conventions. [x] Update the Indexer class. <--

Second Priority

[ ] Add S3 Capabilities. [ ] Update Retriever class. [ ] Build a Tokenizer class ( to tokenize ~ lemmatize and normalize each document )