Skip to content

jmuoghalu/EECS767_CourseProject

Repository files navigation

EECS767_CourseProject

Instructions for Offline Searching

  1. Navigate into the source directory

  2. To run basic searching, run the following command:

     python3 step4_offline_driver.py
    
  3. To run term proximity searching, run the following command:

     python3 step5_offline_driver.py
    
  4. To run relevance feedback searching, run the following command:

     python3 step6_offline_driver.py
    

Instructions for submitting queries are printed to the terminal at runtime.

The Online Version of This Search Engine Is Available Here.

Running the Document Processor and Creating an Inverted Index

  1. Place a folder of HTML documents in the file_cache/unprocessed/ directory.

  2. Let the name of this folder be customFolder.

  3. Navigate into the source directory.

  4. Open a Python terminal:

     python
    
  5. Run the follwing commands:

     import docproc, indexer
    
     dp = docproc.DocProcessor()
     dp.runDocProc("../file_cache/unprocessed/customFolder")
     iic = indexer.InvertedIndex()
     iic.createInvertedIndex("../file_cache/processed/customFolder")
    

    The new inverted index will be written as data directory as customFolder_index.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •