A Java based Web Search Engine that:
- Crawl websites ( Geeks for Geeks, JavatPoint, TutorialsPoint, Oracle, Guru99, Beginners Book ) to fetch the data.
- Finds keywords using KMP string matching algorithm
- Searching patterns in database using InvertedIndex
- Rank the web pages using word frequencies
- Finds patterns using regular expressions
- Converts HTML web pages to text format
- Analyzes the frequencies of words using hash tables
search-engine
| README.md
| .classpath
| .project
| .gitignore
|
└───src
| └───searchEngine
| | | Controller.java -> Main File
| | | Crawler.java
| | | EditDistance.java
| | | Helper.java
| | | In.java
| | | InvertedIndex.java
| | | KMP.java
| | | Menu.java
| | | SpellCheck.java
| | | StdOut.java
|
└───bin
| └───searchEngine
| | | Controller.class
| | | Crawler.class
| | | EditDistance.class
| | | Helper.class
| | | In.class
| | | InvertedIndex.class
| | | KMP.class
| | | Menu.class
| | | SpellCheck.class
| | | StdOut.class
|
|
└───WebContent
|
└───WebPages