NLP_based_information_retrieval_system

The information retrieval methods are needed to find the most relevant documents to a given query. The words contained in the web pages can be modeled using different approaches such as Boolean models, vector space models, and probabilistic models. In this project, we have decided to use the vector space models and particularly the Doc2Vec (or word2vec) technique.

This project aims at developing an Information Retrieval System based on the word embedding technique "Doc2Vec (or word2vec)". The documents and the query will be represented by embedding vectors. The similarity between the query vector and each document will be computed using cosine similarity measure. Furthermore, to measure the effectiveness of this information retrieval system, we used the TREC test collection (dataset) available on this website: https://trec.nist.gov/data.html

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Final_project_information_retrieval_system.ipynb		Final_project_information_retrieval_system.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NLP_based_information_retrieval_system

About

Uh oh!

Releases

Packages

Languages

zakaria-aabbou/NLP_based_information_retrieval_system

Folders and files

Latest commit

History

Repository files navigation

NLP_based_information_retrieval_system

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages