Alexandre Lemonnier <alexandre.lemonnier@epita.fr>
Eliott Bouhana <eliott.bouhana@epita.fr>
Sarah Gutierez <sarah.gutierez@epita.fr>
Victor Simonin <victor.simonin@epita.fr>
The teacher github repository can be found here.
Our repository for the project has the following architecture:
* 1_NaiveBayes : Contains the subject and the notebook for our first lab on the naive bayes classifier.
* 2_FastText : Contains the subject and the notebook for our second lab on the FastText library.
* 3_RNN : Contains the subject and the notebook for our third lab on the usage of an RNN and of an LSTM to classify text.
* requirements.txt
* README.md
Each notebook of the labs contains the work done during the lab, with some comments to describe our technical choices, and analyse our results.
The Naive Bayes classifier is a simple, yet powerful machine learning algorithm that can be used for a variety of tasks, such as classification and prediction. The algorithm is based on the principle of Bayes' theorem, which states that the probability of an event occurring is equal to the product of the prior probability of the event and the likelihood of the event. The Naive Bayes classifier makes the assumption that the features in the data are independent of each other, which is often not the case in real-world data. However, this assumption allows the algorithm to be very efficient and accurate.
The fasttext algorithm is a text classification algorithm that is used to classify text documents into one or more categories. The algorithm is based on the word2vec algorithm and is able to learn from a large corpus of text data. The fasttext algorithm has been shown to be effective at classifying text documents into a variety of different categories.
RNN is a neural network that is used for modeling sequential data. It can be used for tasks such as language modeling and machine translation.
LSTM is a type of RNN that is designed to model long-term dependencies. It can be used for tasks such as speech recognition and text classification.