CommonLit, Inc., is a nonprofit education technology organization serving over 20 million teachers and students with free digital reading and writing lessons for grades 3-12. Together with Georgia State University, an R1 public research university in Atlanta, they are challenging Kagglers to improve readability rating methods. In this competition, we had built algorithms to rate the complexity of reading passages for grade 3-12 classroom use. To accomplish this, we had used state of the art machine learning tools with a dataset that includes readers from a wide variety of age groups and a large collection of texts taken from various domains.
You can find three version of trials :
EDA + LSTM +CNN : In this version, we had passed by an exploration data analysis step , which had allowed us to explore and extraxct valuable insights. Then, we had used an LSTM model which had gave us a good start result in compare with the benchmark results of kagglers competitors. Finaly, we had test a CNN model , which had gave a slightly better result than LSTM .
Roberta Model : We had used roberta model, to resolve this problem. So , we had trained a Roberta model using our dataset , then we had fine tunned this model using our target, finally we had combined this model with LGBM model in order to define our model of forecasting.
Bert Model : As we had done with Roberta model , we had adopt the same approach for Bert model.Except that we combined bert with SVM model in order to make our forecasting.
- Kaggle platform.
- Python.
- Pytorch.
- Tensorflow.
- Bert
- Roberta
- LSTM
- CNN
- LGBM
- SVM
To get a local copy up and running follow these simple example steps.
- Open terminal
- Clone this project by the command:
$ git clone git@github.com:Taher-web-dev/CommonLit-Readability-Prize.git
- Then go to the main folder using the next command:
$ cd CommonLit-Readability-Prize
- IDE to edit and run the code (We use Jupyter Notebook 🔥).
- Git to versionning your work.
- Data scientist practioner
- For anyone interested by NLP topics.
👤 Taher Haggui
- GitHub: @TaherHaggui
- LinkedIn: @TaherHaggui
Contributions, issues, and feature requests are welcome!
Give a ⭐️ if you like this project!
- kaggle plarform 💘 (https://www.kaggle.com/)
- My family's support 🙌
This project is CommonLit licensed.