It is not obvious to compute all parts of a NLP project in french language. After some research, i've found/created some methods that you could use for your own NLP projects.
- clear sentences: using regex
- sentence correction: correction.py
- tokenization : summary_token.py
- lemmatization: summary_lemma.py
- find synonyms in french: syn_french.py
After these benchmark You cand find an function named SENTENCE_TO_CORRECT_WORDS in the file all.py that use these methods to get french tokens from a french sentence.
You can also find my search engine that use preprocessing and semantic similarities here
You can find them here. In respect of the methods you want to test, you can just install some of them