This project is an NLP Project to recognize the poet of a poem. The training set consists of some poems from Molavi, Hafez and Ferdowsi. At first, we created a dictionary from all words that occurred more than two times in the training set. We then created the unigram and bigram model for each of the poets by calculating the probabilities and then applied the backoff model for smoothing.
We applied the generated model to the test set. The test set consists of poems matched by a number to show each poem belongs to which of the poets. We could gain 87 percent accuracy in predicting the correct poet on the test set poems.