Update README.md

minasmz · Nov 22, 2021 · 23b1a67 · 23b1a67
1 parent 893d354
commit 23b1a67
Showing 1 changed file with 2 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -1,4 +1,6 @@
 # Persian-Summarization
+
+For more information please refer to [our article](http://conf.kntu.ac.ir/cnf_papers/csicc2021/articleFiles2/r_411_201229073907.pdf) and cite it if it was helpful in your work.
 # Statistical and semantical text summarizer in Persian language
 
 It’s a project for text summarization in Persian language. It uses text summarization of [Gensim python library](https://github.com/RaRe-Technologies/gensim) for implementing [TextRank algorithm](https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf). This algorithm assumes each sentence a node in a graph and returns nodes with highest relation with other nodes (sentences). In other words it returns most important nodes with some statistical calculation and does not include any semantics of the sentences. For instance if you use different words for the same meaning it won’t recognize and assumes they are different which in reality they are not. For solving this problem and including semantic in the result I trained a doc2vec model by doc2vec.py in Genism with [Hamshahri corpus](http://dbrg.ut.ac.ir/hamshahri/) as training set. The doc2vec model is included in the repository (my_model_sents_from_res2.doc2vec). I used this model for calculating similarity of two sentences for weighting the graph edges. (instead of weighting based on some tf-idf algorithm which is used in Gensim) and return the result by TextRank algorithm.