Skip to content

Latest commit

 

History

History
6 lines (5 loc) · 617 Bytes

File metadata and controls

6 lines (5 loc) · 617 Bytes

Classification-using-Text-Mining

Use of Clusterization of Documents using Rapidminer

Project of Clusterization of Documents, there are articles of diverse topics, and their respective abstracts, a methodology that simplifies the work of the reader, is to understand their subject previously without needing to read one by one. We apply Text mining techniques to simplify the reading of the articles, each abstract eliminates each word "empty", articles, etc. That do not give weight to the text, then standardized to capital letters and proceed with the classification. The rmp file is the Rapidminer process.