Skip to content

Commit 103e903

Browse files
Text cleaning - stemming of the text and removing extra spaces
1 parent 95e748f commit 103e903

File tree

1 file changed

+4
-2
lines changed
  • Part 7 - Natural Language Processing

1 file changed

+4
-2
lines changed

Part 7 - Natural Language Processing/NLP.R

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,9 @@ corpus = tm_map(corpus, removePunctuation)
5353
# 4. non-relevant words -> remove
5454
corpus = tm_map(corpus, removeWords, stopwords())
5555

56+
# 5. stemming - reduce the total number of words -> getting the root of each word
57+
corpus = tm_map(corpus, stemDocument)
5658

57-
58-
59+
# 6. extra spaces -> remove (extra spaces left from removing numbers for example)
60+
corpus = tm_map(corpus, stripWhitespace)
5961

0 commit comments

Comments
 (0)