Skip to content

uozcan12/Twitter-Data-Normalization

Repository files navigation

Twitter-Data-Normalization

Softwares

  • Python 3.4

Sample Dataset

Projects Steps

  • Analyzing the dataset and finding general mistakes when users send tweets ✅
  • To tokenize with NLTK ✅
  • Analyzing the words and
  1. identified and correct emphasize words ✅
  2. adding the forgotten letters in words ✅
  3. correct Turkish sms words ✅
  4. identify emojis ✅
  5. identify mentions ✅
  6. identify hashtags ✅
  7. identify urls ✅
  8. identify punctions ✅
  9. identify symbols ✅
  10. correction accent marks ✅
  11. correction extra whitespaces ✅
  12. making deascifiier ✅
  • Testing results ✅

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages