Skip to content

parmarmanojkumar/DataScience_CapStone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataScience CapStone

Repository to track all the work of the capstone project

Details of Repository:

Created by Manojkumar Parmar for the DataScience capstone project.

DataSet

This is the training data to get you started that will be the basis for most of the capstone. You must download the data from the Coursera site and not from external websites to start.

Your original exploration of the data and modeling steps will be performed on this data set. Later in the capstone, if you find additional data sets that may be useful for building your model you may use them.

Tasks to accomplish

  1. Obtaining the data - Can you download the data and load/manipulate it in R?
  2. Familiarizing yourself with NLP and text mining - Learn about the basics of natural language processing and how it relates to the data science process you have learned in the Data Science Specialization.

Questions to consider

  1. What do the data look like?
  2. Where do the data come from?
  3. Can you think of any other data sources that might help you in this project?
  4. What are the common steps in natural language processing?
  5. What are some common issues in the analysis of text data?
  6. What is the relationship between NLP and the concepts you have learned in the Specialization?

Releases

No releases published

Packages

No packages published