In this section, you learned about the data science process and set up a professional data science environment on your computer.
- There is a lot to learn about data science, but most of the models involved do one of the following things:
- Predicting a continuous value (regression)
- Predicting a category (classification)
- Identifying unusual data (anomaly detection)
- Generating recommendations
- Data science is not just about selecting and tuning machine learning models. Much of the value comes from understanding the business needs and formulating the problem thoughtfully. And most of the effort is in the early stages of finding, cleaning, exploring, and simplifying the data so it's ready to be run against your models.
- It's important to use professional tools.
- Jupyter Notebook is a great environment for combining your notes and your code
- Git allows you to keep track of your changes
- GitHub allows you to share your changes with your team
- Conda virtual environments ensure that the libraries you use for one project won't break another project you were working on