Skip to content

Latest commit

 

History

History
188 lines (102 loc) · 5.89 KB

File metadata and controls

188 lines (102 loc) · 5.89 KB
title subtitle author job logo framework highlighter hitheme url widgets mode
The Data Science Track
Jeffrey Leek
Johns Hopkins Bloomberg School of Public Health
bloomberg_shield.png
io2012
highlight.js
tomorrow
lib assets
../../librariesNew
../../assets
mathjax
selfcontained

Why do data science?

"It is not the critic who counts: not the man who points out how the strong man stumbles or where the doer of deeds could have done better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood, who strives valiantly, who errs and comes up short again and again, because there is no effort without error or shortcoming, but who knows the great enthusiasms, the great devotions, who spends himself for a worthy cause; who, at the best, knows, in the end, the triumph of high achievement, and who, at the worst, if he fails, at least he fails while daring greatly, so that his place shall never be with those cold and timid souls who knew neither victory nor defeat."

Theodore Roosevelt, 26th President of the United States

Statistics and the science game


The key challenge in data science

"Ask yourselves, what problem have you solved, ever, that was worth solving, where you knew all of the given information in advance? Where you didn’t have a surplus of information and have to filter it out, or you didn’t have insufficient information and have to go find some?"

Dan Myer, Mathematics Educator

The key word in data science is not data; it is science


About us

Data intensive statistics in biology and medicine

Why data science?

http://www.economist.com/node/15579717


Why data science?

http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation


Why statistical data science?

http://www.nytimes.com/2009/08/06/technology/06stats.html?_r=0


Why are you lucky?


Why are you lucky?

Heritage Health Prize


Why R?

http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all


Why R?

  • It is free
  • It has a comprehensive set of packages
    • Data access
    • Data cleaning
    • Analysis
    • Data reporting
  • It has one of the best development environments - Rstudio http://www.rstudio.com/
  • It has an amazing ecosystem of developers
  • Packages are easy to install and "play nicely together"

Who is a data scientist?

[Daryl Morey](http://en.wikipedia.org/wiki/Daryl_Morey)

Who is a data scientist?

[Hilary Mason](http://www.hilarymason.com/)

Who is a data scientist?

Daphne Koller


Who is a data scientist?

Nate Silver


Our goal

Drew Conway


Plus jobs

http://radar.oreilly.com/2011/09/building-data-science-teams.html


This course

  • Introducing you to the track
  • Getting tools set up
  • Giving you basic background