Skip to content

epasseto/UdacityFirstProject

Repository files navigation

UdacityFirstProject

Project Submission for First Project of Udacity Data Science Course - Write a Data Science Project

Weblog here

Weblog project here

Description:

This project collects data from Stack Overflow Survey - years 2011 and 2020. The main idea is to compare how users from some kind of group refers frow what programming language. How preferences are changing by time. And what to expect from next generation about using programming languages.

Insights and steps

Some basic questions about how programmers preferences evolved by time can say something for the future of programming languages:

  • what is the general evolution of Stack Overflow users by time
  • what is the most refered language;
  • and how it evolved in time;
  • if US reality in programming languages usage is different from the rest of the world;
  • what is the tendency, based on Z-Generation preferences.

What you will find here

  1. A library of about 20 functions for automatic preprocessing dataframes, for reshaping data:
  • you can adapt them as your wish;
  • and use Jupyter Notebook, or other Python IDE to fork and test your forkings.
  1. A fully documented Jupyter Notebook calling the functions, and showing the results:
  • steps and challenges were documented on this Notebook;
  • graphs illustrate my findings.
  1. both .csv datasets necessary for running the project:
  • readme for the 2020 survey was also attached to this commitment;
  • you can find everything related to these datasets, includin licences for using the data, here

Licence of this comitment:

This software is based on MIT Licence. The complete description of the licence can be readed from Wikipedia

Author:

Eduardo Passeto epasseto@gmail.com


Versions:

  • 0.1 and 0.2 pre Alpha, where incomplete versions, made between april and may 2021;
  • version 1.0, from july 2021 is the actual working version of the system.

Necessary files for running the project:

  1. input files:
  • 2011 Stack Overflow Survey Results 2011.csv → for 2011 data
  • survey_results_public.csv → for 2020 data
  1. Jupyter Notebook:
  • First_Projectn.ipynb → Jupyter Notebook (Python 3.X)
  1. Python library:
  • udacourse.py → Python 3.X functions collection

External libraries:

  • Pandas
  • Numpy
  • Math
  • Matplotlib
  • Seaborn
  • Time

About

Project Submission for First Project of Udacity Data Science Course - Write a Data Science Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published