Skip to content
View DataOnATangent's full-sized avatar
👋
👋

Block or report DataOnATangent

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DataOnATangent/README.md

Welcome Banner

hello_world

About Me

  • 😄 Pronouns: She/Her/Hers
  • 🔭 I’m currently working on: Tableau Certifcation
  • ❤️ My favorite language: SQL
  • 🌱 I’m currently learning: neural nets and Mandarin
  • 👯 I’m always looking to collaborate with: scientist from any field
  • 💬 Ask me about: anything, I am happy to help
  • 🌍 I support: Latinas in Tech, AllStar Code, The Foundation to Decrease Worldsuck
  • 💜 Interests: philosophy, travel, dachshunds, internet culture, video games, Star Trek
  • ⚡ Fun fact: My ultimate dream is to be on Star Trek and dawn a yellow uniform. 🖖

🛠  Tech Stack

  • 👾 Python POSTGRESQL MSExcel
  • 🌐   HTML5 CSS JavaScript
  • ⚙️   Git GitHub Markdown
  • 💻   Windows iOS

📝 Recent Projects

NLP Project to predict review/company ratings from the text of Glassdoor reviews with various models tested including KNN, random forest, XGBoost, and Lightgbm among others. Data webscrapped from Glassdoor using Selenium.
Libraries Utilized: Numpy, Pandas, Matplotlib, Seaborn, Statsmodels, Sklearn, NLTK, XGBoost, Selenium

A study of CO2 emission averages using machine learning prediction models ARMA, ARIMA, and SARIMA to predict CO2 levels in the coming years. Data was sourced from NOAA and based on weekly average measurements. I hope to use this to highlight the need for further conservation efforts.
Libraries Utilized: Numpy, Pandas, Matplotlib, Seaborn, Statsmodels, Sklearn, PMDARIMA

A case study using mutliple classification model to predict a users occupation using the various features found on their OKCupid dating profile. Models tested include random forest, adaboost, and KNN among others. Final predictions made using logistic regression. Data sourced from OKCupid.com in the San Francisco Area.
Libraries Utilized: Scikit-Learn, Pandas, Statsmodel, Numpy, Matplotlib, Seaborn, Scipy

A linear regression modeling project that sought to predict housing prices in King County, WA, USA. The project sought to increade accuracy through feature engineering, one-hot encoding, and feature selection.
Libraries Utilized: Scikit-Learn, Pandas, Statsmodel, Numpy, Matplotlib, Seaborn

Exploratoty data analysis project of Yelp API data for Flatiron Schools Data Science Immersive Program.
Libraries utilized: Pandas, Numpy, Matplotlib, Seaborn

🤝🏻  Connect with Me


LinkedIn  Twitter  Medium  Gmail 


stats card guy


visitors

Pinned Loading

  1. Representative_Profiles_Machine_Learning_Project Representative_Profiles_Machine_Learning_Project Public

    Jupyter Notebook 2

  2. Kings_County_Seattle_Housing_Project Kings_County_Seattle_Housing_Project Public

    A project looking at how various factors affect housing prices to predict the expected value of a new house.

    Jupyter Notebook

  3. Yelp_API_ETL_Project Yelp_API_ETL_Project Public

    Jupyter Notebook 1

  4. Astroraf/CO2_predictions Astroraf/CO2_predictions Public

    Jupyter Notebook 1