Predict which Tweets are about real disasters and which ones are not.
I'm bringing over the Kaggle competitions I'm participating in! There, I have several exciting datasets and notebooks, featuring unconventional challenges—like my sentiment analysis of Rick and Morty scripts (https://www.kaggle.com/code/isabelgonalves/an-lise-de-sentimentos?kernelSessionId=225967123), which I think you'll enjoy!
Kaggle has been an amazing platform for developing a more creative and critical approach to data analysis. Come find me there too!
Welcome to this Kaggle competition!
This challenge is ideal for data scientists looking to get started with Natural Language Processing (NLP). The dataset is manageable in size, and all analysis can be performed in Kaggle Notebooks, a free, cloud-based Jupyter environment that requires no setup.
Twitter has become a crucial communication channel during emergencies. With smartphones widely available, people can report disasters in real-time. This has led organizations—such as disaster relief agencies and news outlets—to explore ways to automatically monitor Twitter for relevant information.
In this project, we develop machine learning models to classify tweets as disaster-related or not, leveraging NLP techniques like TF-IDF vectorization, logistic regression, and exploratory data analysis to improve predictions.
This repository includes:
Interactive visualizations using Matplotlib & Plotly
Word clouds for frequent terms in disaster vs. non-disaster tweets
Feature engineering with TF-IDF
A classification model using Logistic Regression
This repository will be continuously updated with new queries, methods, and insights. Contributions and suggestions are always welcome!
For questions or collaborations, feel free to reach out via:
🔗 LinkedIn: [(https://www.linkedin.com/in/belcruz/)]
📧 Email: [isabel.gon.adm@gmail.com]
📌 Author: Bel – Technical Data Analyst, Government of São Paulo, Brazil