This repository contains two main sections:
- My Personal Data Portfolio: A detailed analysis of my music listening habits, collected via Apple Music.
- Fuel Prices Data Analysis: A deeper exploration of a public dataset provided by the French government, focusing on fuel prices across cities in France.
Each part showcases different aspects of data analysis and visualization, allowing me to demonstrate my skills in data wrangling, cleaning, and visual exploration.
In this section, I analyze my personal music listening data extracted from Apple Music. The goal is to create a data-driven portfolio that highlights my skills in data visualization and interpretation, while using a familiar dataset: my own listening behavior.
The dataset includes the following relevant columns:
- Date Played: The exact date on which a song was played. This column allows for time-based analyses, such as identifying patterns across different days, weeks, or months.
- Play Count: The number of times a particular track was played. This is crucial for understanding which songs or artists I listen to the most, and it can help identify potential correlations with other variables like play duration.
- Skip Count: The number of times a track was skipped before it finished. This could be useful in identifying songs that I tend to abandon, possibly indicating a lack of interest or mood changes during listening sessions.
- Play Duration Seconds: The total time (in seconds) that I listened to a track during a session. This helps analyze the total time spent listening to music and could be used to evaluate listening patterns over long periods.
- Year: The year in which the track was played. This column allows for multi-year trends to be analyzed, providing insights into how my listening habits have evolved over time.
- Month: The specific month of the year when the track was played. Monthly trends can be analyzed to explore if there are seasonal patterns in my music consumption.
- Day: The day of the month when a track was played. This can be combined with the month and year data for precise tracking of my listening history.
- Day of Week: The day of the week (Monday, Tuesday, etc.) when a track was played. This allows for exploring weekly patterns in my listening habits, for example, determining whether I listen to more music on weekends compared to weekdays.
Using this dataset, I will generate various visualizations to uncover insights into my listening behavior, such as:
- Which days of the week or months see the most music activity.
- How frequently I listen to specific artists or genres.
- How much time I spend listening to music over the course of a week, month, or year.
- Correlations between track play counts and play durations, and how often I skip certain tracks.
The goal of this section is not just to explore personal data but also to showcase my ability to visualize data creatively and extract meaningful insights from it.
The second section of this project focuses on a public dataset provided by the French government. This dataset contains fuel prices across different cities in France, allowing for a wide range of analyses.
In this analysis, I will:
- Compare fuel prices across different cities and regions in France.
- Explore the evolution of fuel prices over time.
- Identify regional trends or disparities in fuel pricing.
- Perform statistical analyses to understand factors influencing price variations (e.g., geographical location, type of fuel, time of year).
This part of the project will leverage data cleaning, exploratory data analysis (EDA), and advanced visualizations to identify patterns and trends within the dataset. Some of the key methods and plots will include:
- Time series analysis to observe how fuel prices fluctuate over months or years.
- Geographical mapping to visualize price differences across cities or regions.
- Bar charts and scatter plots to compare prices between different fuel types.
- Python: For data cleaning, analysis, and visualization.
- Pandas: To manipulate and prepare the datasets.
- Plotly: For creating interactive visualizations.
- Jupyter Notebook: To structure and document the workflow.
Both sections of this project aim to demonstrate my ability to handle different types of data, perform thorough analysis, and communicate insights effectively through visualizations.
By including these two datasets in my portfolio, I hope to showcase my proficiency in data manipulation, analysis, and visualization across both personal and public datasets, demonstrating a wide range of techniques and insights that are applicable in various real-world scenarios.