Throughout this case study, I will be taking on the role of a junior analyst tasked with providing insights on Cyclistic, a fictional bike company located in Chicago. Utilizing each phase of the data analysis process (Ask, Prepare, Process, Analyze, Share, and Act), I will leverage data to answer key business questions, contributing to insightful, data driven decisions.
Data Source: divvy_tripdata (Data for this case study utilizes the following months: August 2024-July 2025)
SQL Queries:
1) Combining the Data
2) Exploring the Data
3) Cleaning the Data
4) Analyzing the Data
Visualizations: Tableau
Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use the bikes to commute to work each day.
Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members.
Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Lily Moreno (director of marketing and my manager) believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a solid opportunity to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.
Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.
Design marketing strategies aimed at converting casual riders into annual members.
Three questions will guide the future marketing program:
- How do annual members and casual riders use Cyclistic bikes differently?
- Why would casual riders buy Cyclistic annual memberships?
- How can Cyclistic use digital media to influence casual riders to become members?
For this case study, I have been assigned the first question to answer: How do annual members and casual riders use Cyclistic bikes differently?
Now that we have determined our business task and key question at hand, it's time to take the steps in preparing the data for eventual cleaning and analysis. As mentioned previously, we will be using the divvy_tripdata data to access historical ride information for our analysis. Within the trip data, there are files for each month beginning at April 2020 and ending at August 2025. For this case study, we will be using the ride data from August 2024 through July 2025. Each file has the same naming convention of YYYYMM-divvy-tripdata and includes the same information such as ride id, bike type, start times and stations, end times and stations, and whether the rider is a member or a casual. The data has been made available by Motivate International Inc. under this license.
This is public data that you can use to explore how different customer types are using Cyclistic bikes. But note that data-privacy issues prohibit you from using riders’ personally identifiable information. This means that we won’t be able to connect pass purchases to credit card numbers to determine if casual riders live in the Cyclistic service area or if they have purchased multiple.
During this phase, it's important to recognize where the data is located and how it is organzied, allowing you to better underdstand how to work with the data efficently. Taking into account the integrity of the data and how the information provided helps answer your question is also important, as this will contribute to a reliable analysis.
With the data prepared and downloaded, it's now time to begin the processing phase. Looking at the 12 Cyclistic files we have downloaded, there is a total count of ~5.6 million rows of data. Since Microsoft Excel can only have ~1 million rows of data within a worksheet (not meant to manage large amounts of data), we will be using BigQuery to combine the various datasets into one before cleaning and analyzing.
SQL Query: Combining the Data
In this step, we have downloaded each of the 12 months of data into BigQuery. Now that they are located within the warehouse, we will use SQL code to combine each of the datasets into one new dataset called combined_trips, containing 5,611,500 rows of data for the entire year.
SQL Query: Exploring the Data
Using our new dataset (combined_trips), we can now use SQL code to explore the data, helping us gain better insights on it's content. Exploring the data gives us a head start on narrowing down what to include/exclude for our analysis, while also allowing us to find any errors or inconsistencies that will need to be addressed during the cleaning phase. Below are some of the results from the queries I've run to familiarize myself with the data:
1. Total Number of Rows
2. Column Names and Data Types
As shown below, there are 3 different data types in the combined_trips dataset with the ride_id column as the primary key.
3. Total Null Values in Each Column
Below we can see the number of null values in each column. Knowing this information is vital, since we will not be utilizing any null values for our analysis. We now know which columns need to be addressed during our cleaning process.
4. Duplicate ride_id's & Length of ride_id's
There are no duplicate ride_id's in our combined_trips dataset, meaning we won't have to address the ride_id column duirng our cleaning. We have also confirmed every ride_id has the same amount of values (16 total).
5. Number of Trips per Rideable Type
There are 3 different rideable types that are observed, with electric bikes being the most popular, and electric scooters being the least popular.
6. Number of Trips per Rider Type
As shown, there are two types of riders: members and casuals, with members being the more popular type.
7. Rides Less Than a Minute or More Than a Day
Because there isn't much data associated to a ride that lasts less than a minute long, it's important to identify how many of these cases there are so we can remove them when cleaning. Also, rides that last longer than a day will also be removed, as these are most likely people who forgot to end their ride when arriving to their end station.
SQL Query: Cleaning the Data
Now that we have a better understanding of the combined_trips dataset, it's time to begin cleaning our dataset to prepare for analysis. The cleaned data will be inserted into a new dataset called cleaned_combined_trips. Based on our findings in the previous step, we will be performing the following:
- Removing any rows that have null values.
- Removing any rides that are less than a minute or longer than a day.
Additionally, we will be adding 3 more columns into this new dataset:
- The ride length in minutes will be added as ride_length for each ride that was taken.
- The day of the week each ride took place will be added as day_of_week.
- The month that each ride took place will be added as month.
After using SQL code to perform each of these actions, we are left with a brand new dataset called cleaned_combined_trips, containing a total of 3,766,232 rows.
SQL Query: Analyzing the Data
Visualizations: Tableau
With our data cleaned and our new cleaned_combined_trips dataset created, we have now reached the analyze and share phases of the process. During these phases, we will use SQL queries to generate results that can help answer our assigned question: How do annual members and casual riders use Cyclistic bikes differently?
Once we have these queries created, we will leverage Tableau to translate these findings into clear, meaningful visuals to communicate our findings in a simple, convenient manner.
1. 🚲Ride Distribution & Ride Duration: Members vs Casuals
Observation:
- Members make up the majority of total rides, showing their role as Cyclistic’s most consistent customers.
- Casual riders take fewer trips but ride for longer durations.
Interpretation:
- Members treat Cyclistic as a quick, reliable commuting option.
- Casual riders use Cyclistic for leisure and recreational trips.
2. 📅 Ride Patterns by Day of Week & Month (Line Chart)
Observation:
- Casual riders show clear seasonal and weekly fluctuations: they ride more in warmer months (summer peaks) and heavily on weekends.
- Members maintain a steady pattern across months and weekdays, with a smaller but consistent increase in summer.
- Casual ridership drops sharply during colder months, while members continue riding year-round.
Interpretation:
- Casual riders are more influenced by weather and leisure time, aligning with recreational use.
- Members rely on Cyclistic for commuting or daily transportation needs, explaining their stable ridership across seasons and weekdays.
- Weekday vs weekend differences highlight purpose of use: commuting vs leisure.
3. 🚴 Rideable Type Preference (Bar Chart + Pie Chart)
Observation:
- Members overwhelmingly prefer classic bikes, logging most of their trips on this rideable type.
- Casual riders also favor classic bikes (778,527 rides), but they show a stronger relative interest in electric bikes (572,489 rides) compared to members.
- Electric scooters see minimal use from both groups.
Interpretation:
- Classic bikes remain the primary mode of transport for both members and casuals, which shows that the traditional option continues to be the backbone of Cyclistic’s ridership.
- However, casual riders experiment with electric bikes at a higher rate than members, suggesting that electrics may be appealing to more flexible, occasional users.
- Since electric scooters represent only a small fraction of trips, they likely don’t play a major role in influencing membership decisions.
4. ⏳Ride Duration Breakdown
Observation:
- Most rides (both member and casual) fall in the 6–30 minute range, which represents typical short-distance trips.
- Members have a much higher concentration in the 0–30 minute buckets, while casuals appear more frequently in the 31–60+ minute categories.
- Casuals are far more likely than members to take extended trips of over an hour.
Interpretation:
- Members use the service for short, consistent trips — supporting the idea of commuting or routine travel.
- Casual riders engage in longer exploratory or leisure rides, reinforcing the recreational nature of their usage.
- Extended ride times from casuals could also signal they’re less concerned about time-based pricing since they ride occasionally.
5. 🗺️ Start Station Map (Members vs Casuals)
**Interactive map, please click the following link to explore further: Tableau
Observation:
- Member rides are concentrated near commuter hubs such as downtown Chicago, train stations, and major business districts.
- Casual rides start more often near tourist destinations (lakefront, parks, and popular attractions).
- The density and size of points reflect higher overall casual activity at recreational areas compared to member-focused commuting hubs.
Interpretation:
- Members typically start rides close to where they live or commute (workplaces, transit stations).
- Casuals are more likely to start rides in areas where leisure activities begin, such as near beaches, parks, or bike paths.
- This shows distinct purpose of trip origins: transportation vs leisure.
6. 🗺️ End Station Map (Members vs Casuals)
**Interactive map, please click the following link to explore further: Tableau
Observation:
- Member rides often end close to where they began or at another commuter hub, suggesting point-to-point commuting.
- Casual rides frequently end in popular leisure destinations (parks, beaches, museums, shopping areas).
- Casuals also show more ride dispersion, with varied end points compared to members’ concentrated patterns.
Interpretation:
- Members are purpose-driven riders, using bikes to connect consistent start and end points (home ↔ work).
- Casuals treat the system as a flexible, exploratory service, often ending rides in entertainment or leisure districts.
- This reinforces that members are routine commuters, while casuals are experience-seekers.
Now that we have answered our assigned question (How do annual members and casual riders use Cyclistic bikes differently?), it's time to tie everything together and address the business task: Design marketing startegies aimed at converting casual riders into annual members.
1. Location-Based Marketing
- Place ads, QR codes, or digital signage at start/end stations popular with casuals (parks, beaches, tourist areas).
- Use app notifications when casuals start/end rides in these stations, prompting trial memberships.
2. Weekend & Seasonal Campaigns
- Offer weekend-only memberships or seasonal passes tailored to casual riders’ peak times.
- Promote annual memberships during summer months with discounts or limited-time offers.
- Electric Bike Incentives
- Since casuals use electric bikes more proportionally, create membership perks that reduce e-bike surcharges.
- Position membership as a way to save money on frequent e-bike usage.
3. Highlight Cost Savings for Long Rides
- Casual riders often take longer rides that cost more per trip.
- Marketing should emphasize the unlimited nature of membership, showing cost comparisons: “5 long rides per month cost the same as a membership.”
4. Leverage Tourism Partnerships
- Partner with Chicago tourism boards, hotels, and attractions to bundle memberships into visitor packages.
- Encourage repeat visitors to consider annual memberships with cross-promotions.
5. Commuter Conversion
- For casuals who ride frequently near downtown or train stations, highlight commuter-friendly benefits (no surge costs, guaranteed bike availability).
By aligning marketing strategies with casual riders’ behavioral differences (when, where, and how they ride), Cyclistic can create targeted campaigns that directly address the value of becoming a member, hopefully translating into a high conversion rate. This data-driven approach ensures the company focuses on the most promising opportunities for conversion and sustainable growth.
- Cleaned & modeled ~4M trip records in BigQuery.
- Built 6 dashboards: Rider Distribution/Average Length, Weekly/Monthly Trends, Rideable Types, Ride Duration, Start Map, End Maps.








