Google Data Analytics Professional Certificate – Capstone Project
This project explores bike-share usage patterns to understand how casual riders and annual members use Cyclistic bikes differently.
The objective is to support a marketing strategy focused on converting casual riders into annual members.
- R programming (dplyr, ggplot2, lubridate, janitor)
- Data cleaning and preparation
- Exploratory data analysis
- Data visualization
- Business reporting and communication
Data was provided by Motivate International Inc.
Datasets:
- Divvy Trips 2019 Q1
- Divvy Trips 2020 Q1
Data cleaning and preparation were performed in R using tidyverse packages.
Key steps included:
- Standardizing inconsistent column names
- Converting text timestamps to datetime format
- Creating
ride_lengthandday_of_weekvariables - Removing invalid trips (negative or longer than 24 hours)
- Merging datasets (combined size ~792,000 records)
-
Casual riders take longer rides
- Casual: 36.5 minutes
- Member: 11.4 minutes
-
Casual riders use the service more on weekends
- Particularly on Saturdays and Sundays
-
Members ride mainly on weekdays
- Consistent with commuter usage
-
Casual ride duration increases significantly on weekends
- Indicates leisure or tourism-driven behavior
- Average ride duration by day of the week
- Member vs. casual ride length
- Number of rides by day of the week
(Plots available in the images/ folder.)
- Offer weekend-focused membership promotions
- Use experience-based marketing aimed at leisure riders
- Expand partnerships with employers to increase weekday member usage
- Column name inconsistencies between years → resolved with
rename() - Timestamp fields imported as text → corrected with
ymd_hms() ride_idformat mismatch → converted to character- tidyverse dependency issues on Linux → resolved by installing packages individually
- Removed invalid trips (>24 hours or negative duration)
Final versions of the project reports are available below: