MHC × MTA Inaugural Datathon Project 2024
This comprehensive analysis explores Fair Fares ridership data in NYC to understand how expanding eligibility can improve access to affordable public transportation. Using advanced data analysis techniques and visualization approaches, we identify key insights to inform policy recommendations that could make public transit more equitable for all New Yorkers.
The Fair Fares NYC program provides discounted metro cards for eligible residents, currently serving those at 120% of the Federal Poverty Level (FPL). Our analysis evaluates the potential impact of expanding eligibility to 200% FPL, with a focused examination of six key neighborhoods:
- Elmhurst/Jackson Heights
- Flushing
- Sunset Park
- Brownsville
- Morrisania
- Highbridge
How can expanding Fair Fares eligibility criteria to 200% of the FPL improve access to affordable public transportation in underserved NYC neighborhoods? Which subway lines, bus routes, and stations should be prioritized for expansion to maximize equity and usage?
- Usage Pattern Analysis
- Identified peak usage times (8:00 AM and 6:00 PM)
- Discovered 1.77x higher weekday vs weekend usage
- Mapped network effects between connected stations
- Neighborhood Accessibility Analysis
- Found highest adoption in Morrisania (9.57%) and Highbridge (8.50%)
- Identified critical transfer points in each focus area
- Mapped geographic spread of program adoption
- CUNY Campus Impact
- Analyzed ridership patterns around 2-year and 4-year institutions
- Identified peak academic hours usage
- Evaluated bus-subway integration near campuses
- Policy Insights
- Developed data-driven expansion recommendations
- Identified high-impact transfer points
- Created temporal usage profiles
- Python: Advanced data processing with memory optimization
- SQL: Complex queries for pattern analysis
- Visualization: Interactive maps and statistical charts
- Jupyter: Documented analysis workflow
- Version Control: Git-based collaboration
mhcXmta_datathon_project/
├── data/
│ ├── raw/ # Original MTA datasets
│ │ ├── readme.md # Access to large datasets from Dropbox
│ ├── processed/ # Cleaned and optimized data
│ ├── readme.md # Access to large datasets from Dropbox
│ ├── additional_reports/ # Supporting documentation
│ ├── Fair-Fares-Expansion-Full-Report.pdf
│ ├── Public-Transportation-Subsidies-and-Racial-Equity.pdf
│
├── notebooks/
│ ├── 01_exploratory_analysis.ipynb # Initial data exploration
│ ├── 02_neighborhood_analysis.ipynb # Geographic patterns
│ ├── 03_cuny_analysis.ipynb # Campus impact
│ ├── 04_visualizations.ipynb # Complex pattern analysis
│
├── sql/
│ ├── subway_ridership_queries.sql # Subway analysis queries
│ ├── bus_ridership_queries.sql # Bus pattern queries
│
├── tableau/
│ ├── Comparison of Geographics.twb # Tableau work
│ ├── subway_chart.png
│ ├── subway_map_1.png
│ ├── subway_map_2.png
│ ├── bus_treemap.png
│ ├── NYC Aging Service Providers.cpg # All these files for practice
│ ├── NYC Aging Service Providers.dbf
│ ├── NYC Aging Service Providers.prj
│ ├── NYC Aging Service Providers.qmd
│ ├── NYC Aging Service Providers.shp
│ ├── NYC Aging Service Providers.shx
│
├── results/
│ ├── charts/ # Statistical visualizations
│ │ ├── eda_viz1s.png # EDA visualization for Subway (s)
│ │ ├── eda_viz1b.png # EDA visualization for Bus (b)
│ │ ├── eda_viz2s.png
│ │ ├── eda_viz2b.png
│ │ ├── eda_viz3s.png
│ │ ├── eda_viz3s.png
│ │ ├── eda_viz4s.png
│ │ ├── eda_viz4b.png
│ │ ├── eda_viz5.png
│ │ ├── bus&subway_viz.png
│ │ ├── bus&subway_neighborhood_viz.png
│ │ ├── bus&subway_peak_k/share_viz.png
│ │ ├── bus&subway_cuny_viz.png
│ │ ├── cuny_stations_viz.png
│ │ ├── 4_year_transfer_times_viz.png
│ │ ├── 2_year_transfer_times_viz.png
│ │ ├── network_analysis_viz.png
│ │ ├── time_series_viz.png
│ │ ├── cross-system_transfers_viz.png
│ ├── maps/ # Interactive geographic analysis
│ ├── fair_fares_heatmap.html
│
├── colab/
│ ├── main.ipynb # Coding performed during Datathon Day 1
│ ├── refined_incomplete_work.ipynb # Personal practice
│
├── config/ # Analysis parameters
│ ├── settings.json
│
├── docs/ # Technical documentation
│ ├── methodology.md
│ ├── references.md
│
├── readme.md
├── .gitignore
├── code.py
├── requirements.txt
├── environment.yml
└── LICENSE
- Implemented chunked processing for 10GB+ datasets
- Developed memory-efficient analysis pipelines
- Created optimized data structures
- Mapped station connectivity patterns
- Analyzed transfer behaviors
- Identified usage correlation clusters
- Discovered peak usage periods
- Mapped seasonal variations
- Analyzed weekday/weekend differences
- Evaluated bus-subway coordination
- Analyzed transfer efficiencies
- Mapped system synchronization
- Strong correlation in Fair Fares usage between connected stations
- Higher adoption rates in areas with efficient transfers
- Clear geographic spread patterns
- Morning peak: 8:00 AM (2x average ridership)
- Evening peak: 6:00 PM (1.8x average)
- October shows highest monthly usage
- 98% correlation between bus and subway patterns
- 3-hour offset between mode peaks
- Higher transfer rates on bus routes
Our analysis demonstrates that expanding Fair Fares eligibility to 200% FPL could significantly improve transit accessibility for working New Yorkers. Key opportunities include optimizing transfer points, adjusting service timing, and enhancing cross-mode integration. Strategic implementation focusing on high-impact areas could maximize the program's effectiveness while maintaining operational efficiency.
- Clone the repository:
git clone https://github.com/BasirS/mhcXmta_datathon_project.git cd mhcXmta_datathon_project - Install dependencies:
pip install -r requirements.txt
- Run Jupyter Notebooks:
- Open Jupyter Notebooks (
.ipynbfiles) in your preferred environment (Google Colab, Jupyter Lab, etc.) in sequential order. - Run SQL scripts using your SQL client and use all the provided SQL queries for detailed analysis.
-
Open Tableau workbooks: Use Tableau Desktop to load
.twbxfiles and explore dashboards and/or other visualizations. -
Explore interactive visualizations in results/maps!
- "Making Fair Fares More Fair for More People"
- "Public Transportation Subsidies and Racial Equity"
- NYC Open Data portal
To share my findings and insights, I created a presentation, which summarizes my findings and includes all visualizations:
Together, we're making public transportation more equitable for all New Yorkers!