Skip to content

AkashB29/city_pollution_analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

City Pollution Analytics - Snowflake Project

Project Overview

A real-time air quality and weather monitoring system that ingests pollution data from WAQI API and weather data from OpenWeatherMap API into Snowflake, then visualizes it through an interactive Streamlit dashboard.

Architecture

Data Sources

  • WAQI API: Provides real-time Air Quality Index (AQI), PM2.5, PM10, NO2, SO2, CO, O3 measurements
  • OpenWeatherMap API: Provides weather data including temperature, humidity, wind speed, and pressure

Data Pipeline

  1. Ingestion Script (ingest_data.py):

    • Fetches pollution and weather data for 3 Indian cities: Bangalore, Delhi, Mumbai
    • Stores data in Snowflake tables via UPSERT operations
    • Runs periodically to collect time-series data
  2. Database Schema (Snowflake):

    • CITY_DIM: Dimension table with city metadata (name, country, coordinates)
    • POLLUTION_FACT: Fact table with AQI and pollutant measurements
    • WEATHER_FACT: Fact table with weather observations

Frontend

  • Streamlit Dashboard (app.py):
    • Interactive city selector
    • Real-time visualization of AQI trends
    • Temperature and humidity charts
    • Comprehensive pollution data table with statistics
    • Responsive layout with multiple metrics

Key Technologies

  • Snowflake: Data warehouse for scalable storage and fast querying
  • Python: For data ingestion and API integration
  • Streamlit: Rapid dashboard development framework
  • Plotly: Interactive data visualizations
  • Pandas: Data manipulation and transformation

Data Flow

WAQI API → Ingest Script → Snowflake → Streamlit Dashboard
OpenWeatherMap API ↗                          ↓
                                        Interactive Charts

How It Works

  1. Ingest script fetches live API data for 3 cities
  2. Data is cleaned, transformed, and loaded into Snowflake
  3. Dashboard queries Snowflake for latest metrics
  4. Users can select cities and view real-time trends
  5. Historical data enables trend analysis

Key Features

  • ✅ Real-time data ingestion from multiple APIs
  • ✅ Dimensional data modeling (star schema)
  • ✅ Time-series analysis of air quality trends
  • ✅ Multi-city comparison capability
  • ✅ Interactive visualizations with drill-down capability
  • ✅ Comprehensive pollution metrics summary

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages