Comprehensive data warehouse and business intelligence solution demonstrating enterprise-level capabilities for Electric Vehicle (EV) charging infrastructure analytics
This project demonstrates a complete end-to-end data engineering and business intelligence implementation using Microsoft SQL Server stack. It transforms raw EV charging data into actionable insights through a sophisticated data warehouse architecture, supporting enterprise-level analytics and decision-making.
- Enterprise Architecture: Comprehensive design following Kimball dimensional modeling principles
- Multi-Source Integration: CSV, TXT, Excel, and SQL Server data sources
- Advanced Analytics: SSAS multidimensional cube with complex calculations
- Interactive Dashboards: Power BI with live SSAS connectivity
- Performance Optimization: Sub-second query response on large datasets
- Industry Application: Real-world EV charging business scenarios
This project uses the Palo Alto EV Charging Station Usage Open Data from Kaggle:
- Dataset: EV Charging Station Usage of California City
- Source: City of Palo Alto Open Data Portal
- Size: 85.45 MB, 291,000+ charging session records
- Time Period: 2011-2021 (10 years of charging data)
- License: U.S. Government Works
- 33 columns including charging sessions, energy consumption, GHG savings
- Geographic data: Station locations, addresses, coordinates
- Temporal data: Start/end times, duration, time zones
- Technical data: Port types, plug types, EVSE IDs
- Environmental data: Energy (kWh), GHG savings, gasoline displacement
- User data: User IDs, postal codes, counties
- US Holidays 2017-2020: JSON format for holiday impact analysis
- Weather Data: Palo Alto weather conditions
- User Information: User demographics and location data
The rapidly growing EV charging industry generates massive volumes of transactional data that must be analyzed for:
- Revenue Optimization: Dynamic pricing strategies and demand management
- Operational Efficiency: Station utilization and maintenance planning
- Customer Intelligence: User behavior analysis and retention strategies
- Infrastructure Planning: Network expansion and capacity management
- Real-time Analytics: Sub-second query performance on large datasets
- Multi-dimensional Analysis: Weather, holiday, and geographic impact assessment
- Scalable Architecture: Enterprise-grade design supporting future growth
- Actionable Insights: Data-driven decision making capabilities
Data Sources -> Staging Layer -> ETL Processing -> Data Warehouse -> OLAP Cube -> BI Analytics
| Component | Technology | Purpose |
|---|---|---|
| Database Engine | SQL Server 2019+ | Data storage and processing |
| ETL | SSIS 2019+ | Data integration and transformation |
| OLAP | SSAS 2019+ | Multidimensional analytics |
| BI | Power BI Desktop | Interactive dashboards |
| Analysis | Excel 2016+ | OLAP pivot analysis |
EV-Charging-Data-Warehouse-BI-System/
|
|--- database/ # Database schema and data
| |--- schema/ # Star schema implementation
| |--- data_sources/ # Raw data files
| |--- backup/ # Database backups
|
|--- etl/ # SSIS packages and documentation
| |--- ssis_packages/ # ETL package files
| |--- etl_documentation.md # Comprehensive ETL guide
|
|--- olap_cube/ # SSAS implementation
| |--- ssas_project/ # Analysis Services project
| |--- ssas_documentation.md # OLAP cube documentation
|
|--- powerbi/ # Power BI dashboards
| |--- EV_Charging_All_Reports.pbix
| |--- powerbi_documentation.md # Dashboard guide
| |--- reports/ # Report specifications
|
|--- docs/ # Project documentation
| |--- architecture/ # System architecture
| |--- powerbi_screenshots/ # Dashboard screenshots
| |--- references.md # Technical references
|
|--- analysis/ # Excel OLAP analysis
| |--- excel_olap/ # Excel pivot files
| |--- olap_operations.md # OLAP operations guide
- SQL Server 2019+ with SSIS, SSAS, and SSMS
- Power BI Desktop (latest version)
- Microsoft Excel 2016+ (for OLAP analysis)
- Windows Server (recommended for production)
-- 1. Create database
CREATE DATABASE EV_Charging_DW;
GO
-- 2. Execute schema creation
-- Run: database/schema/table_definitions.sql
-- Run: database/schema/DimDate_for_dw.sql
-- 3. Load initial data
-- Execute database/backup/restore procedures
-- Or run ETL packages to populate from source files-- 1. Deploy to SSIS Catalog
CREATE DATABASE SSISDB;
GO
-- 2. Create folder structure
EXEC [catalog].[create_folder]
@folder_name = N'EV_Charging_DWH';
-- 3. Deploy packages from etl/ssis_packages/
-- Use SSDT or Integration Services Deployment Wizard<!-- 1. Deploy Analysis Services project -->
<!-- Use SQL Server Data Tools (SSDT) -->
<!-- Deploy olap_cube/ssas_project/ to SSAS instance -->
<!-- 2. Process cube -->
<!-- Use SQL Server Management Studio -->
<!-- Right-click cube -> Process -> Process Full -->- Open
powerbi/EV_Charging_All_Reports.pbix - Configure SSAS connection:
- Server:
<your-ssas-server> - Database:
EV_Charging_Analysis
- Server:
- Refresh data connections
- Enable live connection mode
- Grain: One row per charging segment per port usage event
- Measures: Energy_kWh, Fee, GHG_Savings, Gasoline_Savings
- Volume: 100,000+ charging session records
- Update Pattern: Append-only with nightly loads
| Dimension | Purpose | Key Features |
|---|---|---|
| DimDate | Time analysis | Multiple calendars, seasonal flags |
| DimStation | Geographic analysis | Hierarchies, market segmentation |
| DimUser | Customer analysis | SCD Type 2 for location history |
| DimPort | Equipment analysis | Performance metrics, maintenance |
| DimWeather | Environmental analysis | Weather impact correlations |
| DimHoliday | Event analysis | Holiday impact factors |
- EV_Load_Source_To_Staging.dtsx - Multi-source data ingestion
- EV_Staging_Data_Profiling.dtsx - Data quality validation
- EV_Load_Staging_To_DW.dtsx - Dimensional model transformation
- EV_Accumulating_Update.dtsx - SCD Type 2 processing
| Source | Format | Volume | Processing |
|---|---|---|---|
| Charging Sessions | CSV | 100K+ records | Daily batch |
| User Information | TXT | 15K+ records | Daily incremental |
| Weather Data | Excel | 365+ days | Daily updates |
| Holiday Calendar | SQL Server | 50+ records | Annual maintenance |
- Data Quality Framework: Automated validation and profiling
- Error Handling: Comprehensive error staging and recovery
- Performance Optimization: Bulk operations and parallel processing
- Audit Trail: Complete execution logging and monitoring
Detailed ETL documentation available in etl/etl_documentation.md
- Dimensions: 6 conformed dimensions with hierarchies
- Measure Groups: 1 fact table with 15+ calculated measures
- Perspectives: 3 role-based views (Executive, Operations, Finance)
- Calculations: 20+ business metrics and KPIs
- Aggregations: 30% aggregation coverage, 85% query improvement
- Partitions: Time-based partitioning for efficient processing
- Storage Mode: MOLAP for optimal performance
- Processing Strategy: Incremental updates for current data
-- Time Intelligence
YTD Energy Consumption = SUM(YTD(...), [Energy Consumption])
QoQ Revenue Growth = ([Revenue] - ParallelPeriod([Revenue])) / ParallelPeriod([Revenue])
-- Business KPIs
Station Utilization = [Energy Consumption] / [Station Capacity]
Revenue per kWh = [Revenue] / [Energy Consumption]
Customer Retention = ActiveCustomers / TotalCustomersComplete OLAP documentation available in olap_cube/ssas_documentation.md
- KPI Cards: Revenue, Energy, Stations, Satisfaction
- Trend Analysis: 24-month rolling trends
- Geographic Overview: Revenue by state/region
- Business Insights: Growth opportunities and market analysis
- Station Performance Matrix: Utilization and efficiency metrics
- Equipment Analysis: Port performance and maintenance needs
- Peak Usage Analysis: Time-based utilization patterns
- Operational KPIs: Downtime, maintenance, efficiency
- Cascading Slicers: Dynamic filtering hierarchy
- Multi-dimensional Analysis: Cross-filtering between visuals
- Drill-Through Capabilities: Detailed transaction analysis
- Time Series Analysis: Hierarchical date navigation
- Station-Level Deep Analysis: Individual station performance details
- Transaction-Level Insights: Detailed charging session breakdown
- User Behavior Analysis: Individual charging patterns
- Performance Metrics: Station-specific KPIs and trends
- Live Connection: DirectQuery to SSAS cube
- Real-time Data: Sub-second query response
- Mobile Responsive: Cross-platform accessibility
- Scheduled Refresh: Automated data updates
Complete Power BI documentation available in powerbi/powerbi_documentation.md
| Query Type | Response Time | Volume | Complexity |
|---|---|---|---|
| Simple Aggregation | <1 second | 291K+ records | Low |
| Complex Calculation | <2 seconds | 291K+ records | Medium |
| Multi-dimensional | <3 seconds | 291K+ records | High |
| Large Dataset | <5 seconds | 291K+ records | High |
- ETL Throughput: 500K+ records/hour
- Cube Processing: 15-30 minutes full process
- Incremental Updates: 2-5 minutes
- Data Freshness: Daily updates available
- Concurrent Users: 10+ simultaneous analysts
- Data Volume: Supports 291K+ charging sessions
- Storage Efficiency: Optimized indexing and storage
- Growth Capacity: Designed for future expansion
- Architecture Overview - Complete system architecture
- ETL Implementation - Detailed SSIS package documentation
- OLAP Cube Design - SSAS implementation guide
- Power BI Dashboards - Dashboard and visualization guide
- Database Schema - Star schema technical details
- Business Value & Learning Outcomes - Educational impact and technical achievements
- OLAP Operations - Excel OLAP analysis examples
- Data Source: Kaggle EV Charging Dataset (modified for educational purposes)
- Architecture: Based on Kimball dimensional modeling principles
- Technology: Microsoft SQL Server ecosystem
- Total Records Processed: 291,000+ charging sessions
- ETL Packages: 4 specialized packages
- Dimensions: 6 conformed dimensions
- Reports: Interactive Power BI dashboards
- Cube: Multidimensional SSAS implementation





