Skip to content

sandaliz/EV-Charging-Data-Warehouse-BI-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EV Charging Data Warehouse & BI System

SQL Server SSIS SSAS Power BI

Comprehensive data warehouse and business intelligence solution demonstrating enterprise-level capabilities for Electric Vehicle (EV) charging infrastructure analytics


Executive Summary

This project demonstrates a complete end-to-end data engineering and business intelligence implementation using Microsoft SQL Server stack. It transforms raw EV charging data into actionable insights through a sophisticated data warehouse architecture, supporting enterprise-level analytics and decision-making.

Key Achievements

  • Enterprise Architecture: Comprehensive design following Kimball dimensional modeling principles
  • Multi-Source Integration: CSV, TXT, Excel, and SQL Server data sources
  • Advanced Analytics: SSAS multidimensional cube with complex calculations
  • Interactive Dashboards: Power BI with live SSAS connectivity
  • Performance Optimization: Sub-second query response on large datasets
  • Industry Application: Real-world EV charging business scenarios

Data Source

Primary Dataset

This project uses the Palo Alto EV Charging Station Usage Open Data from Kaggle:

  • Dataset: EV Charging Station Usage of California City
  • Source: City of Palo Alto Open Data Portal
  • Size: 85.45 MB, 291,000+ charging session records
  • Time Period: 2011-2021 (10 years of charging data)
  • License: U.S. Government Works

Dataset Features

  • 33 columns including charging sessions, energy consumption, GHG savings
  • Geographic data: Station locations, addresses, coordinates
  • Temporal data: Start/end times, duration, time zones
  • Technical data: Port types, plug types, EVSE IDs
  • Environmental data: Energy (kWh), GHG savings, gasoline displacement
  • User data: User IDs, postal codes, counties

Additional Data Sources

  • US Holidays 2017-2020: JSON format for holiday impact analysis
  • Weather Data: Palo Alto weather conditions
  • User Information: User demographics and location data

Business Context

Industry Challenge

The rapidly growing EV charging industry generates massive volumes of transactional data that must be analyzed for:

  • Revenue Optimization: Dynamic pricing strategies and demand management
  • Operational Efficiency: Station utilization and maintenance planning
  • Customer Intelligence: User behavior analysis and retention strategies
  • Infrastructure Planning: Network expansion and capacity management

Solution Value

  • Real-time Analytics: Sub-second query performance on large datasets
  • Multi-dimensional Analysis: Weather, holiday, and geographic impact assessment
  • Scalable Architecture: Enterprise-grade design supporting future growth
  • Actionable Insights: Data-driven decision making capabilities

System Architecture

High-Level Architecture

Data Sources -> Staging Layer -> ETL Processing -> Data Warehouse -> OLAP Cube -> BI Analytics

Technology Stack

Component Technology Purpose
Database Engine SQL Server 2019+ Data storage and processing
ETL SSIS 2019+ Data integration and transformation
OLAP SSAS 2019+ Multidimensional analytics
BI Power BI Desktop Interactive dashboards
Analysis Excel 2016+ OLAP pivot analysis

Data Flow

ETL Pipeline Architecture


Project Structure

EV-Charging-Data-Warehouse-BI-System/
|
|--- database/                    # Database schema and data
|    |--- schema/                 # Star schema implementation
|    |--- data_sources/           # Raw data files
|    |--- backup/                 # Database backups
|
|--- etl/                         # SSIS packages and documentation
|    |--- ssis_packages/          # ETL package files
|    |--- etl_documentation.md    # Comprehensive ETL guide
|
|--- olap_cube/                   # SSAS implementation
|    |--- ssas_project/           # Analysis Services project
|    |--- ssas_documentation.md   # OLAP cube documentation
|
|--- powerbi/                     # Power BI dashboards
|    |--- EV_Charging_All_Reports.pbix
|    |--- powerbi_documentation.md # Dashboard guide
|    |--- reports/                # Report specifications
|
|--- docs/                        # Project documentation
|    |--- architecture/           # System architecture
|    |--- powerbi_screenshots/    # Dashboard screenshots
|    |--- references.md           # Technical references
|
|--- analysis/                    # Excel OLAP analysis
|    |--- excel_olap/             # Excel pivot files
|    |--- olap_operations.md      # OLAP operations guide

Quick Start

Prerequisites

  • SQL Server 2019+ with SSIS, SSAS, and SSMS
  • Power BI Desktop (latest version)
  • Microsoft Excel 2016+ (for OLAP analysis)
  • Windows Server (recommended for production)

Installation Steps

1. Database Setup

-- 1. Create database
CREATE DATABASE EV_Charging_DW;
GO

-- 2. Execute schema creation
-- Run: database/schema/table_definitions.sql
-- Run: database/schema/DimDate_for_dw.sql

-- 3. Load initial data
-- Execute database/backup/restore procedures
-- Or run ETL packages to populate from source files

2. SSIS Package Deployment

-- 1. Deploy to SSIS Catalog
CREATE DATABASE SSISDB;
GO

-- 2. Create folder structure
EXEC [catalog].[create_folder]
    @folder_name = N'EV_Charging_DWH';

-- 3. Deploy packages from etl/ssis_packages/
-- Use SSDT or Integration Services Deployment Wizard

3. SSAS Cube Deployment

<!-- 1. Deploy Analysis Services project -->
<!-- Use SQL Server Data Tools (SSDT) -->
<!-- Deploy olap_cube/ssas_project/ to SSAS instance -->

<!-- 2. Process cube -->
<!-- Use SQL Server Management Studio -->
<!-- Right-click cube -> Process -> Process Full -->

4. Power BI Setup

  1. Open powerbi/EV_Charging_All_Reports.pbix
  2. Configure SSAS connection:
    • Server: <your-ssas-server>
    • Database: EV_Charging_Analysis
  3. Refresh data connections
  4. Enable live connection mode

Data Model Overview

Star Schema Design

Star Schema

Fact Table: FactChargingSessions

  • Grain: One row per charging segment per port usage event
  • Measures: Energy_kWh, Fee, GHG_Savings, Gasoline_Savings
  • Volume: 100,000+ charging session records
  • Update Pattern: Append-only with nightly loads

Dimension Tables

Dimension Purpose Key Features
DimDate Time analysis Multiple calendars, seasonal flags
DimStation Geographic analysis Hierarchies, market segmentation
DimUser Customer analysis SCD Type 2 for location history
DimPort Equipment analysis Performance metrics, maintenance
DimWeather Environmental analysis Weather impact correlations
DimHoliday Event analysis Holiday impact factors

ETL Implementation

Package Portfolio

  1. EV_Load_Source_To_Staging.dtsx - Multi-source data ingestion
  2. EV_Staging_Data_Profiling.dtsx - Data quality validation
  3. EV_Load_Staging_To_DW.dtsx - Dimensional model transformation
  4. EV_Accumulating_Update.dtsx - SCD Type 2 processing

Data Sources

Source Format Volume Processing
Charging Sessions CSV 100K+ records Daily batch
User Information TXT 15K+ records Daily incremental
Weather Data Excel 365+ days Daily updates
Holiday Calendar SQL Server 50+ records Annual maintenance

Key Features

  • Data Quality Framework: Automated validation and profiling
  • Error Handling: Comprehensive error staging and recovery
  • Performance Optimization: Bulk operations and parallel processing
  • Audit Trail: Complete execution logging and monitoring

Detailed ETL documentation available in etl/etl_documentation.md


OLAP Cube Implementation

Cube Features

  • Dimensions: 6 conformed dimensions with hierarchies
  • Measure Groups: 1 fact table with 15+ calculated measures
  • Perspectives: 3 role-based views (Executive, Operations, Finance)
  • Calculations: 20+ business metrics and KPIs

Performance Optimizations

  • Aggregations: 30% aggregation coverage, 85% query improvement
  • Partitions: Time-based partitioning for efficient processing
  • Storage Mode: MOLAP for optimal performance
  • Processing Strategy: Incremental updates for current data

Key Calculations

-- Time Intelligence
YTD Energy Consumption = SUM(YTD(...), [Energy Consumption])
QoQ Revenue Growth = ([Revenue] - ParallelPeriod([Revenue])) / ParallelPeriod([Revenue])

-- Business KPIs
Station Utilization = [Energy Consumption] / [Station Capacity]
Revenue per kWh = [Revenue] / [Energy Consumption]
Customer Retention = ActiveCustomers / TotalCustomers

Complete OLAP documentation available in olap_cube/ssas_documentation.md


Power BI Dashboards

Dashboard Portfolio

1. Executive Dashboard

  • KPI Cards: Revenue, Energy, Stations, Satisfaction
  • Trend Analysis: 24-month rolling trends
  • Geographic Overview: Revenue by state/region
  • Business Insights: Growth opportunities and market analysis

Executive Dashboard

2. Operations Dashboard

  • Station Performance Matrix: Utilization and efficiency metrics
  • Equipment Analysis: Port performance and maintenance needs
  • Peak Usage Analysis: Time-based utilization patterns
  • Operational KPIs: Downtime, maintenance, efficiency

Operations Analysis

3. Interactive Analysis

  • Cascading Slicers: Dynamic filtering hierarchy
  • Multi-dimensional Analysis: Cross-filtering between visuals
  • Drill-Through Capabilities: Detailed transaction analysis
  • Time Series Analysis: Hierarchical date navigation

Interactive Analysis

4. Drill-Through Report

  • Station-Level Deep Analysis: Individual station performance details
  • Transaction-Level Insights: Detailed charging session breakdown
  • User Behavior Analysis: Individual charging patterns
  • Performance Metrics: Station-specific KPIs and trends

Drill-Through Analysis

Technical Features

  • Live Connection: DirectQuery to SSAS cube
  • Real-time Data: Sub-second query response
  • Mobile Responsive: Cross-platform accessibility
  • Scheduled Refresh: Automated data updates

Complete Power BI documentation available in powerbi/powerbi_documentation.md


Performance Benchmarks

Query Performance

Query Type Response Time Volume Complexity
Simple Aggregation <1 second 291K+ records Low
Complex Calculation <2 seconds 291K+ records Medium
Multi-dimensional <3 seconds 291K+ records High
Large Dataset <5 seconds 291K+ records High

Processing Performance

  • ETL Throughput: 500K+ records/hour
  • Cube Processing: 15-30 minutes full process
  • Incremental Updates: 2-5 minutes
  • Data Freshness: Daily updates available

System Scalability

  • Concurrent Users: 10+ simultaneous analysts
  • Data Volume: Supports 291K+ charging sessions
  • Storage Efficiency: Optimized indexing and storage
  • Growth Capacity: Designed for future expansion

Documentation Structure

Technical Documentation

Business Documentation

Reference Materials


Attribution

  • Data Source: Kaggle EV Charging Dataset (modified for educational purposes)
  • Architecture: Based on Kimball dimensional modeling principles
  • Technology: Microsoft SQL Server ecosystem

Quick Reference

Project Statistics

  • Total Records Processed: 291,000+ charging sessions
  • ETL Packages: 4 specialized packages
  • Dimensions: 6 conformed dimensions
  • Reports: Interactive Power BI dashboards
  • Cube: Multidimensional SSAS implementation

About

End-to-end data engineering project transforming EV charging data into business insights using Microsoft SQL Server stack.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages