This project provides a comprehensive sample data warehouse designed for Direct‑to‑Home (DTH) television service analytics. It includes SQL Server scripts to build a star schema, sample CSV datasets, analytical queries, and diagrams to explore churn analysis, customer engagement, advertising performance, and more.
-
Self-contained SQL Scripts
DW_Tables.sqlcreates all required dimension, fact, and aggregated tables and includes ETL logic for monthly summaries.
-
Sample Data for Instant Use
- CSV files aligned with the schema enable direct loading without external data dependencies.
-
Analytics Query Library
Queries.sqlincludes pre-built queries for churn analysis, customer loyalty scoring, content feedback, promotion impact, and cohort-based retention.
-
Presentation & Documentation
- ER diagrams, schema images, and a PowerPoint/PDF report describe the overall architecture and information flow.
DTH-Data-Warehouse/
├── DW_Tables.sql # Schema creation & ETL
├── Queries.sql # Sample analytics queries
├── customer_dimension.csv # Sample dimension data
├── plan_dimension.csv # More CSVs for each table
├── DWdiagrams/ # ER diagrams and visuals
├── Schema.png # Star schema overview
├── Report_Slides.pdf # Documentation/presentation
└── README.md # This file
All scripts and datasets are placed in the root directory for ease of use.
This project uses a star schema centered around subscription and engagement activities with rich supporting dimensions.
| Table Name | Description |
|---|---|
Customer_dimension |
Subscriber demographics & contact info |
Plan_dimension |
Plan names, pricing, and included channels |
Time_dimension |
Date hierarchy (day, week, month, quarter) |
Channel_dimension |
Channel metadata and genres |
Content_dimension |
Program or episode details |
Reason_dimension |
Reasons for churn or unsubscription |
Promotion_dimension |
Promotional campaign metadata |
Event_dimension |
Major events or seasonal triggers |
Ad_exposure_dimension |
Advertisement details and impressions |
Genre_dimension |
Genre category mapping |
| Table Name | Description |
|---|---|
Subscription_fact |
Subscription records and plan history |
Unsubscription_fact |
Customer churn details with reasons |
Feedback_fact |
Feedback mapped to plans and channels |
Customer_engagement_fact |
Ad exposure, content viewing, engagement |
Monthly_aggregate_fact |
Pre-computed monthly KPIs |
Series_monthly_aggregate_fact |
Series-level monthly summaries |
Example: customer_dimension.csv
customer_id, customer_name, customer_email, customer_address, customer_city, customer_zipcode
Example: plan_dimension.csv
plan_id, plan_name, plan_price, channel_package, plan_duration
These column structures match insert orders used in the SQL scripts.
git clone https://github.com/shiven424/DTH-Data-WarehouseAlternatively, download the ZIP and extract it locally.
-
Open SQL Server Management Studio (SSMS).
-
Execute
DW_Tables.sqlto:- Create all dimension, fact, and aggregate tables.
- Optionally populate them with sample data.
You can load the sample CSVs into their respective tables using:
- SSMS Import Data Wizard, or
BULK INSERTstatements (included inDW_Tables.sql).
Example tables:
Customer_dimensionPlan_dimensionChannel_dimension
Run the ETL section in DW_Tables.sql to populate:
Monthly_aggregate_factSeries_monthly_aggregate_fact
These contain pre-computed KPIs for faster analytics.
- Open
Queries.sqlin SSMS. - Run any of the provided analytical queries.
Example analyses include:
- Churn timing
- Feedback sentiment
- Promotion lift
- Viewer engagement trends
You can connect the database to a BI tool such as:
- Power BI
- Tableau
- Google Data Studio
Use them to create dashboards and visualize key metrics using the sample data.
Query: Average Days to Churn by Reason
SELECT
r.reason_category,
r.reason_description,
ROUND(AVG(DATEDIFF(day, s.subscription_start_date, u.unsubscription_date)), 1) AS avg_days_to_unsub
FROM Unsubscription_fact u
JOIN Subscription_fact s
ON u.customer_id = s.customer_id
AND u.unsubscription_date BETWEEN s.subscription_start_date AND s.subscription_end_date
JOIN Reason_dimension r
ON u.reason_id = r.reason_id
GROUP BY r.reason_category, r.reason_description
ORDER BY avg_days_to_unsub DESC;Other queries compute:
- Channel loyalty scores
- Promotion effectiveness
- Viewer cohort behavior
- Feedback trends by genre or content
Navigate to the DWdiagrams/ folder to explore:
- Star schema diagram
- Information package visualizations
Schema.pngfor a quick overview
These assets assist in understanding relationships and metric derivations.