Skip to content

Ethan07914/GitHubDW

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Market Insight & Compliance DW

Overview

This project takes the public GitHub dataset on BigQuery and models it into a Star Schema. It provides a single source of truth, enabling analysis of market share and open-source legal compliance.

  • Market Insight: Identify which programming languages are gaining popularity.
  • Legal Compliance: Track the percentage of repositories utilising a valid license.

Business Objective

The Data Warehouse focuses on monitoring Market Intelligence and Legal Compliance business processes at GitHub.

  • Monitoring Market Intelligence would allow developers to identify which areas to focus their feature development on.
  • Monitoring Legal Compliance allows GitHub to monitor if open-source repositories are being used lawfully.

Dataflow

  • Source: BigQuery github_repos public dataset.
  • Staging: Relevant tables identified in staging models and written to BigQuery dataset.
  • Warehouse: Data transformed into fact and dimension tables into BigQuery data warehouse.

Tech Stack

  • BigQuery (Data Warehouse)
  • dbt-core / dbt-bigquery (Transformation)
  • Looker Studio (Visualisation)

Dashboard

image image

dbt lineage graph

image

Data Model

  • Conceptual, Logical and Physical data models were created using draw.io.
image

About

This project takes the public GitHub dataset on BigQuery and models it into a Star Schema. It provides a single source of truth, enabling analysis of market share and open-source legal compliance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors