AI-Powered Continuous Repository Health Analysis & Developer Intelligence Platform
- Overview
- Key Features
- Architecture
- Technology Stack
- System Workflows
- Metrics & Scoring
- Installation & Setup
- Environment Configuration
- Deployment
- Team & Contributions
- Screenshots
- License
CodeHealth-AI is a comprehensive developer intelligence platform that provides deep, automated insights into repository health, maintainability, and long-term sustainability. By combining static code analysis, commit behavior analytics, distributed job processing, and AI-driven insights, CodeHealth-AI helps development teams understand where their codebase stands today and what to improve next.
The platform operates continuouslyβanalyzing repositories on every push and pull request, tracking quality trends over time, and proactively alerting teams when critical thresholds are breached.
- Continuous Analysis: Automated re-analysis on every push/PR using distributed workers
- Real-Time Observability: Live dashboards tracking code health trends over time
- AI-Powered Insights: Human-readable explanations and actionable recommendations
- Proactive Alerting: Custom threshold-based notifications via email and in-app
- Enterprise-Grade Architecture: Multi-server deployment with job queues and pub/sub messaging
- Comprehensive Metrics: From cyclomatic complexity to bus factor risk analysis
- Static Code Metrics: Cyclomatic complexity, Halstead volume, maintainability index
- File-Level Risk Scoring: Identifies refactoring priorities with actionable reasons
- Technical Debt Quantification: Estimates refactoring effort in developer-days
- Distribution Analysis: Visualizes code quality spread across the repository
- Commit Pattern Analysis: Tracks velocity trends, consistency, and activity ratios
- Bus Factor Assessment: Identifies knowledge concentration risks
- PR Velocity Metrics: Measures review time, merge frequency, and throughput
- Contributor Analytics: Evaluates team participation and collaboration patterns
- Time-Series Dashboards: Code health score, quality metrics, and activity over time
- Activity Heatmaps: Visualizes contribution patterns across days and hours
- Push/Pull Tracking: Real-time monitoring of repository events
- Trend Detection: Identifies improving, stable, or degrading metrics
- Custom Thresholds: Define acceptable ranges for any metric
- Multi-Channel Notifications: In-app alerts and email notifications
- Breach Detection: Automatic alerts when metrics fall below thresholds
- Configurable Rules: Set different alert levels per repository
- Executive Summaries: High-level repository health assessments
- Strengths & Weaknesses: Automated identification of quality patterns
- Actionable Recommendations: Specific guidance on what to improve
- Natural Language Explanations: Makes complex metrics understandable
CodeHealth-AI employs a distributed, multi-server architecture designed for scalability, fault tolerance, and real-time processing.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLIENT LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Next.js Frontend (Vercel) β β
β β - React Components - D3.js Visualizations β β
β β - Real-time Updates - WebSocket Client β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HTTPS / WebSocket
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β APPLICATION LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Express API Server (Heroku) β β
β β - REST Endpoints - GitHub OAuth β β
β β - WebSocket Server - Redis Pub/Sub β β
β β - Authentication - Job Scheduling β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Job Queue
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROCESSING LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β BullMQ Workers (Heroku) β β
β β - Distributed Job Processing β β
β β - Analysis Orchestration β β
β β - Metric Aggregation β β
β β - Cron Jobs (DB Cleanup, PR Metrics) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Analysis Requests
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ANALYSIS LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β FastAPI Analysis Engine (Heroku) β β
β β - Python Static Analysis β β
β β - Radon, Lizard Integration β β
β β - AI Insight Generation β β
β β - Metric Calculation β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA LAYER β
β βββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ β
β β Supabase β β Redis β β GitHub API β β
β β PostgreSQL β β - Job Queue β β - Webhooks β β
β β - Persistence β β - Pub/Sub β β - Repository β β
β β - Auth β β - Caching β β Data β β
β βββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. Multi-Server Deployment
- Frontend (Vercel): Edge-optimized delivery, automatic scaling
- API Server (Heroku): Centralized business logic and authentication
- Workers (Heroku): Isolated job processing with dedicated resources
- Analysis Engine (Heroku): Python-based static analysis with AI integration
2. Distributed Job Processing
- BullMQ: Redis-backed job queue for reliable task distribution
- Worker Isolation: Prevents analysis workloads from blocking API requests
- Concurrency Control: Configurable parallel job execution
- Deadline Management: Timeout handling for long-running analyses
3. Real-Time Communication
- Redis Pub/Sub: Event broadcasting between workers and API server
- WebSockets: Live updates pushed to frontend clients
- Event-Driven Updates: Instant dashboard refreshes on analysis completion
4. Data Flow
- Write-Heavy Operations: Analysis results written to Supabase
- Read-Heavy Operations: Cached in Redis for fast dashboard loads
- Metric Aggregation: Computed in workers, stored for historical tracking
| Technology | Version | Purpose |
|---|---|---|
| Next.js | 15.4.10 | Application framework with App Router |
| React | 19.1.0 | UI component library |
| TypeScript | 5.x | Type-safe development |
| Zustand | 5.0.7 | State management |
| Recharts | 3.5.1 | Line charts and metric graphs |
| D3.js | 7.9.0 | Advanced visualizations (radar, heatmaps) |
| Chart.js | 4.5.1 | Gauge and distribution charts |
| GSAP | 3.13.0 | Animations and transitions |
| Socket.io-client | 4.8.1 | Real-time communication |
| Tailwind CSS | 4.x | Utility-first CSS framework |
| Axios | 1.11.0 | HTTP client |
| Technology | Purpose |
|---|---|
| Express.js | REST API framework |
| BullMQ | Distributed job queue and worker management |
| Redis | Job queue, caching, pub/sub messaging |
| Supabase | PostgreSQL database and authentication |
| Socket.io | WebSocket server for real-time events |
| Octokit | GitHub API client |
| JWT | Authentication tokens |
| Nodemailer | Email notifications |
| Node-cron | Scheduled jobs (DB cleanup, metric collection) |
| Technology | Purpose |
|---|---|
| FastAPI | High-performance async API framework |
| Radon | Cyclomatic complexity and maintainability analysis |
| Together AI | AI-powered insight generation |
| Google Gemini | Natural language processing |
| JWT | GitHub App authentication |
| Service | Purpose |
|---|---|
| Vercel | Frontend hosting with edge network |
| Heroku | API, workers, and Python server hosting |
| Supabase | Managed PostgreSQL database |
| Redis Cloud | Managed Redis for queue and cache |
| GitHub Apps | Repository webhooks and permissions |
%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#1a1a1a','primaryTextColor':'#fff','primaryBorderColor':'#7C3AED','lineColor':'#10B981','secondaryColor':'#1a1a1a','tertiaryColor':'#1a1a1a','background':'#0d1117','mainBkg':'#1a1a1a','secondBkg':'#1a1a1a','lineColor':'#8B5CF6','border1':'#7C3AED','border2':'#10B981','note':'#1e293b','noteBorder':'#7C3AED','noteBkgColor':'#1e293b','noteTextColor':'#fff','actorBorder':'#7C3AED','actorBkg':'#1a1a1a','actorTextColor':'#fff','actorLineColor':'#10B981','signalColor':'#10B981','signalTextColor':'#fff','labelBoxBkgColor':'#1e293b','labelBoxBorderColor':'#7C3AED','labelTextColor':'#fff','loopTextColor':'#fff','activationBorderColor':'#7C3AED','activationBkgColor':'#1e293b','sequenceNumberColor':'#fff'}}}%%
sequenceDiagram
autonumber
participant User as π€ User
participant Frontend as π¨ Frontend
participant ExpressAPI as π Express API
participant Redis as πΎ Redis Cache
participant PythonService as π Python Service
participant ClaudeAI as π€ Claude AI
participant GitHub as π GitHub API
participant BullMQ as π BullMQ Queue
participant ASTWorker as βοΈ AST Worker
participant DB as ποΈ PostgreSQL
participant SocketIO as π Socket.IO
Note over User,ExpressAPI: π― PHASE 1: INITIALIZATION
User->>+Frontend: Click "Initialize Repo"
Frontend->>+ExpressAPI: GET /:repoId/initialize
alt Repo not found
ExpressAPI-->>Frontend: 404 Not Found
else Already initialized
ExpressAPI-->>Frontend: 200 Already initialized
else Max limit (>2 repos)
ExpressAPI-->>Frontend: 400 Max limit reached
end
ExpressAPI->>DB: Update repo (initialised=true, status=processing)
ExpressAPI->>DB: Create activity log
ExpressAPI->>ExpressAPI: Build payload (repoId, installationId, owner, repoName, branch)
ExpressAPI->>ExpressAPI: fullRepoAnalyse() async via setImmediate
ExpressAPI-->>-Frontend: β
200 Analysis in progress
deactivate Frontend
Note over ExpressAPI,PythonService: π PHASE 2: DISPATCH TO PYTHON SERVICE
ExpressAPI->>ExpressAPI: Validate payload
ExpressAPI->>+PythonService: POST /v1/internal/analysis/full-repo
Note over PythonService,GitHub: π PHASE 3: DATA COLLECTION
PythonService->>PythonService: full_repo_analysis()
par Parallel GitHub API Calls
PythonService->>GitHub: get_installation_token()
PythonService->>GitHub: fetch_repo_code()
PythonService->>GitHub: get_contributors()
PythonService->>GitHub: get_issues()
PythonService->>GitHub: get_all_pr()
PythonService->>GitHub: get_all_commits()
PythonService->>GitHub: get_releases()
PythonService->>GitHub: get_repo_metadata()
end
PythonService->>+ClaudeAI: analyze_commits() - AI analysis
ClaudeAI-->>-PythonService: π‘ Commit insights
PythonService->>PythonService: Count files (.py, .js, .jsx, .ts, .tsx)
Note over PythonService,Redis: π¬ PHASE 4: INITIALIZE TRACKING
PythonService->>+ExpressAPI: POST /scanning/initialize-analysis (totalFiles)
ExpressAPI->>Redis: Store totalFiles count
ExpressAPI->>ExpressAPI: startAnalysisPolling() - Interval: 15s
deactivate ExpressAPI
Note over PythonService,ExpressAPI: π€ PHASE 5: SEND METADATA
par Parallel Metadata Uploads
PythonService->>ExpressAPI: POST /scanning/commits
PythonService->>ExpressAPI: POST /scanning/commits-analysis
PythonService->>ExpressAPI: POST /scanning/repo-metadata
PythonService->>ExpressAPI: POST /scanning/contributors
end
Note over PythonService,DB: π PHASE 6A: PYTHON FILE PROCESSING
loop Batch Processing (size: 50)
PythonService->>PythonService: analyze_py_code() - Calculate CC, MI, LOC, Halstead
PythonService->>+ExpressAPI: POST /scanning/python-batch
ExpressAPI->>DB: Bulk insert RepoFileMetrics
ExpressAPI->>ExpressAPI: triggerBackgroundAnalysis()
deactivate ExpressAPI
end
Note over PythonService,BullMQ: π PHASE 6B: JS/TS FILE QUEUEING
loop Batch Queueing (size: 50)
PythonService->>+ExpressAPI: POST /scanning/enqueue-batch
ExpressAPI->>BullMQ: Add jobs to filesQueue
BullMQ->>Redis: Store job data
deactivate ExpressAPI
end
deactivate PythonService
Note over ASTWorker,SocketIO: βοΈ PHASE 7: WORKER PROCESSING
loop Concurrent Processing (Workers: 5)
ASTWorker->>+BullMQ: Fetch next job
deactivate BullMQ
ASTWorker->>ASTWorker: Validate file extension
ASTWorker->>ASTWorker: analyzeFile(content) - CC, MI, LOC, Halstead
ASTWorker->>DB: Upsert RepoFileMetrics
ASTWorker->>SocketIO: Emit analysis_update
SocketIO-->>Frontend: π‘ Real-time progress update
ASTWorker->>Redis: Increment completed count
alt All files processed
ASTWorker->>ExpressAPI: Worker "completed" event
ExpressAPI->>ExpressAPI: triggerBackgroundAnalysis()
end
end
Note over ExpressAPI,DB: π PHASE 8: POLLING MECHANISM
loop Every 15 seconds
ExpressAPI->>+DB: Query RepoFileMetrics.count()
DB-->>-ExpressAPI: Current count
alt Count >= expectedTotal
ExpressAPI->>Redis: Stop polling
ExpressAPI->>DB: Update status=completed
ExpressAPI->>SocketIO: Emit completion notification
SocketIO-->>Frontend: π Analysis completed!
ExpressAPI->>ExpressAPI: triggerAlertScan()
end
end
Note over ExpressAPI,DB: π― PHASE 9: BACKGROUND ANALYSIS
ExpressAPI->>+DB: Query all RepoFileMetrics
ExpressAPI->>DB: Query commit data
DB-->>-ExpressAPI: All metrics data
ExpressAPI->>ExpressAPI: Calculate: Repo metrics, Commit patterns, Distributions, Health score
ExpressAPI->>DB: Upsert RepositoryAnalysis
ExpressAPI->>Redis: Cache analysis results
ExpressAPI->>DB: Create trend record
ExpressAPI->>DB: Create notification
Note over User,DB: π PHASE 10: COMPLETION & DISPLAY
Frontend->>+ExpressAPI: Fetch dashboard data
ExpressAPI->>+DB: Query analysis data
DB-->>-ExpressAPI: π Metrics & analysis
ExpressAPI-->>-Frontend: Dashboard data
Frontend->>User: π Display results
Note over User,DB: β¨ ANALYSIS COMPLETE! β¨
Process Steps:
- User initiates analysis through frontend
- API validates request and creates analysis record
- Job queued in BullMQ with repository metadata
- Worker picks up job and fetches file tree from GitHub
- Files sent to Python analysis engine in batches
- Python calculates complexity, maintainability, and risk scores
- Worker aggregates results and computes repository-level metrics
- Results persisted to database
- Completion event published via Redis Pub/Sub
- Frontend notified via WebSocket
- Dashboard auto-refreshes with new data
%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#1a1a1a','primaryTextColor':'#fff','primaryBorderColor':'#7C3AED','lineColor':'#10B981','secondaryColor':'#1a1a1a','tertiaryColor':'#1a1a1a','background':'#0d1117','mainBkg':'#1a1a1a','secondBkg':'#1a1a1a','lineColor':'#8B5CF6','border1':'#7C3AED','border2':'#10B981','note':'#1e293b','noteBorder':'#7C3AED','noteBkgColor':'#1e293b','noteTextColor':'#fff','actorBorder':'#7C3AED','actorBkg':'#1a1a1a','actorTextColor':'#fff','actorLineColor':'#10B981','signalColor':'#10B981','signalTextColor':'#fff','labelBoxBkgColor':'#1e293b','labelBoxBorderColor':'#7C3AED','labelTextColor':'#fff','loopTextColor':'#fff','activationBorderColor':'#7C3AED','activationBkgColor':'#1e293b','sequenceNumberColor':'#fff'}}}%%
sequenceDiagram
autonumber
participant GitHub as π GitHub Webhook
participant Webhook as π― Webhook Handler
participant DB as ποΈ PostgreSQL
participant Redis as πΎ Redis Cache
participant SocketIO as π Socket.IO
participant PythonScan as π Python Scan Service
participant PythonAnalysis as π€ Python Analysis Service
participant GitHubAPI as π GitHub API
participant BullMQ as π BullMQ Queue
participant ASTWorker as βοΈ AST Worker
Note over GitHub,Webhook: π― PHASE 1: WEBHOOK RECEPTION
GitHub->>+Webhook: Push Event Received
Webhook->>Webhook: Extract payload (repository, commits, branch, pusher)
Webhook->>+DB: Fetch Project.userId
DB-->>-Webhook: User ID
Note over Webhook,DB: π PHASE 2: BRANCH VALIDATION
alt Not default branch
Webhook->>DB: Create RepoPushEvent (analytics only)
Webhook-->>GitHub: Return {skipped: true, reason: "branch-policy"}
else Default branch
Note over Webhook,DB: Continue processing
end
Note over Webhook,DB: π PHASE 3: FILE CHANGE ANALYSIS
Webhook->>Webhook: Analyze commits: added, modified, removed files
Webhook->>Webhook: Deduplicate files (remove conflicts)
alt Files removed
Webhook->>DB: Delete from RepoFileMetrics
end
Note over Webhook,DB: πΎ PHASE 4: STORE COMMITS & ANALYTICS
Webhook->>DB: Commit.bulkCreate (sha, message, author, committer, timestamps)
Webhook->>DB: Update RepositoryAnalysis (totalCommits, lastCommit)
Webhook->>DB: Create RepoPushEvent (commitCount, branch, pushedAt)
Webhook->>DB: Create activity log "{pusher} pushed commits"
Webhook->>Redis: Invalidate cache (metrics:repo:{repoId})
Note over Webhook,SocketIO: π’ PHASE 5: EMIT NOTIFICATIONS
Webhook->>SocketIO: Emit notification to user
SocketIO-->>Webhook: Notification sent
Webhook->>DB: Create notification "New push on {repo}"
Note over Webhook,PythonScan: π PHASE 6: TRIGGER PUSH SCAN (Parallel)
Webhook->>Webhook: processPushScan() - Build ScanJobData
Webhook->>+PythonScan: POST /v3/internal/pushScan/run
PythonScan->>PythonScan: ScanFiles() - Initialize scan
Note over PythonScan,GitHubAPI: π₯ PHASE 7: FETCH CHANGED FILES
PythonScan->>+GitHubAPI: get_installation_token() - GitHub App auth
GitHubAPI-->>-PythonScan: Access token
PythonScan->>+GitHubAPI: fetch_changed_files_code() - Get file content
GitHubAPI-->>-PythonScan: File contents
PythonScan->>PythonScan: Filter files (.py, .js, .ts, .jsx, .tsx)
Note over PythonScan,DB: π PHASE 8: PROCESS PYTHON FILES
alt Python files found
loop For each Python file
PythonScan->>PythonScan: analyze_py_code() - Calculate metrics
end
PythonScan->>+Webhook: POST /scanning/python-batch (metrics array)
Webhook->>DB: Bulk insert/update RepoFileMetrics
Webhook->>Webhook: triggerBackgroundAnalysis() - Python path
deactivate Webhook
end
Note over PythonScan,BullMQ: π PHASE 9: PROCESS JS/TS FILES
alt JS/TS files found
PythonScan->>+Webhook: POST /scanning/enqueue-batch (isPushEvent=true)
Webhook->>BullMQ: Add jobs to filesQueue
BullMQ->>Redis: Store job data
deactivate Webhook
end
deactivate PythonScan
Note over Webhook,PythonAnalysis: π€ PHASE 10: TRIGGER PUSH ANALYSIS (Parallel)
Webhook->>Webhook: processPushAnalysis() - Build analysis payload
Webhook->>+PythonAnalysis: POST /v1/internal/analysis/run
PythonAnalysis->>PythonAnalysis: push_analyze_repo() - Initialize analysis
Note over PythonAnalysis,Webhook: π PHASE 11: IMPACT ANALYSIS
PythonAnalysis->>PythonAnalysis: seed_impact() - Calculate impact score for changed files
PythonAnalysis->>PythonAnalysis: seed_prioritization() - Rank files by risk
PythonAnalysis->>PythonAnalysis: Calculate final score (impact vs threshold)
Note over PythonAnalysis,DB: πΎ PHASE 12: STORE PUSH METRICS
PythonAnalysis->>+Webhook: POST /scanning/pushMetric (impact, prioritization, score)
Webhook->>DB: Create PushAnalysisMetrics
Webhook->>DB: Store: impact score, risk analysis, impacted files, candidates
deactivate Webhook
deactivate PythonAnalysis
Note over ASTWorker,SocketIO: βοΈ PHASE 13: WORKER PROCESSING (JS/TS)
loop Concurrent Processing (Workers: 5)
ASTWorker->>+BullMQ: Fetch next job
deactivate BullMQ
ASTWorker->>ASTWorker: Validate file extension
ASTWorker->>ASTWorker: analyzeFile(content) - Calculate CC, MI, LOC, Halstead
ASTWorker->>DB: Upsert RepoFileMetrics
ASTWorker->>SocketIO: Emit analysis_update
SocketIO-->>Webhook: π‘ Real-time progress update
ASTWorker->>Redis: Increment status.completed
alt All files for repo completed
ASTWorker->>Webhook: Worker "completed" event
Webhook->>Webhook: triggerBackgroundAnalysis() - Worker path
end
end
Note over Webhook,DB: π― PHASE 14: BACKGROUND ANALYSIS (Convergence)
Note over Webhook: Both Python path & Worker path converge here
Webhook->>+DB: Query RepoFileMetrics
Webhook->>DB: Query commit data
DB-->>-Webhook: All metrics data
Webhook->>Webhook: Calculate aggregates: Repo metrics, Commit patterns, Distributions, Health score
Webhook->>DB: Upsert RepositoryAnalysis (all calculated metrics)
Webhook->>Redis: Cache in Redis (metrics:repo:{repoId}, TTL: 24h)
Webhook->>DB: Create trend record (healthScore, technicalDebt, velocityTrend)
Webhook->>Webhook: triggerAlertScan() - Check for new alerts
Note over GitHub,DB: π PHASE 15: COMPLETION
Webhook-->>-GitHub: β¨ Push Analysis Complete! β¨
Process Steps:
- GitHub sends webhook on push event
- API validates signature and extracts commit data
- Push analysis job queued with changed file list
- Worker identifies modified files
- Only changed files re-analyzed by Python engine
- File metrics updated incrementally
- Repository health score recalculated
- Push metrics stored (timestamp, commit count, file changes)
- Alert system checks if thresholds breached
- Real-time update pushed to connected clients
%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#1a1a1a','primaryTextColor':'#fff','primaryBorderColor':'#7C3AED','lineColor':'#10B981','secondaryColor':'#1a1a1a','tertiaryColor':'#1a1a1a','background':'#0d1117','mainBkg':'#1a1a1a','secondBkg':'#1a1a1a','lineColor':'#8B5CF6','border1':'#7C3AED','border2':'#10B981','note':'#1e293b','noteBorder':'#7C3AED','noteBkgColor':'#1e293b','noteTextColor':'#fff','actorBorder':'#7C3AED','actorBkg':'#1a1a1a','actorTextColor':'#fff','actorLineColor':'#10B981','signalColor':'#10B981','signalTextColor':'#fff','labelBoxBkgColor':'#1e293b','labelBoxBorderColor':'#7C3AED','labelTextColor':'#fff','loopTextColor':'#fff','activationBorderColor':'#7C3AED','activationBkgColor':'#1e293b','sequenceNumberColor':'#fff'}}}%%
sequenceDiagram
autonumber
participant GitHub as π GitHub Webhook
participant Webhook as π― Webhook Handler
participant DB as ποΈ PostgreSQL
participant Redis as πΎ Redis Cache
participant SocketIO as π Socket.IO
participant Handler as π PR Handler
participant Metrics as π Metrics Aggregator
participant PythonService as π Python Analysis Service
participant GitHubAPI as π GitHub API
Note over GitHub,Webhook: π― PHASE 1: WEBHOOK RECEPTION
GitHub->>Webhook: Pull Request Event (pull_request)
Webhook->>Webhook: Validate event type === "pull_request"
Webhook->>Webhook: Extract payload: action, repoId, prNumber, sender, head/base refs/shas
Note over Webhook,DB: π PHASE 2: PROJECT LOOKUP
Webhook->>DB: Find Project by repoId (initialised: true)
DB-->>Webhook: Project found
Note over Webhook,SocketIO: π PHASE 3: ACTION ROUTING
alt Action: "closed" + merged = true
Webhook->>Redis: Invalidate cache (metrics:repo:{repoId})
Webhook->>SocketIO: Send notification "PR merged"
SocketIO-->>Webhook: Notification sent
Webhook->>DB: Create notification in DB
else Action: "opened"
Webhook->>SocketIO: Send notification "PR opened"
SocketIO-->>Webhook: Notification sent
Webhook->>DB: Create notification in DB
else Action: "synchronize"
Webhook->>SocketIO: Send notification "PR updated"
SocketIO-->>Webhook: Notification sent
Webhook->>DB: Create notification in DB
else Action: "reopened"
Webhook->>SocketIO: Send notification "PR reopened"
SocketIO-->>Webhook: Notification sent
end
Note over Webhook,Handler: π PHASE 4: MAIN HANDLER
Webhook->>Handler: handlePullRequest(payload)
Handler->>Handler: Validate payload (repoFullName, prNumber, installationId)
alt Invalid payload
Handler-->>Webhook: Return {skipped: true, reason: "invalid-payload"}
end
Note over Handler,DB: π PHASE 5: PR ANALYTICS
Handler->>DB: Fetch Project userId
DB-->>Handler: User ID
Handler->>DB: Find existing PullRequestActivity (repoId + prNumber)
DB-->>Handler: Existing PR or null
alt Action: "opened"
Handler->>DB: Upsert PullRequestActivity (state: "open", createdAtGitHub, reviewCount)
Handler->>DB: Create activity log "{sender} opened PR"
Handler->>Metrics: Trigger aggregateRepoPRMetrics (async)
else Action: "reopened"
Handler->>DB: Upsert PullRequestActivity (state: "open")
Handler->>DB: Create activity log "{sender} reopened PR"
else Action: "synchronize"
Handler->>DB: Update existing (state: "open")
Handler->>DB: Create activity log "{sender} updated PR"
else Action: "closed"
Handler->>DB: Update/Create (state: "merged" or "closed", timeToMerge, closedAtGitHub)
Handler->>DB: Create activity log "{sender} closed/merged PR"
Handler->>Metrics: Trigger aggregateRepoPRMetrics (async)
end
Note over Metrics,DB: π PHASE 6: METRICS AGGREGATION (Async)
Metrics->>DB: Query all PullRequestActivity for repo
DB-->>Metrics: All PR data
Metrics->>Metrics: Calculate aggregated metrics:<br/>β’ PRs opened/merged/closed<br/>β’ Open/stale PRs<br/>β’ Time to merge stats<br/>β’ Time to first review<br/>β’ Review distribution
Metrics->>DB: Upsert PRVelocityMetrics
Note over Handler,PythonService: π€ PHASE 7: ACTIONABLE CHECK
Handler->>Handler: Check if actionable (opened/reopened/synchronize)
alt Not actionable (closed)
Handler-->>Webhook: Return {skipped: true, reason: "non-actionable"}
else Actionable
Note over Handler,PythonService: π¬ PHASE 8: ANALYSIS PREPARATION
Handler->>Handler: Create job data:<br/>β’ repoFullName, repoId<br/>β’ installationId<br/>β’ prNumber, head, base<br/>β’ isFromFork
Handler->>Handler: analyzePullRequest(jobData)
Note over Handler,PythonService: π PHASE 9: PYTHON SERVICE CALL
Handler->>PythonService: POST /v1/internal/analysis/pr (timeout: 15 min)
PythonService->>PythonService: pull_analyze_repo() - Initialize analysis
Note over PythonService,GitHubAPI: π PHASE 10: FETCH PR DATA
PythonService->>GitHubAPI: get_installation_token() - GitHub App auth
GitHubAPI-->>PythonService: Access token
PythonService->>GitHubAPI: Fetch PR files and changes
GitHubAPI-->>PythonService: PR file data
Note over PythonService: π PHASE 11: CODE ANALYSIS
PythonService->>PythonService: analyze_pr_opened():<br/>β’ Calculate risk/complexity<br/>β’ Detect security issues<br/>β’ Check for tests/docs<br/>β’ Generate suggestions
Note over PythonService,Handler: π€ PHASE 12: RETURN ANALYSIS
PythonService-->>Handler: Analysis results:<br/>β’ score, summary<br/>β’ metrics, annotations<br/>β’ suggestions, warnings<br/>β’ recommended reviewers
Note over Handler,DB: πΎ PHASE 13: SAVE ANALYSIS
Handler->>DB: Find or create PullRequestAnalysis
Handler->>DB: Update analysis data:<br/>β’ risk score<br/>β’ complexity metrics<br/>β’ security findings<br/>β’ suggestions<br/>β’ reviewer recommendations
DB-->>Handler: Analysis saved
Note over Handler,Webhook: β
PHASE 14: COMPLETION
Handler-->>Webhook: Return response:<br/>{enqueued: true,<br/>repoFullName, prNumber,<br/>action, jobId}
end
Webhook-->>GitHub: β¨ PR Analysis Complete! β¨
Note over Handler,Webhook: β ERROR HANDLING
alt Analysis Error
PythonService-->>Handler: Error response
Handler->>Handler: Log error
Handler-->>Webhook: Return error response
else Timeout Error
PythonService-->>Handler: Timeout (15 min)
Handler->>Handler: Log timeout
Handler-->>Webhook: Return error response
end
Process Steps:
- GitHub webhook triggered on PR open/update/close
- API extracts PR number, author, reviewers, file changes
- PR analysis job queued
- Worker fetches diff and identifies impact scope
- Python analyzes code changes in PR context
- PR metadata stored with analysis results
- PR velocity metrics calculated (time to merge, review time)
- Aggregated PR metrics updated in database
- Frontend notified of PR activity
The worker server runs scheduled jobs for maintenance and metric collection:
Cron Job Schedule:
| Job | Schedule | Purpose |
|---|---|---|
| Push Activity Aggregation | Every hour (0 * * * *) | Consolidate push frequency, commit patterns, and activity metrics |
| PR Velocity Calculation | Every 2 hours (0 */2 * * *) | Aggregate PR metrics, calculate average review times and merge velocity |
| Notification Cleanup | Daily at midnight (0 0 * * *) | Clean up old and read notifications from the database |
| Resolved Alerts Deletion | Daily at 2:00 AM (0 2 * * *) | Delete resolved alerts older than 7 days to maintain database performance |
Implementation:
// Node-cron job definitions in worker server
cron.schedule('0 * * * *', async () => { await aggregatePushActivity(); });
cron.schedule('0 */2 * * *', async () => { await aggregatePRVelocity(); });
cron.schedule('0 0 * * *', async () => { await notif_cleanUp(); });
cron.schedule('0 2 * * *', async () => { await deleteResolvedAlerts(7); });The health score is a weighted composite (0-100) calculated from four dimensions:
Health Score = (Code Quality Γ 0.45)
+ (Development Activity Γ 0.25)
+ (Bus Factor Γ 0.15)
+ (Community Γ 0.15)
Rating Scale:
- 85-100: Excellent β Exemplary quality and active development
- 70-84: Good β Solid foundation with minor improvements needed
- 55-69: Fair β Acceptable but requires attention in some areas
- 40-54: Needs Improvement β Significant issues, refactoring recommended
- 0-39: Critical β Severe problems requiring immediate intervention
Derived from static analysis metrics:
File-Level Risk Score:
Risk Score = (0.35 Γ Normalized Complexity)
+ (0.35 Γ Normalized Halstead Volume)
+ (0.25 Γ Maintainability Penalty)
+ (0.05 Γ LOC Penalty)
Metrics Explained:
-
Cyclomatic Complexity
- Measures number of independent code paths
- Threshold: 12
- Higher values indicate harder-to-test code
-
Halstead Volume
- Quantifies cognitive effort to understand code
- Threshold: 1200
- Based on operator/operand frequency
-
Maintainability Index
- Composite score (0-100) for ease of maintenance
- Inverted to create penalty:
100 - MI - Lower MI increases risk
-
Lines of Code (LOC)
- Penalty for files exceeding 150 source lines
- Large files violate single responsibility
Technical Debt Score:
- Average risk score across all files
- Lower values indicate healthier codebase
Evaluates commit patterns and velocity:
Activity Score = (0.6 Γ Activity Intensity)
+ (0.4 Γ Consistency Score)
Components:
- Recent Commits: Logarithmic scaling of 30-day commit count
- Consistency: Based on coefficient of variation
- Activity Ratio: Days with commits / total days
- Velocity Trend: Increasing, stable, or decreasing
Assesses knowledge concentration risk:
Top Contributor Ratio = Top Contributor Commits / Total Commits
Risk Level:
- High (35 pts): Ratio > 70%
- Medium (65 pts): Ratio 50-70%
- Low (90 pts): Ratio < 50%
Measures external engagement:
Community = (0.4 Γ log(1 + stars) Γ 18)
+ (0.35 Γ log(1 + forks) Γ 22)
+ (0.25 Γ log(1 + watchers) Γ 20)
Logarithmic scaling prevents bias toward very popular repositories.
- Node.js 18.x or higher
- Python 3.11 or higher
- Redis 6.x or higher
- PostgreSQL 14.x or higher (via Supabase)
- GitHub App configured with webhook permissions
git clone https://github.com/your-org/codehealth-ai.git
cd codehealth-aicd frontend
npm install
# Create .env.local file (see Environment Configuration)
cp .env.example .env.local
npm run dev
# Frontend runs on http://localhost:3000cd backend
npm install
# Create .env file
cp .env.example .env
# Start API server
npm run dev
# API runs on http://localhost:5000cd analysis-engine
pip install -r requirements.txt
# Create .env file
cp .env.example .env
# Start FastAPI server
uvicorn main:app --reload --port 8000
# Analysis engine runs on http://localhost:8000# Using Docker
docker run -d -p 6379:6379 redis:latest
# Or install locally
# macOS
brew install redis
redis-server
# Ubuntu
sudo apt-get install redis-server
sudo service redis-server start# Authentication URLs
NEXT_PUBLIC_GITHUB_AUTH_URL=http://localhost:5000/auth/github
NEXT_PUBLIC_GOOGLE_AUTH_URL=http://localhost:5000/auth/google
NEXT_PUBLIC_LOGIN_URL=http://localhost:5000/auth/login
NEXT_PUBLIC_SIGNUP_URL=http://localhost:5000/auth/signup
# API Endpoints
NEXT_PUBLIC_AXIOS_API_URL=http://localhost:5000/api
NEXT_PUBLIC_SOCKET_URL=http://localhost:5000
# Frontend URL
NEXT_PUBLIC_Frontend_URL=http://localhost:3000
# GitHub App Permissions
NEXT_PUBLIC_GITHUB_PERMISSION_URL=https://github.com/apps/your-app-slug/installations/new
# Development Mode
NEXT_PUBLIC_USE_MOCK_DATA=false# Server Configuration
PORT=5000
# Google OAuth
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GOOGLE_REDIRECT_URI=http://localhost:5000/auth/google/callback
# GitHub OAuth
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
GITHUB_REDIRECT_URI=http://localhost:5000/auth/github/callback
# GitHub App Configuration
GITHUB_APP_REDIRECT_URI=http://localhost:3000/dashboard
APP_ID=your_github_app_id
APP_SECRET=your_app_secret
PRIVATE_KEY=path/to/private-key.pem
GITHUB_WEBHOOK_SECRET=your_webhook_secret
GITHUB_APP_SLUG=your-app-slug
# GitHub App Private Key (inline)
GITHUB_PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA...
-----END RSA PRIVATE KEY-----"
# Database Configuration
DATABASE_URL=postgresql://user:password@host:5432/database
DATABASE_PASS=your_database_password
# JWT Secret
JWT_SECRET=your_jwt_secret_key
# Frontend URL
FRONTEND_URL=http://localhost:3000
WEB_APP_REDIRECT_URI=http://localhost:3000/auth/callback
# Email Configuration
NODEMAILER_PASSKEY=your_gmail_app_password
# Redis Configuration
REDIS_PASSWORD=your_redis_password
REDIS_HOST=localhost
REDIS_PORT=6379
# Worker Configuration
ANALYSIS_CONCURRENCY=5
ANALYSIS_DEADLINE_MS=600000
# Python Analysis Engine URL
ANALYSIS_INTERNAL_URL=http://localhost:8000
# Backend Server URL (for workers)
BACKEND_SERVER=http://localhost:5000
# Ngrok (for local webhook testing)
NGROK_AUTHTOKEN=your_ngrok_token# Server Configuration
PORT=8000
# GitHub App
GITHUB_APP_ID=your_github_app_id
GITHUB_PRIVATE_KEY='-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA...
-----END RSA PRIVATE KEY-----'
# AI APIs
TOGETHER_API_KEY=your_together_ai_key
GEMINI_API_KEY=your_primary_gemini_key
GEMINI_API_KEY2=your_backup_gemini_key
# Express Backend URL
EXPRESS_URL=http://localhost:5000βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (Vercel) β
β - Automatic deployments from main branch β
β - Edge network CDN β
β - Environment variables via Vercel dashboard β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β API Server (Heroku) β
β - Procfile: web: node src/server.js β
β - Config vars set via Heroku CLI/dashboard β
β - Dyno type: Standard-1X β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Worker Server (Heroku) β
β - Procfile: worker: node src/workers/index.js β
β - Separate dyno from API server β
β - Dyno type: Standard-2X (more CPU for analysis) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Analysis Engine (Heroku) β
β - Procfile: web: uvicorn main:app --host 0.0.0.0 β
β - Python buildpack β
β - Dyno type: Standard-1X β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Frontend (Vercel):
cd frontend
vercel --prodBackend (Heroku):
cd backend
heroku create codehealth-api
git push heroku main
heroku config:set KEY=VALUEWorkers (Heroku):
cd workers
heroku create codehealth-workers
git push heroku main
heroku ps:scale worker=1Python (Heroku):
cd analysis-engine
heroku create codehealth-analysis
heroku buildpacks:set heroku/python
git push heroku mainSet all environment variables using:
heroku config:set VARIABLE_NAME=value --app app-nameFor Vercel:
vercel env add VARIABLE_NAMEThis project was developed by a two-member team with clearly defined responsibilities:
Role: Frontend Engineer, UI/UX Designer & Deployment
Responsibilities:
- Complete frontend development using Next.js
- UI/UX design and component architecture
- Data visualization implementation with D3.js and Recharts
- Dashboard design and development
- WebSocket client integration for real-time updates
- Observability interface development
- Alert configuration UI
- Repository connection flow
- Authentication UI (GitHub/Google OAuth)
- Responsive design and mobile optimization
- Complete deployment
Key Contributions:
- Designed intuitive, data-rich dashboards
- Implemented complex visualizations (heatmaps, radar charts, distributions)
- Created real-time update system for metrics
- Built seamless GitHub integration flow
- Optimized frontend performance for large datasets
Role: Backend Engineer & System Architect
Responsibilities:
- Complete backend architecture and implementation
- Metric design and scoring algorithms
- Data modeling and database schema
- GitHub integrations and webhook handling
- BullMQ worker implementation and job orchestration
- Redis pub/sub messaging system
- Alert system and email notifications
- Cron job scheduling for maintenance tasks
- Python analysis engine integration
- API endpoint design and implementation
- Authentication and authorization logic
Key Contributions:
- Designed distributed job processing architecture
- Implemented real-time metric calculation pipeline
- Built automated re-analysis system on push/PR events
- Created technical debt scoring algorithms
- Developed worker isolation strategy for scalability
Comprehensive repository health overview with key metrics and trends
Custom threshold-based alert setup interface
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Built by Kalash Thakare and Jayesh Rajbhar.
