diff --git a/.claude/active/CLAUDE.feature-smartcache.md b/.claude/active/CLAUDE.feature-smartcache.md deleted file mode 100644 index c2608a3..0000000 --- a/.claude/active/CLAUDE.feature-smartcache.md +++ /dev/null @@ -1,423 +0,0 @@ -# Feature Plan: Smart Cache - Project-Level Caching with Subfolder Extraction - -## Overview - -Cache entire project appraisals once, then extract subfolder contributions from cache on subsequent requests without calling the worker again. - -## Current Architecture - -``` -Client → API → Check Redis cache (folder-specific key) - ↓ (hit) - Return cached JSON - ↓ (miss) - Send AppraisalRequest to SQS → Return 202 Processing - ↓ - Worker receives request - ↓ - Clone if needed - ↓ - Appraise FOLDER only - ↓ - Store JSON in Redis (folder key) - ↓ - Notify via Faye -``` - -**Current cache key format**: `appraisal:{owner}/{project}/{folder_path}` - -**Problem**: Each subfolder requires a separate worker appraisal, even though: -- The worker clones the entire repo anyway -- Git blame runs on all files during appraisal -- Subfolder data is a subset of root appraisal data - -## Proposed Architecture - -``` -Client → API → Check Redis cache (project root key) - ↓ (hit) - Extract subfolder from cached root → Return JSON - ↓ (miss) - Send AppraisalRequest to SQS (always root) → Return 202 Processing - ↓ - Worker receives request - ↓ - Clone if needed - ↓ - Appraise ROOT folder - ↓ - Store JSON in Redis (root key only) - ↓ - Notify via Faye -``` - -**New cache key format**: `appraisal:{owner}/{project}/` (root only) - -**Benefits**: -- Single worker request caches all subfolder data -- Subsequent subfolder requests are cache hits -- Reduced SQS messages, worker load, and latency - -## Key Changes - -### 1. Worker Changes - -- **Modify**: Always appraise root folder (ignore folder_path in request) -- **Modify**: Cache key always uses root path (empty string) -- **Keep**: Progress reporting unchanged - -### 2. Web API Changes - -- **Add**: Subfolder extraction logic from cached root appraisal -- **Modify**: Cache lookup uses root key, then extracts subfolder -- **Modify**: AppraisalRequest always requests root (folder_path = "") -- **Add**: New domain/infrastructure for subfolder extraction - -### 3. Cache Strategy - -- **Single key per project**: `appraisal:{owner}/{project}/` -- **Value**: Full project `FolderContributions` JSON (root) -- **TTL**: 1 day (unchanged) -- **Extraction**: API extracts subfolder from cached root on demand - -## Discussion Points - -### Feasibility Assessment - -**Pros:** - -1. ⚠️ **Reduced worker calls**: One appraisal serves all subfolder requests -2. ⚠️ **Lower latency**: Subfolders served from cache after first request -3. ⚠️ **Simpler cache model**: One key per project vs many keys per folder -4. ⚠️ **Less SQS traffic**: Fewer worker requests - -**Concerns to Address:** - -1. ~~⚠️ **Cache size**: Root appraisal JSON larger than subfolder-only~~ **RESOLVED: ~30KB is practical** - -2. ~~⚠️ **Extraction complexity**: How to extract subfolder from root?~~ **RESOLVED: Navigate tree via Roar deserialization** - -3. ~~⚠️ **First request latency**: Root appraisal takes longer than subfolder~~ **ACCEPTED: Same as current (full clone happens anyway)** - -4. ⚠️ **Subfolder not found**: What if requested subfolder doesn't exist? - - Return nil from extraction, service returns 404 - -5. ~~⚠️ **Representer changes**: How to serialize extracted subfolder?~~ **RESOLVED: Re-serialize OpenStruct subtree via representer** - -## Implementation Phases - -### Phase 1: Analysis & Design ✅ - -- [x] Analyze `FolderContributions` structure for subfolder extraction -- [x] Measure typical root appraisal JSON sizes (~30KB practical) -- [x] Design subfolder extraction algorithm (tree traversal via Roar deserialization) -- [x] Decide on location for extraction logic (Representer) -- [x] Document decisions in this file - -### Phase 2: Refactor - Create `Request::Appraisal` and `Messaging::AppraisalJob` ✅ - -**Replace `Request::ProjectPath` with `Request::Appraisal`:** - -- [x] Create `Request::Appraisal` in `app/application/requests/` - - Subsumes `ProjectPath` functionality (owner_name, project_name, folder_name) - - Add `project_fullname` method - - Add `cache_key` method (single source of truth) - - Add `root_request?` helper -- [x] Update controller to use `Request::Appraisal` instead of `ProjectPath` -- [x] Update `Service::FetchOrRequestAppraisal` to use `request.cache_key` -- [x] Remove duplicate `appraisal_cache_key` helper from service -- [x] Delete old `Request::ProjectPath` -- [x] Add unit tests for `Request::Appraisal` - -**Consolidate worker job DTO:** - -- [x] Create `Messaging::AppraisalJob` in `app/infrastructure/messaging/` -- [x] Create `Representer::AppraisalJob` (replaces `AppraisalRequest`) -- [x] Update service to use `Messaging::AppraisalJob` -- [x] Update worker `JobReporter` to use new naming -- [x] Delete legacy `Response::CloneRequest` and `Response::AppraisalRequest` -- [x] Delete legacy `Representer::CloneRequest` and `Representer::AppraisalRequest` -- [x] Update tests and documentation -- [x] Verify all existing tests pass (88 tests, 0 failures) - -### Phase 3: Representer Subfolder Extraction ✅ - -**Add extraction methods to `Representer::FolderContributions`:** - -- [x] Add `find_subfolder(root_ostruct, folder_path)` - tree traversal helper -- [x] Add `extract_subfolder(json_string, folder_path)` - returns OpenStruct or nil -- [x] Add `extract_subfolder_json(json_string, folder_path)` - returns JSON string or nil -- [x] Handle edge cases: root path (empty string), path not found, trailing slashes -- [x] Unit tests for extraction methods (16 tests in `folder_contributions_extraction_spec.rb`) - -### Phase 4: Smart Cache Key and Worker Request ✅ - -**API owns smart cache strategy; worker remains general-purpose:** - -- [x] Update `Request::Appraisal#cache_key` for root-only behavior - - `cache_key` now always returns root key: `appraisal:{owner}/{project}/` - - `folder_name` still available for subfolder extraction -- [x] Update `FetchOrRequestAppraisal` to always send root to worker - - Added `ROOT_FOLDER_PATH = ''` constant in API service - - Worker receives root path, appraises faithfully -- [x] Worker unchanged - remains general-purpose (appraises whatever folder it's told) -- [x] Update unit tests for smart cache behavior - -### Phase 5: API Service Subfolder Extraction ✅ - -**Modify `FetchOrRequestAppraisal` to extract subfolders from cache:** - -- [x] On cache hit: extract requested subfolder from cached root JSON - - Added `extract_subfolder` and `rebuild_appraisal_json` helper methods - - Root requests return cached JSON unchanged - - Subfolder requests extract and rebuild JSON with correct `folder_path` -- [x] Handle subfolder-not-found (return 404) - - Returns `NO_FOLDER_ERR` with `:not_found` status -- [x] Update service unit/integration tests - - Added 2 integration tests: subfolder extraction (happy) and subfolder not found (sad) - -### Phase 6: Cleanup & Testing ✅ - -- [x] End-to-end acceptance tests for smart cache flow -- [x] Test scenarios: root request, subfolder request, nested subfolder, invalid path -- [x] Verify old folder-specific cache keys are ignored (TTL expiration) -- [x] Update `.claude/CLAUDE.md` documentation with new architecture -- [x] Update session log with completion status - -## Session Log - -### Session 1 - -- Initial discussion of smart cache feature -- Updated CLAUDE.md to reflect current project state -- Created this planning document -- Answered all 6 Open Questions: - - Q1: FolderContributions has nested tree structure; JSON preserves it via recursive representer - - Q2: Measured JSON sizes (~30KB for root project); root-only caching is practical - - Q3: Extraction logic in Representer (presentation layer) - keeps entity worker-only - - Q4: Tree traversal is O(depth); Roar `from_json` handles nested deserialization automatically - - Q5: Cache invalidation unchanged; smart cache simplifies (one key per project) - - Q6: Create `Request::Appraisal` subsuming `ProjectPath` with cache key methods -- Made 5 decisions (see Decisions Made section) -- Refined Implementation Phases (6 phases) with Phase 2 as refactor -- Committed: `8c020a3` - docs: complete Phase 1 planning for smart cache feature -- **Status**: Phase 1 COMPLETE - -### Session 2 - -- Implemented Phase 2 refactoring: - - Created `Request::Appraisal` (replaces `Request::ProjectPath`) - - Created `Messaging::AppraisalJob` DTO (replaces `Response::AppraisalRequest`) - - Created `Representer::AppraisalJob` (replaces `Representer::AppraisalRequest`) - - Updated controller, service, and worker to use new objects - - Deleted legacy code: `CloneRequest`, `AppraisalRequest` (response/representer) - - Updated CLAUDE.md documentation - - All 88 tests passing -- Restructured test tasks: - - `rake spec` now runs only unit + integration tests (77 tests, no worker required) - - `rake spec:all` runs all tests including acceptance (88 tests, requires worker) - - `bash spec/acceptance_tests` starts worker and calls `spec:all` - - Updated CLAUDE.md testing documentation -- Committed: `cfac067` - tests: restructure test tasks for worker-independent development -- **Status**: Phase 2 COMPLETE - -### Session 3 - -- Implemented Phase 3 subfolder extraction: - - Added `extract_subfolder(json_string, folder_path)` class method to `Representer::FolderContributions` - - Added `extract_subfolder_json(json_string, folder_path)` for JSON string output - - Added private `find_subfolder` tree traversal helper - - Handles edge cases: root path, nil, trailing/leading slashes, non-existent folders, error appraisals - - Created JSON fixtures in `spec/fixtures/json/` (not wiped by `vcr:wipe`) - - Added 16 unit tests in `folder_contributions_extraction_spec.rb` - - All 93 tests passing (77 original + 16 new) -- Implemented Phase 4 smart cache (API-owned strategy): - - `Request::Appraisal#cache_key` now always returns root key - - `FetchOrRequestAppraisal` always sends root folder_path to worker - - Worker unchanged - remains general-purpose (appraises whatever it's told) - - Design: API owns smart cache strategy; worker stays simple - - Updated unit tests for smart cache behavior - - All 93 tests passing -- **Status**: Phase 4 COMPLETE; ready for Phase 5 - -### Session 4 - -- Implemented Phase 5 API service subfolder extraction: - - Added `extract_subfolder(cached_json, folder_name)` helper to `FetchOrRequestAppraisal` - - Added `rebuild_appraisal_json` helper for JSON reconstruction - - Root requests return cached JSON unchanged - - Subfolder requests extract, rebuild JSON with updated `folder_path` and `folder` - - Returns 404 when subfolder not found in cached root - - Design decision: Keep extraction as helper method (not separate step) - it's an implementation detail of cache lookup -- Added 2 integration tests: - - HAPPY: extract subfolder from cached root on cache hit - - SAD: return 404 when subfolder not found in cache -- All 95 tests passing (93 original + 2 new integration) -- **Status**: Phase 5 COMPLETE; ready for Phase 6 - -### Session 5 - -- Refactored `FetchOrRequestAppraisal` to separate caching from folder extraction: - - Renamed `check_cache` → `check_project_appraisal_cache` - - Created separate step `extract_folder_from_appraisal_on_cache_hit` - - Renamed `request_appraisal_worker` → `request_appraisal_worker_on_cache_miss` - - Key insight: "folder not found" is a bad request (404), not a cache miss -- Moved `rebuild_appraisal_json` to `Representer::Appraisal.rebuild_with_extracted_folder` - - Fixed `::JSON` namespace issue (avoids conflict with `Representable::JSON`) -- Implemented Phase 6 acceptance tests: - - Added smart cache test: extract multiple subfolders from cached root - - Added invalid subfolder test: returns 404 after root is cached - - Used actual YPBT-app folders (spec, models, services) -- Verified JSON fixtures in `spec/fixtures/json/` are correct and in use: - - `sample_appraisal.json`: used in 3 integration tests + 13 unit tests - - `error_appraisal.json`: used in 1 unit test (error handling) -- Updated CLAUDE.md documentation with smart cache architecture: - - Updated Worker-Based Appraisal Architecture diagram - - Updated cache key format documentation (root-only) - - Updated FetchOrRequestAppraisal step names - - Updated data flow example for viewing contributions -- All 107 tests passing -- **Status**: Phase 6 COMPLETE; Smart Cache feature COMPLETE - ---- - -## Open Questions - -1. ~~How is `FolderContributions` structured? Can subfolders be extracted?~~ **ANSWERED: See Decision 1** -2. ~~What is the typical JSON size for root vs subfolder appraisals?~~ **ANSWERED: ~30KB root; practical for Redis** -3. ~~Should extraction happen in domain layer or infrastructure?~~ **ANSWERED: See Decision 2** -4. ~~How to handle deep nested paths efficiently?~~ **ANSWERED: See Decision 3** -5. ~~Cache invalidation strategy unchanged?~~ **ANSWERED: See Decision 4** -6. ~~Should cache-key generation be consolidated with request parsing?~~ **ANSWERED: See Decision 5** - ---- - -## Decisions Made - -### Decision 1: Navigate Cached Tree for Subfolder Extraction - -- JSON preserves nested `FolderContributions` structure via recursive representer -- Subfolder extraction will **navigate the deserialized tree** (not filter/reconstruct) -- **Rationale**: Leverages existing JSON structure; no reconstruction needed - -### Decision 2: Extraction Logic in Presentation Layer (Representer) - -- Extraction logic will reside in `Representer::FolderContributions` as class methods -- Tentative interface: - - `extract_subfolder(json_string, folder_path)` → OpenStruct or nil - - `extract_subfolder_json(json_string, folder_path)` → JSON string or nil -- **Rationale**: - - Representer already knows the JSON shape - - This is representation logic, not business logic - - Keeps service layer thin - - Avoids needing `FolderContributions` entity in API (stays worker-only) -- Alternative considered: Domain entity - rejected because (a) would require API to have the entity, (b) extraction is representation concern not business logic -- **⚠️ Concern noted**: Representer-based traversal may be brittle to changes in the entity structure. If `FolderContributions` entity changes how it organizes subfolders or paths, the representer extraction logic could break. If this becomes an issue, reconsider having the entity own the traversal logic (would require sharing entity between API and worker). - -### Decision 3: Roar Deserialization is Sufficient - -- Roar's `from_json` already handles nested deserialization automatically -- Deserializes to `OpenStruct` tree (not domain entities) - sufficient for traversal -- No mapper needed (JSON → Entity conversion would be extra work with no benefit) -- API only needs to pass through JSON to client; OpenStruct sufficient for subfolder lookup -- **Verified**: Tested with cached `YPBT-app` project - full nested tree deserializes correctly - -### Decision 4: Cache Invalidation Strategy Unchanged - -- TTL-based expiration remains: 1 day for success, 10 seconds for errors -- Manual wipe via `rake cache:wipe` unchanged -- Smart cache **simplifies** invalidation: one key per project instead of many -- No automatic invalidation on repo updates (deferred to future feature) -- **Future considerations** (out of scope for this feature): - - Webhook-based invalidation on GitHub push - - Manual wipe + re-appraisal workflow - -### Decision 5: Create `Request::Appraisal` Subsuming `ProjectPath` - -- New `Request::Appraisal` object replaces `Request::ProjectPath` -- **Naming**: `Request::Appraisal` (not `Request::AppraisalRequest` - avoids redundancy) - - Clean pairing: `Request::Appraisal` (what client asks for) vs `Value::Appraisal` (what worker produces) -- Owns cache key generation as single source of truth -- Provides: - - `owner_name`, `project_name`, `folder_name` (from ProjectPath) - - `project_fullname` method - - `cache_key` method (currently folder-specific, will become root-only) - - `root_request?` helper -- **Rationale**: - - Semantic clarity: Request (what is asked for) vs Appraisal (result of work) - - Removes duplicate cache key logic from service - - Centralizes "what is being requested" concept - - Stays in application layer where request objects belong -- **Implementation**: Completed in Phase 2 -- **Note**: `Value::Appraisal#cache_key` remains for worker use (result caching) - -### Decision 6: Create `Messaging::AppraisalJob` for Worker Queue - -- New `Messaging::AppraisalJob` DTO replaces `Response::AppraisalRequest` -- **Naming**: "Job" clarifies it's a work payload, not a response -- **Location**: `app/infrastructure/messaging/` (alongside Queue) -- **Pattern**: Data Transfer Object (DTO) for SQS transport -- Serialized via `Representer::AppraisalJob` -- **Cleanup**: Deleted legacy `CloneRequest` and associated representer - ---- - -## Quick Reference - -### Current Cache Key Format - -``` -appraisal:{owner}/{project}/{folder_path} -``` - -Examples: -- Root: `appraisal:ISS-SOA/codepraise-api/` -- Subfolder: `appraisal:ISS-SOA/codepraise-api/app/domain/` - -### Proposed Cache Key Format - -``` -appraisal:{owner}/{project}/ -``` - -Examples: -- All requests use: `appraisal:ISS-SOA/codepraise-api/` - -### Key Files to Modify - -**Worker:** - -- `workers/application/services/appraise_project.rb` -- `app/domain/contributions/values/appraisal.rb` - -**API:** - -- `app/application/services/fetch_or_request_appraisal.rb` -- New: subfolder extraction logic (location TBD) - -**Tests:** - -- `spec/tests/unit/appraisal_spec.rb` -- `spec/tests/integration/services_spec.rb` - ---- - -## Notes - -### Commit Practices - -- **Summarize changes before requesting commit permission** - provide brief summary by folder/file for user review BEFORE asking to commit -- **Separate concerns into distinct commits** - e.g., test setup changes vs feature changes -- **User is author, Claude is co-author** - use `Co-Authored-By: Claude ` -- **Use conventional commit messages** - `feat:`, `fix:`, `refactor:`, `docs:`, etc. - - `feat:` only for changes to external API/service features - - `refactor:` for internal changes (new domain objects, infrastructure, etc.) - -### Implementation Practices - -- **Seek consent before moving to next phase** - summarize completed work, commit, then ask to proceed -- **Include coverage report in commits** - after successful tests, amend `coverage/.resultset.json` to coding commits - -### Planning Practices - -- **IMPORTANT: Pause after each question** - wait for explicit user approval before proceeding to the next question during planning discussions diff --git a/.claude/CLAUDE.md b/CLAUDE.md similarity index 97% rename from .claude/CLAUDE.md rename to CLAUDE.md index 09c4798..eeef66a 100644 --- a/.claude/CLAUDE.md +++ b/CLAUDE.md @@ -367,12 +367,10 @@ workers/ ├── domain/contributions/ # Entities, values, lib for git analysis ├── infrastructure/ │ ├── git/ # Git gateway, repositories, mappers -│ └── messaging/ # Faye progress publisher -├── presentation/values/ # Progress monitor calculations +│ └── faye/ # Faye server gateway and progress mapper ├── application/ │ ├── controllers/worker.rb # Shoryuken entry point -│ ├── services/appraise_project.rb -│ └── requests/job_reporter.rb +│ └── services/appraise_project.rb └── shoryuken*.yml # Queue configs ``` @@ -388,10 +386,7 @@ workers/ - Clones repo if not exists locally - Runs git blame analysis via `Mapper::Contributions` - Stores serialized JSON in Redis with TTL - -- `requests/job_reporter.rb`: Manages progress reporting - - Deserializes AppraisalJob from JSON - - Publishes progress updates to Faye channel + - Reports progress via symbols (`:started`, `:cloning_done`, `:appraising_started`, etc.) **Worker Infrastructure Layer (`workers/infrastructure/`):** @@ -408,14 +403,13 @@ workers/ - `FileContributionsMapper`: Maps file data to `Entity::FileContributions` - `BlameContributor`: Maps git contributor info to `Entity::Contributor` - `PorcelainParser`: Parses git blame porcelain format -- `messaging/progress_publisher.rb`: Sends progress to Faye + - `CloneMapper`: Maps git clone output lines to progress symbols +- `faye/faye_server.rb`: Gateway to Faye server - POSTs JSON messages to `/faye` endpoint - Channel format: `/{request_id}` - -**Worker Presentation Layer (`workers/presentation/`):** - -- `values/progress_monitor.rb`: Tracks progress phases - - `AppraisalMonitor`: Full appraisal progress (15-50% clone, 55-85% appraise, 90-100% cache) +- `faye/progress_mapper.rb`: Maps progress symbols to percentages + - Single source of truth for all symbol → percentage mappings + - Symbols: `:started`, `:cloning_*`, `:appraising_*`, `:caching_*`, `:finished` **Configuration Files:** @@ -655,11 +649,9 @@ Tests are organized by scope in `spec/tests/`: - Contributions domain (git analysis) → `workers/domain/contributions/` - Git operations → `workers/infrastructure/git/` -- Faye messaging → `workers/infrastructure/messaging/` -- Progress tracking → `workers/presentation/values/` +- Faye gateway and progress mapping → `workers/infrastructure/faye/` - Worker entry point → `workers/application/controllers/worker.rb` - Worker services → `workers/application/services/` -- Request handling → `workers/application/requests/` **Adding a New API Endpoint:** 1. Define service object in `app/application/services/` using Dry::Transaction diff --git a/coverage/.resultset.json b/coverage/.resultset.json index 2aa2c7e..e7bbd58 100644 --- a/coverage/.resultset.json +++ b/coverage/.resultset.json @@ -41,7 +41,7 @@ 4, null, 1, - 24, + 25, null, null ] @@ -284,7 +284,7 @@ null, 1, 1, - 60, + 56, null, null, null, @@ -310,11 +310,11 @@ null, null, 1, - 67, + 59, null, null, 1, - 79, + 71, null, null, null, @@ -2397,6 +2397,93 @@ null ] }, + "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/infrastructure/faye/faye_server.rb": { + "lines": [ + null, + null, + 1, + null, + 1, + null, + 1, + 1, + 0, + 0, + null, + null, + 1, + 0, + 0, + 0, + null, + null, + null, + null, + 0, + null, + 0, + null, + null, + 1, + null, + 1, + 0, + null, + null, + null, + null + ] + }, + "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/infrastructure/faye/progress_mapper.rb": { + "lines": [ + null, + null, + 1, + null, + null, + 1, + 1, + null, + null, + null, + null, + null, + null, + null, + null, + null, + null, + null, + null, + 1, + 0, + null, + null, + null, + 1, + 0, + 0, + null, + null, + null, + null, + 1, + 0, + 0, + null, + null, + null, + 1, + 0, + 0, + 0, + null, + null, + null, + null, + null + ] + }, "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/infrastructure/git/gateway/git_command.rb": { "lines": [ null, @@ -2497,6 +2584,39 @@ null ] }, + "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/infrastructure/git/mappers/clone_mapper.rb": { + "lines": [ + null, + null, + 1, + null, + null, + 1, + 1, + null, + null, + null, + null, + null, + null, + null, + null, + null, + 1, + 8, + 30, + null, + 3, + null, + null, + null, + 1, + 2, + null, + null, + null + ] + }, "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/infrastructure/git/mappers/contributions_mapper.rb": { "lines": [ null, @@ -2964,111 +3084,6 @@ null ] }, - "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/infrastructure/messaging/progress_publisher.rb": { - "lines": [ - null, - null, - 1, - null, - 1, - null, - 1, - 1, - 0, - 0, - null, - null, - 1, - 0, - 0, - 0, - null, - null, - null, - null, - 0, - null, - 0, - null, - null, - 1, - null, - 1, - 0, - null, - null, - null, - null - ] - }, - "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/workers/presentation/values/progress_monitor.rb": { - "lines": [ - null, - null, - 1, - null, - null, - 1, - 1, - null, - null, - null, - null, - null, - null, - null, - null, - null, - 1, - 0, - null, - null, - 1, - 0, - null, - null, - 1, - 0, - null, - null, - 1, - 0, - null, - null, - 1, - 0, - null, - null, - null, - null, - null, - null, - null, - null, - 1, - 1, - null, - null, - null, - null, - null, - null, - null, - null, - null, - null, - null, - 1, - 0, - null, - null, - 1, - 0, - null, - null, - null - ] - }, "/Users/soumyaray/Sync/Dropbox/ossdev/projects/codepraise/api-codepraise/spec/helpers/vcr_helper.rb": { "lines": [ null, @@ -3094,8 +3109,8 @@ null, 1, 35, - 2795, - 2795, + 2860, + 2860, null, null, 35, @@ -3140,13 +3155,13 @@ 1, null, 1, - 44, + 40, null, null, null, 1, - 64, - 64, + 56, + 56, null, null ] @@ -4426,11 +4441,11 @@ null, 1, 1, - 7, - 7, + 3, + 3, null, null, - 7, + 3, null, null, null, @@ -4445,12 +4460,12 @@ null, null, null, - 7, - 7, + 3, + 3, null, null, 1, - 7, + 3, null, null, 1, @@ -4487,7 +4502,8 @@ null, null, null, - 1, + null, + null, 1, 1, 1, @@ -4495,19 +4511,31 @@ null, 1, 1, + null, + null, + 1, 1, null, null, 1, 1, + null, + null, + 1, 1, null, null, 1, 1, + null, + null, + 1, 1, null, null, + 1, + 1, + null, null, null, 1, @@ -4683,20 +4711,6 @@ null, null, 1, - 4, - null, - null, - null, - null, - null, - null, - null, - 4, - 4, - null, - null, - null, - 1, 3, 4, null, @@ -4727,6 +4741,6 @@ ] } }, - "timestamp": 1766550348 + "timestamp": 1766594799 } } diff --git a/require_worker.rb b/require_worker.rb index 7ae1c41..f28d55b 100644 --- a/require_worker.rb +++ b/require_worker.rb @@ -3,7 +3,7 @@ # Requires all ruby files in specified worker folders # Worker has its own DDD layers parallel to the API: # - domain: contributions entities/values/lib -# - infrastructure: git gateway/repositories/mappers, messaging +# - infrastructure: git gateway/repositories/mappers, faye server/mapper # - presentation: progress monitor values # - application: controllers, services, requests # Params: diff --git a/spec/tests/unit/worker_appraise_spec.rb b/spec/tests/unit/worker_appraise_spec.rb index aff4764..643ee6b 100644 --- a/spec/tests/unit/worker_appraise_spec.rb +++ b/spec/tests/unit/worker_appraise_spec.rb @@ -67,26 +67,39 @@ end end - describe 'scale_clone_progress helper' do - it 'should scale Cloning to 25' do - service = Appraiser::Service::AppraiseProject.new - _(service.send(:scale_clone_progress, 'Cloning into...')).must_equal 25 - end +end - it 'should scale Receiving to 40' do - service = Appraiser::Service::AppraiseProject.new - _(service.send(:scale_clone_progress, 'Receiving objects: 50%')).must_equal 40 - end +describe 'Unit test of CodePraise::CloneMapper' do + it 'should map Cloning to :cloning_started' do + _(CodePraise::CloneMapper.map('Cloning into...')).must_equal :cloning_started + end - it 'should scale Checking to 50' do - service = Appraiser::Service::AppraiseProject.new - _(service.send(:scale_clone_progress, 'Checking connectivity...')).must_equal 50 - end + it 'should map remote: to :cloning_remote' do + _(CodePraise::CloneMapper.map('remote: Counting objects')).must_equal :cloning_remote + end - it 'should default unknown stages to 30' do - service = Appraiser::Service::AppraiseProject.new - _(service.send(:scale_clone_progress, 'Unknown stage')).must_equal 30 - end + it 'should map Receiving to :cloning_receiving' do + _(CodePraise::CloneMapper.map('Receiving objects: 50%')).must_equal :cloning_receiving + end + + it 'should map Resolving to :cloning_resolving' do + _(CodePraise::CloneMapper.map('Resolving deltas: 100%')).must_equal :cloning_resolving + end + + it 'should map Checking to :cloning_done' do + _(CodePraise::CloneMapper.map('Checking connectivity...')).must_equal :cloning_done + end + + it 'should return nil for unknown lines' do + _(CodePraise::CloneMapper.map('Unknown stage')).must_be_nil + end + + it 'should return default for unknown lines with map_or_default' do + _(CodePraise::CloneMapper.map_or_default('Unknown stage')).must_equal :cloning_started + end + + it 'should return custom default when specified' do + _(CodePraise::CloneMapper.map_or_default('Unknown', :custom)).must_equal :custom end end diff --git a/workers/application/controllers/worker.rb b/workers/application/controllers/worker.rb index 537ef23..4afb7b3 100644 --- a/workers/application/controllers/worker.rb +++ b/workers/application/controllers/worker.rb @@ -31,9 +31,11 @@ def self.config = Figaro.env Shoryuken.sqs_client_receive_message_opts = { wait_time_seconds: 20 } shoryuken_options queue: config.WORKER_QUEUE_URL, auto_delete: true - def perform(_sqs_msg, request) - job = JobReporter.new(request, Worker.config) - perform_appraisal(job) + def perform(_sqs_msg, request_json) + job = deserialize_job(request_json) + progress_mapper = build_progress_mapper(job.id) + + perform_appraisal(job, progress_mapper) rescue CodePraise::GitRepo::Errors::CannotOverwriteLocalGitRepo # worker should crash fail early - only catch errors we expect! puts 'CLONE EXISTS -- ignoring request' @@ -41,19 +43,30 @@ def perform(_sqs_msg, request) private - def perform_appraisal(job) + def deserialize_job(request_json) + CodePraise::Representer::AppraisalJob + .new(OpenStruct.new) + .from_json(request_json) + end + + def build_progress_mapper(channel_id) + faye_server = FayeServer.new(Worker.config, channel_id) + ProgressMapper.new(faye_server) + end + + def perform_appraisal(job, progress_mapper) gitrepo = CodePraise::GitRepo.new(job.project, Worker.config) Service::AppraiseProject.new.call( project: job.project, - folder_path: job.folder_path, + folder_path: job.folder_path || '', config: Worker.config, gitrepo: gitrepo, - progress: job.progress_callback + progress: progress_mapper ) # Keep sending finished status to any latecoming subscribers - job.report_each_second(5) { AppraisalMonitor.finished_percent } + progress_mapper.report_each_second(5, :finished) end end end diff --git a/workers/application/requests/job_reporter.rb b/workers/application/requests/job_reporter.rb deleted file mode 100644 index 1a3b2ab..0000000 --- a/workers/application/requests/job_reporter.rb +++ /dev/null @@ -1,36 +0,0 @@ -# frozen_string_literal: true - -# Note: ProgressPublisher is loaded via require_worker (infrastructure/messaging) - -module Appraiser - # Reports job progress to client - class JobReporter - attr_reader :project, :folder_path - - def initialize(job_json, config) - job = CodePraise::Representer::AppraisalJob - .new(OpenStruct.new) - .from_json(job_json) - - @project = job.project - @folder_path = job.folder_path || '' - @publisher = ProgressPublisher.new(config, job.id) - end - - def report(msg) - @publisher.publish msg - end - - def report_each_second(seconds, &operation) - seconds.times do - sleep(1) - report(operation.call) - end - end - - # Returns a proc that can be passed to services for progress reporting - def progress_callback - ->(percent) { report(percent.to_s) } - end - end -end diff --git a/workers/application/services/appraise_project.rb b/workers/application/services/appraise_project.rb index 1db3382..157dd38 100644 --- a/workers/application/services/appraise_project.rb +++ b/workers/application/services/appraise_project.rb @@ -19,7 +19,7 @@ class AppraiseProject # folder_path: String path to folder being appraised # config: Worker config with Redis URL # gitrepo: GitRepo instance - # progress: Proc to report progress + # progress: ProgressMapper instance to report progress via symbols def prepare_inputs(input) # Convert OpenStruct project to Entity::Project for Value::Appraisal @@ -31,15 +31,15 @@ def prepare_inputs(input) end def clone_repo(input) - input[:progress].call(15) # STARTED + input[:progress].report(:started) if input[:gitrepo].exists_locally? - input[:progress].call(50) # Skip to post-clone + input[:progress].report(:cloning_done) else input[:gitrepo].clone_locally do |line| - # Scale clone progress from 15 to 50 - percent = scale_clone_progress(line) - input[:progress].call(percent) + # Use CloneMapper to convert git output to progress symbol + symbol = CodePraise::CloneMapper.map_or_default(line) + input[:progress].report(symbol) end end @@ -60,13 +60,13 @@ def appraise_contributions(input) # Skip if already errored in clone step return Success(input) if input[:appraisal]&.error? - input[:progress].call(55) # Starting appraisal + input[:progress].report(:appraising_started) folder = CodePraise::Mapper::Contributions .new(input[:gitrepo]) .for_folder(input[:folder_path]) - input[:progress].call(85) # Appraisal complete + input[:progress].report(:appraising_done) # Build successful appraisal value object input[:appraisal] = CodePraise::Value::Appraisal.success( @@ -90,7 +90,7 @@ def appraise_contributions(input) end def cache_result(input) - input[:progress].call(90) # Caching + input[:progress].report(:caching_started) appraisal = input[:appraisal] json = CodePraise::Representer::Appraisal.new(appraisal).to_json @@ -98,32 +98,18 @@ def cache_result(input) cache = CodePraise::Cache::Remote.new(input[:config]) cache.set(appraisal.cache_key, json, ttl: appraisal.ttl) - input[:progress].call(100) # FINISHED + input[:progress].report(:finished) Success(input) rescue StandardError => e - # Cache failure - still report 100% so client can retry + # Cache failure - still report finished so client can retry puts "CACHE ERROR: #{e.message}" - input[:progress].call(100) + input[:progress].report(:finished) Success(input) end private - # Scale git clone progress (15-50 range) - def scale_clone_progress(line) - clone_stages = { - 'Cloning' => 25, - 'remote' => 35, - 'Receiving' => 40, - 'Resolving' => 45, - 'Checking' => 50 - } - - first_word = line.match(/^[A-Za-z]+/).to_s - clone_stages[first_word] || 30 - end - # Convert OpenStruct project (from JSON) to Entity::Project def build_project_entity(project_ostruct) owner = build_member_entity(project_ostruct.owner) diff --git a/workers/infrastructure/messaging/progress_publisher.rb b/workers/infrastructure/faye/faye_server.rb similarity index 89% rename from workers/infrastructure/messaging/progress_publisher.rb rename to workers/infrastructure/faye/faye_server.rb index beef60b..58512fa 100644 --- a/workers/infrastructure/messaging/progress_publisher.rb +++ b/workers/infrastructure/faye/faye_server.rb @@ -3,8 +3,8 @@ require 'http' module Appraiser - # Publishes progress as percent to Faye endpoint - class ProgressPublisher + # Gateway to Faye server for publishing progress messages + class FayeServer def initialize(config, channel_id) @config = config @channel_id = channel_id diff --git a/workers/infrastructure/faye/progress_mapper.rb b/workers/infrastructure/faye/progress_mapper.rb new file mode 100644 index 0000000..3e489de --- /dev/null +++ b/workers/infrastructure/faye/progress_mapper.rb @@ -0,0 +1,46 @@ +# frozen_string_literal: true + +module Appraiser + # Maps progress symbols to percentages and publishes via FayeServer + # Single source of truth for all progress phase percentages + class ProgressMapper + PHASES = { + started: 15, + cloning_started: 20, + cloning_remote: 35, + cloning_receiving: 40, + cloning_resolving: 45, + cloning_done: 50, + appraising_started: 55, + appraising_done: 85, + caching_started: 90, + finished: 100 + }.freeze + + def initialize(faye_server) + @faye_server = faye_server + end + + # Map a progress symbol to its percentage + def map(symbol) + PHASES.fetch(symbol) do + raise ArgumentError, "Unknown progress phase: #{symbol}" + end + end + + # Report progress by mapping symbol to percentage and publishing + def report(symbol) + percentage = map(symbol) + @faye_server.publish(percentage.to_s) + end + + # Report the same symbol repeatedly for a number of seconds + def report_each_second(seconds, symbol) + seconds.times do + sleep(1) + report(symbol) + end + end + + end +end diff --git a/workers/infrastructure/git/mappers/clone_mapper.rb b/workers/infrastructure/git/mappers/clone_mapper.rb new file mode 100644 index 0000000..ce2551c --- /dev/null +++ b/workers/infrastructure/git/mappers/clone_mapper.rb @@ -0,0 +1,29 @@ +# frozen_string_literal: true + +module CodePraise + # Maps git clone output lines to progress symbols + # Used by Service::AppraiseProject to report fine-grained clone progress + module CloneMapper + CLONE_PATTERNS = { + /^Cloning/i => :cloning_started, + /^remote:/i => :cloning_remote, + /^Receiving/i => :cloning_receiving, + /^Resolving/i => :cloning_resolving, + /^Checking/i => :cloning_done + }.freeze + + # Map a git clone output line to a progress symbol + # Returns nil if line doesn't match any known pattern + def self.map(line) + CLONE_PATTERNS.each do |pattern, symbol| + return symbol if line.match?(pattern) + end + nil + end + + # Map a git clone output line, returning a default symbol if no match + def self.map_or_default(line, default = :cloning_started) + map(line) || default + end + end +end diff --git a/workers/presentation/values/progress_monitor.rb b/workers/presentation/values/progress_monitor.rb deleted file mode 100644 index cfd7ffd..0000000 --- a/workers/presentation/values/progress_monitor.rb +++ /dev/null @@ -1,64 +0,0 @@ -# frozen_string_literal: true - -module Appraiser - # Infrastructure to clone while yielding progress - # Legacy module for backwards compatibility - module CloneMonitor - CLONE_PROGRESS = { - 'STARTED' => 15, - 'Cloning' => 30, - 'remote' => 70, - 'Receiving' => 85, - 'Resolving' => 95, - 'Checking' => 100, - 'FINISHED' => 100 - }.freeze - - def self.starting_percent - CLONE_PROGRESS['STARTED'].to_s - end - - def self.finished_percent - CLONE_PROGRESS['FINISHED'].to_s - end - - def self.progress(line) - CLONE_PROGRESS[first_word_of(line)].to_s - end - - def self.percent(stage) - CLONE_PROGRESS[stage].to_s - end - - def self.first_word_of(line) - line.match(/^[A-Za-z]+/).to_s - end - end - - # Progress phases for full appraisal flow (clone + appraise + cache) - # Progress distribution: - # Clone: 0-50% (or skip if already cloned) - # Appraise: 50-90% - # Cache: 90-100% - module AppraisalMonitor - PHASES = { - started: 15, - cloning: 25, - clone_receiving: 40, - clone_resolving: 45, - clone_done: 50, - appraising: 55, - appraise_done: 85, - caching: 90, - finished: 100 - }.freeze - - def self.starting_percent - PHASES[:started].to_s - end - - def self.finished_percent - PHASES[:finished].to_s - end - end -end