LEANDERANTONY
diff --git a/‎DEVLOG.md‎
Lines changed: 42 additions & 0 deletions b/‎DEVLOG.md‎
Lines changed: 42 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 19 additions & 5 deletions b/‎README.md‎
Lines changed: 19 additions & 5 deletions
diff --git a/‎docs/adr/ADR-010-single-pass-review-corrections-and-task-tuned-model-budgets.md‎
Lines changed: 79 additions & 0 deletions b/‎docs/adr/ADR-010-single-pass-review-corrections-and-task-tuned-model-budgets.md‎
Lines changed: 79 additions & 0 deletions
diff --git a/‎docs/adr/README.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/adr/README.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/recommended-changes-status-2026-03-16.md‎
Lines changed: 16 additions & 7 deletions b/‎docs/recommended-changes-status-2026-03-16.md‎
Lines changed: 16 additions & 7 deletions
diff --git a/‎src/agents/fit_agent.py‎
Lines changed: 5 additions & 24 deletions b/‎src/agents/fit_agent.py‎
Lines changed: 5 additions & 24 deletions
@@ -258,6 +258,48 @@ Persistent per-user usage storage, saved artifact history, and quotas are intent
 - Added `src/usage_store.py` to persist assisted usage events in Supabase Postgres for authenticated users.
 - Extended `src/openai_service.py` with an optional usage-event callback so persistence stays transport-agnostic.
 - Wired authenticated usage-event recording from `src/ui/workflow.py` without leaking Streamlit concerns into the service layer.
+
+## Day 18: Single-Pass Review-Correction Workflow
+
+- Removed the live `ProfileAgent` and `JobAgent` stages from the supervised workflow because they were mostly restating deterministic inputs without adding enough value for the latency cost.
+- Simplified the active orchestrator path to:
+  - fit
+  - tailoring
+  - strategy
+  - review
+  - resume generation
+- Removed the bounded rerun loop that previously sent tailoring and strategy back through another full pass after review.
+- Changed Review so it can return direct corrections for tailoring and strategy, and the orchestrator now feeds those corrected outputs straight into final resume generation.
+- Removed interview-theme style outputs that were adding contract size without being core to the current product output.
+- Updated the UI, payload layer, and report rendering so they reflect the smaller workflow and the direct-correction review model.
+- Verified the redesign with focused workflow, prompt, builder, and UI test coverage.
+
+## Day 19: Model Routing And Output Budget Tuning
+
+- Rebalanced reasoning effort by task based on real runtime logs instead of keeping one default posture for every agent.
+- Changed the active routing defaults to:
+  - `fit`: `gpt-5-mini-2025-08-07` with `low` reasoning
+  - `tailoring`: `gpt-5-mini-2025-08-07` with `medium` reasoning
+  - `strategy`: `gpt-5-mini-2025-08-07` with `low` reasoning
+  - `review`: `gpt-5.4` with `medium` reasoning
+  - `resume_generation`: `gpt-5.4` with `medium` reasoning
+- Increased the Review output budget to start at 4000 tokens so the stage does not immediately fall into retry-on-truncation for corrected JSON payloads.
+- Reduced oversized output caps where observed usage made the previous limits unnecessary:
+  - `fit`: 1600
+  - `strategy`: 1500
+  - `resume_generation`: 3000
+- Kept `tailoring` at 3200 and `review` at 4000 because they still carry the heaviest grounded payloads in the current flow.
+- Verified the new routing and cap defaults with targeted orchestration and OpenAI-service tests.
+
+## Day 20: Review Approval Semantics And Backward Compatibility
+
+- Clarified Review semantics so `approved` now means the final corrected output is safe to use, not that the incoming tailoring or strategy draft was perfect before correction.
+- Added `unresolved_issues` to the review contract so the app can distinguish between:
+  - issues found in the incoming draft
+  - blockers that still remain after correction
+- Updated UI and report labels to show `Approved After Corrections` when Review repaired the output successfully.
+- Added backward-compatible access patterns so older saved or in-memory `ReviewAgentOutput` objects without `unresolved_issues` do not crash the app.
+- Logged PDF-output quality as a follow-up documentation item because export aesthetics still need a dedicated pass even though workflow runtime is now much healthier.
 - Added the `usage_events` SQL schema and RLS policies in `docs/supabase-usage-events.sql`.
 - Kept assisted requests resilient: usage persistence failures are logged but do not break the user-facing AI response.
 
 
@@ -25,17 +25,15 @@ The repository now follows the same product-style structure as the GitHub Portfo
 - Generate a deterministic fit snapshot against the active job description
 - Produce first-pass resume-tailoring guidance from grounded profile and JD signals
 - Run a supervised specialist-agent workflow on demand:
-  - profile
-  - job
   - fit
   - tailoring
   - strategy
   - review
   - resume generation
-- Run a bounded review-revision loop before the final resume artifact is generated
+- Let the review stage directly correct tailoring and strategy outputs before final resume generation instead of rerunning the full workflow loop
 - Use OpenAI when configured, with deterministic fallback when it is not
 - Route different assisted tasks to different model tiers instead of relying on one global model
-- Route GPT-5 reasoning effort by task, with medium effort for normal tasks and high effort for review, resume generation, and grounded application Q&A
+- Route GPT-5 reasoning effort by task, with low effort on fit and strategy, medium effort on tailoring, review, and final resume generation, and higher-trust routing kept for the final grounding stages
 - Use the OpenAI Responses API for assisted JSON generation and usage tracking
 - Build a tailored resume artifact from the current workflow state
 - Preview the tailored resume directly in the app before export
@@ -55,7 +53,7 @@ The repository now follows the same product-style structure as the GitHub Portfo
 
 ## Current Status
 
-The app is still an MVP, but it is now a coherent authenticated workflow product rather than only a deterministic prototype. Resume parsing, JD structuring, deterministic fit analysis, supervised specialist-agent orchestration, bounded review-driven revision, tailored resume generation, report generation, preview-before-download flows, export packaging, model-aware assisted routing, grounded in-app assistance, Google sign-in, persisted usage tracking, daily quotas, and 24-hour saved workspace reloads are all working.
+The app is still an MVP, but it is now a coherent authenticated workflow product rather than only a deterministic prototype. Resume parsing, JD structuring, deterministic fit analysis, supervised specialist-agent orchestration, direct review-driven correction, tailored resume generation, report generation, preview-before-download flows, export packaging, model-aware assisted routing, grounded in-app assistance, Google sign-in, persisted usage tracking, daily quotas, and 24-hour saved workspace reloads are all working.
 
 The active product scope is intentionally focused:
 
@@ -212,6 +210,22 @@ The current OpenAI Responses API integration also includes runtime safeguards fo
 - one retry with a higher output-token budget when a response is incomplete because the original output budget was exhausted
 - longer client timeouts plus SDK retries to reduce transient read-timeout failures
 
+Current default assisted routing is intentionally asymmetric:
+
+- `fit`: GPT-5 Mini with `low` reasoning
+- `tailoring`: GPT-5 Mini with `medium` reasoning
+- `strategy`: GPT-5 Mini with `low` reasoning
+- `review`: GPT-5.4 with `medium` reasoning
+- `resume_generation`: GPT-5.4 with `medium` reasoning
+
+Current default output-token caps are also tuned by task rather than kept uniform:
+
+- `fit`: 1600
+- `tailoring`: 3200
+- `strategy`: 1500
+- `review`: 4000
+- `resume_generation`: 3000
+
 The current app also does not require Supabase for a first hosted deploy. If Supabase is not configured yet, the app can still run the non-authenticated product shell and deterministic workflow. Google sign-in, saved-workspace reloads, and account-level quotas remain inactive until Supabase is configured.
 
 If you plan to deploy before creating Supabase, set `AUTH_REQUIRED_FOR_ASSISTED_WORKFLOW=false` so the AI-assisted workflow button is not blocked by the missing login layer. Once Supabase exists, turn it back on if you want assisted usage tied to authenticated accounts.
 
@@ -0,0 +1,79 @@
+# ADR-010: Single-Pass Review Corrections and Task-Tuned Model Budgets
+
+## Status
+
+Accepted
+
+## Context
+
+The earlier supervised workflow had grown into a more expensive sequence than the product needed:
+
+- `ProfileAgent` and `JobAgent` were generating summaries that mostly restated deterministic candidate and JD structure already available elsewhere in the system
+- the orchestrator ran a bounded revision loop that resent tailoring, strategy, and review through another full pass when review rejected the first draft
+- Review was functioning as both a gate and a revision trigger, but the product goal had shifted toward direct grounded correction rather than repeated full-pipeline iteration
+- real runtime logs showed that the second tailoring + strategy + review pass was one of the largest contributors to total latency
+- the final high-trust stages were worth more model budget than the early summarization stages, but the routing defaults had not yet been tightened around that reality
+
+At the same time, the product still needed:
+
+- deterministic fit analysis as the grounding backbone
+- a strong Tailoring step because that stage carries the most content-heavy rewrite work
+- a strong Review step that can reject or repair unsupported wording
+- a final Resume Generation step that turns the reviewed output into the export-ready artifact
+
+## Decision
+
+Adopt a single-pass supervised workflow with direct review corrections and task-tuned reasoning / output budgets.
+
+The accepted workflow is:
+
+1. `fit`
+2. `tailoring`
+3. `strategy`
+4. `review`
+5. `resume_generation`
+
+Implementation details:
+
+1. remove live `ProfileAgent` and `JobAgent` execution from the active orchestrator path
+2. keep deterministic `CandidateProfile`, `JobDescription`, `FitAnalysis`, and `TailoredResumeDraft` as the source-of-truth inputs
+3. make Review return direct corrected tailoring and strategy outputs when repairs are straightforward
+4. stop rerunning the entire tailoring / strategy / review chain after review feedback
+5. define review approval in terms of the final corrected state, not only the cleanliness of the incoming draft
+6. route earlier cheaper stages to cheaper reasoning levels than the final grounding stages
+7. tune output-token caps by observed usage instead of using one oversized default for every task
+
+The current routing defaults following this decision are:
+
+- `fit`: `gpt-5-mini-2025-08-07` with `low` reasoning and a 1600-token output cap
+- `tailoring`: `gpt-5-mini-2025-08-07` with `medium` reasoning and a 3200-token output cap
+- `strategy`: `gpt-5-mini-2025-08-07` with `low` reasoning and a 1500-token output cap
+- `review`: `gpt-5.4` with `medium` reasoning and a 4000-token output cap
+- `resume_generation`: `gpt-5.4` with `medium` reasoning and a 3000-token output cap
+
+## Alternatives Considered
+
+### 1. Keep the full Profile + Job + Fit + Tailoring + Strategy + Review + Resume Generation stack
+
+Rejected because Profile and Job were not adding enough unique value relative to the deterministic data they were summarizing, while still costing additional sequential model latency.
+
+### 2. Keep the revision loop but reduce model size only
+
+Rejected because the largest avoidable cost was architectural: repeated full-pipeline passes. Model tuning alone would not remove that structural latency.
+
+### 3. Remove Review entirely and trust Tailoring / Strategy output directly
+
+Rejected because Review is still the main grounding defense against unsupported claims, inferred tenure, and overstated tooling experience.
+
+### 4. Lower every stage to the same cheapest reasoning tier
+
+Rejected because the stages do not have the same risk profile. Review and final resume generation justify more careful reasoning than early fit and strategy summarization.
+
+## Consequences
+
+- the workflow becomes materially faster because it removes redundant live stages and the second-pass loop
+- deterministic inputs remain the grounding backbone even though the live agent count is smaller
+- Review becomes a direct correcting editor rather than only a rejection gate
+- the meaning of `approved` must reflect the final corrected output state, which required updates to UI and report wording
+- model routing becomes easier to reason about because costlier reasoning is reserved for the stages that materially affect grounding and final export quality
+- PDF output quality remains a separate follow-up concern; the workflow and routing changes improve runtime and correctness, but they do not solve visual export polish by themselves
@@ -11,6 +11,7 @@ This directory tracks the architectural decisions that shape the AI Job Applicat
 - [ADR-007: Remove LinkedIn import from active product scope](ADR-007-remove-linkedin-import-from-active-product-scope.md)
 - [ADR-008: Two-mode grounded assistant panel](ADR-008-two-mode-grounded-assistant-panel.md)
 - [ADR-009: Google sign-in via Supabase for persistent identity](ADR-009-google-sign-in-via-supabase-for-persistent-identity.md)
+- [ADR-010: Single-pass review corrections and task-tuned model budgets](ADR-010-single-pass-review-corrections-and-task-tuned-model-budgets.md)
 
 ## Superseded
 
 
@@ -249,18 +249,26 @@ Checkpoint:
 
 ## 18. Revision Loop
 
-Status: `Already implemented earlier`
+Status: `Superseded by later workflow simplification`
 
-Changes present in the current codebase:
+Earlier state:
+
+- `ApplicationOrchestrator` previously reran tailoring, strategy, and review in a bounded revision loop
+- review feedback was injected back into `TailoringAgent.run(...)` as `revision_requests`
+- revision pass history was preserved on `review_history`
+- the loop was capped by `max_revision_passes`
+
+Current state:
 
-- `ApplicationOrchestrator` reruns tailoring, strategy, and review in a bounded revision loop
-- review feedback is injected back into `TailoringAgent.run(...)` as `revision_requests`
-- revision pass history is preserved on `review_history`
-- the loop is capped by `max_revision_passes`
+- the bounded rerun loop was removed in favor of one single-pass workflow
+- Review now applies direct corrections to tailoring and strategy outputs instead of sending the whole flow through another pass
+- `review_history` remains only as a compatibility field, not as an active revision-loop record for the current live flow
 
 Current evidence:
 
 - `src/agents/orchestrator.py`
+- `src/agents/review_agent.py`
+- `docs/adr/ADR-010-single-pass-review-corrections-and-task-tuned-model-budgets.md`
 
 ## 19. Application Strategy Agent
 
@@ -269,7 +277,7 @@ Status: `Already implemented earlier`
 Changes present in the current codebase:
 
 - `StrategyAgent` exists as a first-class agent under `src/agents/strategy_agent.py`
-- the orchestrator runs it on each revision pass
+- the orchestrator runs it once in the active single-pass workflow
 - its output is included in workflow payloads, UI rendering, report generation, and resume generation context
 
 Current evidence:
@@ -484,6 +492,7 @@ Current state:
 
 These are the next practical checks to run in the app after the Supabase bootstrap update.
 
+0. Review the generated PDF outputs themselves and improve the visual/layout quality, because the current exported documents still look off even when the workflow data and runtime are behaving correctly.
 1. Sign in with a normal non-internal account and confirm the daily quota panel renders without warnings or silent fallback.
 2. Verify the saved workspace flow still works normally after the updated bootstrap SQL, including reload and download regeneration.
 3. Do one final spot-check with a normal non-internal account so the persisted quota panel, assisted run, and post-run quota refresh all behave correctly end to end.
 
@@ -4,9 +4,7 @@
     CandidateProfile,
     FitAnalysis,
     FitAgentOutput,
-    JobAgentOutput,
     JobDescription,
-    ProfileAgentOutput,
 )
 
 from .common import coerce_string, coerce_string_list, unique_strings
@@ -21,16 +19,12 @@ def run(
         candidate_profile: CandidateProfile,
         job_description: JobDescription,
         fit_analysis: FitAnalysis,
-        profile_output: ProfileAgentOutput,
-        job_output: JobAgentOutput,
     ) -> FitAgentOutput:
         if self._openai_service and self._openai_service.is_available():
             prompt = build_fit_agent_prompt(
                 candidate_profile,
                 job_description,
                 fit_analysis,
-                profile_output,
-                job_output,
             )
             payload = self._openai_service.run_json_prompt(
                 prompt["system"],
@@ -44,15 +38,14 @@ def run(
                 fit_summary=coerce_string(payload.get("fit_summary")),
                 top_matches=coerce_string_list(payload.get("top_matches"), limit=4),
                 key_gaps=coerce_string_list(payload.get("key_gaps"), limit=4),
-                interview_themes=coerce_string_list(payload.get("interview_themes"), limit=4),
             )
-        return self._fallback(fit_analysis, profile_output, job_output)
+        return self._fallback(fit_analysis, candidate_profile, job_description)
 
     @staticmethod
     def _fallback(
         fit_analysis: FitAnalysis,
-        profile_output: ProfileAgentOutput,
-        job_output: JobAgentOutput,
+        candidate_profile: CandidateProfile,
+        job_description: JobDescription,
     ) -> FitAgentOutput:
         fit_summary = (
             "{label} for {role} with a score of {score}/100. {experience}".format(
@@ -63,28 +56,16 @@ def _fallback(
             )
         )
         top_matches = unique_strings(
-            fit_analysis.strengths + profile_output.evidence_highlights + job_output.priority_skills,
+            fit_analysis.strengths + fit_analysis.matched_hard_skills + candidate_profile.skills,
             limit=4,
         )
         key_gaps = unique_strings(
-            fit_analysis.gaps + fit_analysis.missing_hard_skills + profile_output.cautions,
+            fit_analysis.gaps + fit_analysis.missing_hard_skills + fit_analysis.missing_soft_skills,
             limit=4,
         )
-        interview_themes = []
-        if fit_analysis.matched_hard_skills:
-            interview_themes.append(
-                "Prepare concrete stories around " + ", ".join(fit_analysis.matched_hard_skills[:3]) + "."
-            )
-        if fit_analysis.missing_hard_skills:
-            interview_themes.append(
-                "Frame a credible upskilling plan for " + ", ".join(fit_analysis.missing_hard_skills[:3]) + "."
-            )
-        if not interview_themes:
-            interview_themes.append("Prepare outcome-focused examples from your strongest recent work.")
 
         return FitAgentOutput(
             fit_summary=fit_summary,
             top_matches=top_matches,
             key_gaps=key_gaps,
-            interview_themes=unique_strings(interview_themes, limit=4),
         )