gwonxhj · hyeokjun32 · May 4, 2026 · May 4, 2026
diff --git a/README.ko.md b/README.ko.md
@@ -64,9 +64,12 @@ Studio evidence와 jobs는 in-memory이며 local server process가 재시작되
 
 - macOS ONNX Runtime CPU smoke: Lab -> C++ Runtime CLI -> ONNX Runtime CPU execution -> Lab job result ingestion 경로 검증.
 - Jetson Orin Nano TensorRT smoke: Forge manifest + TensorRT engine artifact를 C++ Runtime CLI가 실행한 evidence 확보.
-- YOLOv8n real image benchmark:
-  - TensorRT Jetson: mean `9.9375 ms`, p99 `15.5231 ms`, FPS `100.6293`
+- Local Studio demo evidence:
+  - TensorRT Jetson FP16 25W: mean `10.066401 ms`, p95 `15.476641 ms`, p99 `15.548438 ms`, FPS `99.340373`
   - ONNX Runtime CPU: mean `45.4299 ms`, p99 `49.2128 ms`, FPS `22.0119`
+- Jetson Evidence Track:
+  - TensorRT Jetson FP16 15W: mean `10.799106 ms`, p95 `15.438690 ms`, p99 `15.529218 ms`, FPS `92.600262`
+  - power mode는 run configuration의 일부이므로 15W/25W 결과는 같은 조건 회귀가 아니라 system evidence로 해석합니다.
 - Runtime source model identity polish: TensorRT `model.engine` 실행에서도 Forge manifest의 `source_model.path`를 우선해 `compare_key=yolov8n__b1__h640w640__fp32`를 유지할 수 있습니다.
 
 ## 설치와 빠른 실행

diff --git a/README.md b/README.md
@@ -65,17 +65,20 @@ Interview one-liner: **InferEdge is an end-to-end inference validation pipeline
 
 ---
 
-## Real Inference Benchmark Result
+## Current Validation Evidence
 
-YOLOv8n was validated with a real OpenCV image-input benchmark: InferEdgeRuntime generated compare-ready JSON results, and InferEdgeLab automatically grouped and compared them by `compare_key` and `backend_key`.
+YOLOv8n is validated through the current Local Studio evidence fixtures and Jetson Evidence Track result JSONs.
+InferEdgeRuntime generates compare-ready JSON results, and InferEdgeLab groups and compares them by `compare_key`, `backend_key`, precision, and run context.
 
-| Backend | Input Mode | Mean ms | P99 ms | FPS |
-|---|---|---:|---:|---:|
-| TensorRT Jetson | image | 9.9375 | 15.5231 | 100.6293 |
-| ONNX Runtime CPU | image | 45.4299 | 49.2128 | 22.0119 |
+| Evidence | Backend | Precision | Power Mode | Mean ms | P95 ms | P99 ms | FPS |
+|---|---|---|---|---:|---:|---:|---:|
+| Local Studio baseline | ONNX Runtime CPU | FP32 | n/a | 45.4299 | n/a | 49.2128 | 22.0119 |
+| Local Studio candidate | TensorRT Jetson | FP16 | 25W | 10.066401 | 15.476641 | 15.548438 | 99.340373 |
+| Jetson power-mode evidence | TensorRT Jetson | FP16 | 15W | 10.799106 | 15.438690 | 15.529218 | 92.600262 |
 
-TensorRT Jetson was 4.6x faster than ONNX Runtime CPU in this real image input benchmark.
-The benchmark uses end-to-end Runtime latency, not trtexec GPU-only latency.
+The current Local Studio demo shows TensorRT Jetson FP16 25W as about 4.51x faster than the ONNX Runtime CPU FP32 baseline.
+The Jetson 15W/25W comparison is tracked as system evidence because power mode changes the run configuration.
+These measurements use InferEdgeRuntime end-to-end Runtime latency, not `trtexec` GPU-only latency.
 The full pipeline portfolio summary is available at [docs/portfolio/inferedge_pipeline_portfolio.md](docs/portfolio/inferedge_pipeline_portfolio.md), and the detailed Runtime comparison report is available at [docs/portfolio/runtime_compare_yolov8n.md](docs/portfolio/runtime_compare_yolov8n.md).
 The final local-first validation completion pass is summarized in [docs/portfolio/final_validation_completion.md](docs/portfolio/final_validation_completion.md).
 The YOLOv8 COCO subset accuracy demo is documented in [docs/portfolio/yolov8_coco_subset_evaluation.md](docs/portfolio/yolov8_coco_subset_evaluation.md).
@@ -100,12 +103,12 @@ Recommended demo flow:
 
 Verified demo fixture values:
 
-| Backend | Device | Mean ms | P99 ms | FPS | Compare Key |
-|---|---|---:|---:|---:|---|
-| ONNX Runtime | CPU | 45.4299 | 49.2128 | 22.0119 | `yolov8n__b1__h640w640__fp32` |
-| TensorRT | Jetson | 9.9375 | 15.5231 | 100.6293 | `yolov8n__b1__h640w640__fp32` |
+| Backend | Device | Precision | Power Mode | Mean ms | P95 ms | P99 ms | FPS | Compare Key |
+|---|---|---|---|---:|---:|---:|---:|---|
+| ONNX Runtime | CPU | FP32 | n/a | 45.4299 | n/a | 49.2128 | 22.0119 | `yolov8n__b1__h640w640__fp32` |
+| TensorRT | Jetson | FP16 | 25W | 10.066401 | 15.476641 | 15.548438 | 99.340373 | `yolov8n__b1__h640w640__fp16` |
 
-Studio reports this as a `4.57x` TensorRT speedup for the bundled demo pair.
+Studio reports this as about a `4.51x` TensorRT speedup for the bundled demo pair.
 AIGuard remains optional in this local Studio path; if Guard evidence is not loaded, the deployment decision explains that the Lab comparison is available but diagnosis evidence is not provided.
 The same demo flow also surfaces a small `yolov8_coco` evaluation report summary: 10 images, 89 ground-truth boxes, mAP@50 `0.1410`, precision `0.2941`, recall `0.1685`, structural validation `passed`.
 It also includes problem-case summaries for annotation-missing review, invalid detection structure blocking, contract shape mismatch blocking, and latency regression review.
@@ -153,16 +156,22 @@ This is a compact example of the structured result shape that InferEdgeRuntime e
 
 ```json
 {
-  "compare_key": "yolov8n__b1__h640w640__fp32",
+  "compare_key": "yolov8n__b1__h640w640__fp16",
   "backend_key": "tensorrt__jetson",
-  "mean_ms": 9.9375,
-  "p99_ms": 15.5231,
-  "fps_value": 100.6293,
+  "mean_ms": 10.066401,
+  "p95_ms": 15.476641,
+  "p99_ms": 15.548438,
+  "fps_value": 99.340373,
   "success": true,
   "status": "success",
+  "run_config": {
+    "power_mode": "25W",
+    "jetson_clocks": "on"
+  },
   "extra": {
-    "input_mode": "image",
-    "input_preprocess": "opencv_bgr_to_rgb_resize_float32_nchw"
+    "input_mode": "dummy",
+    "precision": "fp16",
+    "power_mode": "25W"
   }
 }
 ```

diff --git a/docs/portfolio/final_validation_completion.md b/docs/portfolio/final_validation_completion.md
@@ -39,8 +39,9 @@ InferEdge is complete for the current portfolio milestone when it can replay a l
 Runtime demo pair:
 
 - ONNX Runtime CPU: 45.4299 ms mean / 49.2128 ms p99 / 22.0119 FPS
-- TensorRT Jetson: 9.9375 ms mean / 15.5231 ms p99 / 100.6293 FPS
-- Studio speedup display: about 4.57x faster
+- TensorRT Jetson FP16 25W: 10.066401 ms mean / 15.548438 ms p99 / 99.340373 FPS
+- Jetson FP16 15W power-mode evidence: 10.799106 ms mean / 15.529218 ms p99 / 92.600262 FPS
+- Studio speedup display: about 4.51x faster for the ONNX Runtime CPU FP32 vs TensorRT Jetson FP16 25W demo pair
 
 YOLOv8 COCO subset evaluation:
 

diff --git a/docs/portfolio/inferedge_1page_architecture.md b/docs/portfolio/inferedge_1page_architecture.md
@@ -39,12 +39,12 @@ ONNX model
 - `/api/analyze` in-memory job workflow
 - Lab `worker_request` / `worker_response` boundary
 - Lab -> Runtime dev-only minimal execution smoke using `yolov8n.onnx` (ONNX Runtime CPU, success, mean about 47.97 ms, p95 about 51.80 ms, about 20.85 FPS)
-- Jetson Orin Nano TensorRT Runtime smoke using Forge manifest + TensorRT engine artifact (success, manifest applied, mean about 14.00 ms, p99 about 15.50 ms, about 71.44 FPS)
-- Local Studio demo evidence replay at `/studio` using bundled ONNX Runtime CPU and TensorRT Jetson result fixtures: 45.4299 ms vs 9.9375 ms mean latency, 49.2128 ms vs 15.5231 ms p99, 22.0119 vs 100.6293 FPS, and a 4.57x TensorRT speedup for the demo pair
+- Jetson Orin Nano TensorRT Runtime smoke using Forge manifest + TensorRT engine artifact, now recorded as Jetson Evidence Track fixtures for FP16 25W and 15W power modes
+- Local Studio demo evidence replay at `/studio` using bundled ONNX Runtime CPU FP32 and TensorRT Jetson FP16 25W result fixtures: 45.4299 ms vs 10.066401 ms mean latency, 49.2128 ms vs 15.548438 ms p99, 22.0119 vs 99.340373 FPS, and about a 4.51x TensorRT speedup for the demo pair
 - Runtime source-model identity polish for manifest-backed TensorRT engine results (`model.engine` can still keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`)
 - Runtime `worker_request` validation and `worker_response` dry-run export
 - Forge worker/runtime summary
-- AIGuard provenance mismatch diagnosis
+- AIGuard evidence diagnosis cases for provenance mismatch, bbox collapse, score saturation, temporal instability, and normal/pass paths
 - Lab decision/report guard evidence smoke
 - all repo README pipeline summaries synced
 

diff --git a/docs/portfolio/inferedge_pipeline_portfolio.md b/docs/portfolio/inferedge_pipeline_portfolio.md
@@ -80,30 +80,31 @@ The benchmark workflow is:
 `compare_key` identifies the comparison group for the same model, input shape, and precision.
 `backend_key` identifies the actual backend and device combination, such as `onnxruntime__cpu` or `tensorrt__jetson`.
 
-## 5. Real Image Input Validation Result
+## 5. Current Local Studio Demo Evidence
 
-This validation used YOLOv8n with real image input:
+The current Local Studio demo evidence uses bundled Runtime result fixtures so the comparison can be replayed in a browser without a live Jetson session:
 
 - Model: YOLOv8n
-- Input Mode: image
 - Input Shape: `1x3x640x640`
-- `compare_key`: `yolov8n__b1__h640w640__fp32`
-- `input_preprocess`: `opencv_bgr_to_rgb_resize_float32_nchw`
+- ONNX baseline `compare_key`: `yolov8n__b1__h640w640__fp32`
+- TensorRT candidate `compare_key`: `yolov8n__b1__h640w640__fp16`
+- TensorRT power mode: `25W`
 
-| Backend | Input Mode | Mean ms | P99 ms | FPS | Status |
-|---|---|---:|---:|---:|---|
-| TensorRT Jetson | image | 9.9375 | 15.5231 | 100.6293 | success |
-| ONNX Runtime CPU | image | 45.4299 | 49.2128 | 22.0119 | success |
+| Backend | Precision | Power Mode | Mean ms | P95 ms | P99 ms | FPS | Status |
+|---|---|---|---:|---:|---:|---:|---|
+| TensorRT Jetson | FP16 | 25W | 10.066401 | 15.476641 | 15.548438 | 99.340373 | success |
+| ONNX Runtime CPU | FP32 | n/a | 45.4299 | n/a | 49.2128 | 22.0119 | success |
 
 - Total compare groups: 1
 - Comparable groups count: 1
 - Skipped groups count: 0
 - Fastest backend: `tensorrt__jetson`
 - Slowest backend: `onnxruntime__cpu`
-- Speedup ratio: `4.6x`
-- ONNX Runtime is 4.6x slower than TensorRT.
+- Speedup ratio: about `4.51x`
+- ONNX Runtime CPU is about 4.51x slower than TensorRT Jetson FP16 25W for this demo pair.
 
 The Runtime latency is end-to-end wall-clock latency and should not be directly compared with trtexec GPU-only latency.
+The historical OpenCV real-image input benchmark remains documented in `runtime_compare_yolov8n.md`, while Local Studio now uses the explicit FP16/25W evidence fixture above.
 
 ## 6. Technical Contribution
 

diff --git a/docs/portfolio/inferedge_pipeline_portfolio_pdf.md b/docs/portfolio/inferedge_pipeline_portfolio_pdf.md
@@ -70,24 +70,25 @@ InferEdgeLab
 
 ---
 
-## Page 3. Real Benchmark Result & Contribution
+## Page 3. Current Demo Evidence & Contribution
 
-### Real Image Input Benchmark
+### Local Studio Demo Evidence
 
 - Model: YOLOv8n
-- Input Mode: image
 - Input Shape: `1x3x640x640`
-- `compare_key`: `yolov8n__b1__h640w640__fp32`
-- `input_preprocess`: `opencv_bgr_to_rgb_resize_float32_nchw`
+- ONNX baseline `compare_key`: `yolov8n__b1__h640w640__fp32`
+- TensorRT candidate `compare_key`: `yolov8n__b1__h640w640__fp16`
+- TensorRT power mode: `25W`
 
-| Backend | Input Mode | Mean ms | P99 ms | FPS | Status |
-|---|---|---:|---:|---:|---|
-| TensorRT Jetson | image | 9.9375 | 15.5231 | 100.6293 | success |
-| ONNX Runtime CPU | image | 45.4299 | 49.2128 | 22.0119 | success |
+| Backend | Precision | Power Mode | Mean ms | P99 ms | FPS | Status |
+|---|---|---|---:|---:|---:|---|
+| TensorRT Jetson | FP16 | 25W | 10.066401 | 15.548438 | 99.340373 | success |
+| ONNX Runtime CPU | FP32 | n/a | 45.4299 | 49.2128 | 22.0119 | success |
 
-TensorRT Jetson was 4.6x faster than ONNX Runtime CPU in this real image input benchmark.
+TensorRT Jetson FP16 25W was about 4.51x faster than ONNX Runtime CPU FP32 in the current Local Studio demo evidence.
 
 Runtime latency is measured as end-to-end wall-clock latency and should not be directly compared with trtexec GPU-only latency.
+The historical real-image input benchmark remains documented separately in `runtime_compare_yolov8n.md`.
 
 ### Technical Contribution
 

diff --git a/docs/portfolio/inferedge_pipeline_status.md b/docs/portfolio/inferedge_pipeline_status.md
@@ -97,6 +97,7 @@ The current cross-repository loop is covered by documentation, fixtures, and smo
 - AIGuard worker provenance mismatch diagnosis
 - Lab deployment decision/report evidence smoke for AIGuard worker provenance diagnosis
 - Local Studio local-first workflow UI for viewing Forge -> Runtime -> Lab -> optional AIGuard state, creating in-memory analyze jobs, importing Runtime result JSON, replaying bundled demo evidence, comparing backends, and inspecting Lab-owned deployment decision context
+- Local Studio portfolio demo evidence for ONNX Runtime CPU, TensorRT Jetson FP16 25W, Jetson FP16 15W power-mode evidence, and AIGuard diagnosis cases
 - YOLOv8 COCO subset evaluation report generated from 10 local images and 89 converted COCO-style person annotations, with metric backend `simplified`, mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed
 - Validation problem case fixtures for annotation-missing review, invalid detection structure blocking, and contract shape mismatch blocking
 
@@ -105,7 +106,10 @@ This means the current product boundary is testable without running the producti
 InferEdge now has two runtime execution evidence paths:
 
 1. macOS ONNX Runtime CPU smoke through Lab's dev-only Runtime execution path using `yolov8n.onnx`. The smoke created Lab job `job_9e2321179256`, called the C++ Runtime CLI through Lab's subprocess path, executed ONNX Runtime on CPU with FP32, and ingested the resulting JSON back into the Lab job result. Runtime reported input shape `[1, 3, 640, 640]`, output shape `[1, 84, 8400]`, `warmup=1`, `runs=5`, benchmark status success, mean latency about 47.97 ms, p50 about 46.95 ms, p95/p99 about 51.80 ms, and about 20.85 FPS. The resulting `deployment_decision` was `unknown`, which is expected for direct Runtime execution before Lab compare/report.
-2. Jetson Orin Nano TensorRT smoke using a Forge-generated manifest and TensorRT engine artifact executed by the C++ Runtime CLI. The manual Jetson smoke ran on Linux `5.15.148-tegra` / `aarch64` from `~/InferEdge-Runtime`, using Forge manifest `/home/risenano01/InferEdgeForge/builds/yolov8n__jetson__tensorrt__jetson_fp16/manifest.json` and artifact `/home/risenano01/InferEdgeForge/builds/yolov8n__jetson__tensorrt__jetson_fp16/model.engine`. The result JSON was `results/jetson/yolov8n_jetson_tensorrt_manifest_smoke.json` and reported `success: true`, `status: success`, `engine_backend: tensorrt`, `device_name: jetson`, `manifest_applied: true`, input shape `[1, 3, 640, 640]`, output shape `[1, 84, 8400]`, mean latency about 14.00 ms, p99 about 15.50 ms, and about 71.44 FPS.
+2. Jetson Orin Nano TensorRT smoke using a Forge-generated manifest and TensorRT engine artifact executed by the C++ Runtime CLI. The current Jetson Evidence Track records TensorRT FP16 short-smoke results with tegrastats summaries for both 25W and 15W power modes:
+   - 25W result: `results/jetson_evidence/yolov8n_trt_fp16_25w_20260504T170039Z.json`, mean `10.066401 ms`, p95 `15.476641 ms`, p99 `15.548438 ms`, FPS `99.340373`.
+   - 15W result: `results/jetson_evidence/yolov8n_trt_fp16_15w_20260504T171959Z.json`, mean `10.799106 ms`, p95 `15.438690 ms`, p99 `15.529218 ms`, FPS `92.600262`.
+   - The 15W vs 25W comparison is treated as system evidence because power mode changes the run configuration; it is not interpreted as same-condition model regression.
 
 Compare-key polish status: this limitation has been resolved in InferEdgeRuntime #37. When a Forge manifest is applied, Runtime now prefers `manifest.source_model.path` for compare naming, so a TensorRT artifact path such as `model.engine` can still produce `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`. This improves provenance and compare-readiness; it does not add production SaaS worker infrastructure.
 
@@ -131,10 +135,11 @@ This does not mean production SaaS is complete.
 - Runtime compare-key identity polish for manifest-backed engine artifacts
 - Guided end-to-end demo entrypoint for portfolio and interview walkthroughs
 - Local Studio at `/studio` for a local-first browser view of Run / Import / Demo Evidence / Compare / Decision / Jetson Helper workflows
+- Jetson Evidence Track short-smoke fixtures with TensorRT FP16 25W and 15W power-mode context, tegrastats summaries, and Lab-compatible Runtime JSON import
 - Contract/preset validation demo with `yolov8_coco`, COCO annotation loading, `--metric-backend simplified` by default, optional `pycocotools` backend contract, structural validation, and JSON/Markdown/HTML report fixtures
 - Problem-case validation reports that make skipped accuracy, invalid output structure, contract mismatch, and latency regression visible in Local Studio
 - Cross-repo fixture compatibility across Forge, Runtime, Lab, and AIGuard
-- Rule/evidence based provenance mismatch diagnosis
+- Rule/evidence based AIGuard diagnosis, including normal/pass, bbox collapse/blocked, score saturation/blocked, temporal instability/review_required, and provenance mismatch cases
 
 ### Planned Later