diff --git a/docs/portfolio/final_validation_completion.md b/docs/portfolio/final_validation_completion.md index 64fbd60..4a71b06 100644 --- a/docs/portfolio/final_validation_completion.md +++ b/docs/portfolio/final_validation_completion.md @@ -60,6 +60,7 @@ YOLOv8 COCO subset evaluation: | annotation missing | review | | invalid detection structure | blocked | | contract shape mismatch | blocked | +| latency regression | review_required | ## Remaining Future Work diff --git a/docs/portfolio/inferedge_pipeline_status.md b/docs/portfolio/inferedge_pipeline_status.md index ca0758b..6a5745b 100644 --- a/docs/portfolio/inferedge_pipeline_status.md +++ b/docs/portfolio/inferedge_pipeline_status.md @@ -97,7 +97,7 @@ The current cross-repository loop is covered by documentation, fixtures, and smo - AIGuard worker provenance mismatch diagnosis - Lab deployment decision/report evidence smoke for AIGuard worker provenance diagnosis - Local Studio local-first workflow UI for viewing Forge -> Runtime -> Lab -> optional AIGuard state, creating in-memory analyze jobs, importing Runtime result JSON, replaying bundled demo evidence, comparing backends, and inspecting Lab-owned deployment decision context -- YOLOv8 COCO subset evaluation report generated from 10 local images and 89 converted COCO-style person annotations, with mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed +- YOLOv8 COCO subset evaluation report generated from 10 local images and 89 converted COCO-style person annotations, with metric backend `simplified`, mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed - Validation problem case fixtures for annotation-missing review, invalid detection structure blocking, and contract shape mismatch blocking This means the current product boundary is testable without running the production worker infrastructure. @@ -131,8 +131,8 @@ This does not mean production SaaS is complete. - Runtime compare-key identity polish for manifest-backed engine artifacts - Guided end-to-end demo entrypoint for portfolio and interview walkthroughs - Local Studio at `/studio` for a local-first browser view of Run / Import / Demo Evidence / Compare / Decision / Jetson Helper workflows -- Contract/preset validation demo with `yolov8_coco`, COCO annotation loading, simplified accuracy metrics, structural validation, and JSON/Markdown/HTML report fixtures -- Problem-case validation reports that make skipped accuracy, invalid output structure, and contract mismatch visible in Local Studio +- Contract/preset validation demo with `yolov8_coco`, COCO annotation loading, `--metric-backend simplified` by default, optional `pycocotools` backend contract, structural validation, and JSON/Markdown/HTML report fixtures +- Problem-case validation reports that make skipped accuracy, invalid output structure, contract mismatch, and latency regression visible in Local Studio - Cross-repo fixture compatibility across Forge, Runtime, Lab, and AIGuard - Rule/evidence based provenance mismatch diagnosis diff --git a/docs/portfolio/inferedge_portfolio_submission.md b/docs/portfolio/inferedge_portfolio_submission.md index b8b2653..e6aae4d 100644 --- a/docs/portfolio/inferedge_portfolio_submission.md +++ b/docs/portfolio/inferedge_portfolio_submission.md @@ -112,8 +112,8 @@ Recent validation evidence: - Runtime compare-key identity polish: InferEdgeRuntime now preserves Forge manifest source model identity for compare naming. If `manifest.source_model.path` is `models/onnx/yolov8n.onnx` and the explicit TensorRT artifact path is `model.engine`, Runtime can keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`. - Guided demo entrypoint: `scripts/demo_pipeline_full.sh` summarizes the full Forge -> Runtime -> Lab -> optional AIGuard flow and can print the Jetson TensorRT Runtime command without claiming production worker or SaaS readiness. - Local Studio demo evidence: `/studio` can load bundled ONNX Runtime CPU and TensorRT Jetson Runtime result fixtures from `examples/studio_demo`, keep the demo pair selectable in Recent jobs while the local server process is alive, and show TensorRT Jetson vs ONNX Runtime CPU comparison in the browser. The fixture-backed evidence records ONNX Runtime CPU at mean 45.4299 ms / p99 49.2128 ms / 22.0119 FPS and TensorRT Jetson at mean 9.9375 ms / p99 15.5231 ms / 100.6293 FPS, a 4.57x TensorRT speedup for this demo pair. -- YOLOv8 COCO subset evaluation: a 10-image local person-detection subset with 89 ground-truth boxes is converted into a COCO-style annotation fixture and evaluated through the `yolov8_coco` preset. The generated report records mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed. This is documented as subset workflow evidence, not a full COCO benchmark claim. -- Validation problem cases: the demo bundle includes annotation-missing, invalid detection structure, and contract shape mismatch reports. These show that InferEdge records review/block evidence explicitly instead of presenting every validation path as successful. +- YOLOv8 COCO subset evaluation: a 10-image local person-detection subset with 89 ground-truth boxes is converted into a COCO-style annotation fixture and evaluated through the `yolov8_coco` preset. The generated report records metric backend `simplified`, mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed. This is documented as subset workflow evidence, not a full COCO benchmark claim. `pycocotools` remains an optional explicit backend. +- Validation problem cases: the demo bundle includes annotation-missing, invalid detection structure, contract shape mismatch, and latency regression reports. These show that InferEdge records review/block evidence explicitly instead of presenting every validation path as successful. The direct Runtime execution result includes `deployment_decision`. Its `unknown` value is expected before Lab compare/report because the worker response has not yet been compared by Lab.