Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ YOLOv8n was validated with a real OpenCV image-input benchmark: InferEdgeRuntime
TensorRT Jetson was 4.6x faster than ONNX Runtime CPU in this real image input benchmark.
The benchmark uses end-to-end Runtime latency, not trtexec GPU-only latency.
The full pipeline portfolio summary is available at [docs/portfolio/inferedge_pipeline_portfolio.md](docs/portfolio/inferedge_pipeline_portfolio.md), and the detailed Runtime comparison report is available at [docs/portfolio/runtime_compare_yolov8n.md](docs/portfolio/runtime_compare_yolov8n.md).
The YOLOv8 COCO subset accuracy demo is documented in [docs/portfolio/yolov8_coco_subset_evaluation.md](docs/portfolio/yolov8_coco_subset_evaluation.md).

## Local Studio Demo Evidence

Expand All @@ -100,6 +101,7 @@ Verified demo fixture values:

Studio reports this as a `4.57x` TensorRT speedup for the bundled demo pair.
AIGuard remains optional in this local Studio path; if Guard evidence is not loaded, the deployment decision explains that the Lab comparison is available but diagnosis evidence is not provided.
The same demo flow also surfaces a small `yolov8_coco` evaluation report summary: 10 images, 89 ground-truth boxes, mAP@50 `0.1410`, precision `0.2941`, recall `0.1685`, structural validation `passed`.

---

Expand Down
2 changes: 2 additions & 0 deletions docs/portfolio/inferedge_pipeline_status.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ The current cross-repository loop is covered by documentation, fixtures, and smo
- AIGuard worker provenance mismatch diagnosis
- Lab deployment decision/report evidence smoke for AIGuard worker provenance diagnosis
- Local Studio local-first workflow UI for viewing Forge -> Runtime -> Lab -> optional AIGuard state, creating in-memory analyze jobs, importing Runtime result JSON, replaying bundled demo evidence, comparing backends, and inspecting Lab-owned deployment decision context
- YOLOv8 COCO subset evaluation report generated from 10 local images and 89 converted COCO-style person annotations, with mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed

This means the current product boundary is testable without running the production worker infrastructure.

Expand Down Expand Up @@ -125,6 +126,7 @@ Demo readiness: `scripts/demo_pipeline_full.sh` now provides a guided end-to-end
- Runtime compare-key identity polish for manifest-backed engine artifacts
- Guided end-to-end demo entrypoint for portfolio and interview walkthroughs
- Local Studio at `/studio` for a local-first browser view of Run / Import / Demo Evidence / Compare / Decision / Jetson Helper workflows
- Contract/preset validation demo with `yolov8_coco`, COCO annotation loading, simplified accuracy metrics, structural validation, and JSON/Markdown/HTML report fixtures
- Cross-repo fixture compatibility across Forge, Runtime, Lab, and AIGuard
- Rule/evidence based provenance mismatch diagnosis

Expand Down
1 change: 1 addition & 0 deletions docs/portfolio/inferedge_portfolio_submission.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ Recent validation evidence:
- Runtime compare-key identity polish: InferEdgeRuntime now preserves Forge manifest source model identity for compare naming. If `manifest.source_model.path` is `models/onnx/yolov8n.onnx` and the explicit TensorRT artifact path is `model.engine`, Runtime can keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`.
- Guided demo entrypoint: `scripts/demo_pipeline_full.sh` summarizes the full Forge -> Runtime -> Lab -> optional AIGuard flow and can print the Jetson TensorRT Runtime command without claiming production worker or SaaS readiness.
- Local Studio demo evidence: `/studio` can load bundled ONNX Runtime CPU and TensorRT Jetson Runtime result fixtures from `examples/studio_demo`, keep the demo pair selectable in Recent jobs while the local server process is alive, and show TensorRT Jetson vs ONNX Runtime CPU comparison in the browser. The fixture-backed evidence records ONNX Runtime CPU at mean 45.4299 ms / p99 49.2128 ms / 22.0119 FPS and TensorRT Jetson at mean 9.9375 ms / p99 15.5231 ms / 100.6293 FPS, a 4.57x TensorRT speedup for this demo pair.
- YOLOv8 COCO subset evaluation: a 10-image local person-detection subset with 89 ground-truth boxes is converted into a COCO-style annotation fixture and evaluated through the `yolov8_coco` preset. The generated report records mAP@50 0.1410, precision 0.2941, recall 0.1685, and structural validation passed. This is documented as subset workflow evidence, not a full COCO benchmark claim.

The direct Runtime execution result includes `deployment_decision`. Its `unknown` value is expected before Lab compare/report because the worker response has not yet been compared by Lab.

Expand Down
43 changes: 43 additions & 0 deletions docs/portfolio/yolov8_coco_subset_evaluation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# YOLOv8 COCO Subset Evaluation Demo

This document records a small local-first accuracy evaluation demo for InferEdgeLab.
It is not a full COCO benchmark and should not be presented as production model validation.

## Scope

- Preset: `yolov8_coco`
- Model: YOLOv8n ONNX Runtime CPU
- Demo input: 10 local person-detection images
- Annotation source: local YOLO txt labels converted into a compact COCO-style annotation fixture
- Raw images: intentionally not committed
- Annotation fixture: `examples/validation_demo/subset/yolov8_coco_subset_annotations.json`
- Evaluation report: `examples/validation_demo/subset/yolov8_coco_subset_evaluation.json`

## Result

| Metric | Value |
|---|---:|
| Samples | 10 |
| Ground-truth boxes | 89 |
| Post-NMS detections checked | 51 |
| mAP@50 | 0.1410 |
| mAP@50-95 | 0.0873 |
| Precision | 0.2941 |
| Recall | 0.1685 |
| F1 score | 0.2143 |
| Structural validation | passed |
| Contract input shape | passed |

## Interpretation

This demo proves that InferEdgeLab can load COCO-style annotations, run the YOLOv8 detection evaluator, compute simplified accuracy metrics, validate detection output structure, and emit JSON/Markdown/HTML reports.
The numbers are intentionally documented as a small subset result only.
They are useful as portfolio workflow evidence, not as a claim of full COCO accuracy.

The relatively low recall is expected for this tiny local subset because the images are night beach/crowd scenes with many small person boxes.
That is useful for the portfolio: it shows that the validation pipeline records uncomfortable evidence instead of hiding it.

## Local Studio Link

Local Studio's `Load Demo Evidence` flow now returns this evaluation report summary together with the existing ONNX Runtime CPU vs TensorRT Jetson latency pair.
The Studio path remains local-first and does not upload raw images or add database, queue, auth, or production SaaS features.
Loading
Loading