AI-Ocean · GHoon-Lee · May 15, 2026 · May 15, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,19 @@
 # Changelog
 
+## Unreleased
+
+- Hardened `gua status` and `gua stop` so stale PID files do not act on
+  unrelated live processes.
+- Clarified report output by explaining sample units, classification rules,
+  interval-dependent GPU-hours, and heatmap density.
+- Split §2 from generic "Waste" into idle-held capacity and truly-idle
+  capacity. The equivalent-GPU figures now use GPUs present in the report
+  window instead of the entire database.
+- Made §4 Top identities aggregate by identity/GPU/tick before converting to
+  GPU-hours, so reports may show lower per-user GPU-hours when one user has
+  multiple processes on the same GPU at the same tick.
+- Warn when NVML process-list visibility is unavailable for a GPU.
+
 ## 1.0.1 - 2026-05-15
 
 - Made `gua` the documented command surface for daemon, report, demo, and doctor output.

diff --git a/README.md b/README.md
@@ -81,34 +81,44 @@ $ gua report --since 1h --interval 30s
 gua — lab-a100 (bare, driver 560.35.05)  Window: 1:00:00
 
 §1 Headline
+  basis: one sample = one GPU card at one daemon tick
+  rules: active >=10% util; idle-held <10% util with >100 MB process memory
   █████████▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░
   active       █   15.7%
   idle-held    ▒   45.1%       ← this is the number conventional tools miss
   truly-idle   ░   39.2%
   (51 samples)
 
-§2 Waste
-  ~0.43 GPU-hours idle, ~2.53 GPUs equivalently unused
+§2 Idle capacity
+  converted from card-ticks to GPU-hours using the report --interval
+  idle-held: ~0.31 GPU-hours, ~1.53 GPUs equivalently unavailable
+  truly-idle: ~0.12 GPU-hours, ~1.00 GPUs equivalently free
 
 §3 Per-GPU
+  per-card share of samples in the same three states
   GPU-0     active  47.1%  idle-held  35.3%  truly-idle  17.6%
   GPU-1     active   0.0%  idle-held 100.0%  truly-idle   0.0%
   GPU-2     active   0.0%  idle-held   0.0%  truly-idle 100.0%
 
 §4 Top identities
-  identity              gpu-hours   idle-held
-  alice                      0.42       42.9%
-  bob                        0.28      100.0%
+  one identity counts once per GPU/tick after its processes are summed
+  identity              gpu-hours   idle-held   samples
+  alice                      0.42       42.9%        51
+  bob                        0.28      100.0%        34
 
 §5 Time-of-day heatmap (UTC)
+  darker means higher active share; blank means no samples
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
   Mon               .
 ```
 
 The 3-bar collapses every card × every tick over the window into the
 active / idle-held / truly-idle split. **`idle-held` rows are the
 embarrassing category**: a process is holding GPU memory but the SM
-utilization is below 10%.
+utilization is below 10%. §2 converts those card-ticks into GPU-hours
+with `--interval`; §4 groups process rows by identity, GPU, and tick
+before ranking users, so multiple same-user processes on one GPU/tick
+count once.
 
 ## Demo (no GPU required)
 
@@ -185,7 +195,7 @@ point remains installed for compatibility, but new examples use `gua`.
 | -------- | ----------------------------------------------------------- |
 | `daemon` | Starts the collector in the background. Samples real NVML telemetry on every tick and writes to a new database. NVIDIA host required. |
 | `start`  | Alias for `gua daemon`. |
-| `status` | Shows whether the background collector PID is still running. |
+| `status` | Shows whether the background collector PID is still running. Also clears a stale PID file when it points to a missing or unrelated process. |
 | `stop`   | Stops the background collector with SIGTERM. |
 | `report` | One-shot read against the accumulated database. Safe to run **while the daemon is still writing** — SQLite WAL mode handles the concurrency. |
 | `demo`   | Self-contained showcase. Records N fake ticks and immediately prints the report. No GPU, no second shell, no operational meaning — just to see the output shape. |
@@ -213,6 +223,8 @@ By default, `gua daemon` returns after the collector starts. Each tick is
 written to the log file; on shutdown the cumulative row count is written
 there too. `gua daemon --foreground` prints the tick summaries directly
 to the terminal and exits on Ctrl+C, SIGTERM, or `systemctl stop`.
+`gua status` and `gua stop` verify that the PID file points to the
+managed collector before acting on it; stale PID files are cleared.
 
 ### `report`
 
@@ -227,7 +239,7 @@ gua report [--db PATH] [--since D] [--interval D] [--width N]
   of oldest sample), so passing a huge `--since` is the same as "all
   data". Units: `ms`, `s`, `m`, `h`, `d` (no `w`; use `7d`).
 - `--interval D` (default `30s`) — **must match what the daemon used**.
-  This is how §2 (Waste) and §4 (Top identities) convert tick counts
+  This is how §2 (Idle capacity) and §4 (Top identities) convert tick counts
   to GPU-hours. Mismatched intervals → wrong GPU-hours.
 - `--width N` (default `60`) — width of the §1 three-bar in characters.
 

diff --git a/projects/bare-metal-1.0/handoff.ko.md b/projects/bare-metal-1.0/handoff.ko.md
@@ -4,81 +4,78 @@
 
 ## 이어받을 때 먼저 볼 것
 
-- `projects/bare-metal-1.0/plan.ko.md`: 범위와 PR A-D 계획의 source of truth.
-- `projects/bare-metal-1.0/status.ko.md`: 현재 완료/대기 상태와 마지막 검증 결과.
-- `README.md`: 실제 사용자 문서와 release/install/runbook 표면.
-- `pyproject.toml`: 현재 package version과 dependency 정책.
+- `projects/bare-metal-1.0/status.ko.md`: 현재 완료 상태, 1.0.1 검증 결과, cleanup 리뷰 결과.
+- `README.md`: 실제 사용자 문서와 release/install/runbook/report 표면.
+- `src/gpu_usage_audit/__main__.py`: `gua` CLI, background daemon lifecycle, PID handling.
+- `src/gpu_usage_audit/report.py`: report SQL 집계.
+- `src/gpu_usage_audit/render.py`: report 사람이 읽는 출력.
 - `.github/workflows/release.yml`: tag release, GitHub Release, PyPI publish 경로.
 
 ## 고정된 결정
 
 - 1.0은 단일 로컬 베어메탈 NVIDIA 호스트만 본다.
-- Kubernetes, Slurm, Docker/Podman fallback, remote node, managed
-  `gua start/status/stop/uninstall`은 1.0 사용자 표면에서 제외한다.
+- Kubernetes, Slurm, Docker/Podman fallback, remote node, cluster-wide report는 1.0 범위 밖이다.
 - `nvidia-ml-py`는 기본 dependency다.
 - `gpu-usage-audit[nvml]` extra는 compatibility를 위해 빈 alias로 남긴다.
 - DB schema는 v1을 유지한다: `host`, `gpu_sample`, `proc_sample`.
 - 기본 DB는 `/tmp/gua.db`다.
+- `gua daemon`은 기본 백그라운드 실행이다.
+- `gua daemon --foreground`는 systemd/debugging 용도다.
+- `gua start`는 `gua daemon` alias다.
+- `gua status`와 `gua stop`은 pid file 기반 background collector 관리용이다.
 - `daemon`은 기존 DB 파일이 있으면 실패한다.
 - `report`는 DB 파일이 없으면 실패한다.
-- `gua`의 사용자 표면은 `doctor`만 남긴다.
-- auto-runtime proposal/project 문서는 삭제했다. Kubernetes/Slurm/Docker/Podman
-  확장을 다시 시작하려면 새 proposal로 시작한다.
+- `daemon`과 `demo`는 host row의 `env_kind`를 항상 `"bare"`로 기록한다.
+- auto-runtime proposal/project 문서는 삭제했다. Kubernetes/Slurm/Docker/Podman 확장을 다시
+  시작하려면 새 proposal로 시작한다.
 
 ## 현재 상태
 
 - PR A: implemented in PR #9.
 - PR B: implemented in PR #10.
-- Post-1.0 cleanup: 완료. auto-runtime 문서와 `RuntimePlan`/env detection
-  잔재를 제거했다.
-- PR C: implemented in release prep.
-- PR D: 진행 중. 현재 버전은 `1.0.0`으로 bump했고, local build/wheel smoke는
-  통과했다. NVIDIA host acceptance와 tag publish가 남았다.
+- Post-1.0 cleanup: completed in PR #11.
+- Bare-metal 1.0 release: completed in PR #12 and tag `v1.0.0`.
+- 1.0.1 command surface/background daemon release: completed in PR #13 and tag `v1.0.1`.
+- GitHub Release `v1.0.1`: published.
+- PyPI `gpu-usage-audit 1.0.1`: published.
+- NVIDIA host acceptance: 사용자가 실제 host에서 수집 정상 동작을 확인했다.
 
-마지막 로컬 검증은 모두 통과했다.
+## 마지막 로컬 검증
 
 ```sh
 uv run ruff check
 uv run ruff format --check
 uv run mypy
 uv run pytest
-uv build --out-dir /tmp/gua-dist-1.0.0-prep
-bash scripts/smoke-dist-wheel.sh /tmp/gua-dist-1.0.0-prep/gpu_usage_audit-1.0.0-py3-none-any.whl
-env GITHUB_REF_NAME=v1.0.0 uv run python scripts/check-tag-version.py
+uv build --out-dir /tmp/gua-dist-1.0.1-status
+bash scripts/smoke-dist-wheel.sh /tmp/gua-dist-1.0.1-status/gpu_usage_audit-1.0.1-py3-none-any.whl
+env GITHUB_REF_NAME=v1.0.1 uv run python scripts/check-tag-version.py
 ```
 
-cleanup 후 결과는 `pytest` 107 passed, `mypy` 25 source files, `ruff format`
-26 files 기준이다. release prep에서는 `/tmp/gua-dist-1.0.0-prep`로 build와
-wheel smoke를 확인한다.
+결과는 `pytest` 114 passed, `mypy` 25 source files, `ruff format` 26 files 기준이다.
+
+## 현재 cleanup PR 방향
+
+- `/tmp/gua.pid`가 PID 재사용으로 다른 프로세스를 가리킬 수 있으므로 `status`/`stop` 전에
+  해당 PID가 실제 managed `gpu_usage_audit daemon` 프로세스인지 확인한다.
+- report §2는 low-util 전체를 "waste"로 합치지 말고 `idle-held`와 `truly-idle`을 분리한다.
+- report §4는 process row가 아니라 identity/GPU/tick 단위로 먼저 접어서 사용자별 GPU-hours를 계산한다.
+- report 출력 자체에 sample 의미, classification rule, `--interval` 의존성, heatmap 의미를 짧게 노출한다.
+- NVML process list 조회 실패는 idle-held를 과소평가할 수 있으므로 warning으로 남긴다.
+- `projects/bare-metal-1.0/*` 문서는 1.0.1 완료 상태로 갱신한다.
 
 ## 주의할 점
 
-- 현재 로컬 개발 머신은 NVIDIA host가 아니다. `gua doctor`가 unsupported를 내는 것은
-  정상이다.
-- `/tmp/gua.db`가 이미 존재한다. 기본 경로 daemon 테스트는 이 파일 때문에 실패하는
-  것이 기대 동작이다.
-- 실제 1.0 acceptance는 NVIDIA 베어메탈 호스트에서만 닫을 수 있다.
-- `daemon`과 `demo`는 host row의 `env_kind`를 항상 `"bare"`로 기록한다. 1.0은
-  container/k8s runtime 감지를 하지 않는다.
-- PR C를 닫기 전에 문서만 보고 끝내지 말고, 기존 DB 존재/부재 error UX가 README와
-  CLI 출력에서 서로 같은 메시지를 주는지 확인한다.
-- PR D에서 tag를 만들기 전에는 `env GITHUB_REF_NAME=v1.0.0 uv run python
-  scripts/check-tag-version.py`가 통과해야 한다.
+- 현재 로컬 개발 머신은 NVIDIA host가 아니다. `gua doctor`가 unsupported를 내는 것은 정상이다.
+- `/tmp/gua.db`가 이미 존재한다. 기본 경로 daemon 실행이 거부되는 것은 기대 동작이다.
+- `report --interval`은 daemon 수집 interval과 같아야 GPU-hours가 맞다.
+- SQLite WAL sidecar(`*.db-wal`, `*.db-shm`)는 마지막 connection이 닫히면 정리된다.
+- 1.0.2를 자를 경우 `env GITHUB_REF_NAME=v1.0.2 uv run python scripts/check-tag-version.py`가
+  통과해야 한다.
 
 ## 다음 세션 추천 순서
 
 1. `git status --short`로 사용자 변경 여부를 먼저 확인한다.
-2. `projects/bare-metal-1.0/status.ko.md`를 읽고 마지막 검증 이후 차이를 확인한다.
-3. NVIDIA host acceptance를 실행한다.
-4. release prep PR을 main에 머지한다.
-5. `v1.0.0` tag를 push하기 전 아래를 다시 실행한다.
-
-```sh
-uv run ruff check
-uv run ruff format --check
-uv run mypy
-uv run pytest
-uv build
-bash scripts/smoke-dist-wheel.sh
-env GITHUB_REF_NAME=v1.0.0 uv run python scripts/check-tag-version.py
-```
+2. cleanup PR의 CI 결과와 review comments를 확인한다.
+3. 필요하면 report wording을 실제 운영자가 읽기 쉬운 형태로 한 번 더 다듬는다.
+4. merge 후 patch release가 필요하면 version bump와 changelog를 별도 PR로 처리한다.
diff --git a/projects/bare-metal-1.0/status.ko.md b/projects/bare-metal-1.0/status.ko.md
@@ -4,73 +4,81 @@
 
 ## 요약
 
-Bare Metal 1.0은 단일 NVIDIA 베어메탈 호스트만 대상으로 하는 방향으로 정리되어
-있다. PR A/B/C와 post-1.0 cleanup은 완료됐고, 현재는 PR D release prep을
-진행 중이다.
+Bare Metal 1.0은 단일 NVIDIA 베어메탈 호스트만 대상으로 하는 형태로 1.0.1까지
+릴리스됐다. `v1.0.1` GitHub Release와 PyPI publish는 완료됐고, 사용자가 실제
+NVIDIA host에서 telemetry 수집이 정상 동작하는 것도 확인했다.
 
-cleanup 시작 시 워크트리는 깨끗했다.
+현재 작업은 1.0.1 이후 코드 퀄리티 cleanup이다. 주요 초점은 background daemon
+PID 안전성, report 의미 가시성, 내부 문서 정합성이다.
 
 ## 구현 상태
 
 | 영역 | 상태 | 메모 |
 | --- | --- | --- |
-| Scope reset | 완료 | Kubernetes/Slurm/Docker/remote/managed runtime 표면 제거. |
+| Scope reset | 완료 | Kubernetes/Slurm/Docker/remote runtime 표면 제거. |
 | `gua doctor` | 완료 | 현재 머신의 `/dev/nvidia*`, `nvidia-smi -L`, NVML, DB path만 진단. |
 | Packaging UX | 완료 | `nvidia-ml-py`가 기본 dependency이고 `nvml` extra는 빈 compatibility alias. |
-| `daemon`/`report` DB UX | 구현됨 | 기본 DB는 `/tmp/gua.db`; daemon은 기존 DB를 거부하고 report는 없는 DB를 거부. |
-| README bare-metal 문서 | 완료 | 2-shell flow, systemd 예시, 운영 notes가 들어가 있음. |
-| Post-1.0 cleanup | 완료 | auto-runtime proposal/project 문서, k8s/docker env 감지, `RuntimePlan` 잔재 제거. |
-| PR C closure | 완료 | runbook과 기존 DB 존재/부재 UX가 README/CLI에 반영됨. |
-| PR D release prep | 진행 중 | package version은 `1.0.0`; local build/wheel smoke 완료, tag publish가 남음. |
-| NVIDIA host acceptance | 미검증 | 현재 로컬 머신에는 NVIDIA device/driver가 없어 실제 host 수집 loop는 확인하지 못함. |
+| `gua` command surface | 완료 | `doctor`, `daemon`, `start`, `status`, `stop`, `report`, `demo` 제공. |
+| Background daemon UX | 완료 | `gua daemon`은 기본 백그라운드 실행, `--foreground`는 systemd/debug용. |
+| `daemon`/`report` DB UX | 완료 | 기본 DB는 `/tmp/gua.db`; daemon은 기존 DB를 거부하고 report는 없는 DB를 거부. |
+| README bare-metal 문서 | 완료 | install, runbook, systemd 예시, 운영 notes가 1.0.1 기준. |
+| Release | 완료 | `v1.0.1` tag, GitHub Release, PyPI publish 완료. |
+| NVIDIA host acceptance | 완료 | 실제 NVIDIA host에서 수집 정상 동작 확인. |
 
-## 검증 결과
+## 마지막 확인 결과
 
-2026-05-15 release prep 로컬 검증:
+2026-05-15 1.0.1 상태 확인:
 
 ```sh
 git status --short
 uv run ruff check
 uv run ruff format --check
 uv run mypy
 uv run pytest
-uv build --out-dir /tmp/gua-dist-1.0.0-prep
-bash scripts/smoke-dist-wheel.sh /tmp/gua-dist-1.0.0-prep/gpu_usage_audit-1.0.0-py3-none-any.whl
-env GITHUB_REF_NAME=v1.0.0 uv run python scripts/check-tag-version.py
+env GITHUB_REF_NAME=v1.0.1 uv run python scripts/check-tag-version.py
+uv build --out-dir /tmp/gua-dist-1.0.1-status
+bash scripts/smoke-dist-wheel.sh /tmp/gua-dist-1.0.1-status/gpu_usage_audit-1.0.1-py3-none-any.whl
 ```
 
 결과:
 
-- `git status --short`: release prep 변경분만 존재.
+- 작업트리 clean.
 - `ruff check`: pass.
 - `ruff format --check`: 26 files already formatted.
 - `mypy`: no issues in 25 source files.
-- `pytest`: 107 passed.
+- `pytest`: 114 passed.
+- tag-version check: `v1.0.1`과 `pyproject.toml` version 일치.
 - `uv build`: sdist/wheel build 성공.
 - wheel smoke: 성공.
-- tag-version check: `v1.0.0`과 `pyproject.toml` version 일치.
-
-2026-05-15 release prep 변경:
-
-- `pyproject.toml` / `uv.lock` package version을 `1.0.0`으로 갱신.
-- README status와 GitHub Release asset 예시를 `v1.0.0` 기준으로 갱신.
-- `CHANGELOG.md`에 1.0.0 release notes 추가.
-
-## 이번 cleanup 변경
-
-- `proposals/design-auto-runtime*.md` 삭제.
-- `projects/auto-runtime-audit/plan*.md` 삭제.
-- `src/gpu_usage_audit/env.py`와 `tests/test_env.py` 삭제.
-- `daemon`/`demo`는 1.0 계약대로 host `env_kind`를 `"bare"`로 직접 기록.
-- `RuntimePlan` 모델 제거. `gua doctor`는 내부 `DoctorPlan`으로 host/unsupported,
-  reasons, blockers, warnings만 유지.
-- `DoctorPlan` JSON에서 post-1.0 placeholder였던 `scheduler`, `telemetry`,
-  `confidence`, `required_privileges`, `actions` 필드 제거.
+- Release workflow: `v1.0.1` success.
+- PyPI latest: `gpu-usage-audit 1.0.1`.
+
+## 1.0.1에서 바뀐 점
+
+- `gua`를 documented command surface로 정리했다.
+- `gua daemon`은 collector를 백그라운드로 시작한다.
+- `gua daemon --foreground`는 systemd와 debugging 용도로 유지한다.
+- `gua start`, `gua status`, `gua stop`을 추가했다.
+- README의 install/run/report 예시는 `gua` 기준으로 정리됐다.
+
+## 현재 cleanup 리뷰 결과
+
+- `/tmp/gua.pid` 숫자만 믿고 `gua stop`이 SIGTERM을 보내면 PID 재사용 시 다른
+  프로세스를 건드릴 수 있다. pid가 실제 `python -m gpu_usage_audit daemon`
+  프로세스인지 확인해야 한다.
+- §2 report가 `idle-held`와 `truly-idle`을 모두 "idle/waste"로 합쳐 보여주면
+  제품 메시지가 흐려진다. 사용자가 못 쓰는 용량과 실제 빈 용량을 분리해야 한다.
+- §4 Top identities는 process row를 바로 세면 같은 사용자의 여러 프로세스가
+  같은 GPU/tick에서 과대계상될 수 있다. identity/GPU/tick 단위로 먼저 접어야 한다.
+- report는 "sample"의 의미, threshold, `--interval` 의존성을 출력 자체에서 더
+  잘 설명해야 한다.
+- NVML process list를 읽지 못하는 경우 low-util GPU가 `truly-idle`처럼 보일 수
+  있으므로 최소한 경고가 필요하다.
 
 ## 로컬 `doctor` 상태
 
-현재 개발 머신은 NVIDIA host가 아니므로 `uv run gua doctor --json`은
-`unsupported`가 정상 결과다.
+현재 개발 머신은 NVIDIA host가 아니므로 `uv run gua doctor`는 `unsupported`가
+정상 결과다.
 
 관찰된 blocker:
 
@@ -79,18 +87,11 @@ env GITHUB_REF_NAME=v1.0.0 uv run python scripts/check-tag-version.py
 - NVML init 실패: `libnvidia-ml.so.1` 없음.
 - `/tmp/gua.db`가 이미 있어 daemon은 기본 경로로 시작하지 않음.
 
-이 결과는 로컬 환경 한계이며, 제품 regression으로 보지는 않는다. 실제 acceptance는
-NVIDIA 베어메탈 호스트에서 다시 실행해야 한다.
+이 결과는 로컬 환경 한계이며, 제품 regression으로 보지 않는다.
 
 ## 다음 작업
 
-1. NVIDIA host에서 acceptance command를 실행한다.
-2. release prep PR을 main에 머지한다.
-3. `v1.0.0` tag를 push해서 GitHub Release와 PyPI publish workflow를 실행한다.
-
-```sh
-uv tool install gpu-usage-audit
-gua doctor
-gpu-usage-audit daemon --interval 30s
-gpu-usage-audit report --since 1h --interval 30s
-```
+1. cleanup PR에서 PID 검증, report 가시성, 문서 정합성을 반영한다.
+2. `uv run ruff check`, `uv run ruff format --check`, `uv run mypy`, `uv run pytest`를
+   다시 실행한다.
+3. 필요하면 1.0.2 patch release 후보로 묶는다.