Regenerate single-file output when its source grows; fix success-message accuracy (#221 follow-up)#253
Regenerate single-file output when its source grows; fix success-message accuracy (#221 follow-up)#253cboos wants to merge 3 commits into
Conversation
…essage (#221 follow-up) The default single-file output path (no `-o`, detail-suffixed naming like `<file>.low.html`) rode the version-only staleness skip: `is_outdated()` compares only the embedded tool version, not the source's freshness. So a single `.jsonl` that grew between runs (an in-progress session re-exported with the same tool version) was wrongly reported "current" and served stale HTML. #237 only forced regeneration for the explicit `-o file` case. Bug 1: add a source-mtime term to the single-file `else` branch in convert_jsonl_to — regenerate when the source file is newer than the existing output. Scoped to a file source (a directory source has no single mtime and is already handled by the cache / `cache_was_updated` path). The version-marker check is kept for tool-version bumps. Bug 2: on a skip the CLI printed "Successfully converted" on top of the converter's own "is current, skipping regeneration" line. convert_jsonl_to now reports whether it actually (re)wrote the output via a non-breaking optional `regenerated` out-parameter (the `Path` return contract that ~20 callers rely on is unchanged); the CLI prints the success line only on a real regeneration. Tests: flip test_single_file_mode_regeneration_behavior (it enshrined the stale behavior as "expected") to assert regeneration on a grown source; add TestSingleFileStaleness covering the exact CLI repro — grown source regenerates, unchanged source skips with no redundant success line. Both fail on the prior code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ty, mtime rationale Non-blocking review follow-ups on the #221 single-file-staleness fix: 1. Directory-branch symmetry: the skip-suppression of the success line was gated only on the single-file branch, so a directory whose combined output is current (Phase-1b early-exit) still printed "Successfully combined N transcript files ... and generated M individual session files" on a pure skip. The `regenerated` out-param already carries the bool — extend the same guard to the directory branch so skip-messaging is consistent for single-file AND directory inputs. 2. Readability: `regenerated != [False]` -> `any(regenerated)`. 3. Mtime rationale: add a comment explaining why the *output* file's mtime is the correct, sufficient, persistence-free basis in single-file mode (no cache DB to store the source mtime as the directory path does), and that both share the same filesystem-granularity limit. Test: add test_directory_skip_suppresses_success_line pinning note 1 (RED on the prior file-only guard, which printed the misleading combined line on a no-op directory re-run). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bined no silent run) The skip-suppression added earlier gated the CLI success line on the COMBINED-output write only (`did_regenerate`). But per-session generation is an independent axis: `--combined no` writes only individual session files and never a combined transcript, so `did_regenerate` stayed False and the run printed NOTHING despite producing every session file — a regression vs. the prior unconditional "Successfully combined N transcript files ... and generated M individual session files" line. The same gap silently dropped the session count whenever the combined was current but a session was (re)written. `_generate_individual_session_files` already returns a regenerated count; it was discarded. Capture it and OR it into the `regenerated` out-param so the gate means "combined OR any session file (re)written". Pure skips (single-file or a fully-current directory) still report False and stay quiet, preserving the original double-message fix. Surfaced by a self-review pass (/code-review) on the branch; not caught by the diff-scoped manual review because it only manifests under --combined no / session-only regeneration, which the review didn't execute. Test: test_combined_no_confirms_session_generation — `--combined no` writes session files AND confirms the work (RED before: the run printed only Processing/Loading with no summary line). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthrough
ChangesRegeneration tracking and messaging
Sequence Diagram(s)sequenceDiagram
participant CLI
participant convert_jsonl_to
participant FileSystem
CLI->>convert_jsonl_to: convert_jsonl_to(input_path, output_path, regenerated=[])
convert_jsonl_to->>FileSystem: compare output mtime vs input/version marker
alt all current, no regeneration needed
convert_jsonl_to->>convert_jsonl_to: regenerated.append(False)
convert_jsonl_to-->>CLI: return output_path
CLI->>CLI: was_regenerated = False, suppress success print
else regeneration required
convert_jsonl_to->>FileSystem: write combined HTML and/or session files
convert_jsonl_to->>convert_jsonl_to: regenerated.append(True)
convert_jsonl_to-->>CLI: return output_path
CLI->>CLI: was_regenerated = True, print success message
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
claude_code_log/cli.py (1)
1343-1361: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winHandle
--combined noas a session-only success path.When
write_combinedis false,convert_jsonl_to()deliberately returnsoutput_pathwithout writing that combined file. This branch is now reachable becausewas_regeneratedbecomes true from session-file writes, so the CLI reportsSuccessfully combined ... to {output_path}andclick.launch()can target a path that does not exist.Suggested fix
- if not was_regenerated: - pass - elif input_path.is_file(): + if not was_regenerated: + pass + elif input_path.is_file(): click.echo(f"Successfully converted {input_path} to {output_path}") else: jsonl_count = len(list(input_path.glob("*.jsonl"))) - if write_individual: + if write_combined and write_individual: ext = get_file_extension(output_format) session_files = list(input_path.glob(f"session-*.{ext}")) click.echo( f"Successfully combined {jsonl_count} transcript files from {input_path} to {output_path} and generated {len(session_files)} individual session files" ) - else: + elif write_combined: click.echo( f"Successfully combined {jsonl_count} transcript files from {input_path} to {output_path}" ) + else: + ext = get_file_extension(output_format) + session_files = list(input_path.glob(f"session-*.{ext}")) + click.echo( + f"Successfully generated {len(session_files)} individual session files from {jsonl_count} transcript files in {input_path}" + ) - if open_browser: + if open_browser and write_combined: click.launch(str(output_path))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@claude_code_log/cli.py` around lines 1343 - 1361, The success handling in cli.py’s convert_jsonl_to flow is treating session-only regeneration as a combined-output success, which can report a nonexistent output_path and launch it in the browser. Update the branch around was_regenerated, input_path.is_file(), and write_individual so that when write_combined is false the CLI uses a session-only success message and only calls click.launch on a real existing file path, while preserving the combined-file messaging when output_path was actually written.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/test_html_regeneration.py`:
- Around line 331-352: The regeneration assertion in this test is too dependent
on `time.sleep(0.1)` and can flake on coarse mtime filesystems; update the
`convert_jsonl_to_html` setup so the source file is explicitly made newer than
the HTML output using `os.utime(...)`, matching the approach already used in
`test_output_explicit.py`. Keep the checks around `skip_calls`,
`output_file.stat().st_mtime`, and the `"Single file mode test."` content, but
ensure the test advances the source timestamp in a filesystem-safe way before
calling `convert_jsonl_to_html`.
---
Outside diff comments:
In `@claude_code_log/cli.py`:
- Around line 1343-1361: The success handling in cli.py’s convert_jsonl_to flow
is treating session-only regeneration as a combined-output success, which can
report a nonexistent output_path and launch it in the browser. Update the branch
around was_regenerated, input_path.is_file(), and write_individual so that when
write_combined is false the CLI uses a session-only success message and only
calls click.launch on a real existing file path, while preserving the
combined-file messaging when output_path was actually written.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f13ccd68-ef0e-4edc-8d40-7b4af1b1b440
📒 Files selected for processing (4)
claude_code_log/cli.pyclaude_code_log/converter.pytest/test_html_regeneration.pytest/test_output_explicit.py
| # Modify the source: even without a cache, a single file that grows | ||
| # (e.g. an in-progress session) MUST regenerate — the output is now | ||
| # older than its source (issue #221 follow-up). Previously single | ||
| # file mode wrongly skipped on the version marker alone. | ||
| time.sleep(0.1) | ||
| new_message = '{"type":"user","timestamp":"2025-07-03T16:40:00Z","parentUuid":null,"isSidechain":false,"userType":"human","cwd":"/tmp","sessionId":"test_session","version":"1.0.0","uuid":"single_file_msg","message":{"role":"user","content":[{"type":"text","text":"Single file mode test."}]}}\n' | ||
| with open(jsonl_file, "a", encoding="utf-8") as f: | ||
| f.write(new_message) | ||
|
|
||
| # Single file mode doesn't have cache, so it should still skip based on version | ||
| with patch("builtins.print") as mock_print: | ||
| convert_jsonl_to_html(jsonl_file) | ||
| mock_print.assert_any_call( | ||
| "HTML file single_test.html is current, skipping regeneration" | ||
| ) | ||
|
|
||
| # Verify file wasn't regenerated (this is expected behavior for single file mode) | ||
| assert output_file.stat().st_mtime == original_mtime | ||
| # No "skipping" line: it regenerated because the source is newer. | ||
| skip_calls = [ | ||
| c | ||
| for c in mock_print.call_args_list | ||
| if c.args and "skipping regeneration" in str(c.args[0]) | ||
| ] | ||
| assert not skip_calls, f"unexpected skip after source grew: {skip_calls}" | ||
|
|
||
| # The output was rewritten (newer mtime) and now contains the new message. | ||
| assert output_file.stat().st_mtime > original_mtime | ||
| assert "Single file mode test." in output_file.read_text(encoding="utf-8") |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Make this regeneration check coarse-timestamp safe.
This test only sleeps 0.1s before appending, but the production path keys off mtimes. On 1-second-granularity filesystems that can leave the source and HTML in the same tick, so this rerun still skips and stat().st_mtime > original_mtime flakes. test/test_output_explicit.py already avoids that by forcing the source mtime strictly newer with os.utime(...); this case should do the same.
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 336-336: File path is request-/variable-derived; validate and normalize to prevent path traversal.
Context: open(jsonl_file, "a", encoding="utf-8")
Note: [CWE-22] Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal').
(open-filename-from-request)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/test_html_regeneration.py` around lines 331 - 352, The regeneration
assertion in this test is too dependent on `time.sleep(0.1)` and can flake on
coarse mtime filesystems; update the `convert_jsonl_to_html` setup so the source
file is explicitly made newer than the HTML output using `os.utime(...)`,
matching the approach already used in `test_output_explicit.py`. Keep the checks
around `skip_calls`, `output_file.stat().st_mtime`, and the `"Single file mode
test."` content, but ensure the test advances the source timestamp in a
filesystem-safe way before calling `convert_jsonl_to_html`.
Summary
Follow-up to #221/#237. Two user-visible staleness/messaging bugs in the
default single-file conversion path, plus message-accuracy fixes surfaced
while reviewing the change.
Bug 1 — stale HTML on a grown single-file source
claude-code-log <file>.jsonl [--detail low]writes<file>[.low].html, thenre-runs were skipped purely on the embedded tool version
(
renderer.is_outdated), which ignores source freshness. So an in-progresssession that grew between runs was wrongly reported "current" and served stale
HTML. #237 only forced regeneration for the explicit
-o filecase.Fix: add a source-mtime term to the single-file/no-cache branch —
regenerate when the source file is newer than the existing output. Scoped to a
file source (a directory source has no single mtime and is handled by the
cache path). The version-marker check is kept for tool-version bumps.
Bug 2 — misleading success message
On a skip, the CLI printed
Successfully converted …on top of theconverter's
… is current, skipping regeneration. The converter now reportswhether it actually (re)wrote output via a non-breaking optional
regeneratedout-param (the
Pathreturn is unchanged — ~20 callers unaffected); the CLIprints the success line only on a real regeneration. Applied symmetrically to
the directory branch (
Successfully combined …).Message-accuracy fix
The regeneration gate initially tracked only the combined write, so
--combined no(which writes only per-session files, never a combined) fellcompletely silent despite producing every session file — and the per-session
count was dropped whenever the combined was current but a session was
rewritten. The gate now also counts
_generate_individual_session_files'sreturn, i.e. "combined or any session file (re)written". Pure skips still
stay quiet.
Tests
TestSingleFileStaleness(test_output_explicit.py): grown source regenerates;unchanged source skips with no redundant success line; directory no-op skip
suppresses the combined success line;
--combined noconfirms itsper-session work.
test_single_file_mode_regeneration_behavior, which previouslyenshrined the stale behavior as "expected".
All four behavioral tests were verified RED on the prior code and GREEN after.
Not addressed here
--no-cacheon a directory with a grown source can still serve stale HTML(the source-mtime term is scoped to file sources). This is pre-existing, not
introduced here, and tracked separately.
🤖 Generated with Claude Code
Summary by CodeRabbit