Regenerate single-file output when its source grows; fix success-message accuracy (#221 follow-up) by cboos · Pull Request #253 · daaain/claude-code-log

cboos · 2026-06-30T22:20:24Z

Summary

Follow-up to #221/#237. Two user-visible staleness/messaging bugs in the
default single-file conversion path, plus message-accuracy fixes surfaced
while reviewing the change.

Bug 1 — stale HTML on a grown single-file source

claude-code-log <file>.jsonl [--detail low] writes <file>[.low].html, then
re-runs were skipped purely on the embedded tool version
(renderer.is_outdated), which ignores source freshness. So an in-progress
session that grew between runs was wrongly reported "current" and served stale
HTML. #237 only forced regeneration for the explicit -o file case.

Fix: add a source-mtime term to the single-file/no-cache branch —
regenerate when the source file is newer than the existing output. Scoped to a
file source (a directory source has no single mtime and is handled by the
cache path). The version-marker check is kept for tool-version bumps.

Bug 2 — misleading success message

On a skip, the CLI printed Successfully converted … on top of the
converter's … is current, skipping regeneration. The converter now reports
whether it actually (re)wrote output via a non-breaking optional regenerated
out-param (the Path return is unchanged — ~20 callers unaffected); the CLI
prints the success line only on a real regeneration. Applied symmetrically to
the directory branch (Successfully combined …).

Message-accuracy fix

The regeneration gate initially tracked only the combined write, so
--combined no (which writes only per-session files, never a combined) fell
completely silent despite producing every session file — and the per-session
count was dropped whenever the combined was current but a session was
rewritten. The gate now also counts _generate_individual_session_files's
return, i.e. "combined or any session file (re)written". Pure skips still
stay quiet.

Tests

TestSingleFileStaleness (test_output_explicit.py): grown source regenerates;
unchanged source skips with no redundant success line; directory no-op skip
suppresses the combined success line; --combined no confirms its
per-session work.
Flipped test_single_file_mode_regeneration_behavior, which previously
enshrined the stale behavior as "expected".

All four behavioral tests were verified RED on the prior code and GREEN after.

Not addressed here

--no-cache on a directory with a grown source can still serve stale HTML
(the source-mtime term is scoped to file sources). This is pre-existing, not
introduced here, and tracked separately.

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved regeneration handling so the CLI no longer shows a success message when output is already current and nothing was rebuilt.
- Default single-file output now refreshes correctly when the source changes.
- Directory-based runs now suppress misleading success text during skip scenarios while still reporting when session files are generated.
Tests
- Expanded coverage for stale vs. current outputs, single-file regeneration, directory skips, and combined-off session file generation.

…essage (#221 follow-up) The default single-file output path (no `-o`, detail-suffixed naming like `<file>.low.html`) rode the version-only staleness skip: `is_outdated()` compares only the embedded tool version, not the source's freshness. So a single `.jsonl` that grew between runs (an in-progress session re-exported with the same tool version) was wrongly reported "current" and served stale HTML. #237 only forced regeneration for the explicit `-o file` case. Bug 1: add a source-mtime term to the single-file `else` branch in convert_jsonl_to — regenerate when the source file is newer than the existing output. Scoped to a file source (a directory source has no single mtime and is already handled by the cache / `cache_was_updated` path). The version-marker check is kept for tool-version bumps. Bug 2: on a skip the CLI printed "Successfully converted" on top of the converter's own "is current, skipping regeneration" line. convert_jsonl_to now reports whether it actually (re)wrote the output via a non-breaking optional `regenerated` out-parameter (the `Path` return contract that ~20 callers rely on is unchanged); the CLI prints the success line only on a real regeneration. Tests: flip test_single_file_mode_regeneration_behavior (it enshrined the stale behavior as "expected") to assert regeneration on a grown source; add TestSingleFileStaleness covering the exact CLI repro — grown source regenerates, unchanged source skips with no redundant success line. Both fail on the prior code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ty, mtime rationale Non-blocking review follow-ups on the #221 single-file-staleness fix: 1. Directory-branch symmetry: the skip-suppression of the success line was gated only on the single-file branch, so a directory whose combined output is current (Phase-1b early-exit) still printed "Successfully combined N transcript files ... and generated M individual session files" on a pure skip. The `regenerated` out-param already carries the bool — extend the same guard to the directory branch so skip-messaging is consistent for single-file AND directory inputs. 2. Readability: `regenerated != [False]` -> `any(regenerated)`. 3. Mtime rationale: add a comment explaining why the *output* file's mtime is the correct, sufficient, persistence-free basis in single-file mode (no cache DB to store the source mtime as the directory path does), and that both share the same filesystem-granularity limit. Test: add test_directory_skip_suppresses_success_line pinning note 1 (RED on the prior file-only guard, which printed the misleading combined line on a no-op directory re-run). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…bined no silent run) The skip-suppression added earlier gated the CLI success line on the COMBINED-output write only (`did_regenerate`). But per-session generation is an independent axis: `--combined no` writes only individual session files and never a combined transcript, so `did_regenerate` stayed False and the run printed NOTHING despite producing every session file — a regression vs. the prior unconditional "Successfully combined N transcript files ... and generated M individual session files" line. The same gap silently dropped the session count whenever the combined was current but a session was (re)written. `_generate_individual_session_files` already returns a regenerated count; it was discarded. Capture it and OR it into the `regenerated` out-param so the gate means "combined OR any session file (re)written". Pure skips (single-file or a fully-current directory) still report False and stay quiet, preserving the original double-message fix. Surfaced by a self-review pass (/code-review) on the branch; not caught by the diff-scoped manual review because it only manifests under --combined no / session-only regeneration, which the review didn't execute. Test: test_combined_no_confirms_session_generation — `--combined no` writes session files AND confirms the work (RED before: the run printed only Processing/Loading with no summary line). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-30T22:20:43Z

📝 Walkthrough

Walkthrough

convert_jsonl_to() gains an optional regenerated out-parameter populated across its early-exit, paginated, non-paginated, and per-session paths. The CLI uses this to compute was_regenerated and suppress success messages when output was skipped. Tests are updated and extended to verify this behavior across single-file, directory, and combined-no scenarios.

Changes

Regeneration tracking and messaging

Layer / File(s)	Summary
Converter regeneration tracking `claude_code_log/converter.py`	`convert_jsonl_to` accepts a `regenerated` out-parameter, tracks `did_regenerate`/`sessions_regenerated` across cache-hit early exit, paginated and non-paginated generation (with new `source_is_newer` mtime check), and per-session file generation.
CLI success message gating `claude_code_log/cli.py`	CLI passes a `regenerated` list to `convert_jsonl_to`, computes `was_regenerated`, and only prints success messages when regeneration actually occurred.
Regeneration behavior tests `test/test_html_regeneration.py`, `test/test_output_explicit.py`	Existing single-file-mode test updated to expect regeneration on newer source instead of version-marker-based skip; new `TestSingleFileStaleness` class covers grown/unchanged source, directory rerun, and `--combined no` scenarios.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant convert_jsonl_to
  participant FileSystem

  CLI->>convert_jsonl_to: convert_jsonl_to(input_path, output_path, regenerated=[])
  convert_jsonl_to->>FileSystem: compare output mtime vs input/version marker
  alt all current, no regeneration needed
    convert_jsonl_to->>convert_jsonl_to: regenerated.append(False)
    convert_jsonl_to-->>CLI: return output_path
    CLI->>CLI: was_regenerated = False, suppress success print
  else regeneration required
    convert_jsonl_to->>FileSystem: write combined HTML and/or session files
    convert_jsonl_to->>convert_jsonl_to: regenerated.append(True)
    convert_jsonl_to-->>CLI: return output_path
    CLI->>CLI: was_regenerated = True, print success message
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

daaain/claude-code-log#237: Both PRs modify convert_jsonl_to()'s regeneration/skip logic, directly overlapping in the converter's caching behavior.

Poem

A rabbit checked the mtime twice,
"Skip or write?" — such careful advice.
No more false cheers when nothing's new,
Just honest hops when work is true. 🐇
Thump thump — regenerated, verified!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main changes: single-file regeneration on source growth and corrected success-message behavior.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dev/single-file-staleness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

claude_code_log/cli.py (1)

1343-1361: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Handle --combined no as a session-only success path.

When write_combined is false, convert_jsonl_to() deliberately returns output_path without writing that combined file. This branch is now reachable because was_regenerated becomes true from session-file writes, so the CLI reports Successfully combined ... to {output_path} and click.launch() can target a path that does not exist.

Suggested fix

-        if not was_regenerated:
-            pass
-        elif input_path.is_file():
+        if not was_regenerated:
+            pass
+        elif input_path.is_file():
             click.echo(f"Successfully converted {input_path} to {output_path}")
         else:
             jsonl_count = len(list(input_path.glob("*.jsonl")))
-            if write_individual:
+            if write_combined and write_individual:
                 ext = get_file_extension(output_format)
                 session_files = list(input_path.glob(f"session-*.{ext}"))
                 click.echo(
                     f"Successfully combined {jsonl_count} transcript files from {input_path} to {output_path} and generated {len(session_files)} individual session files"
                 )
-            else:
+            elif write_combined:
                 click.echo(
                     f"Successfully combined {jsonl_count} transcript files from {input_path} to {output_path}"
                 )
+            else:
+                ext = get_file_extension(output_format)
+                session_files = list(input_path.glob(f"session-*.{ext}"))
+                click.echo(
+                    f"Successfully generated {len(session_files)} individual session files from {jsonl_count} transcript files in {input_path}"
+                )
 
-        if open_browser:
+        if open_browser and write_combined:
             click.launch(str(output_path))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/cli.py` around lines 1343 - 1361, The success handling in
cli.py’s convert_jsonl_to flow is treating session-only regeneration as a
combined-output success, which can report a nonexistent output_path and launch
it in the browser. Update the branch around was_regenerated,
input_path.is_file(), and write_individual so that when write_combined is false
the CLI uses a session-only success message and only calls click.launch on a
real existing file path, while preserving the combined-file messaging when
output_path was actually written.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/test_html_regeneration.py`:
- Around line 331-352: The regeneration assertion in this test is too dependent
on `time.sleep(0.1)` and can flake on coarse mtime filesystems; update the
`convert_jsonl_to_html` setup so the source file is explicitly made newer than
the HTML output using `os.utime(...)`, matching the approach already used in
`test_output_explicit.py`. Keep the checks around `skip_calls`,
`output_file.stat().st_mtime`, and the `"Single file mode test."` content, but
ensure the test advances the source timestamp in a filesystem-safe way before
calling `convert_jsonl_to_html`.

---

Outside diff comments:
In `@claude_code_log/cli.py`:
- Around line 1343-1361: The success handling in cli.py’s convert_jsonl_to flow
is treating session-only regeneration as a combined-output success, which can
report a nonexistent output_path and launch it in the browser. Update the branch
around was_regenerated, input_path.is_file(), and write_individual so that when
write_combined is false the CLI uses a session-only success message and only
calls click.launch on a real existing file path, while preserving the
combined-file messaging when output_path was actually written.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f13ccd68-ef0e-4edc-8d40-7b4af1b1b440

📥 Commits

Reviewing files that changed from the base of the PR and between 8c974bf and 4d7f498.

📒 Files selected for processing (4)

claude_code_log/cli.py
claude_code_log/converter.py
test/test_html_regeneration.py
test/test_output_explicit.py

coderabbitai · 2026-06-30T22:25:31Z

+        # Modify the source: even without a cache, a single file that grows
+        # (e.g. an in-progress session) MUST regenerate — the output is now
+        # older than its source (issue #221 follow-up). Previously single
+        # file mode wrongly skipped on the version marker alone.
        time.sleep(0.1)
        new_message = '{"type":"user","timestamp":"2025-07-03T16:40:00Z","parentUuid":null,"isSidechain":false,"userType":"human","cwd":"/tmp","sessionId":"test_session","version":"1.0.0","uuid":"single_file_msg","message":{"role":"user","content":[{"type":"text","text":"Single file mode test."}]}}\n'
        with open(jsonl_file, "a", encoding="utf-8") as f:
            f.write(new_message)

-        # Single file mode doesn't have cache, so it should still skip based on version
        with patch("builtins.print") as mock_print:
            convert_jsonl_to_html(jsonl_file)
-            mock_print.assert_any_call(
-                "HTML file single_test.html is current, skipping regeneration"
-            )
-
-        # Verify file wasn't regenerated (this is expected behavior for single file mode)
-        assert output_file.stat().st_mtime == original_mtime
+            # No "skipping" line: it regenerated because the source is newer.
+            skip_calls = [
+                c
+                for c in mock_print.call_args_list
+                if c.args and "skipping regeneration" in str(c.args[0])
+            ]
+            assert not skip_calls, f"unexpected skip after source grew: {skip_calls}"
+
+        # The output was rewritten (newer mtime) and now contains the new message.
+        assert output_file.stat().st_mtime > original_mtime
+        assert "Single file mode test." in output_file.read_text(encoding="utf-8")


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make this regeneration check coarse-timestamp safe.

This test only sleeps 0.1s before appending, but the production path keys off mtimes. On 1-second-granularity filesystems that can leave the source and HTML in the same tick, so this rerun still skips and stat().st_mtime > original_mtime flakes. test/test_output_explicit.py already avoids that by forcing the source mtime strictly newer with os.utime(...); this case should do the same.

🧰 Tools

🪛 ast-grep (0.44.0)

[warning] 336-336: File path is request-/variable-derived; validate and normalize to prevent path traversal.
Context: open(jsonl_file, "a", encoding="utf-8")
Note: [CWE-22] Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal').

(open-filename-from-request)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/test_html_regeneration.py` around lines 331 - 352, The regeneration assertion in this test is too dependent on `time.sleep(0.1)` and can flake on coarse mtime filesystems; update the `convert_jsonl_to_html` setup so the source file is explicitly made newer than the HTML output using `os.utime(...)`, matching the approach already used in `test_output_explicit.py`. Keep the checks around `skip_calls`, `output_file.stat().st_mtime`, and the `"Single file mode test."` content, but ensure the test advances the source timestamp in a filesystem-safe way before calling `convert_jsonl_to_html`.

cboos and others added 3 commits July 1, 2026 00:17

cboos mentioned this pull request Jun 30, 2026

--no-cache on a directory serves stale HTML when a source grows #254

Open

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regenerate single-file output when its source grows; fix success-message accuracy (#221 follow-up)#253

Regenerate single-file output when its source grows; fix success-message accuracy (#221 follow-up)#253
cboos wants to merge 3 commits into
mainfrom
dev/single-file-staleness

cboos commented Jun 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cboos commented Jun 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bug 1 — stale HTML on a grown single-file source

Bug 2 — misleading success message

Message-accuracy fix

Tests

Not addressed here

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cboos commented Jun 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading