feat: base provider abstraction + Claude adapter (phase 1) by iansherr · Pull Request #242 · daaain/claude-code-log

iansherr · 2026-06-25T20:09:25Z

Replaces #225 — this is the clean base abstraction, split out per maintainer feedback.

What's included

Base BaseProvider ABC + SessionInfo dataclass — the contract all providers implement
Factory helpers for transcript entry creation (~150 lines of boilerplate eliminated)
ProviderRegistry with auto-discovery and instantiation
ClaudeProvider wrapping existing ~/.claude/ functionality (behavioral no-op)
Unified discovery API (discovery.py) — single entry point for all providers
Pyright: 0 errors, 0 warnings (strict mode)
docs/ symlinks preserved — no copy noise

What's not included (coming as separate PRs)

AGY adapter (next — with proper entry threading + dropped types addressed)
Codex, OpenCode, Gemini adapters
--provider CLI flag wiring
--all-providers path filtering composition

Verification

All existing tests pass (2207 on this branch, up from 2138). No behavior change — ClaudeProvider delegates entirely to the existing converter.py + renderer.py pipeline.

Summary by CodeRabbit

New Features
- Added support for discovering and loading chat sessions through a unified provider system.
- Introduced automatic session discovery across available sources, plus per-provider session counts and lookup.
- Added support for Claude session data, including listing sessions from local project folders and opening individual transcripts.
- Expanded the public provider interface to standardize session metadata and transcript handling.

- BaseProvider ABC + SessionInfo dataclass for multi-provider support - Factory helpers: make_user_entry, make_assistant_entry, make_thinking_entry, make_tool_use_entry, make_tool_result_entry, extract_text, file_mtime_iso - ClaudeProvider wrapping existing claude-code-log functionality - ProviderRegistry with auto-discovery and lazy instantiation - discover_providers() entry point for provider discovery - discover_all_sessions() and discover_sessions_by_provider() for unified access - 0 Pyright errors, 2207 tests pass - Docs symlinks preserved (not replaced with copies)

iansherr · 2026-06-25T20:09:33Z

@cboos You make a good point that we're landing a lot. I've been chewing away at making this less POC and more feature-hub so we can easily add more providers as time goes on, if wanted. I'm updating this PR as the main hub, then will submit AGY as a separate. Once you're happy with those, I can refine the Codex, OpenCode, etc other providers.

What's included in this PR:

Base BaseProvider ABC + SessionInfo dataclass
Factory helpers for transcript entry creation (~150 lines of boilerplate eliminated)
ProviderRegistry with auto-discovery + instantiation
ClaudeProvider wrapping existing ~/.claude/ functionality
Unified discovery API (discover.py)
Pyright: 0 errors, 0 warnings
docs/ symlinks preserved (not replaced with copies)

All existing tests pass (2207 on this branch). No behavior change — ClaudeProvider delegates entirely to the existing converter.py + renderer.py pipeline.

Edited to add: This is my first substantive PR to a project. Thanks for your patience as I keep learning the ropes.

coderabbitai · 2026-06-25T20:12:28Z

📝 Walkthrough

Walkthrough

Adds shared provider abstractions and session metadata, a Claude provider implementation, a registry for provider discovery and loading, and public discovery helpers plus package exports.

Changes

Provider discovery and registry

Layer / File(s)	Summary
Session model and provider contract `claude_code_log/providers/base.py`	`SessionInfo`, transcript builders, and `BaseProvider` define the shared session and provider contract used by the discovery flow.
Registry storage and lookup `claude_code_log/providers/registry.py`	`ProviderRegistry` stores provider instances and classes, instantiates registered classes, and exposes provider lookup and availability queries.
Claude provider discovery and loading `claude_code_log/providers/claude.py`	`ClaudeProvider` resolves `~/.claude/projects`, discovers non-`agent-` JSONL sessions, and loads matching transcripts from project subfolders.
Registry session flow and bootstrap `claude_code_log/providers/registry.py`	`ProviderRegistry` discovers sessions, counts sessions per provider, loads sessions by provider name, and `discover_providers()` registers `ClaudeProvider`.
Public discovery helpers and exports `claude_code_log/discovery.py`, `claude_code_log/providers/__init__.py`	`discover_all_sessions`, `discover_sessions_by_provider`, `get_session_stats`, and `load_session` delegate through the registry, and the providers package re-exports the public surface.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant discover_all_sessions
  participant discover_providers
  participant ProviderRegistry
  participant ClaudeProvider

  Caller->>discover_all_sessions: request session enumeration
  discover_all_sessions->>discover_providers: build registry
  discover_providers->>ProviderRegistry: register ClaudeProvider
  discover_providers->>ProviderRegistry: instantiate_registered()
  discover_all_sessions->>ProviderRegistry: get_available_providers()
  ProviderRegistry->>ClaudeProvider: discover_sessions()
  ClaudeProvider-->>ProviderRegistry: SessionInfo values
  ProviderRegistry-->>discover_all_sessions: SessionInfo values

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A bunny hopped through logs tonight,
գտ? no—just paw prints, neat and light. 🐰
New sessions bloom in registry rows,
While Claude on moonbeam gently glows.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly reflects the main changes: a new base provider abstraction and a Claude provider adapter.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

Warning

⚠️ This pull request shows signs of AI-generated slop (defensive_cruft). It has been flagged by CodeRabbit slop detection and should be reviewed carefully.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (6)

claude_code_log/providers/registry.py (2)
68-70: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Add an explicit return type annotation for consistency.

Every other method on ProviderRegistry is annotated; load_session returns provider.load_session(...), i.e. Iterator[TranscriptEntry]. Annotating it keeps the public surface consistent and explicit.
Proposed change
     def load_session(
         self, provider_name: str, session_id: str, max_messages: Optional[int] = None
-    ):
+    ) -> Iterator[TranscriptEntry]:
(requires importing TranscriptEntry and from .base import ...)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/providers/registry.py` around lines 68 - 70, Add an explicit
return type annotation to ProviderRegistry.load_session for consistency with the
rest of the class. Update the load_session method signature in ProviderRegistry
to declare the Iterator[TranscriptEntry] return type, and make sure
TranscriptEntry is imported alongside the existing base provider symbols so the
public API stays explicit and consistent.
29-36: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Silently swallowing provider init failures hampers diagnosability.

A bare except Exception: pass means a misconfigured or broken provider disappears with no trace, making field issues very hard to debug. At minimum log the failure (with provider name) at warning/debug level.
Proposed change
     def instantiate_registered(self) -> None:
-        for provider_class in self._provider_classes.values():
+        for name, provider_class in self._provider_classes.items():
             try:
                 provider = provider_class()
                 self.register(provider)
-            except Exception:
-                # Skip providers that fail to initialize
-                pass
+            except Exception:
+                logger.warning(
+                    "Skipping provider %r: failed to initialize", name, exc_info=True
+                )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/providers/registry.py` around lines 29 - 36, The
instantiate_registered method in the registry currently swallows provider
initialization failures with a bare except, which hides broken providers; update
this path to log the exception at warning or debug level and include the
provider class name before skipping it. Use the existing instantiate_registered
and register flow in ProviderRegistry to identify the failing provider, and keep
the fallback behavior of continuing with the remaining providers.
claude_code_log/providers/base.py (1)
49-50: 🎯 Functional Correctness | 🔵 Trivial | 💤 Low value

file_mtime_iso emits a timezone-naive local timestamp.

datetime.fromtimestamp(...) without a tz argument produces a naive local-time value, so created_at/updated_at strings lack an offset. Once sessions from multiple providers (or machines in different zones) are sorted/compared, this is ambiguous. Consider anchoring to UTC.
Proposed change
-from datetime import datetime
+from datetime import datetime, timezone
@@
 def file_mtime_iso(path: Path) -> str:
-    return datetime.fromtimestamp(path.stat().st_mtime).isoformat()
+    return datetime.fromtimestamp(path.stat().st_mtime, tz=timezone.utc).isoformat()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/providers/base.py` around lines 49 - 50, The file_mtime_iso
helper currently returns a timezone-naive local timestamp, so the
created_at/updated_at values are ambiguous across machines and providers. Update
file_mtime_iso in base.py to produce a timezone-aware UTC ISO string by
anchoring datetime.fromtimestamp to UTC, and keep the change localized to this
helper so callers continue using file_mtime_iso unchanged.
claude_code_log/discovery.py (3)
45-55: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Add a docstring to get_session_stats.

Every other public function in this module has a docstring; get_session_stats is the only one without one. Document the return shape ({provider_name: session_count}) for consistency.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/discovery.py` around lines 45 - 55, Add a docstring to
get_session_stats in discovery.py so it matches the other public functions in
the module. Describe that it returns a dict mapping provider names to session
counts, and note the return shape as {provider_name: session_count} for
consistency.
9-29: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Consider delegating to the registry to avoid duplicated discovery logic.

discover_all_sessions (no filter) and get_session_stats reimplement the iterate-available-providers / yield-from-discover_sessions flow that ProviderRegistry already exposes (discover_all_sessions, discover_sessions_by_provider). The provider-name filter here is the only added value; the rest could route through the registry to keep a single source of truth for availability and iteration semantics.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/discovery.py` around lines 9 - 29, discover_all_sessions
currently duplicates the provider iteration and availability checks that
ProviderRegistry already owns. Update discover_all_sessions to delegate to the
registry’s discover_all_sessions and discover_sessions_by_provider helpers,
keeping only the provider-name filter logic here. Use discover_providers,
ProviderRegistry, and discover_sessions_by_provider to centralize availability
and iteration semantics in one place.
58-69: 📐 Maintainability & Code Quality | 🔵 Trivial

Add a return annotation to load_session.

claude_code_log/discovery.py:58 should mirror the provider API with -> Iterator[TranscriptEntry]; TranscriptEntry is already available from .providers.base, so the wrapper can expose the same contract explicitly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude_code_log/discovery.py` around lines 58 - 69, Add an explicit return
type to load_session in discovery.py so the wrapper matches the provider API;
update the function signature to return Iterator[TranscriptEntry], using the
already imported TranscriptEntry from .providers.base, and keep the body
delegating through discover_providers().load_session(provider_name, session_id).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@claude_code_log/providers/claude.py`:
- Around line 40-56: The load_session method in
claude_code_log/providers/claude.py ignores the max_messages parameter and
always returns the full transcript from load_transcript(jsonl_file). Update
load_session to enforce the cap by passing max_messages through to the
transcript loading path if supported, or by truncating iteration after
max_messages entries before returning. Keep the change localized to load_session
and any directly related transcript-loading helper it calls.

---

Nitpick comments:
In `@claude_code_log/discovery.py`:
- Around line 45-55: Add a docstring to get_session_stats in discovery.py so it
matches the other public functions in the module. Describe that it returns a
dict mapping provider names to session counts, and note the return shape as
{provider_name: session_count} for consistency.
- Around line 9-29: discover_all_sessions currently duplicates the provider
iteration and availability checks that ProviderRegistry already owns. Update
discover_all_sessions to delegate to the registry’s discover_all_sessions and
discover_sessions_by_provider helpers, keeping only the provider-name filter
logic here. Use discover_providers, ProviderRegistry, and
discover_sessions_by_provider to centralize availability and iteration semantics
in one place.
- Around line 58-69: Add an explicit return type to load_session in discovery.py
so the wrapper matches the provider API; update the function signature to return
Iterator[TranscriptEntry], using the already imported TranscriptEntry from
.providers.base, and keep the body delegating through
discover_providers().load_session(provider_name, session_id).

In `@claude_code_log/providers/base.py`:
- Around line 49-50: The file_mtime_iso helper currently returns a
timezone-naive local timestamp, so the created_at/updated_at values are
ambiguous across machines and providers. Update file_mtime_iso in base.py to
produce a timezone-aware UTC ISO string by anchoring datetime.fromtimestamp to
UTC, and keep the change localized to this helper so callers continue using
file_mtime_iso unchanged.

In `@claude_code_log/providers/registry.py`:
- Around line 68-70: Add an explicit return type annotation to
ProviderRegistry.load_session for consistency with the rest of the class. Update
the load_session method signature in ProviderRegistry to declare the
Iterator[TranscriptEntry] return type, and make sure TranscriptEntry is imported
alongside the existing base provider symbols so the public API stays explicit
and consistent.
- Around line 29-36: The instantiate_registered method in the registry currently
swallows provider initialization failures with a bare except, which hides broken
providers; update this path to log the exception at warning or debug level and
include the provider class name before skipping it. Use the existing
instantiate_registered and register flow in ProviderRegistry to identify the
failing provider, and keep the fallback behavior of continuing with the
remaining providers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bed43211-85e1-4f3e-ab40-9fe78af5e58e

📥 Commits

Reviewing files that changed from the base of the PR and between 4bd633f and 997da83.

📒 Files selected for processing (5)

claude_code_log/discovery.py
claude_code_log/providers/__init__.py
claude_code_log/providers/base.py
claude_code_log/providers/claude.py
claude_code_log/providers/registry.py

coderabbitai · 2026-06-25T20:19:51Z

+    def load_session(
+        self, session_id: str, max_messages: Optional[int] = None
+    ) -> Iterator[TranscriptEntry]:
+        from claude_code_log.converter import load_transcript
+
+        data_dir = self.get_data_dir()
+        if data_dir is None:
+            raise ValueError("Claude data directory not found")
+
+        for project_dir in data_dir.iterdir():
+            if not project_dir.is_dir():
+                continue
+            jsonl_file = project_dir / f"{session_id}.jsonl"
+            if jsonl_file.exists():
+                return iter(load_transcript(jsonl_file))
+
+        raise FileNotFoundError(f"Session {session_id} not found")


🎯 Functional Correctness | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Inspect load_transcript signature/return type fd -t f 'converter.py' claude_code_log ast-grep run --pattern 'def load_transcript($$$): $$$' --lang python $(fd -t f 'converter.py' claude_code_log)

Repository: daaain/claude-code-log

Length of output: 189

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== outline converter.py ==" ast-grep outline claude_code_log/converter.py --view expanded || true echo echo "== search load_transcript / max_messages usages ==" rg -n "load_transcript|max_messages" claude_code_log

Repository: daaain/claude-code-log

Length of output: 4261

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== read converter.py relevant section ==" wc -l claude_code_log/converter.py sed -n '1,260p' claude_code_log/converter.py echo echo "== read provider/registry relevant sections ==" rg -n "def load_session|max_messages|load_transcript" claude_code_log/providers claude_code_log | sed -n '1,200p'

Repository: daaain/claude-code-log

Length of output: 12573

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '1,260p' claude_code_log/converter.py echo sed -n '1,220p' claude_code_log/providers/claude.py echo rg -n "max_messages|load_session\\(" claude_code_log

Repository: daaain/claude-code-log

Length of output: 12196

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '1,260p' claude_code_log/converter.py echo sed -n '1,220p' claude_code_log/providers/claude.py echo rg -n "max_messages|load_session\\(" claude_code_log

Repository: daaain/claude-code-log

Length of output: 12196

max_messages needs to be enforced in claude_code_log/providers/claude.py::load_session

max_messages is accepted and forwarded by the registry, but this method always returns load_transcript(jsonl_file) in full. That leaves large sessions unbounded and makes the cap parameter a no-op. Either pass the limit through to the transcript loader or stop reading once the cap is reached.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@claude_code_log/providers/claude.py` around lines 40 - 56, The load_session method in claude_code_log/providers/claude.py ignores the max_messages parameter and always returns the full transcript from load_transcript(jsonl_file). Update load_session to enforce the cap by passing max_messages through to the transcript loading path if supported, or by truncating iteration after max_messages entries before returning. Keep the change localized to load_session and any directly related transcript-loading helper it calls.

cboos · 2026-06-28T09:18:42Z

Thanks for taking up the challenge. However, the abstraction alone might not be the best choice as the first step, since we can't be sure we've nailed it perfectly without at least one example to validate it. In particular, I'm a bit concerned about the parentUuid=None parts.

The suggestion in #225 was indeed: land the abstraction plus one fully-realized provider — properly typed, threading entries, and modeling tool calls/results — as the quality bar.

So, I think it's nice to separate the addition of the providers layer as a single commit as you did already, but on top of that, in this PR, you could just add the agy example in one or more additional commits, exercising the different aspects of the support, so as to fully shape the abstraction together with one instantiation.

Never mind, I just noticed #243...

cboos · 2026-06-28T22:27:45Z

(Claude) Reviewed as the base for #243 — clean, merge-ready foundation. It's purely additive (5 new files, zero changes to existing code), so the Claude path is untouched and the docs symlinks are preserved (addressing the symlink breakage from #225). ClaudeProvider.load_session delegates to the existing load_transcript, so it's a true behavioral no-op. pyright 0/0/0 (the 274 from #225 are gone), ty at baseline, 2207 tests pass. Clean ABC with typed factory helpers, and the registry encapsulates instantiation (fixing the earlier reportPrivateUsage). Full review + the forward direction (structural tool rendering toward parity with Claude's log output) is on #243.

iansherr mentioned this pull request Jun 25, 2026

feat: AGY (Antigravity) provider — with threading + full type coverage #243

Open

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

cboos mentioned this pull request Jun 28, 2026

feat: Add multi-provider session viewer support (Codex, Gemini, OpenCode, agy) #225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: base provider abstraction + Claude adapter (phase 1)#242

feat: base provider abstraction + Claude adapter (phase 1)#242
iansherr wants to merge 1 commit into
daaain:mainfrom
iansherr:feat/base-abstraction

iansherr commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

iansherr commented Jun 25, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Uh oh!

cboos commented Jun 28, 2026 •

edited

Loading

Uh oh!

cboos commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iansherr commented Jun 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's included

What's not included (coming as separate PRs)

Verification

Summary by CodeRabbit

Uh oh!

iansherr commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cboos commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cboos commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

iansherr commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

iansherr commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

cboos commented Jun 28, 2026 •

edited

Loading