Skip to content

[WIP]LangSmithを使用している箇所をWeaveに変更#44

Open
os1ma wants to merge 6 commits intomainfrom
feature/weave
Open

[WIP]LangSmithを使用している箇所をWeaveに変更#44
os1ma wants to merge 6 commits intomainfrom
feature/weave

Conversation

@os1ma
Copy link
Copy Markdown
Contributor

@os1ma os1ma commented Jan 13, 2026

Summary by CodeRabbit

  • 新機能

    • Weave を用いた実行トレーシングを導入(ストリーミングでコールIDを扱います)。
    • プロンプト管理を中央化し、プロンプトの公開機能を追加。
    • Weave ベースの評価ワークフロー、評価スクリプト、スコアラー群を追加。
    • ドキュメント用データセット作成 UI と関連ページを追加。
  • その他

    • 環境テンプレートの設定項目を更新(外部サービスキーの構成を変更)。

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 13, 2026

Walkthrough

LangSmith ベースのトレーシングを Weave ベースへ移行。RAG チェーンのデコレータを @traceable から @weave.op に置換し、WeaveCallId 型を導入して各 stream が最初に weave の呼び出し ID を yield するよう変更。環境変数と依存に Weave/WANDB を追加。

Changes

Cohort / File(s) Summary
Environment & Dependencies
\.env.template, pyproject.toml
.env.template から LangChain/LangSmith 変数を削除し WANDB_API_KEY, WEAVE_PROJECT_NAME を追加。pyproject.tomlweave, wandb, litellm を追加。
RAG Chain Base Abstraction
app/advanced_rag/chains/base.py
新クラス WeaveCallId を追加。BaseRAGChain.stream の yield 型を `Context
RAG Chain Implementations
app/advanced_rag/chains/*
app/advanced_rag/chains/hyde.py, .../multi_query.py, .../naive.py, .../rerank.py, .../route.py, .../rag_fusion.py, .../hybrid.py
各チェーンで @traceable@weave.op に変更。weave をインポートし、stream の先頭で現在の weave コール ID を WeaveCallId として yield。内部プロンプト処理を app.prompts のプロンプトへ移譲。
Prompts モジュール追加/整理
app/prompts/*
app/prompts/__init__.py, app/prompts/rag_prompts.py, app/prompts/evaluation_prompts.py, app/prompts/documentation_quality_prompt.py
RAG/評価用の Weave StringPrompt 定義を追加・公開し、各種 publish_* 関数を提供。プロンプトを中央管理してチェーン/評価から参照可能に。
Streamlit Pages — Weave 初期化追加
pages/*.py (複数: day1_3_indexing.py, day1_4_rag.py, day1_5_naive_agent.py, day2_2_advanced_rag.py, day2_5_naive_agent.py, day2_6_mcp.py, day2_7_custom_mcp.py, day3_2_documentation_agent.py, day3_3_checkpointer.py, day3_4_human_in_the_loop.py, day3_5_form.py, day3_6_agent_inbox.py, day3_7_supervisor.py, day3_x_create_documentation_dataset.py, day1_4_rag.py )
多数ページに import weaveweave.init(os.getenv("WEAVE_PROJECT_NAME")) を追加。いくつかで @traceable@weave.op に変更。
Evaluation & Feedback pages
pages/day2_3_evaluation.py, pages/day2_4_advanced_rag_feedback.py, pages/day2_x_create_dataset.py
Predictor を Weave Model として再実装し @weave.op を使用。フィードバックで run_idweave_call_id に置換し、Weave API 経由でリアクション・スコアを扱うよう更新。データセット作成を Weave Dataset.from_pandas / publish に簡素化。
Notebooks
notebooks/*.ipynb (例: day1_2_langchain_and_rag_basics.ipynb, day3_1_langgraph.ipynb, day3_8_agent_design_pattern.ipynb)
ノートブック内の LangSmith 呼び出しを Weave に置換し weave.init(...) を追加。day3_8 は CLI 呼び出しをプログラム的 API 呼び出しへ変更。
Documentation Agent & Evaluations
app/documentation_agent/agent.py, evals/*
DocumentationAgent の run 系を @weave.op で公開。evals に Weave ベースの Predictor/Scorer/Evaluation ワークフローを多数追加(新規モジュール・クラス・CLI)。

Sequence Diagram(s)

sequenceDiagram
  participant User as Client
  participant UI as Streamlit Page
  participant Chain as RAG Chain (weave.op)
  participant Weave as Weave Control Plane
  participant Model as LLM
  participant Store as VectorStore

  User->>UI: 質問送信
  UI->>Chain: call stream(question) `@weave.op`
  Chain->>Weave: require_current_call() (weave_call_id)
  Chain-->>UI: yield WeaveCallId(weave_call_id)
  Chain->>Store: retrieve/embeddings(query)
  Store-->>Chain: documents (Context)
  Chain-->>UI: yield Context
  Chain->>Model: model.stream(generate_answer_prompt_text)
  Model-->>Chain: AnswerToken chunks
  Chain-->>UI: yield AnswerToken...
  Note over Weave,Chain: Weave により呼び出しが追跡・公開される
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • segavvy

Poem

🐰 朝の草原でコードを縫う、
呼び出しの糸は静かに光る、
WeaveCallId が最初に跳ねて、
チェーンは一列に歌い出す、
小さなウサギ、祝杯をぴょん。

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed プルリクエストのタイトルは「[WIP]LangSmithを使用している箇所をWeaveに変更」で、主要な変更内容(LangSmithからWeaveへの移行)を明確に要約しており、チェンジセット全体と一致しています。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/weave

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
notebooks/day3_1_langgraph.ipynb (1)

686-704: kernelspec / Python バージョンのメタデータ変更は差分ノイズになりやすいので意図を確認したいです

kernelspec.display_namelanguage_info.version が変わっているので、チームの方針(Notebook メタデータは固定する/しない)に合わせたいです。意図しない変更なら元に戻すのがおすすめです。

pages/day2_3_evaluation.py (3)

44-60: 軽微: ground_truch_source のtypoと metadata["source"] のKeyError耐性
Line 54 の変数名は ground_truth_source が意図だと思います。加えて r.metadata["source"] は欠損で落ちるので .get("source") に寄せるのが無難です。


118-144: answer_hallucination{context}list[Document] をそのまま渡すのは避け、output 変数の上書きも解消してください

Line 131 で output["context"](list[Document])をテンプレートに渡していますが、Document オブジェクトが文字列化される際に repr() 形式で出力され、評価用プロンプトとして適切ではありません。ドキュメントのメタデータ(source など)と page_content を明示的に結合して文字列化し、output 変数の上書きも避けて可読性を向上させてください。

修正案(context を文字列化 + 変数名整理)
 def answer_hallucination(output: dict[str, Any], question: str, answer: str) -> int:
     prompt = ChatPromptTemplate.from_template(_answer_hallucination_prompt)
@@
     model_with_structure = model.with_structured_output(AnswerHallucinationOutput)

+    context_docs: list[Document] = output["context"]
+    context_text = "\n\n".join(
+        f"SOURCE: {d.metadata.get('source','')}\n{d.page_content}" for d in context_docs
+    )
+
     prompt_value = prompt.invoke(
         {
             "input": question,
-            "context": output["context"],
+            "context": context_text,
             "output": output["answer"],
             "reference_outputs": answer,
         }
     )
-    output: AnswerHallucinationOutput = model_with_structure.invoke(prompt_value)  # type: ignore[assignment]
+    judged: AnswerHallucinationOutput = model_with_structure.invoke(prompt_value)  # type: ignore[assignment]

     # ハルシネーションのある場合は0、ない場合は1を返す
-    if output.hallucination:
+    if judged.hallucination:
         score = 0
     else:
         score = 1
     return score

149-191: Streamlit環境ではasyncio.run()を直接呼び出せません。別スレッドで新しいイベントループを作成する必要があります。

Streamlit自体が既にasyncioイベントループを実行しているため、Line 190のasyncio.run(evaluation.evaluate(predictor))は「event loop already running」エラーで失敗します。これは実装環境(ローカル/Cloud/Notebook)に関わらず発生する既知の互換性問題です。

推奨される修正パターン:

  • threading.Threadを使用して別スレッドで新しいイベントループを作成し、そこでWeave Evaluationを実行する
  • 評価の結果をst.session_stateに保存して、メインスレッドからUI更新する
  • WEAVE_PARALLELISM環境変数でワーカー並行数を制御し、レート制限やリソーススパイクを回避する
pages/day2_4_advanced_rag_feedback.py (1)

102-124: Line 117の weave.init() 再呼び出しを weave.get_client() に置き換えてください

weave.init() はプロセスグローバルな初期化であり、同一プロセス内で複数回呼ぶのは避けるべきです。初回の初期化(Line 33)の後、クライアント取得時は weave.get_client() を使用してください。

client = weave.get_client()
call = client.get_call(weave_call_id)

weave.init() は後続の呼び出しでグローバルなWeaveクライアント/設定を置き換えるため、フィードバック送信時の不要な再初期化につながります。weave.get_client() は初期化済みのクライアントインスタンスを取得するための推奨APIです。

app/advanced_rag/chains/route.py (1)

75-82: route が想定外の値のとき documents が未初期化になり得る
LLM出力がスキーマ逸脱/変換失敗した場合などに備えて、明示的に else: raise を入れて落とし所を作っておくのが安全です。

提案差分
         if route == Route.langsmith_document:
             documents = self.langsmith_document_retriever.invoke(question)
         elif route == Route.web:
             documents = self.web_retriever.invoke(question)
+        else:
+            raise ValueError(f"Unknown route: {route!r}")

         # 検索結果を返す
         yield Context(documents=documents)
app/advanced_rag/chains/rerank.py (1)

24-45: @weave.op による入力ドキュメントのログ記録リスク

_rerank()documents パラメータを受け取り、そのテキスト内容を Cohere に送信しています。@weave.op デコレータは入力引数をデフォルトで記録するため、同じドキュメント内容が Weave にも記録され、情報漏えい面積が増大します。

Weave の postprocess_inputs パラメータを使用して、sensitive なフィールドをマスキング・除外することを推奨します。例: @weave.op(postprocess_inputs=lambda inputs: {**inputs, "documents": "REDACTED"}) または同等の処理。もしくは weave.init() 時に redact_pii を有効化するか、sanitize.add_redact_key() で個別フィールドを指定してください。

app/advanced_rag/chains/rag_fusion.py (1)

39-43: @weave.opデコレータの使用時にDocument内容とメタデータがWeaveに記録される点への対応が必要

現在、_reciprocal_rank_fusion(39-43行目)とstreamメソッド(67行目)の両方で、Document オブジェクトがメタデータ含めそのままWeaveに記録されています。Document内容がPII・機密情報・個人情報を含む場合、そのままトレーシングシステムに記録されるリスクがあります。

Weaveは以下の対策を提供しています:

  • @weave.op(postprocess_inputs=fn) で入力引数をマスキング・フィルタリング
  • @weave.op(tracing_sample_rate=0.1) でサンプリング率を設定して記録対象を削減
  • WEAVE_DISABLED=true環境変数またはweave初期化時に記録を完全無効化

データ保護要件に応じて、以下のいずれかの実装を検討してください:

  1. postprocess_inputsで Document オブジェクトをマスク化(例:メタデータ削除、内容サマリー化)
  2. tracing_sample_rateでサンプリング率を制限
  3. 本番環境ではWEAVE_DISABLEDでの記録無効化
🤖 Fix all issues with AI agents
In @app/advanced_rag/chains/base.py:
- Around line 17-27: The stream() method was extended to yield WeaveCallId, so
update the reduce_fn used when consuming that stream (e.g., reduce_fn in
BaseRAGChain or any implementing class) to accept WeaveCallId values and
explicitly ignore them during aggregation; change the reducer signature to
accept Union[Context, AnswerToken, WeaveCallId] (or add a parameter typed for
WeaveCallId) and add a branch that returns the accumulator unchanged when the
item is a WeaveCallId, leaving normal handling for Context and AnswerToken
intact.

In @app/advanced_rag/chains/rag_fusion.py:
- Around line 78-80: The stream method in RAGFusionRAGChain (decorated as
@weave.op name="rag_fusion") violates BaseRAGChain.stream's generator contract
by not yielding a WeaveCallId first; import WeaveCallId and modify
RAGFusionRAGChain.stream to yield a WeaveCallId as the first yielded value
(matching other chains like naive/hyde/multi_query/rerank/route) before emitting
Context or AnswerToken so the generator type becomes Generator[Context |
AnswerToken | WeaveCallId, None, None] and callers can uniformly handle the
initial call id.

In @notebooks/day1_2_langchain_and_rag_basics.ipynb:
- Around line 303-318: Remove the insecure print of the API key prefix (the call
print(os.environ["WANDB_API_KEY"][:3])) and replace the existing assert
os.environ["WANDB_API_KEY"] with a safe presence check using
os.environ.get("WANDB_API_KEY") that raises a clear error (or ValueError) if
missing; ensure only existence is validated before calling
weave.init("training-llm-app") so no part of the WANDB_API_KEY is ever printed
or logged.

In @notebooks/day3_1_langgraph.ipynb:
- Around line 28-38: Replace the unconditional call to
weave.init("training-llm-app") with a conditional check using weave.get_client()
so initialization only happens when no client exists: call
weave.init("training-llm-app") only if weave.get_client() returns None;
reference weave.finish() if you need to support teardown-and-reinit scenarios.

In @notebooks/day3_8_agent_design_pattern.ipynb:
- Around line 203-216: ReflectionManager currently may attempt to write to a
file under a missing directory in save_reflection; update
ReflectionManager.__init__ to ensure the parent directory of self.file_path is
created before any writes (e.g., use pathlib to create the parent with
parents=True and exist_ok=True), or alternatively check/create the parent
directory at the start of save_reflection so FileNotFoundError cannot occur.

In @pages/day1_3_indexing.py:
- Line 2: The Streamlit app currently calls weave.init() on every rerun (likely
inside app()), so wrap the initialization to run once: move weave.init() into a
helper decorated with @st.cache_resource (e.g., def init_weave(): weave.init();
return True and call init_weave() from app()), or guard it with st.session_state
(e.g., if "weave_inited" not in st.session_state: weave.init();
st.session_state["weave_inited"]=True). Ensure references to weave.init() and
the app() entry point are updated so initialization is only performed on first
run.

In @pages/day2_4_advanced_rag_feedback.py:
- Around line 31-43: In app(), guard calls to show_context and show_answer when
st.session_state.state.context or .answer can be None: before calling
show_context(st.session_state.state.context) and
show_answer(st.session_state.state.answer) (the else branch around the existing
display logic in app and the block covering lines ~73-100), check that the value
is not None (or provide a safe fallback like an empty string) and skip or render
a placeholder if it is None; update the SessionState usage so any downstream
code that assumes non-None values only runs after these checks.

In @pages/day2_x_create_dataset.py:
- Around line 3-6: The call to weave.publish(dataset) in the button handler
lacks exception handling, so network/auth/validation/timeouts can crash the UI;
wrap the publish call (the block that runs when the Streamlit button is pressed)
in a try/except and show errors to the user (e.g., via st.error) and optionally
use a spinner (st.spinner) while awaiting; catch broad exceptions around
weave.publish(dataset) and display the exception message, and consider adding
retry logic or a specific message for auth/timeout failures to the user.

In @pages/day3_2_documentation_agent.py:
- Around line 1-4: Wrap the non-idempotent weave.init call with Streamlit
caching: create a cached initializer (e.g., add a function named init_weave()
decorated with @st.cache_resource that calls load_dotenv(override=True) if
desired and returns weave.init("training-llm-app")), then call init_weave() from
app() instead of calling weave.init() directly; alternatively move load_dotenv()
to module scope and only cache weave.init() via the @st.cache_resource-decorated
init_weave to prevent multiple weave.init invocations across Streamlit reruns.

In @pages/day3_3_checkpointer.py:
- Around line 4-7: The module currently initializes Weave in a way that can leak
sensitive tool/checkpoint outputs across users; change initialization to respect
an env var flag and enable PII redaction: read WEAVE_DISABLED (or a config) and
skip calling weave.init() if set true; when calling weave.init() from the
Streamlit app() or a guarded init function, pass settings={"redact_pii": True}
to enable Presidio redaction; wrap the weave.init(...) call in a try/except to
catch and log initialization errors (do not crash the app) and fall back to
disabled mode so tracing is not sent.

In @pages/day3_5_form.py:
- Around line 4-8: The code currently calls weave.init("training-llm-app")
inside app(), causing repeated initialization on every Streamlit rerun; move the
initialization out of app() and perform it exactly once by guarding at module
load or via a one-time flag (e.g., st.session_state["weave_initialized"]) before
calling weave.init; additionally honor an environment variable (e.g.,
WEAVE_INIT_ENABLED) to disable initialization when set to "false" and wrap the
call in try/except to log or handle errors so failures don’t crash the app or
leak PII—update references to weave.init, the app() function, and use
st.session_state (or a module-level boolean) plus os.environ to implement these
changes.

In @pages/day3_7_supervisor.py:
- Around line 112-114: Wrap the global Weave initialization in a Streamlit
cached resource so it runs once per server process and fails gracefully: replace
the direct call to weave.init("training-llm-app") with a helper function (e.g.,
get_weave_client or init_weave) decorated with @st.cache_resource that calls
load_dotenv(override=True) and weave.init(...), catches exceptions, logs a
warning to Streamlit (or sets a flag in st.session_state) and returns None or a
safe noop client on failure; then call that helper from the app instead of
calling weave.init directly so re-runs don’t reinitialize global state and
errors don’t crash the app.

In @pyproject.toml:
- Around line 33-35: The dependencies wandb and weave are using open-ended >=
ranges which harms reproducibility; update the pyproject.toml entries for
"wandb" and "weave" to use strict pins (recommended: wandb==0.23.1 and
weave==0.52.22) or, if you prefer an alternative, either commit the lockfile
(uv.lock) to fully freeze transitive deps or use an upper-bounded range (e.g.,
wandb>=0.23.1,<0.24 and weave>=0.52.22,<0.53); modify the two lines referencing
wandb and weave accordingly.
🧹 Nitpick comments (10)
notebooks/day3_8_agent_design_pattern.ipynb (1)

254-271: ノートブックのメタデータがローカル環境を反映しています。

カーネル名が .venv に変更され、Python バージョンが 3.11.13 から 3.11.2 にダウングレードされています。これはローカル開発環境の設定がコミットされた可能性があります。チーム全体で一貫した環境を維持するため、元のカーネル設定を維持するか、意図的な変更であれば確認してください。

💡 元のメタデータを維持する場合
  "kernelspec": {
-   "display_name": ".venv",
+   "display_name": "training-llm-application-development",
   "language": "python",
   "name": "python3"
  },
   "pygments_lexer": "ipython3",
-   "version": "3.11.2"
+   "version": "3.11.13"
  }
pages/day2_5_naive_agent.py (1)

1-4: weave.init() は Streamlit の rerun ガード推奨(1回だけの初期化を確保)

このコード(line 84)では app() 関数内で weave.init("training-llm-app") を呼び出していますが、app() 自体がモジュールレベル(line 142)で実行されているため、ユーザー操作のたびに Streamlit が再実行するスクリプト内で毎回 init が走ります。Weave の公式ガイダンスでは weave.init は Python プロセスごとに 1 回だけ呼び出すべき(グローバル初期化)とされているため、ガード機構が必要です。

修正例(@st.cache_resource またはst.session_state を使用)
 import streamlit as st
 import weave
 from dotenv import load_dotenv
@@
 def app() -> None:
     load_dotenv(override=True)
-    weave.init("training-llm-app")
+    @st.cache_resource
+    def _init_weave_once() -> None:
+        weave.init("training-llm-app")
+    _init_weave_once()

または st.session_state パターン:

 def app() -> None:
     load_dotenv(override=True)
+    if "weave_inited" not in st.session_state:
+        st.session_state.weave_inited = True
+        weave.init("training-llm-app")
pages/day1_5_naive_agent.py (1)

1-4: ハードコードされたプロジェクト名を環境変数で一元管理する

16個のページファイル全体に "training-llm-app" が散在しており、環境変数で一元化することで、dev/prod環境の切り替えや教材ごとの分離がより簡単になります。Weave 公式ドキュメントでも、WEAVE_PROJECT_NAME などの環境変数を使用してプロジェクト名を管理することが推奨されています。weave.init() は複数回呼び出されても安全ですが、統一された設定場所から読み込む方が保守性が向上します。

例(環境変数から取得)
+import os
 import streamlit as st
 import weave
 from dotenv import load_dotenv
@@
 def app() -> None:
     load_dotenv(override=True)
-    weave.init("training-llm-app")
+    weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
pages/day3_6_agent_inbox.py (1)

97-97: タイポ: intrrupt_threadinterrupt_thread

変数名にeが抜けています。このPRでの変更ではありませんが、リファクタリングの機会として修正を検討してください。

♻️ 修正案
-            intrrupt_thread: InterruptThread = next(
+            interrupt_thread: InterruptThread = next(
                 filter(
                     lambda t: t.thread_id == ui_state.selected_thread_id,
                     interrupt_threads,
                 )
             )
             st.write("### 質問")
-            st.write(intrrupt_thread.question)
+            st.write(interrupt_thread.question)
             st.write("### 回答ドラフト")
             final_answer = st.text_area(
-                "回答", value=intrrupt_thread.draft_answer, height=400
+                "回答", value=interrupt_thread.draft_answer, height=400
             )

             submit = st.button("回答送信")

             if not final_answer or not submit:
                 return

-            thread_id = intrrupt_thread.thread_id
+            thread_id = interrupt_thread.thread_id
pages/day3_4_human_in_the_loop.py (1)

5-6: Streamlit の rerun で weave.init() が何度も走らないよう、キャッシュして初期化を一度だけにする

Streamlit はユーザー操作でスクリプトが再実行されるため、毎回 weave.init("training-llm-app") が走ると、起動遅延や不要な再初期化が発生します。公式ドキュメントでは、Streamlit 環境での Weave 初期化は @st.cache_resource を使ってサーバープロセス単位で一度だけ実行することが推奨されています。st.session_state より堅牢で効率的です。

推奨パターン
+import streamlit as st
 import weave
 from dotenv import load_dotenv

+@st.cache_resource
+def init_weave():
+    return weave.init("training-llm-app")
+
 def app(thread_id: str | None = None) -> None:
     load_dotenv(override=True)
-    weave.init("training-llm-app")
+    client = init_weave()
pages/day1_4_rag.py (1)

4-5: @weave.op がジェネレータ関数に対応することは確認されました

stream_rag()Iterator[BaseMessageChunk] 戻り値は @weave.op でサポートされています。Weave はジェネレータが消費される際に出力をログに記録し、このファイルでは st.write_stream() がストリームを適切に消費するため、トレースの問題はありません(Weave ドキュメントおよび同一コードベース内の advanced_rag chains で確認)。

ただし、Streamlit の rerun 対策として、weave.init() をガード することは有効な最適化です:

提案(init ガード例)
def app() -> None:
    load_dotenv(override=True)
-   weave.init("training-llm-app")
+   if "weave_initialized" not in st.session_state:
+       weave.init("training-llm-app")
+       st.session_state.weave_initialized = True
app/advanced_rag/chains/hyde.py (1)

39-45: require_current_call() の例外でストリーム全体が落ちる可能性
WIP段階ならOKですが、利用側が「常にWeave環境で呼ぶ」と限らないならフォールバック(例: WeaveCallId(None))も検討余地ありです。

提案差分(フォールバック例)
 @weave.op(name="hyde")
 def stream(
     self, question: str
 ) -> Generator[Context | AnswerToken | WeaveCallId, None, None]:
-    current_call = weave.require_current_call()
-    yield WeaveCallId(weave_call_id=current_call.id)
+    try:
+        current_call = weave.require_current_call()
+        yield WeaveCallId(weave_call_id=current_call.id)
+    except Exception:
+        # Weave未初期化等でも最低限ストリームは継続できるようにする(要: 方針確認)
+        yield WeaveCallId(weave_call_id=None)
app/advanced_rag/chains/naive.py (1)

3-8: 各Chainで同じWeave導入コードが重複
require_current_call()WeaveCallId の生成はBase側のヘルパーに寄せると、例外時の方針も一箇所で統一できます。

app/advanced_rag/chains/rerank.py (1)

60-66: 先頭でWeaveCallIdを流すのは良いが、例外時の扱いは要確認
weave.init が必ず通る前提ならOK、そうでないならフォールバックを検討したいです。

app/advanced_rag/chains/multi_query.py (1)

3-10: Weave導入パターンが各Chainに散っているので共通化推奨
BaseRAGChain 側に「CallIdをyieldする共通ジェネレータ/ヘルパー」を置けると、仕様変更や例外時対応が楽になります。

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d651fde and 23458ff.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (28)
  • .env.template
  • app/advanced_rag/chains/base.py
  • app/advanced_rag/chains/hyde.py
  • app/advanced_rag/chains/multi_query.py
  • app/advanced_rag/chains/naive.py
  • app/advanced_rag/chains/rag_fusion.py
  • app/advanced_rag/chains/rerank.py
  • app/advanced_rag/chains/route.py
  • notebooks/day1_2_langchain_and_rag_basics.ipynb
  • notebooks/day3_1_langgraph.ipynb
  • notebooks/day3_8_agent_design_pattern.ipynb
  • pages/day1_3_indexing.py
  • pages/day1_4_rag.py
  • pages/day1_5_naive_agent.py
  • pages/day2_2_advanced_rag.py
  • pages/day2_3_evaluation.py
  • pages/day2_4_advanced_rag_feedback.py
  • pages/day2_5_naive_agent.py
  • pages/day2_6_mcp.py
  • pages/day2_7_custom_mcp.py
  • pages/day2_x_create_dataset.py
  • pages/day3_2_documentation_agent.py
  • pages/day3_3_checkpointer.py
  • pages/day3_4_human_in_the_loop.py
  • pages/day3_5_form.py
  • pages/day3_6_agent_inbox.py
  • pages/day3_7_supervisor.py
  • pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (9)
app/advanced_rag/chains/hyde.py (2)
app/advanced_rag/chains/base.py (5)
  • AnswerToken (12-14)
  • BaseRAGChain (22-27)
  • Context (7-9)
  • WeaveCallId (17-19)
  • stream (24-27)
app/advanced_rag/chains/naive.py (1)
  • stream (34-50)
pages/day2_4_advanced_rag_feedback.py (3)
pages/day2_2_advanced_rag.py (1)
  • app (10-58)
app/advanced_rag/chains/base.py (4)
  • AnswerToken (12-14)
  • Context (7-9)
  • WeaveCallId (17-19)
  • stream (24-27)
app/advanced_rag/factory.py (1)
  • create_rag_chain (27-32)
app/advanced_rag/chains/base.py (5)
app/advanced_rag/chains/hyde.py (1)
  • stream (40-60)
app/advanced_rag/chains/multi_query.py (1)
  • stream (50-76)
app/advanced_rag/chains/naive.py (1)
  • stream (34-50)
app/advanced_rag/chains/rerank.py (1)
  • stream (61-80)
app/advanced_rag/chains/route.py (1)
  • stream (62-89)
app/advanced_rag/chains/route.py (2)
app/advanced_rag/chains/base.py (4)
  • AnswerToken (12-14)
  • Context (7-9)
  • WeaveCallId (17-19)
  • stream (24-27)
app/advanced_rag/chains/naive.py (1)
  • stream (34-50)
app/advanced_rag/chains/naive.py (1)
app/advanced_rag/chains/base.py (3)
  • AnswerToken (12-14)
  • Context (7-9)
  • WeaveCallId (17-19)
app/advanced_rag/chains/rag_fusion.py (1)
app/advanced_rag/chains/base.py (3)
  • AnswerToken (12-14)
  • BaseRAGChain (22-27)
  • Context (7-9)
app/advanced_rag/chains/rerank.py (4)
app/advanced_rag/chains/base.py (5)
  • AnswerToken (12-14)
  • BaseRAGChain (22-27)
  • Context (7-9)
  • WeaveCallId (17-19)
  • stream (24-27)
app/advanced_rag/chains/multi_query.py (1)
  • stream (50-76)
app/advanced_rag/chains/naive.py (1)
  • stream (34-50)
app/advanced_rag/chains/hybrid.py (1)
  • stream (79-104)
app/advanced_rag/chains/multi_query.py (2)
app/advanced_rag/chains/base.py (5)
  • AnswerToken (12-14)
  • BaseRAGChain (22-27)
  • Context (7-9)
  • WeaveCallId (17-19)
  • stream (24-27)
app/advanced_rag/chains/naive.py (1)
  • stream (34-50)
pages/day2_3_evaluation.py (2)
app/advanced_rag/chains/base.py (3)
  • AnswerToken (12-14)
  • BaseRAGChain (22-27)
  • Context (7-9)
app/advanced_rag/factory.py (1)
  • create_rag_chain (27-32)
🔇 Additional comments (19)
notebooks/day3_8_agent_design_pattern.ipynb (8)

10-20: LGTM!

dotenv の読み込みは正しく実装されています。override=True により、既存の環境変数が適切に更新されます。


21-31: Weave の初期化は適切です。

Weave のトレーシング初期化は正しく実装されています。WANDB_API_KEY などの必要な環境変数が .env ファイルに設定されていることを確認してください。


72-86: LGTM!

init_chat_model を使用した LLM の初期化は適切です。temperature=0.0 により再現性のある出力が得られます。


100-104: LGTM!

CLI 実行からプログラマティックな Python コードへの変更は適切です。コードの可読性とデバッグのしやすさが向上しています。


127-131: LGTM!

前のセルの goal 変数を使用しており、ノートブックのフローとして適切です。


147-151: LGTM!

型アノテーション optimized_response: str が追加されており、コードの可読性が向上しています。


174-180: LGTM!

SinglePathPlanGeneration の実装は他のセルと一貫したパターンに従っています。


239-251: LGTM!

RoleBasedCooperation の実装は適切です。末尾の空セルはノートブックでは一般的であり、問題ありません。

pages/day3_7_supervisor.py (1)

2-2: import weave は正しい(W&B Weave公式パッケージ)

PyPI の weave パッケージはW&B Weave の公式 Python 配布パッケージです。インストール方法も pip install weave で、公式ドキュメントの推奨に従っています。インポートパスと依存関係は想定通りです。

pages/day2_6_mcp.py (1)

5-5: LGTM!

Weaveの初期化が適切な位置(load_dotenv()の直後)に配置されており、他のページファイルと一貫したパターンになっています。

Also applies to: 77-79

pages/day2_2_advanced_rag.py (2)

2-2: LGTM!

Weaveの初期化パターンが一貫しており、適切に配置されています。

Also applies to: 10-12


42-58: WeaveCallIdのストリーム処理について確認してください。

AI-generatedサマリーによると、RAGチェーンのストリームはContextAnswerTokenに加えてWeaveCallIdもyieldするようになっています。現在のコードではWeaveCallIdは明示的に処理されていませんが、これはトレーシング用のため意図的な設計と思われます。

WeaveCallIdが静かに無視されることを確認してください。

pages/day3_6_agent_inbox.py (1)

4-6: LGTM!

load_dotenvweaveのインポート・初期化が追加され、他のページファイルと一貫したパターンになっています。

Also applies to: 64-67

pages/day2_7_custom_mcp.py (1)

5-5: LGTM!

Weaveの初期化が適切に配置されており、pages/day2_6_mcp.pyと一貫したパターンになっています。

Also applies to: 69-71

.env.template (1)

1-4: WANDB_API_KEY 追加は適切です。WANDB_ENTITY / WANDB_PROJECT は不要。

weave.init("training-llm-app") でプロジェクト名をコード側で指定しているため、WANDB_ENTITYWANDB_PROJECT 環境変数は必須ではありません。これらの変数は、プロジェクト名を環境変数で上書きしたい場合のみ有用で、現在のコード実装では不必要です。WANDB_API_KEY のみで十分です。

app/advanced_rag/chains/naive.py (1)

33-39: ストリーム先頭のWeaveCallId付与は統一できていて良い
他チェーンと同じ順序(CallId→Context→Token)で揃っているのは扱いやすいです。

app/advanced_rag/chains/route.py (1)

61-67: WeaveCallIdの先頭yieldは他チェーンと整合していて良い
ただし require_current_call() の例外時方針(落とす/フォールバック)は全chainで統一したいです。

app/advanced_rag/chains/multi_query.py (1)

49-55: CallId→検索→回答ストリームの順序が統一されていて良い
利用側のイベント処理がシンプルになります。

app/advanced_rag/chains/hyde.py (1)

3-8: 削除推奨 - コード実装とプロジェクト構成は正しい状態です

検証結果:

  • weave.op は Python のジェネレータ関数(yield使用)を公式にサポートしており、イテレータ関数を自動検出してトレースします
  • weave.require_current_call()@weave.op デコレータ内で呼び出す際の安全な使用法です
  • 本プロジェクトの全てのエントリーポイント(pages/*.py, notebooks/) で weave.init("training-llm-app") がアプリ起動時に明示的に呼ばれているため、初期化が保証されています

hyde.py の実装パターンは route.py, rerank.py, naive.py, multi_query.py と一貫しており、Weave の推奨される使用法に従っています。暗黙的な依存関係の懸念は既に実装パターンで担保されています。

Comment on lines +17 to 27
class WeaveCallId:
def __init__(self, weave_call_id: str | None):
self.weave_call_id = weave_call_id


class BaseRAGChain(ABC):
@abstractmethod
def stream(self, question: str) -> Generator[Context | AnswerToken, None, None]:
def stream(
self, question: str
) -> Generator[Context | AnswerToken | WeaveCallId, None, None]:
pass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

stream() の型拡張に合わせて reduce_fn()WeaveCallId を受けられるようにしておくと安全です
BaseRAGChain.stream()WeaveCallId をyieldする前提になったので、reduce_fn() が同じストリーム(or 収集済みchunks)を扱う経路があると型/実装がズレます(Line 24-26 vs Line 30)。WeaveCallId は集約では無視でOKだと思うので、明示的にスキップすると事故が減ります。

修正案(WeaveCallIdを無視して集約)
-def reduce_fn(chunks: Sequence[Context | AnswerToken]) -> Any:
+def reduce_fn(chunks: Sequence[Context | AnswerToken | WeaveCallId]) -> Any:
     context: Sequence[Document] = []
     answer: str = ""

     for chunk in chunks:
         if isinstance(chunk, Context):
             context = chunk.documents

-        if isinstance(chunk, AnswerToken):
+        elif isinstance(chunk, AnswerToken):
             answer += chunk.token
+        elif isinstance(chunk, WeaveCallId):
+            continue

     return {
         "context": context,
         "answer": answer,
     }

Also applies to: 30-44

🤖 Prompt for AI Agents
In @app/advanced_rag/chains/base.py around lines 17 - 27, The stream() method
was extended to yield WeaveCallId, so update the reduce_fn used when consuming
that stream (e.g., reduce_fn in BaseRAGChain or any implementing class) to
accept WeaveCallId values and explicitly ignore them during aggregation; change
the reducer signature to accept Union[Context, AnswerToken, WeaveCallId] (or add
a parameter typed for WeaveCallId) and add a branch that returns the accumulator
unchanged when the item is a WeaveCallId, leaving normal handling for Context
and AnswerToken intact.

Comment on lines +78 to 80
@weave.op(name="rag_fusion")
def stream(self, question: str) -> Generator[Context | AnswerToken, None, None]:
# 検索クエリを生成する
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the base.py file
find . -name "base.py" -path "*/advanced_rag/chains/*" | head -5

Repository: GenerativeAgents/training-llm-application-development

Length of output: 130


🏁 Script executed:

# Check the structure of advanced_rag/chains directory
ls -la app/advanced_rag/chains/ 2>/dev/null | head -20

Repository: GenerativeAgents/training-llm-application-development

Length of output: 744


🏁 Script executed:

# Get the BaseRAGChain class definition and stream method
rg -A 10 "class BaseRAGChain" app/advanced_rag/chains/base.py

Repository: GenerativeAgents/training-llm-application-development

Length of output: 390


🏁 Script executed:

# Check what WeaveCallId is and where it's defined
rg "WeaveCallId" app/advanced_rag/chains/

Repository: GenerativeAgents/training-llm-application-development

Length of output: 1836


🏁 Script executed:

# Check the current rag_fusion.py implementation
cat -n app/advanced_rag/chains/rag_fusion.py

Repository: GenerativeAgents/training-llm-application-development

Length of output: 4046


🏁 Script executed:

# Check if WeaveCallId is imported in rag_fusion.py
rg "import.*WeaveCallId|from.*WeaveCallId" app/advanced_rag/chains/rag_fusion.py

Repository: GenerativeAgents/training-llm-application-development

Length of output: 79


🏁 Script executed:

# Check the stream method in rag_fusion.py around line 78-80
sed -n '70,100p' app/advanced_rag/chains/rag_fusion.py | cat -n

Repository: GenerativeAgents/training-llm-application-development

Length of output: 1585


BaseRAGChain.stream の型契約に不整合:WeaveCallId が実装されていない

BaseRAGChain.stream の抽象メソッドは Generator[Context | AnswerToken | WeaveCallId, None, None] を返すと定義されていますが、RAGFusionRAGChain.streamWeaveCallId をyieldしていません。他のすべてのチェーン実装(naive、hyde、multi_query、rerank、route)は先頭で WeaveCallId をyieldしており、このチェーンだけが異なります。型チェッカでも引っかかり、利用側が一律処理できません。

WeaveCallId のインポート追加と、ストリームの最初でyieldを追加してください。

🤖 Prompt for AI Agents
In @app/advanced_rag/chains/rag_fusion.py around lines 78 - 80, The stream
method in RAGFusionRAGChain (decorated as @weave.op name="rag_fusion") violates
BaseRAGChain.stream's generator contract by not yielding a WeaveCallId first;
import WeaveCallId and modify RAGFusionRAGChain.stream to yield a WeaveCallId as
the first yielded value (matching other chains like
naive/hyde/multi_query/rerank/route) before emitting Context or AnswerToken so
the generator type becomes Generator[Context | AnswerToken | WeaveCallId, None,
None] and callers can uniformly handle the initial call id.

Comment on lines 303 to 318
"# APIキーの先頭3文字が表示されれば、環境変数に設定できています\n",
"print(os.environ[\"WANDB_API_KEY\"][:3])\n",
"assert os.environ[\"WANDB_API_KEY\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Weaveの初期化\n",
"import weave\n",
"\n",
"weave.init(\"training-llm-app\")"
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the notebook file
find . -name "*day1_2_langchain_and_rag_basics.ipynb" -type f

Repository: GenerativeAgents/training-llm-application-development

Length of output: 146


🏁 Script executed:

# Search for official W&B/Weave documentation on API authentication best practices
# This will help verify if printing the API key prefix is actually discouraged

Repository: GenerativeAgents/training-llm-application-development

Length of output: 79


🌐 Web query:

Weights & Biases official documentation API key authentication best practices notebook validation

💡 Result:

Summary — W&B API key authentication best practices + notebook tips

Key-auth basics

  • W&B uses API keys (personal or service-account) for all API/SDK requests; send as Authorization: Bearer <API_KEY> or configure via the wandb SDK (wandb.login / WANDB_API_KEY). [1][2][3]

Create & rotate

  • Create keys in User Settings (personal) or Team → Service Accounts (service account keys). Full secret is shown only once — store it securely or generate a new key if lost. Rotate and revoke exposed keys immediately. [1][4]

Storage & handling (recommended)

  • Never hard-code or commit keys. Use a secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault, Google Secret Manager), a password manager, or OS keychains for local use. Pass keys to processes via environment variables (e.g., WANDB_API_KEY) — but avoid in-line exports that appear in process listings. [1][2]

Service accounts & least privilege

  • Use service-account keys for automation/CI and personal keys for interactive use. Prefer minimal-scope service accounts and team-level service accounts for shared automation; store their keys in your secret manager and limit access. [4][5]

SDK/HTTP usage & verification

  • With SDK: call wandb.login(key=..., verify=True) to verify credentials if needed. For direct HTTP calls use Authorization: Bearer <API_KEY>. Keep SDK up-to-date (newer keys require recent SDK versions). [3][1][6]

Notebook-specific validation & safety

  • Don’t embed secrets in notebooks. Instead:
    • Load keys from environment variables or a secrets manager at runtime (e.g., os.environ["WANDB_API_KEY"] or via a CI-injected secret). [1]
    • Use .env files only locally (and never commit); add to .gitignore. [1]
    • For shared notebooks (Colab / JupyterHub), use platform secrets features (Colab snippets, notebook server secret mounts, or CI artifact tokens) rather than cell text. [1][2]
  • Before committing/notebook-sharing: run a secret scan (git-secrets, detect-secrets, or built-in repo scanners) and remove any accidental keys. Revoke any exposed key immediately and replace it. [1]

Operational security & monitoring

  • Log and monitor API usage; set alerts for unusual activity and enforce team policies for key use and rotation. If a key is compromised, delete it from settings and create a replacement. [1][4]

Quick references (official docs)

  • API keys, creation & storage: W&B docs — Create API key & Store API keys securely. [1]
  • Authentication for APIs / headers and SDK login(): API docs & login() reference. [3][6]
  • Service accounts / team automation: Service accounts guide. [4]

If you want, I can:

  • produce a short notebook-safe code snippet showing secure WANDB login from environment/secret manager,
  • or a checklist for pre-sharing notebook validation and automated secret scanning.

🌐 Web query:

Weave W&B initialization authentication security API key handling

💡 Result:

Summary and secure best-practices for Weave (W&B Weave) initialization and API key handling

  1. How Weave authenticates
  • Weave uses your Weights & Biases account; behind the scenes logging/traces go to a W&B project. You authenticate W&B either with wandb.login(...) or by setting the WANDB_API_KEY environment variable before calling weave.init(...). [1][2][3]
  1. Secure key-handling recommendations (practical)
  • Never hard-code keys in source or notebooks; use environment variables instead (WANDB_API_KEY). [1][3]
  • In interactive notebooks, prompt securely (getpass) so keys aren’t echoed or stored in cell outputs. [3]
  • In headless/CI, store the key in your CI/CD secret store (GitHub Actions Secrets, GitLab CI variables, Travis/Circle secrets) and inject as WANDB_API_KEY at runtime. [1][4]
  • Use service-account (team) API keys for automated systems rather than personal keys; create keys with minimal required scope and a descriptive name so they can be rotated/revoked. [1]
  • Keep keys out of version control (.gitignore .env files) and scan commits for accidental leaks. Rotate keys immediately if exposed. [1]
  • Don’t log or print the full API key anywhere (avoid sending it into telemetry or trace payloads). Use key IDs/metadata for audit logs. [1]
  • For production/cloud deployments, prefer cloud secret managers (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, HashiCorp Vault) and grant services only necessary IAM roles. Inject secrets into containers/pods via secrets mounts or environment injection at runtime. (W&B supports standard env-var auth.) [1][3]
  1. How to initialize safely in code
  • Example (recommended patterns):
    • Notebook: prompt with getpass and set os.environ["WANDB_API_KEY"] before weave.init(...). [3]
    • Headless/CI: rely on environment variable WANDB_API_KEY set by your secret manager/CI, then call weave.init("team/project") (or use wandb.login(host=..., key=os.getenv("WANDB_API_KEY")) if you need custom host). [1][2][4]
  1. Operational hygiene
  • Use per-service keys, rotate regularly, restrict to team/project where possible, revoke compromised keys, and monitor W&B audit logs for suspicious activity. [1]

References

  • W&B quickstart / API key guidance. [1]
  • Weave (weave.init) reference and behavior. [2]
  • Weave “Intro to Traces” / secure getpass notebook example. [3]
  • Example customer onboarding (env var usage and wandb.login examples). [4]

If you want, I can: (a) show a short secure code snippet for notebook or CI, or (b) draft a checklist for rotating/revoking W&B API keys in your org. Which would you prefer?


🏁 Script executed:

# Read the notebook to check lines 303-318 and 593-607
python3 << 'EOF'
import json

with open('./notebooks/day1_2_langchain_and_rag_basics.ipynb', 'r') as f:
    notebook = json.load(f)

# Get all cells and find content around lines 303-318 and 593-607
# In Jupyter notebooks, "lines" refer to cell content, not file lines
# Let's examine cells systematically

print("=== EXAMINING CELLS FOR WANDB_API_KEY AND WEAVE REFERENCES ===\n")

for i, cell in enumerate(notebook['cells']):
    if cell['cell_type'] == 'code':
        content = ''.join(cell['source'])
        if 'WANDB_API_KEY' in content or 'weave' in content:
            print(f"\n--- CELL {i} ---")
            print(content)
            print()

EOF

Repository: GenerativeAgents/training-llm-application-development

Length of output: 839


Notebook出力に WANDB_API_KEY の一部を表示しないでください

Line in Cell 19 の print(os.environ["WANDB_API_KEY"][:3]) は、セキュリティ ベストプラクティスに違反しています。Weights & Biases の公式ガイダンスでは「API キーをログやプリント出力に含めない」ことを推奨しており、教材として共有される Notebook の実行出力にはこの情報を含めるべきではありません。

以下のように存在チェックのみに変更してください:

修正案
import os
assert os.environ.get("WANDB_API_KEY"), "WANDB_API_KEY is not set"

または

import os
if not os.environ.get("WANDB_API_KEY"):
    raise ValueError("WANDB_API_KEY environment variable is required")
🤖 Prompt for AI Agents
In @notebooks/day1_2_langchain_and_rag_basics.ipynb around lines 303 - 318,
Remove the insecure print of the API key prefix (the call
print(os.environ["WANDB_API_KEY"][:3])) and replace the existing assert
os.environ["WANDB_API_KEY"] with a safe presence check using
os.environ.get("WANDB_API_KEY") that raises a clear error (or ValueError) if
missing; ensure only existence is validated before calling
weave.init("training-llm-app") so no part of the WANDB_API_KEY is ever printed
or logged.

Comment on lines +28 to +38
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import weave\n",
"\n",
"weave.init(\"training-llm-app\")"
]
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

weave.init() Jupyter notebook best practices multiple calls idempotent

💡 Result:

Short answer: call weave.init() once per Jupyter kernel session (at the top of your notebook). It's safe to call again, but re-calling re-initializes global tracking (so prefer a single init, or call finish() before re-init). To avoid accidental re-inits, check get_client() and only call init if it returns None.

Recommended pattern (concise):

  • Put initialization in the first cell: import weave; if weave.get_client() is None: weave.init("my-project").
  • If you need to stop logging: weave.finish() and then you may call weave.init(...) again.
  • Avoid multiple differing inits in the same session (they change global tracking settings / project target).

References:

  • Weave API reference (init, get_client, finish). [1]
  • Weave quick-start / README (example init usage). [2]

Sources:
[1] Weights & Biases Weave API docs (weave.init, get_client, finish). (docs.wandb.ai)
[2] Weave (wandb) README / quick start. (github.com)

If you want, I can paste a one-cell snippet you can copy into your notebook.

Citations:


🌐 Web query:

weave.init Weights & Biases documentation initialization

💡 Result:

weave.init initializes Weave (W&B Weave) tracking and connects it to a Weights & Biases project.

Key points

  • Signature (Python): weave.init(project_name: str, *, settings: dict|UserSettings|None = None, autopatch_settings: None|AutopatchSettings = None, global_postprocess_inputs: callable|None = None, global_postprocess_output: callable|None = None, global_attributes: dict|None = None) -> WeaveClient. It returns a WeaveClient and sets up global logging so you don’t need to keep the return value. [1]

  • Minimal example:
    import weave
    weave.init("my-project-name") # uses your default W&B entity if no team specified. [2]

  • Common options:

    • settings: client configuration (e.g., implicitly_patch_integrations). [1][4]
    • global_attributes: metadata applied to all traces. [5]
    • autopatch_settings: deprecated — prefer explicit integration patching. [1][4]
    • global_postprocess_inputs / global_postprocess_output: functions to transform inputs/outputs globally. [1]
  • Integrations/patching:

    • By default Weave implicitly patches supported integrations (can be disabled via settings or env var WEAVE_IMPLICITLY_PATCH_INTEGRATIONS=false). If disabled, call weave.integrations.patch_openai(), etc., to enable tracing. [4]
  • Lifecycle:

    • weave.finish() stops logging; call weave.init(...) again to resume. [1]

Sources
[1] Weave Python SDK reference — weave.init.
[2] Weave quickstart / GitHub README (example usage).
[3] Weave integrations / implicit patching (settings, autopatch deprecation, explicit patch functions).


🌐 Web query:

Jupyter notebook cell execution weave.init repeated calls safe

💡 Result:

Short answer: Yes — calling weave.init() multiple times in the same Jupyter kernel is safe, but be aware of side effects: each init creates/returns a Weave client and (re)enables automatic tracing/logging. If you want to stop logging before re-initializing, call weave.finish() first to avoid duplicate/overlapping logging. Sources: Weave README and SDK docs. [1][2]

Sources

  • Weave README / quickstart (shows import + weave.init usage). [1]
  • Weave Python SDK docs (weave.finish() stops logging; weave.init returns a client you can reuse). [2]

weave.init() は最初のセルで条件付き呼び出しを使用してください

ノートブックで複数回セルが実行される運用では、weave.init() の重複初期化を防ぐ必要があります。公式推奨パターンは、フラグ変数ではなく weave.get_client() で既に初期化されているかを確認する方法です:

import weave

if weave.get_client() is None:
    weave.init("training-llm-app")

この条件付き呼び出しにより、セルが何度実行されても安全に初期化が一度だけ行われます。必要に応じて weave.finish() を呼び出した後に再初期化することもできます。

🤖 Prompt for AI Agents
In @notebooks/day3_1_langgraph.ipynb around lines 28 - 38, Replace the
unconditional call to weave.init("training-llm-app") with a conditional check
using weave.get_client() so initialization only happens when no client exists:
call weave.init("training-llm-app") only if weave.get_client() returns None;
reference weave.finish() if you need to support teardown-and-reinit scenarios.

Comment on lines +203 to +216
"from app.agent_design_pattern.common.reflection_manager import (\n",
" ReflectionManager,\n",
" TaskReflector,\n",
")\n",
"from app.agent_design_pattern.self_reflection.main import ReflectiveAgent\n",
"\n",
"reflection_manager = ReflectionManager(file_path=\"tmp/self_reflection_db.json\")\n",
"task_reflector = TaskReflector(llm=llm, reflection_manager=reflection_manager)\n",
"agent = ReflectiveAgent(\n",
" llm=llm, reflection_manager=reflection_manager, task_reflector=task_reflector\n",
")\n",
"\n",
"result = agent.run(query=\"カレーライスの作り方\")\n",
"print(result)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.py" | xargs rg -l "class ReflectionManager" | head -5

Repository: GenerativeAgents/training-llm-application-development

Length of output: 152


🏁 Script executed:

cat -n ./app/agent_design_pattern/common/reflection_manager.py

Repository: GenerativeAgents/training-llm-application-development

Length of output: 6347


ディレクトリ作成の処理が不足しています。

ReflectionManagersave_reflection() メソッド(72行目)でファイルを書き込む際に、tmp/ ディレクトリが存在しないと FileNotFoundError が発生します。現在の実装ではディレクトリ作成処理がないため、__init__ メソッドで pathlib.Path(self.file_path).parent.mkdir(parents=True, exist_ok=True) を追加するか、実行前に tmp/ ディレクトリを手動で作成する必要があります。

🤖 Prompt for AI Agents
In @notebooks/day3_8_agent_design_pattern.ipynb around lines 203 - 216,
ReflectionManager currently may attempt to write to a file under a missing
directory in save_reflection; update ReflectionManager.__init__ to ensure the
parent directory of self.file_path is created before any writes (e.g., use
pathlib to create the parent with parents=True and exist_ok=True), or
alternatively check/create the parent directory at the start of save_reflection
so FileNotFoundError cannot occur.

Comment on lines 1 to 4
import streamlit as st
import weave
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd / && find . -name "day3_2_documentation_agent.py" -type f 2>/dev/null | head -5

Repository: GenerativeAgents/training-llm-application-development

Length of output: 152


🏁 Script executed:

git ls-files | grep -i "day3_2" | head -10

Repository: GenerativeAgents/training-llm-application-development

Length of output: 132


🏁 Script executed:

git ls-files | grep -E "pages/.*\.py$" | head -20

Repository: GenerativeAgents/training-llm-application-development

Length of output: 783


🏁 Script executed:

cat -n pages/day3_2_documentation_agent.py | head -30

Repository: GenerativeAgents/training-llm-application-development

Length of output: 1119


🌐 Web query:

Weave library init() idempotency Streamlit best practices

💡 Result:

Short answer

  • weave.init() initializes Weave globally (sets up the global client and autopatching of integrations) and is intended to be called once per process. [1][2]
  • In Streamlit (which reruns your script on every interaction) call weave.init exactly once by placing it at top-level and/or protecting it with Streamlit’s resource-singleton caching (st.cache_resource). That avoids repeated init/patching errors and double-patching of integrations. [2][3]

Recommended pattern (minimal)

  • Put this near the top of your script (or in a module imported once):

    import streamlit as st
    import weave

    @st.cache_resource
    def init_weave():
    return weave.init("my-team/my-project", autopatch_settings={...})

    client = init_weave()

Notes / tips

  • Use weave.init(..., autopatch_settings=...) or settings to disable/limit implicit patching if you want explicit control over which LLM libs are instrumented. [2]
  • If you need to toggle tracing at runtime, initialize once with the desired settings (re-initializing with different settings can cause HTTP/patching errors — see community reports). If you must change settings, restart the process. [1][4]
  • st.cache_resource replaces the old experimental_singleton; it ensures the init call runs once per server process even across Streamlit reruns. [3]

Sources

  • Weave (W&B) Python API / init docs. [1]
  • Weave integrations / implicit patching guidance. [2]
  • Streamlit caching / st.cache_resource documentation and examples. [3]
  • Community reports about errors when re-initializing/attempting to disable tracing at runtime. [4]

🏁 Script executed:

# Verify if there are any other Streamlit pages that might have similar patterns
rg "weave\.init" pages/ --type py -A 2 -B 2

Repository: GenerativeAgents/training-llm-application-development

Length of output: 4518


🏁 Script executed:

# Check if st.cache_resource is already used elsewhere in the project
rg "@st\.cache_resource" . --type py -l

Repository: GenerativeAgents/training-llm-application-development

Length of output: 79


weave.init()@st.cache_resource でラップし、Streamlit rerun での多重初期化を防ぐ必要があります

現在のコード(app() 関数内で weave.init("training-llm-app") を呼び出し)は、Streamlit の rerun トリガーごとに weave.init() を複数回実行し、HTTP/patching エラーが発生します。weave.init() はプロセス当たり 1 回限定の非べき等操作です。

以下のパターンで修正してください:

@st.cache_resource
def init_weave():
    load_dotenv(override=True)
    return weave.init("training-llm-app")

def app() -> None:
    init_weave()
    # ... rest of app

または load_dotenv() はモジュールレベルで 1 回実行し、weave.init だけをキャッシュすることもできます。

🤖 Prompt for AI Agents
In @pages/day3_2_documentation_agent.py around lines 1 - 4, Wrap the
non-idempotent weave.init call with Streamlit caching: create a cached
initializer (e.g., add a function named init_weave() decorated with
@st.cache_resource that calls load_dotenv(override=True) if desired and returns
weave.init("training-llm-app")), then call init_weave() from app() instead of
calling weave.init() directly; alternatively move load_dotenv() to module scope
and only cache weave.init() via the @st.cache_resource-decorated init_weave to
prevent multiple weave.init invocations across Streamlit reruns.

Comment on lines 4 to 7
import streamlit as st
import weave
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# ファイルの実際の内容を確認
if [ -f "pages/day3_3_checkpointer.py" ]; then
  echo "=== Lines 1-20 ===" && head -20 pages/day3_3_checkpointer.py && echo "" && echo "=== Lines 100-115 ===" && sed -n '100,115p' pages/day3_3_checkpointer.py
else
  find . -name "*checkpointer*" -type f
fi

Repository: GenerativeAgents/training-llm-application-development

Length of output: 1300


🌐 Web query:

Weave data masking redaction PII filtering disable tracing options

💡 Result:

Summary of Weave options for data masking / PII redaction and disabling tracing

  1. Built‑in PII redaction (Presidio integration)
  • Weave Python SDK supports a Sensitive Data Protection feature that uses Microsoft Presidio. Enable with weave.init(..., settings={"redact_pii": True}); optionally set redact_pii_fields to control which entity types are redacted. Requires installing presidio-analyzer/presidio-anonymizer. [1]
  1. Custom redaction / sanitize keys
  • You can add custom keys to redact (API keys, tokens, headers, internal IDs) via the sanitize helpers (sanitize.add_redact_key). Weave also redacts common keys by default (e.g., "api_key", "auth_headers", "authorization"). You can also post‑process inputs/outputs when autopatching integrations to perform custom redaction. [1][2]
  1. Regex / local methods
  • The docs show examples of using regex or libraries like Presidio/Faker locally before data is sent to traces to mask or replace PII. Use these for custom patterns or anonymization workflows. [3]
  1. Disable tracing / control instrumentation
  • Globally disable tracing via environment var WEAVE_DISABLED=true.
  • Disable per-client via weave.init(..., settings={"disabled": True}).
  • Temporarily disable within code using the context manager set_tracing_enabled(False) from weave.trace.context.call_context.
  • For autopatching (automatic LLM library instrumentation) disable implicit autopatching with implicitly_patch_integrations=False and then selectively patch integrations; you can supply postprocess hooks to redact inputs/outputs for specific integrations. [2][4]

Quick example snippets

  • Enable Presidio redaction:
    weave.init("my-project", settings={"redact_pii": True})
    (install presidio-analyzer/presidio-anonymizer first). [1]

  • Add custom redact key:
    from weave.utils import sanitize
    sanitize.add_redact_key("token") . [1]

  • Disable tracing globally:
    export WEAVE_DISABLED=true (or use weave.init(..., settings={"disabled": True}) or with set_tracing_enabled(False)). [2]

Sources
[1] Weave — Redacting PII / Sensitive Data Protection (Weave docs). (weave-docs.wandb.ai)
[2] Weave — Tracing Basics / autopatching and disabling tracing (Weave docs). (weave-docs.wandb.ai)
[3] Weave — Handling and Redacting PII (examples: regex, Presidio). (weave-docs.wandb.ai)
[4] Weave — Tracing FAQs (disable tracing examples & postprocess hooks). (docs.wandb.ai)

If you want, tell me which language (Python or TypeScript), which integrations you use (OpenAI/Anthropic/etc.), and whether you run on‑prem or cloud—I can give a concrete config snippet for your setup.

Citations:


🌐 Web query:

Streamlit multi-user session isolation global state initialization

💡 Result:

Short answer + best practices:

  • st.session_state is per-user-session and is the correct place to keep per-user state; initialize with a guard like:
    if "foo" not in st.session_state: st.session_state["foo"] = initial_value. [1]

  • Never reassign st.session_state (e.g. st.session_state = {}) — that can break session isolation for other users. Use st.session_state.clear() or delete individual keys instead. (Community bug/FAQ and examples). [5]

  • Global singletons (st.experimental_singleton / module-level globals) are shared across all users and must be thread-safe; don’t use them to store per-user data. If you need shared resources (DB engine, model), keep only thread-safe connectors or factories in singletons and create per-session objects from them. [4][3]

  • For DB sessions / per-user connections: create a new session per Streamlit session and store it in st.session_state (or use st.session_state.setdefault("db", make_session())). This avoids cross-session transaction conflicts. [1][6]

  • Beware custom threads: Streamlit runs a script thread per session; custom threads cannot access session context safely. Use st.session_state and Streamlit-provided patterns for concurrency or pass results back into the session via thread-safe queues and st.experimental_rerun patterns. [3]

  • Use keys for widgets and multipage patterns; when initializing complex state, prefer setdefault or the pattern:
    if "initialized" not in st.session_state:
    do_init()
    st.session_state.initialized = True
    to run one-time per-session initialization. [1][2]

Key references:

  1. Streamlit Session State docs. [1]
  2. Widget / multipage recommendations (keys, placeholders). [2]
  3. Streamlit threading / script threads guidance. [3]
  4. Singleton/shared-object guidance (discussion & cache behavior). [4]
  5. Community note about reassigning session_state vs .clear(). [5]
  6. StackOverflow / examples for SQLAlchemy per-session storage in st.session_state. [6]

If you want, I can turn this into a minimal code template that implements safe per-session initialization, DB-session-in-st.session_state, and a thread-safe shared singleton for a model/engine.


チェックポイント/ツール出力の機微情報がトレース対象になる点を運用的に保護したい(無効化+redaction対応)

このページはツール実行結果やチェックポイント内容(機微情報を含み得る)を扱うため、Weave へのトレース送信を制御する必要があります。weave.init() は Streamlit の app() 関数内で呼ばれていますが、実際にはモジュールレベルの状態を初期化するため、マルチユーザー環境では共有されます。

以下の対応を推奨します:

  • 環境変数 WEAVE_DISABLED=true または設定で無効化可能にする
  • Weave の Presidio PII redaction を有効化する(settings={"redact_pii": True}
  • init() 失敗時のエラーハンドリングを追加する
推奨実装例
+import os
 from typing import Annotated, Any
 from uuid import uuid4
@@
 def app() -> None:
     load_dotenv(override=True)
-    weave.init("training-llm-app")
+    # Weave 無効化フラグ確認
+    if os.getenv("WEAVE_DISABLED", "").lower() not in {"1", "true", "yes"}:
+        try:
+            weave.init(
+                os.getenv("WEAVE_PROJECT", "training-llm-app"),
+                settings={
+                    "redact_pii": True,  # Presidio による自動 PII マスキング
+                }
+            )
+        except Exception as e:
+            st.warning(f"Weave 初期化に失敗しました(トレース機能無効): {e}")
🤖 Prompt for AI Agents
In @pages/day3_3_checkpointer.py around lines 4 - 7, The module currently
initializes Weave in a way that can leak sensitive tool/checkpoint outputs
across users; change initialization to respect an env var flag and enable PII
redaction: read WEAVE_DISABLED (or a config) and skip calling weave.init() if
set true; when calling weave.init() from the Streamlit app() or a guarded init
function, pass settings={"redact_pii": True} to enable Presidio redaction; wrap
the weave.init(...) call in a try/except to catch and log initialization errors
(do not crash the app) and fall back to disabled mode so tracing is not sent.

Comment on lines 4 to 8
import streamlit as st
import weave
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
from langchain_core.prompts import ChatPromptTemplate
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check the actual code in the file
cat -n pages/day3_5_form.py | head -80

Repository: GenerativeAgents/training-llm-application-development

Length of output: 2959


🌐 Web query:

weave.init idempotent Streamlit multiple calls

💡 Result:

Streamlit reruns your script multiple times, so call weave.init only once (guard it). Simple patterns:

  1. Use st.session_state guard (recommended)

    import streamlit as st
    import weave

    if "weave_inited" not in st.session_state:
    weave.init("my-project")
    st.session_state["weave_inited"] = True

  2. Or use a module-level boolean so repeated imports/reruns don't re-init:

    import weave
    _WEAVE_INITED = globals().get("_WEAVE_INITED", False)
    if not _WEAVE_INITED:
    weave.init("my-project")
    globals()["_WEAVE_INITED"] = True

Either approach makes weave.init idempotent across Streamlit reruns. Streamlit’s rerun behavior is documented and common (so guarding init is necessary). [1][2]

Sources:

  • Weave / W&B Weave usage examples showing weave.init usage in Streamlit apps. [1]
  • Streamlit docs / discussion explaining script reruns and why you must guard one-time initialization. [2]

🌐 Web query:

Weave Python W&B initialization best practices idempotency

💡 Result:

Here are concise best practices for making Weave (W&B) initialization in Python idempotent and safe for production code.

Key facts

  • weave.init() initializes logging globally; you don’t need to hold the return value. Repeated calls can change global settings. [1][2][3]

Best practices

  1. Initialize exactly once per process — call weave.init(...) early and centrally (e.g., app entrypoint or inside if name == "main"). [1][2]
  2. Guard repeated init calls with a simple idempotency check (module-level flag or try/except) so repeated imports or re-execution won’t reinitialize.
  3. For multi-process / forked environments, initialize after process spawn (in worker entry) to avoid sharing sockets/clients. Use if name == "main" or worker-level init.
  4. Pass explicit project/settings (or settings={"disabled": True}) rather than relying on implicit defaults so repeated inits don’t change behavior unexpectedly. [1][2]
  5. Keep W&B API key and environment configuration external (WANDB_API_KEY / wandb.login) and avoid re-auth in repeated inits. [2][3]
  6. Use weave’s global_postprocess_* / autopatch_settings or explicit op patching rather than ad-hoc re-init changes to alter behavior at runtime. [1][2]

Minimal idempotent pattern (example)

  • Central init module (safe to import multiple times):

    import weave

    _WEAVE_INITIALIZED = False

    def init_weave(project="myproj", settings=None, **kwargs):
    global _WEAVE_INITIALIZED
    if _WEAVE_INITIALIZED:
    return
    weave.init(project, settings=settings, **kwargs)
    _WEAVE_INITIALIZED = True

Notes / references

  • weave.init is global and used to route weave.op traces to a W&B project. Initialize centrally and avoid repeated inits. [1][2]
  • Weave quickstarts / examples show initializing once per script and decorating functions with @weave.op for tracing. [2][3]
  • Official Weave GitHub and docs contain examples and more details on settings and integrations. [3]

References
[1] Weave Python SDK docs (weave.init signature and global behavior).
[2] Weave + W&B guide: using Weave with W&B training runs / quickstart.
[3] wandb/weave GitHub repo and quickstart examples.


Streamlit の再実行時に weave.init() が何度も呼ばれる問題を修正してください

現在のコード(77行目)は app() 関数内で weave.init("training-llm-app") を実行していますが、Streamlit はユーザー操作のたびにスクリプト全体を再実行するため、この初期化が繰り返し呼ばれます。Weave の init() は複数回呼び出すと予期しない動作や設定変更を引き起こす可能性があるため、初期化を1回に限定する必要があります。あわせて、このコードで入力フォームの内容(ユーザーの質問)が外部トレースに送信されることになり、PII や社内機密情報の漏洩リスクがあるため、環境変数での無効化やエラーハンドリングを追加することが望ましいです。

修正案
 import sqlite3
 from uuid import uuid4

 import streamlit as st
 import weave
 from dotenv import load_dotenv
@@
 def app() -> None:
     load_dotenv(override=True)
-    weave.init("training-llm-app")
+    @st.cache_resource
+    def _init_weave_once() -> None:
+        try:
+            weave.init("training-llm-app")
+        except Exception as e:
+            st.warning(f"Weave の初期化に失敗しました(トレース無効): {e}")
+
+    _init_weave_once()
🤖 Prompt for AI Agents
In @pages/day3_5_form.py around lines 4 - 8, The code currently calls
weave.init("training-llm-app") inside app(), causing repeated initialization on
every Streamlit rerun; move the initialization out of app() and perform it
exactly once by guarding at module load or via a one-time flag (e.g.,
st.session_state["weave_initialized"]) before calling weave.init; additionally
honor an environment variable (e.g., WEAVE_INIT_ENABLED) to disable
initialization when set to "false" and wrap the call in try/except to log or
handle errors so failures don’t crash the app or leak PII—update references to
weave.init, the app() function, and use st.session_state (or a module-level
boolean) plus os.environ to implement these changes.

Comment on lines 112 to 114
load_dotenv(override=True)
weave.init("training-llm-app")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Weave weave.init() idempotent multiple calls safe

💡 Result:

Short answer: Yes — calling weave.init() multiple times is safe, but not a no-op: it (re)initializes global tracking and returns a WeaveClient. Subsequent calls will replace the global logging configuration (project name, settings, autopatch/global hooks). If you just want to check whether tracking is active, use weave.get_client() (or keep the returned client). See Weave docs for init/finish behavior. [1][2]

Sources:

  • Weave init / global logging (init, finish, get_client). [1]
  • Weave tracing / automatic LLM tracking (weave.init usage). [2]

References:
[1] Weave Python SDK reference — init / finish / get_client.
[2] Weave tracing basics / automatic tracking (weave.init example).


🌐 Web query:

Streamlit cache_resource external library initialization Weave W&B

💡 Result:

Short answer — use @st.cache_resource to create a shared, singleton resource that initializes the external library once per server process, ensure the initializer is idempotent/thread-safe, and handle unhashable args / reinitialization with Streamlit features (validate, ttl, clear). Example + notes below.

Example (W&B or Weave client):

  • Use a function that calls the library init and returns the client, and decorate it with st.cache_resource.
  • If the init function takes non-hashable objects, prefix those arg names with an underscore or omit them so Streamlit can cache correctly.
  • Use validate to re-init if the cached object becomes invalid, or ttl to auto-expire; call func.clear() or st.cache_resource.clear() to force reinit.

Example code (conceptual):
import streamlit as st
import wandb # or weave client

@st.cache_resource(ttl=3600, validate=lambda c: c.is_alive())
def get_wandb(run_id: str, _config=None):
    # make init idempotent
    wandb.init(id=run_id, reinit=True, config=_config)
    return wandb

wb = get_wandb("my-run", _config={"lr":0.01})

Key caveats

  • Cached objects are shared across users/sessions and must be thread-safe. Use session_state if you need per-session resources. [1]
  • Async objects aren’t officially supported; caching async initializers can cause event-loop errors. [1]
  • Streamlit hashes function source + hashable args; use underscore-prefixed args for unhashable inputs, or provide hash_funcs. [1][2]
  • W&B/wrap libraries that perform global side effects (network auth, global state) should be initialized with idempotent flags (e.g., wandb.init(reinit=True)) or guarded to avoid double-init errors; use validate/clear/ttl to recover. (Apply same pattern to Weave.)

Sources

  • Streamlit st.cache_resource docs (behavior, ttl, validate, hash rules, thread-safety, clearing). [1]
  • Streamlit blog on cache_resource vs cache_data and recommended usage. [2]

References
[1] Streamlit st.cache_resource docs. [2] Streamlit blog: introducing st.cache_resource / st.cache_data.


🏁 Script executed:

cat -n pages/day3_7_supervisor.py | sed -n '105,125p'

Repository: GenerativeAgents/training-llm-application-development

Length of output: 835


weave.init()@st.cache_resource でガードし、初期化失敗時も graceful に続行させる

Line 113 の weave.init() は Streamlit の再実行のたびに呼ばれ、no-op ではなく毎回グローバル設定を再初期化します。これを避けるには、Streamlit の推奨パターンである @st.cache_resource を使用してサーバープロセスごとに一度だけ初期化し、初期化失敗時は警告を表示して続行するのが安全です(st.session_state は複数ユーザー間で共有されないため、このような共有グローバルリソースには @cache_resource が適切です)。

推奨される実装例
+@st.cache_resource
+def init_weave():
+    try:
+        weave.init("training-llm-app")
+        return weave.get_client()
+    except Exception as e:
+        st.warning(f"Weave の初期化に失敗しました(トレーシング無効): {e}")
+        return None
+
 def app() -> None:
     load_dotenv(override=True)
-    weave.init("training-llm-app")
+    init_weave()
🤖 Prompt for AI Agents
In @pages/day3_7_supervisor.py around lines 112 - 114, Wrap the global Weave
initialization in a Streamlit cached resource so it runs once per server process
and fails gracefully: replace the direct call to weave.init("training-llm-app")
with a helper function (e.g., get_weave_client or init_weave) decorated with
@st.cache_resource that calls load_dotenv(override=True) and weave.init(...),
catches exceptions, logs a warning to Streamlit (or sets a flag in
st.session_state) and returns None or a safe noop client on failure; then call
that helper from the app instead of calling weave.init directly so re-runs don’t
reinitialize global state and errors don’t crash the app.

Comment on lines +33 to 35
"wandb>=0.23.1",
"weave>=0.52.22",
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

weave library installation documentation dependency pinning recommendations

💡 Result:

Summary and recommendations for installing/pinning dependencies for the Weave (Weights & Biases) Python library

Key references

  • Weave GitHub README (prereqs & install). [1]
  • Weave docs / example notebooks showing common companion packages (openai, wandb, pandas, torch, faiss, etc.). [2]
  • PyPI/piwheels package listings (recent Weave releases & supported Python versions). [3]

Concise recommendations

  1. Python and isolation
    • Use Python 3.10+ in a fresh virtualenv or venv (Weave requires Python ≥3.10). [1][3]
  2. Pin Weave itself
    • Pin the Weave wheel to a specific released version (e.g., weave==0.52.22) in your requirements.txt or constraints file to avoid unintentional upgrades. Check the exact latest version on PyPI/GitHub before pinning. [3][1]
  3. Pin direct companion libs used by your project
    • Explicitly pin packages you use with Weave (examples commonly used in Weave examples: wandb, openai, pandas, torch, faiss-gpu, streamlit, litellm, pydantic). Match versions that are known-compatible in your test matrix. Do not rely on implicit transitive pins. [2]
  4. Use a constraints file for transitive dependencies
    • Generate a pip constraints file (e.g., pip-compile, pip freeze --all in a known-good env, or use Poetry/Poetry lock) and install with --constraint to ensure reproducible transitive dependency resolution across environments.
  5. Test matrix
    • Run CI that installs from your pinned requirements + constraints and executes smoke tests (import weave, weave.init, and the code paths you rely on) so you catch incompatible transitive updates early.
  6. Optional: loosen vs. strict pins
    • For libraries where you accept minor upgrades, use = (e.g., pandas=2.1) or >= with an upper bound (e.g., torch>=2.1,<3.0). For weave itself and critical companions, prefer exact pins in production.
  7. Reproducible build artifact
    • Produce a lockfile or a wheelhouse for deployments (private index or wheel cache) and document the pinned set and the date/version used.
  8. Keep an upgrade cadence
    • Schedule regular upgrades (e.g., monthly or quarterly) where you update pins in a branch, run full tests, and then promote.

If you want, I can:

  • create a starter requirements.txt + constraints example for a typical Weave + wandb + openai project, or
  • look up the current latest Weave release and produce concrete pins (requires confirming live versions).

🌐 Web query:

wandb weave compatibility version requirements

💡 Result:

Short answer: follow W&B docs — use wandb >= 0.22.3 and weave >= 0.52.17 for current SDK compatibility (notably required for new API key lengths); see W&B docs and SDK release notes. [1][2]

Sources:

  • W&B docs (SDK version compatibility mentioning wandb v0.22.3+ and weave v0.52.17+). [1]
  • W&B SDK / release notes referencing weave integration and related compatibility details. [2]

wandbweave だけ >= になっているので再現性が落ちます。一貫した依存関係固定方針に統一してください。

他の依存関係が厳密ピン留め(==)中心なのに対し、この2つだけ >= を使用しているため、環境によって挙動が変わりやすくなります。公式ドキュメントでは、weave を本番環境で使用する場合、weave 自体とそのコンパニオンライブラリ(wandb など)に対して正確なバージョン固定(==)を推奨しています。

以下の選択肢から方針を統一してください:

  • 推奨:両パッケージを == で厳密固定(例:wandb==0.23.1weave==0.52.22
  • 代替案:uv.lock ファイルをコミットして全依存を再現可能に固定
  • 代替案:上限付きレンジを使用(例:wandb>=0.23.1,<0.24weave>=0.52.22,<0.53
提案
     "wandb>=0.23.1",
     "weave>=0.52.22",
+    "wandb==0.23.1",
+    "weave==0.52.22",
🤖 Prompt for AI Agents
In @pyproject.toml around lines 33 - 35, The dependencies wandb and weave are
using open-ended >= ranges which harms reproducibility; update the
pyproject.toml entries for "wandb" and "weave" to use strict pins (recommended:
wandb==0.23.1 and weave==0.52.22) or, if you prefer an alternative, either
commit the lockfile (uv.lock) to fully freeze transitive deps or use an
upper-bounded range (e.g., wandb>=0.23.1,<0.24 and weave>=0.52.22,<0.53); modify
the two lines referencing wandb and weave accordingly.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pages/day2_2_advanced_rag.py (1)

44-60: WeaveCallId のインポート・処理が欠けています

pages/day2_4_advanced_rag_feedback.py と異なり、このコードは WeaveCallId をインポートしておらず、ストリーム処理で明示的に処理していません。すべてのチェーン実装は WeaveCallId を最初に yield していますが、現在のコードでは isinstance チェックによって暗黙的にスキップされています。

day2_4 の実装を参考に、WeaveCallId をインポートして明示的に処理するか、処理しない理由をコメントで明記してください。

if isinstance(chunk, WeaveCallId):
    weave_call_id = chunk.weave_call_id
    # weave_call_idを使用して処理を行う
app/advanced_rag/chains/hybrid.py (1)

69-70: 他のチェーンモジュールとの整合性が取れていません。

naive.py など他のチェーンモジュールでは @weave.op デコレータに移行し、WeaveCallId を yield していますが、hybrid.py は依然として @traceable を使用しています。

  • @traceable(name="hybrid", reduce_fn=reduce_fn)@weave.op(name="hybrid")
  • 戻り値の型に WeaveCallId を追加
  • weave.require_current_call() を使用して WeaveCallId を最初に yield
♻️ 提案する修正
-from langsmith import traceable
+import weave

-from app.advanced_rag.chains.base import AnswerToken, BaseRAGChain, Context, reduce_fn
+from app.advanced_rag.chains.base import AnswerToken, BaseRAGChain, Context, WeaveCallId

-    `@traceable`(name="hybrid", reduce_fn=reduce_fn)
-    def stream(self, question: str) -> Generator[Context | AnswerToken, None, None]:
+    `@weave.op`(name="hybrid")
+    def stream(self, question: str) -> Generator[Context | AnswerToken | WeaveCallId, None, None]:
+        current_call = weave.require_current_call()
+        yield WeaveCallId(weave_call_id=current_call.id)
+
         # 並列で検索する準備
🤖 Fix all issues with AI agents
In `@evals/day3_documentation_agent/predictor.py`:
- Around line 35-36: The ChatOpenAI instantiation in predictor.py is passing an
invalid reasoning_effort value ("none"); remove the reasoning_effort argument or
set it to one of the supported values ("minimal", "low", "medium", "high") only
when using a reasoning-capable model. Update the code around ChatOpenAI(...)
(the line creating llm) to either omit reasoning_effort entirely for
non-reasoning models like "gpt-4.1" or conditionally set reasoning_effort to a
valid string when self.model_name identifies a reasoning model (e.g., o1/o3),
ensuring DocumentationAgent(...) still receives the llm and k as before.

In `@evals/day3_documentation_agent/scorers.py`:
- Around line 127-148: score メソッドは async なのに同期的に
llm_with_structure.invoke(prompt_value) を呼んでいるため非同期処理をブロックします; 解決するには score
内で同期呼び出しを await 可能に置き換える(推奨: await
llm_with_structure.ainvoke(prompt_value))か、代替で await asyncio.to_thread(lambda:
llm_with_structure.invoke(prompt_value)) を使って同期実行を別スレッドに移すように修正し、必要に応じて asyncio
をインポートし、呼び出し元の変数名(llm_with_structure, prompt_value, score)をそのまま使って置換してください。

In `@pages/day2_2_advanced_rag.py`:
- Line 14: weave.init(os.getenv("WEAVE_PROJECT_NAME")) is being called inside
app(), causing it to run on every Streamlit re-run; move the initialization out
of the app() function to module-level (or a one-time guarded initializer) so it
executes only once on import/startup; locate the weave.init call in the file and
relocate it so app() no longer calls weave.init (or wrap it with a guard like an
already_inited flag) to match the pattern used across other pages.

In `@pages/day2_3_evaluation.py`:
- Around line 54-56: Rename the misspelled variable ground_truch_source to
ground_truth_source in the assignment (currently set to context) so it matches
usage elsewhere; update any subsequent references to ground_truch_source to the
corrected ground_truth_source identifier, keeping the surrounding lines
(output_context and search_result_sources assignments) unchanged.

In `@pages/day2_5_naive_agent.py`:
- Line 86: The call to weave.init(os.getenv("WEAVE_PROJECT_NAME")) inside app()
causes repeated initialization on each Streamlit rerun; wrap this so it only
runs once (e.g., check a guard before calling weave.init — use a module-level
flag like INITIALIZED_WEAVE or Streamlit's st.session_state, or use a
weave-provided is_initialized() if available) and move the guarded call so that
app() invokes weave.init only when the guard indicates initialization hasn't
happened; update references in app() to rely on the single initialization and
set the flag after successful weave.init().

In `@pages/day2_6_mcp.py`:
- Line 80: Streamlit re-runs cause weave.init() to be called repeatedly; guard
the initialization by storing a flag in st.session_state (e.g.,
st.session_state["weave_inited"]) and only call
weave.init(os.getenv("WEAVE_PROJECT_NAME")) when that flag is missing/false,
then set the flag to true; update the spot around asyncio.run(app())/weave.init
usage in pages/day2_6_mcp.py to check the session state flag before initializing
so weave.init runs exactly once per session.

In `@pages/day2_7_custom_mcp.py`:
- Line 72: weave.init(os.getenv("WEAVE_PROJECT_NAME")) can receive None if
WEAVE_PROJECT_NAME is not set; update the call to validate or provide a default
before invoking weave.init: read the env via os.getenv("WEAVE_PROJECT_NAME"), if
it's None or empty raise a clear error (or fallback to a safe default string)
and then pass that validated string into weave.init; reference the weave.init
call and the os.getenv("WEAVE_PROJECT_NAME") usage when making the change.
♻️ Duplicate comments (7)
pages/day3_5_form.py (1)

78-78: Streamlit の再実行時に weave.init() が繰り返し呼ばれる問題

app() 内での weave.init() 呼び出しは、Streamlit がユーザー操作のたびにスクリプトを再実行するため、初期化が何度も行われます。st.session_state@st.cache_resource を使用して初期化を1回に限定してください。

また、os.getenv("WEAVE_PROJECT_NAME")None を返す可能性があるため、デフォルト値の設定またはバリデーションを追加することを推奨します。

pages/day3_3_checkpointer.py (1)

108-108: Streamlit 再実行時の繰り返し初期化と機微情報の保護

app() 内での weave.init() 呼び出しは、Streamlit の再実行のたびに繰り返されます。また、このページはツール実行結果やチェックポイント内容を扱うため、Weave へのトレース送信における PII 保護を検討してください。

過去のレビューで指摘された通り、st.session_state での初期化制限、settings={"redact_pii": True} の有効化、およびエラーハンドリングの追加を推奨します。

notebooks/day1_2_langchain_and_rag_basics.ipynb (1)

303-305: Notebook出力に WANDB_API_KEY の一部を表示しないでください

Line 304 の print(os.environ["WANDB_API_KEY"][:3]) は、セキュリティベストプラクティスに違反しています。APIキーの一部であっても出力に含めるべきではありません。

修正案
-# APIキーの先頭3文字が表示されれば、環境変数に設定できています
-print(os.environ["WANDB_API_KEY"][:3])
-assert os.environ["WANDB_API_KEY"]
+# 環境変数が設定されていることを確認
+assert os.environ.get("WANDB_API_KEY"), "WANDB_API_KEY is not set"
notebooks/day3_8_agent_design_pattern.ipynb (1)

199-212: ディレクトリ作成の処理が不足しています

ReflectionManager(file_path="tmp/self_reflection_db.json") を使用する際、tmp/ ディレクトリが存在しない場合に FileNotFoundError が発生する可能性があります。ReflectionManager の初期化時にディレクトリ作成処理を追加するか、ノートブック内でディレクトリを事前に作成することを検討してください。

pages/day2_4_advanced_rag_feedback.py (1)

96-100: context / answerNone のケースをガードしてください

else ブランチで st.session_state.state.contextNone の場合、show_context(None) が呼び出され、Line 25-26 で for doc in context:TypeError を引き起こします。初回のストリームが例外で中断したケース等で発生し得るため、ガードが必要です。

修正案
     else:
         context = st.session_state.state.context
-        show_context(context)
+        if context:
+            show_context(context)
         st.write("### 回答")
-        st.write(st.session_state.state.answer)
+        st.write(st.session_state.state.answer or "")
app/advanced_rag/chains/rag_fusion.py (2)

11-12: WeaveCallIdのインポートが欠落しています

他のチェーン実装(naive、hyde、multi_query、rerank、route)と一貫性を保つため、WeaveCallIdをインポートに追加してください。

-from app.advanced_rag.chains.base import AnswerToken, BaseRAGChain, Context
+from app.advanced_rag.chains.base import AnswerToken, BaseRAGChain, Context, WeaveCallId

58-59: BaseRAGChain.streamの型契約違反: WeaveCallIdの実装が欠落

streamメソッドの戻り値型にWeaveCallIdが含まれておらず、メソッド内でWeaveCallIdをyieldしていません。これはBaseRAGChainの抽象メソッド契約(Generator[Context | AnswerToken | WeaveCallId, None, None])に違反しています。

🐛 修正案
     `@weave.op`(name="rag_fusion")
-    def stream(self, question: str) -> Generator[Context | AnswerToken, None, None]:
+    def stream(
+        self, question: str
+    ) -> Generator[Context | AnswerToken | WeaveCallId, None, None]:
+        current_call = weave.require_current_call()
+        yield WeaveCallId(weave_call_id=current_call.id)
+
         # 検索クエリを生成する
🧹 Nitpick comments (11)
pages/day3_x_create_documentation_dataset.py (1)

10-12: 環境変数が未設定の場合の考慮

os.getenv("WEAVE_PROJECT_NAME") は環境変数が未設定の場合に None を返します。weave.init(None) が予期しない動作をする可能性があります。

既存の day2_x_create_dataset.py でも同様のパターンが使われているため、プロジェクト全体で統一的に対応することを推奨します。

♻️ 提案: 環境変数のバリデーション追加
 def app() -> None:
     load_dotenv(override=True)
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    project_name = os.getenv("WEAVE_PROJECT_NAME")
+    if not project_name:
+        st.error("WEAVE_PROJECT_NAME 環境変数が設定されていません。")
+        return
+    weave.init(project_name)
app/advanced_rag/chains/hybrid.py (1)

17-18: _reciprocal_rank_fusion@weave.op に移行すべきか検討してください。

内部関数 _reciprocal_rank_fusion は引き続き @traceable を使用しています。移行の一貫性を保つため、この関数も Weave に移行するか、トレーシングが不要であればデコレータを削除することを検討してください。

pages/day1_4_rag.py (1)

13-21: プロンプトの一元管理を検討してください。

このファイルではローカルに _prompt_template を定義していますが、app/prompts/rag_prompts.py で定義されている generate_answer_prompt と内容が類似しています。他の RAG チェーンモジュールでは外部プロンプトを使用しているため、一貫性のために generate_answer_prompt を使用することを検討してください。

♻️ 提案する修正
+from app.prompts import generate_answer_prompt
-
-_prompt_template = '''
-以下の文脈だけを踏まえて質問に回答してください。
-
-文脈: """
-{context}
-"""
-
-質問: {question}
-'''

# stream_rag 関数内で:
-    prompt = ChatPromptTemplate.from_template(_prompt_template)
-    prompt_value = prompt.invoke({"question": query, "context": documents})
+    prompt_text = generate_answer_prompt.format(context=documents, question=query)
notebooks/day1_2_langchain_and_rag_basics.ipynb (1)

308-314: WEAVE_PROJECT_NAME が未設定の場合のエラーハンドリングを追加してください

os.getenv("WEAVE_PROJECT_NAME") は環境変数が設定されていない場合に None を返します。weave.init(None) の挙動が不明確なため、事前にバリデーションを行うか、デフォルト値を設定することを推奨します。

修正案
# Weaveの初期化
import weave

project_name = os.getenv("WEAVE_PROJECT_NAME")
assert project_name, "WEAVE_PROJECT_NAME is not set"
weave.init(project_name)
notebooks/day3_8_agent_design_pattern.ipynb (1)

21-27: WEAVE_PROJECT_NAME が未設定の場合のエラーハンドリングを追加してください

他のノートブックと同様に、os.getenv("WEAVE_PROJECT_NAME")None を返す可能性があります。事前のバリデーションを推奨します。

修正案
import os

import weave

project_name = os.getenv("WEAVE_PROJECT_NAME")
assert project_name, "WEAVE_PROJECT_NAME is not set"
weave.init(project_name)
pages/day2_4_advanced_rag_feedback.py (2)

74-95: answer の状態更新をループ外に移動することを検討してください

Line 95 の st.session_state.state.answer = answer がforループ内にあります。機能的には問題ありませんが、最終的な値のみを保存すればよいため、ループ外に移動することでコードの意図が明確になります。

修正案
             if isinstance(chunk, AnswerToken):
                 if not answer_start:
                     answer_start = True
                     st.write("### 回答")
                     placeholder = st.empty()

                 answer += chunk.token
                 placeholder.write(answer)

-            st.session_state.state.answer = answer
+        st.session_state.state.answer = answer

118-124: weave.init()が二重に呼び出されています。より効率的な実装が可能です。

Line 34 で既に weave.init() が呼び出されていますが、Line 118 で再度呼び出されています。weave.init() はグローバルな Weave ロギング状態を初期化するため、プロセスごとに一度の呼び出しが推奨されます。

Line 118 では weave.get_client() を使用して既に初期化されたクライアントを取得するか、Line 34 の戻り値を保存して再利用してください。Streamlit アプリケーションの場合、Line 34 の初期化は @st.cache_resource デコレーターでキャッシュすることも検討してください。

修正案

方法1: weave.get_client() を使用する場合

             client = weave.init(os.getenv("WEAVE_PROJECT_NAME"))
             call = client.get_call(weave_call_id)

を以下に変更:

-            client = weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+            client = weave.get_client()
             call = client.get_call(weave_call_id)

方法2: Streamlit で初期化をキャッシュする場合

+@st.cache_resource
 def app() -> None:
     load_dotenv(override=True)
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    client = weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    return client

その後 Line 118 では:

-            client = weave.init(os.getenv("WEAVE_PROJECT_NAME"))
             call = client.get_call(weave_call_id)
app/advanced_rag/chains/route.py (1)

55-59: オプション: ルーティング分岐にelse句の追加を検討

Route enumは現在2つの値のみですが、将来的な拡張やモデル出力の予期しない値に対する防御として、else句を追加してエラーをraiseするか、デフォルトの動作を定義することを検討してください。

♻️ 提案例
         # ルーティングに応じて検索
         if route == Route.langsmith_document:
             documents = self.langsmith_document_retriever.invoke(question)
         elif route == Route.web:
             documents = self.web_retriever.invoke(question)
+        else:
+            raise ValueError(f"Unexpected route: {route}")
pages/day2_3_evaluation.py (1)

107-112: WEAVE_PROJECT_NAME 環境変数が未設定の場合の処理を検討してください。

os.getenv("WEAVE_PROJECT_NAME")None を返す場合、weave.init(None) の動作が不明確です。デフォルト値を設定するか、環境変数が必須であることを明示的にチェックすることを推奨します。

♻️ 修正案
 def app() -> None:
     load_dotenv(override=True)
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    project_name = os.getenv("WEAVE_PROJECT_NAME")
+    if not project_name:
+        st.error("WEAVE_PROJECT_NAME 環境変数が設定されていません。")
+        return
+    weave.init(project_name)
evals/day2_advanced_rag/run_evaluation.py (1)

112-117: WEAVE_PROJECT_NAME 環境変数が未設定の場合のエラーハンドリングを追加してください。

os.getenv("WEAVE_PROJECT_NAME")None を返す可能性があります。

♻️ 修正案
 async def run_evaluations(chains: list[str], model_name: str, dataset_name: str):
     """全チェーンの評価を単一の非同期コンテキストで実行"""
 
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    project_name = os.getenv("WEAVE_PROJECT_NAME")
+    if not project_name:
+        raise ValueError("WEAVE_PROJECT_NAME 環境変数が設定されていません。")
+    weave.init(project_name)
evals/day3_documentation_agent/run_evaluation.py (1)

32-36: WEAVE_PROJECT_NAME 環境変数が未設定の場合のエラーハンドリングを追加してください。

他の評価スクリプト(evals/day2_advanced_rag/run_evaluation.py)と同様に、環境変数の検証を追加することを推奨します。

♻️ 修正案
 async def run_evaluations(models: list[str], dataset_name: str):
     """全モデルの評価を単一の非同期コンテキストで実行"""
 
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    project_name = os.getenv("WEAVE_PROJECT_NAME")
+    if not project_name:
+        raise ValueError("WEAVE_PROJECT_NAME 環境変数が設定されていません。")
+    weave.init(project_name)

Comment on lines +127 to +148
@weave.op
async def score(
self, output: dict[str, Any], user_request: str, expected_output: str
) -> dict:
requirements_doc = output.get("requirements_doc", "")

prompt = ChatPromptTemplate.from_template(
document_quality_evaluation_prompt.content
)

model = init_chat_model(model=self.model_id, model_provider="openai")
llm_with_structure = model.with_structured_output(QualityEvaluation)

prompt_value = prompt.invoke(
{
"user_request": user_request,
"expected_output": expected_output,
"requirements_doc": requirements_doc,
}
)

result: QualityEvaluation = llm_with_structure.invoke(prompt_value)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

async メソッド内で同期的な invoke() を使用しています。

score メソッドは async ですが、llm_with_structure.invoke() は同期呼び出しです。これにより、他の非同期タスクがブロックされる可能性があります。ainvoke() を使用するか、または asyncio.to_thread() でラップすることを検討してください。

🔧 修正案(ainvoke を使用)
     `@weave.op`
     async def score(
         self, output: dict[str, Any], user_request: str, expected_output: str
     ) -> dict:
         requirements_doc = output.get("requirements_doc", "")
 
         prompt = ChatPromptTemplate.from_template(
             document_quality_evaluation_prompt.content
         )
 
         model = init_chat_model(model=self.model_id, model_provider="openai")
         llm_with_structure = model.with_structured_output(QualityEvaluation)
 
         prompt_value = prompt.invoke(
             {
                 "user_request": user_request,
                 "expected_output": expected_output,
                 "requirements_doc": requirements_doc,
             }
         )
 
-        result: QualityEvaluation = llm_with_structure.invoke(prompt_value)
+        result: QualityEvaluation = await llm_with_structure.ainvoke(prompt_value)
🤖 Prompt for AI Agents
In `@evals/day3_documentation_agent/scorers.py` around lines 127 - 148, score
メソッドは async なのに同期的に llm_with_structure.invoke(prompt_value)
を呼んでいるため非同期処理をブロックします; 解決するには score 内で同期呼び出しを await 可能に置き換える(推奨: await
llm_with_structure.ainvoke(prompt_value))か、代替で await asyncio.to_thread(lambda:
llm_with_structure.invoke(prompt_value)) を使って同期実行を別スレッドに移すように修正し、必要に応じて asyncio
をインポートし、呼び出し元の変数名(llm_with_structure, prompt_value, score)をそのまま使って置換してください。


def app() -> None:
load_dotenv(override=True)
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Streamlit の再実行時に weave.init() が繰り返し呼ばれます

他のページファイルと同様に、app() 内での weave.init() は Streamlit の再実行のたびに呼び出されます。プロジェクト全体で一貫したパターンを適用することを推奨します:

修正案
 def app() -> None:
     load_dotenv(override=True)
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+
+    if "weave_initialized" not in st.session_state:
+        weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
+        st.session_state["weave_initialized"] = True
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
def app() -> None:
load_dotenv(override=True)
if "weave_initialized" not in st.session_state:
weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
st.session_state["weave_initialized"] = True
🤖 Prompt for AI Agents
In `@pages/day2_2_advanced_rag.py` at line 14,
weave.init(os.getenv("WEAVE_PROJECT_NAME")) is being called inside app(),
causing it to run on every Streamlit re-run; move the initialization out of the
app() function to module-level (or a one-time guarded initializer) so it
executes only once on import/startup; locate the weave.init call in the file and
relocate it so app() no longer calls weave.init (or wrap it with a guard like an
already_inited flag) to match the pattern used across other pages.

Comment on lines +54 to +56
output_context: list[Document] = output["context"]
search_result_sources: list[str] = [r.metadata["source"] for r in output_context]
ground_truch_source: str = reference_outputs["context"]
ground_truch_source: str = context
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

変数名のタイポを修正してください。

ground_truch_sourceground_truth_source

🔧 修正案
     output_context: list[Document] = output["context"]
     search_result_sources: list[str] = [r.metadata["source"] for r in output_context]
-    ground_truch_source: str = context
+    ground_truth_source: str = context
 
-    if ground_truch_source in search_result_sources:
+    if ground_truth_source in search_result_sources:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
output_context: list[Document] = output["context"]
search_result_sources: list[str] = [r.metadata["source"] for r in output_context]
ground_truch_source: str = reference_outputs["context"]
ground_truch_source: str = context
output_context: list[Document] = output["context"]
search_result_sources: list[str] = [r.metadata["source"] for r in output_context]
ground_truth_source: str = context
if ground_truth_source in search_result_sources:
🤖 Prompt for AI Agents
In `@pages/day2_3_evaluation.py` around lines 54 - 56, Rename the misspelled
variable ground_truch_source to ground_truth_source in the assignment (currently
set to context) so it matches usage elsewhere; update any subsequent references
to ground_truch_source to the corrected ground_truth_source identifier, keeping
the surrounding lines (output_context and search_result_sources assignments)
unchanged.


def app() -> None:
load_dotenv(override=True)
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Streamlit の再実行時に weave.init() が繰り返し呼ばれます

app() 関数内での weave.init() は、Streamlit のスクリプト再実行のたびに呼び出されます。以下のパターンで初期化を1回に制限してください:

修正案
 def app() -> None:
     load_dotenv(override=True)
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+
+    if "weave_initialized" not in st.session_state:
+        weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
+        st.session_state["weave_initialized"] = True
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
def app() -> None:
load_dotenv(override=True)
if "weave_initialized" not in st.session_state:
weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
st.session_state["weave_initialized"] = True
🤖 Prompt for AI Agents
In `@pages/day2_5_naive_agent.py` at line 86, The call to
weave.init(os.getenv("WEAVE_PROJECT_NAME")) inside app() causes repeated
initialization on each Streamlit rerun; wrap this so it only runs once (e.g.,
check a guard before calling weave.init — use a module-level flag like
INITIALIZED_WEAVE or Streamlit's st.session_state, or use a weave-provided
is_initialized() if available) and move the guarded call so that app() invokes
weave.init only when the guard indicates initialization hasn't happened; update
references in app() to rely on the single initialization and set the flag after
successful weave.init().


async def app() -> None:
load_dotenv(override=True)
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Streamlit の再実行時に weave.init() が繰り返し呼ばれます

asyncio.run(app()) は Streamlit のスクリプト再実行のたびに実行されるため、weave.init() も繰り返し呼び出されます。st.session_state を使用して初期化を1回に制限してください。

修正案
 async def app() -> None:
     load_dotenv(override=True)
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+
+    if "weave_initialized" not in st.session_state:
+        weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
+        st.session_state["weave_initialized"] = True
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
async def app() -> None:
load_dotenv(override=True)
if "weave_initialized" not in st.session_state:
weave.init(os.getenv("WEAVE_PROJECT_NAME", "training-llm-app"))
st.session_state["weave_initialized"] = True
🤖 Prompt for AI Agents
In `@pages/day2_6_mcp.py` at line 80, Streamlit re-runs cause weave.init() to be
called repeatedly; guard the initialization by storing a flag in
st.session_state (e.g., st.session_state["weave_inited"]) and only call
weave.init(os.getenv("WEAVE_PROJECT_NAME")) when that flag is missing/false,
then set the flag to true; update the spot around asyncio.run(app())/weave.init
usage in pages/day2_6_mcp.py to check the session state flag before initializing
so weave.init runs exactly once per session.


async def app() -> None:
load_dotenv(override=True)
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

weave.init None project name behavior documentation

💡 Result:

In W&B Weave, weave.init expects a string project identifier:

  • Signature: weave.init(project_name: str, ...)—it “logs to the specified project.” If you omit the team/entity in the string, Weave uses your default entity. Passing None is not a documented/valid value for project_name (it’s typed as str), so you should treat it as an error case and provide a real project name. [1]
  • If you want to avoid hard-coding <entity>/<project>, you can typically rely on W&B defaults and/or environment variables (commonly WANDB_ENTITY, WANDB_PROJECT) so you only pass what’s necessary (but still pass a real string project name to weave.init). [2]
  • If your goal is “no project / don’t log”, disable tracing instead (for example via WEAVE_DISABLED=true). [3]

Sources: [1] [2] [3]


🏁 Script executed:

# Search for weave.init usage patterns in the codebase
rg "weave\.init" -A 2 -B 2

Repository: GenerativeAgents/training-llm-application-development

Length of output: 6937


🏁 Script executed:

# Check the specific file for context and imports
cat -n pages/day2_7_custom_mcp.py | head -80

Repository: GenerativeAgents/training-llm-application-development

Length of output: 3125


🏁 Script executed:

# Search for WEAVE_PROJECT_NAME configuration
rg "WEAVE_PROJECT_NAME" -B 2 -A 2

Repository: GenerativeAgents/training-llm-application-development

Length of output: 6937


🏁 Script executed:

# Check for .env configuration or setup instructions
find . -type f \( -name ".env*" -o -name "*.md" -o -name "*.txt" \) -exec grep -l "WEAVE_PROJECT_NAME" {} \;

Repository: GenerativeAgents/training-llm-application-development

Length of output: 112


🏁 Script executed:

# Check .env.template for WEAVE_PROJECT_NAME configuration
cat .env.template

Repository: GenerativeAgents/training-llm-application-development

Length of output: 179


🏁 Script executed:

# Check if there's a .env file in the repo (may not be committed)
ls -la | grep -E "\.env|\.env\."

Repository: GenerativeAgents/training-llm-application-development

Length of output: 162


🏁 Script executed:

# Look for documentation or setup instructions mentioning WEAVE_PROJECT_NAME
rg "WEAVE_PROJECT_NAME" -t md -t txt

Repository: GenerativeAgents/training-llm-application-development

Length of output: 79


weave.init()None が渡される可能性があります。環境変数の設定を確認してください。

WEAVE_PROJECT_NAME 環境変数が .env ファイルに設定されていない場合、os.getenv()None を返します。W&B Weave の weave.init() は文字列型の引数を要求するため、None を渡すと型エラーが発生します。.env.template では WEAVE_PROJECT_NAME= が空の状態で定義されており、ユーザーがセットアップ時に値を入力する必要があります。環境変数が正しく設定されていることを確認するか、デフォルト値を提供することを検討してください。

🤖 Prompt for AI Agents
In `@pages/day2_7_custom_mcp.py` at line 72,
weave.init(os.getenv("WEAVE_PROJECT_NAME")) can receive None if
WEAVE_PROJECT_NAME is not set; update the call to validate or provide a default
before invoking weave.init: read the env via os.getenv("WEAVE_PROJECT_NAME"), if
it's None or empty raise a clear error (or fallback to a safe default string)
and then pass that validated string into weave.init; reference the weave.init
call and the os.getenv("WEAVE_PROJECT_NAME") usage when making the change.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@evals/day2_advanced_rag/run_evaluation.py`:
- Around line 1-12: Replace all full-width Japanese parentheses '(' and ')' with
ASCII parentheses '(' and ')' in the module docstring and any other string
literals that currently use them (e.g., the docstring text "7種類のRAG Chain(naive,
hyde, multi_query, rag_fusion, rerank, route, hybrid)" and the other occurrences
noted), or alternatively add a per-line ruff ignore (e.g., "# ruff: noqa:
RUF002") next to those lines; ensure the updates target the module-level
docstring and the same string literals so Ruff no longer flags RUF002.

In `@evals/day3_documentation_agent/run_evaluation.py`:
- Around line 68-78: Replace the full-width Japanese parentheses in the help
strings for the argparse arguments to avoid RUF001: update the help for the
"--model" argument (parser.add_argument for "--model") and the "--models"
argument (parser.add_argument for "--models") to use ASCII/half-width
parentheses instead of 全角括弧, or alternatively add a ruff rule disable comment if
you intentionally want to keep full-width characters; ensure you change only the
help text values for those two add_argument calls.

In `@evals/day3_documentation_agent/scorers.py`:
- Around line 1-8: The module docstring uses fullwidth Japanese parentheses
which triggers Ruff RUF002; replace all fullwidth parentheses "(" and ")" with
ASCII parentheses "(" and ")" across the file (notably the docstring lines
describing SectionCompletenessScorer, RequirementsCoverageScorer,
DocumentQualityScorer and the other occurrence near the
SectionCompletenessScorer usage around line 65) so lint passes, or alternatively
add a localized ruff noqa/disable comment if you intend to keep fullwidth
characters; update occurrences referencing the class names
SectionCompletenessScorer, RequirementsCoverageScorer, and
DocumentQualityScorer.
- Around line 97-110: The Field descriptions for comprehensiveness_score,
specificity_score, and consistency_score use full-width parentheses which
trigger Ruff RUF001; update the description strings for comprehensiveness_score,
specificity_score, and consistency_score in the scorer model to use ASCII
half-width parentheses "(0.0-1.0)" instead of the full-width ones to satisfy the
linter.

Comment on lines +1 to +12
"""
Advanced RAG Chain Type 評価スクリプト

7種類のRAG Chain(naive, hyde, multi_query, rag_fusion, rerank, route, hybrid)を
一括評価し、Weave UIで性能差を比較する。

使用例:
uv run python -m evals.day2_advanced_rag.run_evaluation
uv run python -m evals.day2_advanced_rag.run_evaluation --chains naive hyde
uv run python -m evals.day2_advanced_rag.run_evaluation --model gpt-4.1-nano

データセットの登録は pages/day2_x_create_dataset.py を参照
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

全角括弧がRuff(RUF002)でlint対象になります

RUF002 が全角括弧を曖昧文字として検出しています。Line 4 以外にも Line 68-69 / Line 90 / Line 93-94 に同種の指摘があるので、CIで ruff を実行するなら半角括弧に置き換えるかルールを無効化してください。

🛠 例: 半角括弧への置き換え
-7種類のRAG Chain(naive, hyde, multi_query, rag_fusion, rerank, route, hybrid)を
+7種類のRAG Chain (naive, hyde, multi_query, rag_fusion, rerank, route, hybrid) を
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"""
Advanced RAG Chain Type 評価スクリプト
7種類のRAG Chainnaive, hyde, multi_query, rag_fusion, rerank, route, hybrid
一括評価しWeave UIで性能差を比較する
使用例:
uv run python -m evals.day2_advanced_rag.run_evaluation
uv run python -m evals.day2_advanced_rag.run_evaluation --chains naive hyde
uv run python -m evals.day2_advanced_rag.run_evaluation --model gpt-4.1-nano
データセットの登録は pages/day2_x_create_dataset.py を参照
"""
Advanced RAG Chain Type 評価スクリプト
7種類のRAG Chain (naive, hyde, multi_query, rag_fusion, rerank, route, hybrid)
一括評価しWeave UIで性能差を比較する
使用例:
uv run python -m evals.day2_advanced_rag.run_evaluation
uv run python -m evals.day2_advanced_rag.run_evaluation --chains naive hyde
uv run python -m evals.day2_advanced_rag.run_evaluation --model gpt-4.1-nano
データセットの登録は pages/day2_x_create_dataset.py を参照
"""
🧰 Tools
🪛 Ruff (0.14.14)

4-4: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


4-4: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)

🤖 Prompt for AI Agents
In `@evals/day2_advanced_rag/run_evaluation.py` around lines 1 - 12, Replace all
full-width Japanese parentheses '(' and ')' with ASCII parentheses '(' and ')'
in the module docstring and any other string literals that currently use them
(e.g., the docstring text "7種類のRAG Chain(naive, hyde, multi_query, rag_fusion,
rerank, route, hybrid)" and the other occurrences noted), or alternatively add a
per-line ruff ignore (e.g., "# ruff: noqa: RUF002") next to those lines; ensure
the updates target the module-level docstring and the same string literals so
Ruff no longer flags RUF002.

Comment on lines +68 to +78
parser = argparse.ArgumentParser(description="Documentation Agent 評価")
parser.add_argument(
"--model",
type=str,
help="評価するモデル(単一)",
)
parser.add_argument(
"--models",
nargs="+",
help="評価するモデル(複数)",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

help 文字列の全角括弧がRuff(RUF001)でlint対象になります

Line 72 と Line 77 の help 文字列が RUF001 で指摘される可能性があります。CIで ruff を実行するなら半角括弧へ置き換えるかルール無効化を検討してください。

🛠 例: 半角括弧への置き換え
-        help="評価するモデル(単一)",
+        help="評価するモデル (単一)",
@@
-        help="評価するモデル(複数)",
+        help="評価するモデル (複数)",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parser = argparse.ArgumentParser(description="Documentation Agent 評価")
parser.add_argument(
"--model",
type=str,
help="評価するモデル(単一)",
)
parser.add_argument(
"--models",
nargs="+",
help="評価するモデル(複数)",
)
parser = argparse.ArgumentParser(description="Documentation Agent 評価")
parser.add_argument(
"--model",
type=str,
help="評価するモデル (単一)",
)
parser.add_argument(
"--models",
nargs="+",
help="評価するモデル (複数)",
)
🧰 Tools
🪛 Ruff (0.14.14)

72-72: String contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF001)


72-72: String contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF001)


77-77: String contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF001)


77-77: String contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF001)

🤖 Prompt for AI Agents
In `@evals/day3_documentation_agent/run_evaluation.py` around lines 68 - 78,
Replace the full-width Japanese parentheses in the help strings for the argparse
arguments to avoid RUF001: update the help for the "--model" argument
(parser.add_argument for "--model") and the "--models" argument
(parser.add_argument for "--models") to use ASCII/half-width parentheses instead
of 全角括弧, or alternatively add a ruff rule disable comment if you intentionally
want to keep full-width characters; ensure you change only the help text values
for those two add_argument calls.

Comment on lines +1 to +8
"""
Documentation Agent 評価用スコアラー

3種類のスコアラーを提供:
- SectionCompletenessScorer: 7セクションが全て含まれているか(ルールベース)
- RequirementsCoverageScorer: キー要件がカバーされているか(キーワードマッチ)
- DocumentQualityScorer: 品質評価(LLM-as-judge)
"""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

全角括弧がRuff(RUF002)でlint対象になります

RUF002 が全角括弧を曖昧文字として検出しています。Line 5-7 以外にも Line 65 で同種の指摘があるので、CIで ruff を実行するなら半角括弧に置き換えるかルールを無効化してください。

🧰 Tools
🪛 Ruff (0.14.14)

5-5: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


5-5: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


6-6: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


6-6: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


7-7: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


7-7: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)

🤖 Prompt for AI Agents
In `@evals/day3_documentation_agent/scorers.py` around lines 1 - 8, The module
docstring uses fullwidth Japanese parentheses which triggers Ruff RUF002;
replace all fullwidth parentheses "(" and ")" with ASCII parentheses "(" and ")"
across the file (notably the docstring lines describing
SectionCompletenessScorer, RequirementsCoverageScorer, DocumentQualityScorer and
the other occurrence near the SectionCompletenessScorer usage around line 65) so
lint passes, or alternatively add a localized ruff noqa/disable comment if you
intend to keep fullwidth characters; update occurrences referencing the class
names SectionCompletenessScorer, RequirementsCoverageScorer, and
DocumentQualityScorer.

Comment on lines +97 to +110
comprehensiveness_score: float = Field(
..., ge=0.0, le=1.0, description="網羅性スコア(0.0-1.0)"
)
comprehensiveness_reason: str = Field(..., description="網羅性の評価理由")

specificity_score: float = Field(
..., ge=0.0, le=1.0, description="具体性スコア(0.0-1.0)"
)
specificity_reason: str = Field(..., description="具体性の評価理由")

consistency_score: float = Field(
..., ge=0.0, le=1.0, description="整合性スコア(0.0-1.0)"
)
consistency_reason: str = Field(..., description="整合性の評価理由")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Field の説明文に全角括弧が含まれ、Ruff(RUF001)でlint対象になります

Line 98 / Line 103 / Line 108 の説明文が RUF001 対象です。CIで ruff を実行するなら半角括弧へ統一してください。

🛠 例: 半角括弧への置き換え
-        ..., ge=0.0, le=1.0, description="網羅性スコア(0.0-1.0)"
+        ..., ge=0.0, le=1.0, description="網羅性スコア (0.0-1.0)"
@@
-        ..., ge=0.0, le=1.0, description="具体性スコア(0.0-1.0)"
+        ..., ge=0.0, le=1.0, description="具体性スコア (0.0-1.0)"
@@
-        ..., ge=0.0, le=1.0, description="整合性スコア(0.0-1.0)"
+        ..., ge=0.0, le=1.0, description="整合性スコア (0.0-1.0)"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
comprehensiveness_score: float = Field(
..., ge=0.0, le=1.0, description="網羅性スコア0.0-1.0"
)
comprehensiveness_reason: str = Field(..., description="網羅性の評価理由")
specificity_score: float = Field(
..., ge=0.0, le=1.0, description="具体性スコア0.0-1.0"
)
specificity_reason: str = Field(..., description="具体性の評価理由")
consistency_score: float = Field(
..., ge=0.0, le=1.0, description="整合性スコア0.0-1.0"
)
consistency_reason: str = Field(..., description="整合性の評価理由")
comprehensiveness_score: float = Field(
..., ge=0.0, le=1.0, description="網羅性スコア (0.0-1.0)"
)
comprehensiveness_reason: str = Field(..., description="網羅性の評価理由")
specificity_score: float = Field(
..., ge=0.0, le=1.0, description="具体性スコア (0.0-1.0)"
)
specificity_reason: str = Field(..., description="具体性の評価理由")
consistency_score: float = Field(
..., ge=0.0, le=1.0, description="整合性スコア (0.0-1.0)"
)
consistency_reason: str = Field(..., description="整合性の評価理由")
🧰 Tools
🪛 Ruff (0.14.14)

98-98: String contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF001)


98-98: String contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF001)


103-103: String contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF001)


103-103: String contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF001)


108-108: String contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF001)


108-108: String contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF001)

🤖 Prompt for AI Agents
In `@evals/day3_documentation_agent/scorers.py` around lines 97 - 110, The Field
descriptions for comprehensiveness_score, specificity_score, and
consistency_score use full-width parentheses which trigger Ruff RUF001; update
the description strings for comprehensiveness_score, specificity_score, and
consistency_score in the scorer model to use ASCII half-width parentheses
"(0.0-1.0)" instead of the full-width ones to satisfy the linter.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@evals/day2_advanced_rag/run_evaluation.py`:
- Around line 118-119: Ensure weave.init is not called with None by validating
os.getenv("WEAVE_PROJECT_NAME") before use: fetch the env var into a local
(e.g., project_name), if it's None either set a sensible default or raise/exit
with a clear error, then call weave.init(project_name); keep subsequent dataset
retrieval (weave.ref(dataset_name).get()) unchanged but only after successful
init.
🧹 Nitpick comments (3)
evals/day2_advanced_rag/run_evaluation.py (3)

39-41: predict呼び出しごとにモデル・チェーンを再生成しています

評価実行時、各サンプルで init_chat_modelcreate_rag_chain が呼ばれるため、データセットが大きいとオーバーヘッドが顕著になります。

weave.Model の制約上、非シリアライズ可能オブジェクトを属性に持てないため、functools.lru_cache やクラス変数でのキャッシュを検討してください。

♻️ キャッシュ例
from functools import lru_cache

`@lru_cache`(maxsize=16)
def _get_chain(chain_name: str, model_name: str):
    model = init_chat_model(model=model_name, model_provider="openai")
    return create_rag_chain(chain_name=chain_name, model=model)

predict 内で呼び出す:

chain = _get_chain(self.chain_name, self.model_name)

74-84: score呼び出しごとにContextEntityRecallScorerを再生成しています

評価サンプル数が多い場合、毎回インスタンスを作成するオーバーヘッドが積み重なります。__init__ でスコアラーを初期化し、score 内で再利用する方が効率的です。

♻️ 改善案
 class ContextRecallScorer(weave.Scorer):
     model_id: str = "openai/gpt-4.1-nano"
+    _inner_scorer: ContextEntityRecallScorer | None = None
+
+    def model_post_init(self, __context):
+        self._inner_scorer = ContextEntityRecallScorer(model_id=self.model_id)

     `@weave.op`
     async def score(self, output: dict[str, Any], answer: str) -> dict:
-        scorer = ContextEntityRecallScorer(model_id=self.model_id)
-        result = await scorer.score(
+        result = await self._inner_scorer.score(
             output=answer,
             context=output["context_str"],
         )

100-112: HallucinationFreeScorerも同様に毎回インスタンス化されています

ContextRecallScorer と同じパターンで、__init__ または model_post_init でスコアラーを初期化することを推奨します。

Comment on lines +118 to +119
weave.init(os.getenv("WEAVE_PROJECT_NAME"))
dataset = weave.ref(dataset_name).get()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

環境変数未設定時にweave.initNoneを受け取る可能性があります

WEAVE_PROJECT_NAME が設定されていない場合、os.getenvNone を返します。明示的なバリデーションまたはデフォルト値の設定を推奨します。

🛠️ 修正案
-    weave.init(os.getenv("WEAVE_PROJECT_NAME"))
+    project_name = os.getenv("WEAVE_PROJECT_NAME")
+    if not project_name:
+        raise ValueError("WEAVE_PROJECT_NAME environment variable is not set")
+    weave.init(project_name)

または CLI 引数として受け取る:

     parser.add_argument("--dataset", default="training-llm-app")
+    parser.add_argument("--project", default=None, help="Weave project name (defaults to WEAVE_PROJECT_NAME env var)")
🤖 Prompt for AI Agents
In `@evals/day2_advanced_rag/run_evaluation.py` around lines 118 - 119, Ensure
weave.init is not called with None by validating os.getenv("WEAVE_PROJECT_NAME")
before use: fetch the env var into a local (e.g., project_name), if it's None
either set a sensible default or raise/exit with a clear error, then call
weave.init(project_name); keep subsequent dataset retrieval
(weave.ref(dataset_name).get()) unchanged but only after successful init.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants