A LangChain integration package for Doubleword.
This package wires Doubleword's OpenAI-compatible inference API
(https://api.doubleword.ai/v1) into LangChain and LangGraph as both real-time
chat / embedding models and transparently-batched variants powered by
autobatcher.
The batched variants are required to access models that Doubleword exposes only via the batch API, and they cut cost on workloads that fan out many concurrent calls — typically the case in LangGraph agents.
pip install langchain-doublewordThree resolution paths, in precedence order:
-
Explicit constructor argument:
ChatDoubleword(model="...", api_key="sk-...")
-
Environment variable:
export DOUBLEWORD_API_KEY=sk-... -
~/.dw/credentials.toml— the same file written by Doubleword's CLI tooling. The active account is selected by~/.dw/config.toml'sactive_accountfield, andinference_keyfrom that account is used.# ~/.dw/config.toml active_account = "work"
# ~/.dw/credentials.toml [accounts.work] inference_key = "sk-..."
To use a non-active account from your credentials file, set
DOUBLEWORD_API_KEYdirectly to that account'sinference_key— there is noaccount=selector on the model itself.
Drop-in chat model. Use this in any LangChain or LangGraph workflow that
expects a BaseChatModel.
from langchain_doubleword import ChatDoubleword
llm = ChatDoubleword(model="your-model-name")
response = llm.invoke("Explain bismuth in three sentences.")
print(response.content)Same interface, but every concurrent .ainvoke() call is collected by
autobatcher and submitted via Doubleword's batch endpoint. Async-only —
sync .invoke() raises.
Use this when:
- The model you want is batch-only (some Doubleword-hosted models do not expose a real-time chat endpoint).
- You're running a LangGraph workflow with parallel branches and want ~50% cost savings via batch pricing.
import asyncio
from langchain_doubleword import ChatDoublewordBatch
llm = ChatDoublewordBatch(model="batch-only-model")
async def main():
# Concurrent calls collected into a single batch under the hood.
results = await asyncio.gather(*[
llm.ainvoke(f"Summarize chapter {i}") for i in range(50)
])
for r in results:
print(r.content)
asyncio.run(main())Four autobatcher.BatchOpenAI knobs are exposed as constructor arguments:
| Argument | Default | Purpose |
|---|---|---|
batch_size |
1000 |
Submit a batch when this many requests are queued. |
batch_window_seconds |
10.0 |
Submit a batch after this many seconds even if the size cap is not reached. |
poll_interval_seconds |
5.0 |
How often autobatcher polls for batch completion. |
completion_window |
"24h" |
Doubleword batch completion window. "1h" is more expensive but faster. |
llm = ChatDoublewordBatch(
model="your-model",
batch_size=250, # smaller batches for fast-turnaround LangGraph nodes
batch_window_seconds=2.5, # don't make latency-sensitive calls wait 10s
completion_window="1h", # pay more, finish quicker
)The same arguments are available on DoublewordEmbeddingsBatch.
from langchain_doubleword import DoublewordEmbeddings, DoublewordEmbeddingsBatch
embed = DoublewordEmbeddings(model="your-embedding-model")
vec = embed.embed_query("hello world")
# Or, transparently batched:
batch_embed = DoublewordEmbeddingsBatch(model="your-embedding-model")
# vecs = await batch_embed.aembed_documents([...])ChatDoubleword and ChatDoublewordBatch are standard BaseChatModel
implementations, so they slot into any LangGraph node:
from langgraph.graph import StateGraph, END
from langchain_doubleword import ChatDoublewordBatch
llm = ChatDoublewordBatch(model="your-model")
async def call_model(state):
response = await llm.ainvoke(state["messages"])
return {"messages": [response]}
graph = StateGraph(dict)
graph.add_node("model", call_model)
graph.set_entry_point("model")
graph.add_edge("model", END)
app = graph.compile()When several model nodes execute in parallel (e.g. via Send or fan-out
edges), autobatcher collects their requests into a single batch.
| Argument | Env var | Default |
|---|---|---|
api_key |
DOUBLEWORD_API_KEY |
required |
base_url |
DOUBLEWORD_API_BASE |
https://api.doubleword.ai/v1 |
model |
— | required |
All other arguments accepted by langchain_openai.ChatOpenAI are forwarded
unchanged (temperature, max_tokens, model_kwargs, timeout, etc.).
MIT