Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a3cf34a
Staging branch for LLM module
eb8680 Oct 9, 2025
b57a18d
Move LLM interface code from `robotl` (#358)
jfeser Oct 9, 2025
8170a64
Implement basic tool calling (#366)
jfeser Oct 10, 2025
bcbf7bb
Merge branch 'master' into staging-llm
eb8680 Oct 10, 2025
66a5eb4
Merge branch 'master' into staging-llm
jfeser Oct 20, 2025
ab9e2fe
enable strict mode for tool calling (#375)
jfeser Oct 20, 2025
661cab8
add structured generation and remove unused `decode` operation (#376)
jfeser Oct 20, 2025
02c4378
implemented support for class methods in `Template.define` (#377)
kiranandcode Oct 24, 2025
d9d1782
Revert "implemented support for class methods in `Template.define` (#…
kiranandcode Oct 24, 2025
1053fdd
Add support for methods in `Template.define` (#377) (#378)
kiranandcode Oct 26, 2025
54efb77
Adding a lower-level event and a logger example (#382)
datvo06 Oct 28, 2025
657924e
Add support for tools returning images (#385)
kiranandcode Oct 29, 2025
68af295
Implement Caching Handler for LLM (#392)
datvo06 Nov 12, 2025
b9207b4
implement first to k-ahead sampler (#412)
kiranandcode Nov 24, 2025
41b52b4
Add inheritable class for stateful templates (#416)
jfeser Nov 26, 2025
248ff6e
Support multiple providers (via `litellm`) (#418)
kiranandcode Dec 1, 2025
e4c0d99
store source of generated functions in `__src__` attribute (#403)
kiranandcode Dec 2, 2025
5cb8e89
Adds type-based encoding and support for legacy APIs (#411)
kiranandcode Dec 2, 2025
1f50599
Add LLM Integration tests to the workflows. (#420)
kiranandcode Dec 3, 2025
8118a8f
Merge master into llm-staging (#423)
jfeser Dec 4, 2025
62e45a4
Fix `staging-llm` diff against `master` (#426)
eb8680 Dec 5, 2025
1c37637
Implement a RetryHandler for LLM module (#428)
datvo06 Dec 9, 2025
bb5bded
Merge `master` into `staging-llm` again (#443)
eb8680 Dec 12, 2025
44d7d12
Implements a unified `encode`ing/`decode`ing pipeline for `llm` (#442)
kiranandcode Dec 15, 2025
931d507
Initial version of Lexical Context Collection - Collecting Tools and …
datvo06 Dec 15, 2025
8530fd0
Update `staging-llm` from `master` (#457)
eb8680 Dec 22, 2025
bae8d02
Convert `Template` into an operation (#424)
jfeser Dec 29, 2025
3311d1b
Fail when encoding terms or operations (#474)
jfeser Dec 29, 2025
23f95ef
Implemented record and replay fixtures for LLM calls (#467)
kiranandcode Dec 31, 2025
2094f22
Remove program synthesis code (#475)
jfeser Dec 31, 2025
05b28ef
Disables direct recursion on templates by default (#466)
kiranandcode Dec 31, 2025
d91d4c9
drop k-ahead sampler (#479)
jfeser Dec 31, 2025
e3e8c7e
Document `Template` and `Tool` (#478)
jfeser Jan 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ jobs:
with:
enable-cache: true

- name: Install pandoc
run: |
sudo apt install -y pandoc

- name: Install dependencies
run: |
uv sync --all-extras --dev
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ jobs:
with:
enable-cache: true

- name: Install pandoc
run: |
sudo apt install -y pandoc

- name: Install Python dependencies
run: |
uv sync --all-extras --dev --python ${{ matrix.python-version }}
Expand Down
34 changes: 34 additions & 0 deletions .github/workflows/test_llm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: LLM Integration Tests

on:
pull_request:
branches:
- master
- 'staging-*'
# Allow manual trigger
workflow_dispatch:

jobs:
test-llm:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["'3.13'"]
steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true

- name: Install Python dependencies
run: |
uv sync --all-extras --dev --python ${{ matrix.python-version }}

- name: Run LLM integration tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
uv run pytest tests/test_handlers_llm_provider.py -v --tb=short
4 changes: 4 additions & 0 deletions .github/workflows/test_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ jobs:
with:
enable-cache: true

- name: Install pandoc
run: |
sudo apt install -y pandoc

- name: Install Python packages
run: |
uv sync --all-extras --dev
Expand Down
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.PHONY: lint format test test-notebooks rebuild-fixtures FORCE

lint: FORCE
./scripts/lint.sh

Expand All @@ -10,4 +12,7 @@ test: lint FORCE
test-notebooks: lint FORCE
./scripts/test_notebooks.sh

rebuild-fixtures:
REBUILD_FIXTURES=true uv run pytest tests/test_handlers_llm_provider.py

FORCE:
144 changes: 144 additions & 0 deletions docs/source/beam.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
"""This example demonstrates a beam search over a program that uses a `choose`
effect for nondeterminism and `score` effect to weigh its choices.

"""

import functools
import heapq
import random
from collections.abc import Callable
from dataclasses import dataclass
from pprint import pprint

from effectful.ops.semantics import fwd, handler
from effectful.ops.syntax import ObjectInterpretation, defop, implements


@defop
def choose[T](choices: list[T]) -> T:
result = random.choice(choices)
print(f"choose({choices}) = {result}")
return result


@defop
def score(value: float) -> None:
pass


class Suspend(Exception): ...


class ReplayIntp(ObjectInterpretation):
def __init__(self, trace):
self.trace = trace
self.step = 0

@implements(choose)
def _(self, *args, **kwargs):
if self.step < len(self.trace):
result = self.trace[self.step][1]
self.step += 1
return result
return fwd()


class TraceIntp(ObjectInterpretation):
def __init__(self):
self.trace = []

@implements(choose)
def _(self, *args, **kwargs):
result = fwd()
self.trace.append(((args, kwargs), result))
return result


class ScoreIntp(ObjectInterpretation):
def __init__(self):
self.score = 0.0

@implements(score)
def _(self, value):
self.score += value


class ChooseOnceIntp(ObjectInterpretation):
def __init__(self):
self.is_first_call = True

@implements(choose)
def _(self, *args, **kwargs):
if not self.is_first_call:
raise Suspend

self.is_first_call = False
return fwd()


@dataclass
class BeamCandidate[S, T]:
"""Represents a candidate execution path in beam search."""

trace: list[S]
score: float
in_progress: bool
result: T | None

def __lt__(self, other: "BeamCandidate[S, T]") -> bool:
return self.score < other.score

def expand[**P](self, model_fn: Callable[P, T], *args: P.args, **kwargs: P.kwargs):
in_progress = False
result = None
score_intp = ScoreIntp()
trace_intp = TraceIntp()
with (
handler(score_intp),
handler(ChooseOnceIntp()),
handler(ReplayIntp(self.trace)),
handler(trace_intp),
):
try:
result = model_fn(*args, **kwargs)
except Suspend:
in_progress = True

return BeamCandidate(trace_intp.trace, score_intp.score, in_progress, result)


def beam_search[**P, S, T](
model_fn: Callable[P, T], beam_width=3
) -> Callable[P, BeamCandidate[S, T]]:
@functools.wraps(model_fn)
def wrapper(*args, **kwargs):
beam = [BeamCandidate([], 0.0, True, None)]

while True:
expandable = [c for c in beam if c.in_progress] * beam_width
if not expandable:
return beam

new_candidates = [c.expand(model_fn, *args, **kwargs) for c in expandable]

for c in new_candidates:
heapq.heappushpop(beam, c) if len(
beam
) >= beam_width else heapq.heappush(beam, c)

return wrapper


if __name__ == "__main__":

def model():
s1 = choose(range(100))
score(s1)
s2 = choose(range(-100, 100))
score(s2)
s3 = choose(range(-100, 100))
score(s3)
return s3

result: BeamCandidate = beam_search(model)()
pprint(result)
26 changes: 26 additions & 0 deletions docs/source/beam_search_example.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Angelic Nondeterminism
======================

Here we give an example of *angelic nondeterminism* in effectful [#f1]_.
Our model is a nondeterministic program that makes choices using a ``choose`` effect and uses a ``score`` effect to sum up a final score.
We implement a beam search that optimizes this final score as a handler for the ``choose`` and ``score`` effects.

The beam search works by running the model until it reaches a ``choose``, at which point the continuation is captured.
This continuation is resumed multiple times with different values from ``choose`` to expand the beam.
The intermediate score is used to rank the beam candidates.

Because Python does not have support for first-class continuations, we use *thermometer continuations* [#f2]_.
A thermometer continuation works by tracking any nondeterminism
(essentially, the model is rerun from the start replaying the ``choose`` effects).
If ``choose`` is the only source of nondeterminism, then the
after each ``choose`` and replaying it uses *thermometer continuations* to

.. literalinclude:: ./beam.py
:language: python

References
----------

.. [#f1] Li, Z., Solar-Lezama, A., Yue, Y., and Zheng, S., "EnCompass: Enhancing Agent Programming with Search Over Program Execution Paths", 2025. https://arxiv.org/abs/2512.03571

.. [#f2] James Koppel, Gabriel Scherer, and Armando Solar-Lezama. 2018. Capturing the future by replaying the past (functional pearl). Proc. ACM Program. Lang. 2, ICFP, Article 76 (September 2018), 29 pages. https://doi.org/10.1145/3236771
4 changes: 1 addition & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,8 @@

import os
import sys
from typing import List

sys.path.insert(0, os.path.abspath("../../"))
import sphinx_rtd_theme # noqa: E402

# -- Project information -----------------------------------------------------

Expand Down Expand Up @@ -69,7 +67,7 @@
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns: List[str] = []
exclude_patterns: list[str] = []


# -- Options for HTML output -------------------------------------------------
Expand Down
22 changes: 21 additions & 1 deletion docs/source/effectful.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ Syntax
:members:
:undoc-members:

.. autofunction:: effectful.ops.syntax.defterm(value: T) -> Expr[T]
.. autofunction:: effectful.ops.syntax.defdata(value: Term[T]) -> Expr[T]

Semantics
Expand All @@ -41,6 +40,27 @@ Handlers
:undoc-members:


LLM
^^^

.. automodule:: effectful.handlers.llm
:members:
:undoc-members:

Encoding
""""""""

.. automodule:: effectful.handlers.llm.encoding
:members:
:undoc-members:

Providers
"""""""""

.. automodule:: effectful.handlers.llm.providers
:members:
:undoc-members:

Jax
^^^

Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Table of Contents
minipyro_example
lambda_example
semi_ring_example
beam_search_example

.. toctree::
:maxdepth: 2
Expand Down
5 changes: 3 additions & 2 deletions docs/source/lambda_.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import functools
from typing import Annotated, Callable
from collections.abc import Callable
from typing import Annotated

from effectful.ops.semantics import coproduct, evaluate, fvsof, fwd, handler
from effectful.ops.syntax import Scoped, defdata, defop, syntactic_eq
Expand Down Expand Up @@ -102,7 +103,7 @@ def sort_add(x: Expr[int], y: Expr[int]) -> Expr[int]:
case Term(add_, (a, Term(vx, ()))), Term(vy, ()) if add_ == add and id(vx) > id(
vy
):
return (a + vy()) + vx() # type: ignore
return (a + vy()) + vx()
case _:
return fwd()

Expand Down
Loading