Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions skills/pyem-model-generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# pyem-model-generator skill

Use this skill to generate standalone computational cognitive model code and a matching parameter-recovery notebook.

## What this skill generates

Given a task/model description, the skill generates files in a **single directory**:

- `{modclass}_utils.py`
- `{model_name}.py`
- `{model_name}.ipynb`

The generated model file follows a consistent contract:

- attributes: `mod_desc`, `mod_spec`, `mod_id`, `MODEL`
- functions: `mod_params`, `mod_sim`, `mod_fit`

## Required references bundled with the skill

- `references/rl.json`
- `references/bayes.json`
- `references/glm.json`
- `references/modelclass-utils-template.py`
- `references/model-file-template.py`
- `references/example-notebook-template.json`
- `references/parameter-recovery-notebook.md`
- `references/pyem-runtime-contract.md`

Comment thread
shawnrhoads marked this conversation as resolved.
## Quick start (first-time users)

1. Describe your task and model in plain language (or equations).
2. Ask the skill to generate:
- `{modclass}_utils.py`
- `{model_name}.py`
- `{model_name}.ipynb`
3. If details are missing, answer the skill’s follow-up questions.
4. Review generated files and run your analysis workflow.

## Notes on generated files

### Shared utils file

`{modclass}_utils.py` should define shared helpers used across model files:

- `_alloc_sim`, `_alloc_fit`
- `ModelSpec`, `ParamDef`
- `spec_to_id`, `build_params`
- `PARAM_REGISTRY`

### Model file

`{model_name}.py` imports math helpers from pyEM:

```python
from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval
```

And imports shared helpers from:

```python
from {modclass}_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params
```

### Notebook file

The notebook template uses:

```python
from pyem.api import EMModel
```

and follows a simulation → fit → recovery plot workflow.

## Example prompt

```text
Use pyem-model-generator.
Generate standalone files in one directory:
- social_utils.py
- social_rw.py
- social_rw.ipynb

Task: three-option social learning task with 4 blocks x 12 trials and 100 agents.
Model: dual-value update equations for self and other values with softmax choice.
Include parameter recovery plots in the notebook.
Ask follow-up questions before generation if any details are ambiguous.
```
96 changes: 96 additions & 0 deletions skills/pyem-model-generator/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
name: pyem-model-generator
description: Generate standalone computational cognitive model modules and example notebooks from free-text or reference specs, using a shared `modclass_utils.py` contract and per-model files with `mod_desc`, `mod_spec`, `mod_id`, `MODEL`, `mod_params`, `mod_sim`, and `mod_fit`.
---

# pyem-model-generator

Generate all outputs into the **current working directory** (flat layout).

## Required local references

- `references/rl.json`
- `references/bayes.json`
- `references/glm.json`
- `references/modelclass-utils-template.py`
- `references/model-file-template.py`
- `references/example-notebook-template.json`
- `references/parameter-recovery-notebook.md`
- `references/pyem-runtime-contract.md`

Do not require repository path conventions like `pyem/models/...` or `examples/...`.

## Output layout (flat)

Write files in one directory:

- `{modclass}_utils.py`
- `{model_name}.py` (one or more model files)
- `{model_name}.ipynb` (or one notebook per model class)

## Clarification behavior

If required details are missing, ask concise follow-up questions before generation:

1. Task structure (`nsubjects`, `nblocks`, `ntrials`, choices, outcomes).
2. Parameter names/transforms/bounds/priors.
3. Equations (state update and choice rule).
4. Variant list and naming.
5. Desired output filenames.

## Free-text parsing workflow

When given prose/equations:

1. Extract task flow, tensors, equations, and variants.
2. Normalize symbol names to valid Python variables.
3. Preserve equation intent in `mod_sim`/`mod_fit`.
4. Resolve ambiguities via targeted questions.

## Shared utility heuristic (required)

Create one shared `{modclass}_utils.py` file containing only:

- `_alloc_sim`
- `_alloc_fit`
- `ModelSpec`
- `ParamDef`
- `spec_to_id`
- `build_params`
- `PARAM_REGISTRY`

Each `{model_name}.py` should import shared helpers with:

```python
from {modclass}_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params
```

## Per-model file contract

Each generated `{model_name}.py` must include:

- attributes: `mod_desc`, `mod_spec`, `mod_id`, `MODEL`
- functions: `mod_params`, `mod_sim`, `mod_fit`

Each model file should import math helpers directly from pyem:

```python
from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval
```

## Notebook requirements

Generate notebook from `references/example-notebook-template.json` and ensure it imports:

```python
from pyem.api import EMModel
```

Do not use `from scipy.optimize import minimize` in generated notebooks.

## Generation steps

1. Select the closest anchor from `references/rl.json`, `references/bayes.json`, `references/glm.json`.
2. Generate `modclass_utils.py` from `references/modelclass-utils-template.py`.
3. Generate each `{model_name}.py` from `references/model-file-template.py`.
4. Generate notebook(s) from `references/example-notebook-template.json` and `references/parameter-recovery-notebook.md`.
18 changes: 18 additions & 0 deletions skills/pyem-model-generator/references/bayes.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
"model_class": "bayes",
"utils_file": "modclass_utils.py",
"models": [
{
"model_name": "bayes_fish",
"model_file": "bayes_fish.py",
"notebook_file": "bayes_fish.ipynb",
"required_attributes": ["mod_desc", "mod_spec", "mod_id", "MODEL"],
"required_functions": ["mod_params", "mod_sim", "mod_fit"],
"shared_import": "from modclass_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params",
"math_import": "from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval",
"parameters": ["beta", "lambda1"],
"sim_outputs": ["params", "choices", "observations", "posterior", "nll"],
"fit_outputs": ["npl", "nll", "all"]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
{
"nbformat": 4,
"nbformat_minor": 5,
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"cell_templates": [
{
"cell_type": "markdown",
"source": [
"# {model_title}\\n",
"\\n",
"## {task_title}\\n",
"This notebook demonstrates simulation, fitting, and parameter recovery."
]
},
{
"cell_type": "code",
"source": [
"import importlib\\n",
"import numpy as np\\n",
"import matplotlib.pyplot as plt\\n",
"from pyem.api import EMModel"
]
},
{
"cell_type": "code",
"source": [
"script_fn = \"{model_name}\"\\n",
"nsubj = {nsubjects}\\n",
"nblocks = {nblocks}\\n",
"ntrials = {ntrials}\\n",
"module = importlib.import_module(f\"{script_fn}\")\\n",
"MODEL = module.MODEL\\n",
"print(script_fn, end=\"\\n\")\\n",
"print(MODEL.id, end=\"\\n\\n\")\\n",
"print(MODEL.desc, end=\"\\n\\n\")\\n",
"mod_params, mod_sim, mod_fit = MODEL.params, MODEL.sim, MODEL.fit\\n",
"param_names, param_xform, true_params = mod_params(nsubj)"
]
},
{
"cell_type": "code",
"source": [
"sim_outp = mod_sim(true_params, nblocks=nblocks, ntrials=ntrials)\\n",
"sim_data = [[sim_outp['choices'][i, ...], sim_outp['rewards'][i, ...]] for i in range(nsubj)]\\n",
"len(sim_data)"
]
},
{
"cell_type": "code",
"source": [
"model = EMModel(all_data=sim_data, fit_func=mod_fit, param_names=param_names, param_xform=param_xform)\\n",
"result = model.fit(verbose=1)\\n",
"result"
]
},
{
"cell_type": "code",
"source": [
"fig = model.plot_recovery({'true_params': true_params, 'estimated_params': model.outfit['params']})\\n",
"fig"
]
}
]
}
18 changes: 18 additions & 0 deletions skills/pyem-model-generator/references/glm.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
"model_class": "glm",
"utils_file": "modclass_utils.py",
"models": [
{
"model_name": "glm_linear",
"model_file": "glm_linear.py",
"notebook_file": "glm_linear.ipynb",
"required_attributes": ["mod_desc", "mod_spec", "mod_id", "MODEL"],
"required_functions": ["mod_params", "mod_sim", "mod_fit"],
"shared_import": "from modclass_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params",
"math_import": "from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval",
"parameters": ["w0", "w1", "sigma"],
"sim_outputs": ["params", "X", "y", "pred", "nll"],
"fit_outputs": ["npl", "nll", "all"]
}
]
}
92 changes: 92 additions & 0 deletions skills/pyem-model-generator/references/model-file-template.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
"""Template for one generated model module."""

from __future__ import annotations

import numpy as np

from pyem.utils.math import calc_fval, norm2alpha, norm2beta, softmax
from modclass_utils import (
ModelSpec,
_alloc_fit,
_alloc_sim,
build_params,
spec_to_id,
)
Comment thread
shawnrhoads marked this conversation as resolved.


mod_desc = """Replace with concise model description."""
mod_spec = {"rl": {"softmax": ["beta"], "rw": ["alpha"]}}
mod_id = spec_to_id(mod_spec)


def mod_params(nsubj: int, rng: np.random.Generator | None = None):
"""Generate parameter names, transforms, and true parameters."""
return build_params(["beta", "alpha"], nsubj, rng)


def mod_sim(params: np.ndarray, nblocks: int = 4, ntrials: int = 12, **kwargs):
"""Simulate behavior for this model variant."""
nsubj = params.shape[0]
dat = _alloc_sim(nsubj, nblocks, ntrials, nchoices=2)
rng = np.random.default_rng(kwargs.get("seed", None))

beta = params[:, 0]
alpha = params[:, 1]

for s in range(nsubj):
for b in range(nblocks):
dat["ev"][s, b, 0, :] = 0.5
for t in range(ntrials):
p = softmax(dat["ev"][s, b, t, :], beta[s])
c = rng.choice([0, 1], p=p)
r = float(rng.integers(0, 2))
dat["choices"][s, b, t] = "A" if c == 0 else "B"
dat["rewards"][s, b, t] = r
dat["ch_prob"][s, b, t, :] = p
dat["pe"][s, b, t] = r - dat["ev"][s, b, t, c]
dat["ev"][s, b, t + 1, :] = dat["ev"][s, b, t, :]
dat["ev"][s, b, t + 1, c] = (
dat["ev"][s, b, t, c] + alpha[s] * dat["pe"][s, b, t]
)
dat["nll"][s] += -np.log(p[c] + 1e-12)

dat["params"] = params
return dat
Comment thread
shawnrhoads marked this conversation as resolved.


def mod_fit(params, choices, rewards, prior=None, output="npl"):
"""Fit objective (npl/nll) with optional diagnostics."""
beta = float(norm2beta(params[0]))
alpha = float(norm2alpha(params[1]))

if not (1e-5 <= beta <= 20.0) or not (0.0 <= alpha <= 1.0):
return 1e7

nblocks, ntrials = rewards.shape
dat = _alloc_fit(nblocks, ntrials, nchoices=2)

for b in range(nblocks):
dat["ev"][b, 0, :] = 0.5
for t in range(ntrials):
c = 0 if choices[b, t] == "A" else 1
p = softmax(dat["ev"][b, t, :], beta)
r = rewards[b, t]
dat["ch_prob"][b, t, :] = p
dat["pe"][b, t] = r - dat["ev"][b, t, c]
dat["ev"][b, t + 1, :] = dat["ev"][b, t, :]
dat["ev"][b, t + 1, c] = dat["ev"][b, t, c] + alpha * dat["pe"][b, t]
dat["nll"] += -np.log(p[c] + 1e-12)

if output == "all":
return {"params": [beta, alpha], **dat}
return calc_fval(dat["nll"], np.asarray(params), prior=prior, output=output)


MODEL = ModelSpec(
id=mod_id,
spec=mod_spec,
desc=mod_desc,
params=mod_params,
sim=mod_sim,
fit=mod_fit,
)
Loading
Loading