shawnrhoads · shawnrhoads · Apr 26, 2026 · Apr 26, 2026 · Apr 26, 2026 · Apr 26, 2026
diff --git a/skills/pyem-model-generator/README.md b/skills/pyem-model-generator/README.md
@@ -0,0 +1,87 @@
+# pyem-model-generator skill
+
+Use this skill to generate standalone computational cognitive model code and a matching parameter-recovery notebook.
+
+## What this skill generates
+
+Given a task/model description, the skill generates files in a **single directory**:
+
+- `{modclass}_utils.py`
+- `{model_name}.py`
+- `{model_name}.ipynb`
+
+The generated model file follows a consistent contract:
+
+- attributes: `mod_desc`, `mod_spec`, `mod_id`, `MODEL`
+- functions: `mod_params`, `mod_sim`, `mod_fit`
+
+## Required references bundled with the skill
+
+- `references/rl.json`
+- `references/bayes.json`
+- `references/glm.json`
+- `references/modelclass-utils-template.py`
+- `references/model-file-template.py`
+- `references/example-notebook-template.json`
+- `references/parameter-recovery-notebook.md`
+- `references/pyem-runtime-contract.md`
+
+## Quick start (first-time users)
+
+1. Describe your task and model in plain language (or equations).
+2. Ask the skill to generate:
+   - `{modclass}_utils.py`
+   - `{model_name}.py`
+   - `{model_name}.ipynb`
+3. If details are missing, answer the skill’s follow-up questions.
+4. Review generated files and run your analysis workflow.
+
+## Notes on generated files
+
+### Shared utils file
+
+`{modclass}_utils.py` should define shared helpers used across model files:
+
+- `_alloc_sim`, `_alloc_fit`
+- `ModelSpec`, `ParamDef`
+- `spec_to_id`, `build_params`
+- `PARAM_REGISTRY`
+
+### Model file
+
+`{model_name}.py` imports math helpers from pyEM:
+
+```python
+from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval
+```
+
+And imports shared helpers from:
+
+```python
+from {modclass}_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params
+```
+
+### Notebook file
+
+The notebook template uses:
+
+```python
+from pyem.api import EMModel
+```
+
+and follows a simulation → fit → recovery plot workflow.
+
+## Example prompt
+
+```text
+Use pyem-model-generator.
+Generate standalone files in one directory:
+- social_utils.py
+- social_rw.py
+- social_rw.ipynb
+
+Task: three-option social learning task with 4 blocks x 12 trials and 100 agents.
+Model: dual-value update equations for self and other values with softmax choice.
+Include parameter recovery plots in the notebook.
+Ask follow-up questions before generation if any details are ambiguous.
+```
diff --git a/skills/pyem-model-generator/SKILL.md b/skills/pyem-model-generator/SKILL.md
@@ -0,0 +1,96 @@
+---
+name: pyem-model-generator
+description: Generate standalone computational cognitive model modules and example notebooks from free-text or reference specs, using a shared `modclass_utils.py` contract and per-model files with `mod_desc`, `mod_spec`, `mod_id`, `MODEL`, `mod_params`, `mod_sim`, and `mod_fit`.
+---
+
+# pyem-model-generator
+
+Generate all outputs into the **current working directory** (flat layout).
+
+## Required local references
+
+- `references/rl.json`
+- `references/bayes.json`
+- `references/glm.json`
+- `references/modelclass-utils-template.py`
+- `references/model-file-template.py`
+- `references/example-notebook-template.json`
+- `references/parameter-recovery-notebook.md`
+- `references/pyem-runtime-contract.md`
+
+Do not require repository path conventions like `pyem/models/...` or `examples/...`.
+
+## Output layout (flat)
+
+Write files in one directory:
+
+- `{modclass}_utils.py`
+- `{model_name}.py` (one or more model files)
+- `{model_name}.ipynb` (or one notebook per model class)
+
+## Clarification behavior
+
+If required details are missing, ask concise follow-up questions before generation:
+
+1. Task structure (`nsubjects`, `nblocks`, `ntrials`, choices, outcomes).
+2. Parameter names/transforms/bounds/priors.
+3. Equations (state update and choice rule).
+4. Variant list and naming.
+5. Desired output filenames.
+
+## Free-text parsing workflow
+
+When given prose/equations:
+
+1. Extract task flow, tensors, equations, and variants.
+2. Normalize symbol names to valid Python variables.
+3. Preserve equation intent in `mod_sim`/`mod_fit`.
+4. Resolve ambiguities via targeted questions.
+
+## Shared utility heuristic (required)
+
+Create one shared `{modclass}_utils.py` file containing only:
+
+- `_alloc_sim`
+- `_alloc_fit`
+- `ModelSpec`
+- `ParamDef`
+- `spec_to_id`
+- `build_params`
+- `PARAM_REGISTRY`
+
+Each `{model_name}.py` should import shared helpers with:
+
+```python
+from {modclass}_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params
+```
+
+## Per-model file contract
+
+Each generated `{model_name}.py` must include:
+
+- attributes: `mod_desc`, `mod_spec`, `mod_id`, `MODEL`
+- functions: `mod_params`, `mod_sim`, `mod_fit`
+
+Each model file should import math helpers directly from pyem:
+
+```python
+from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval
+```
+
+## Notebook requirements
+
+Generate notebook from `references/example-notebook-template.json` and ensure it imports:
+
+```python
+from pyem.api import EMModel
+```
+
+Do not use `from scipy.optimize import minimize` in generated notebooks.
+
+## Generation steps
+
+1. Select the closest anchor from `references/rl.json`, `references/bayes.json`, `references/glm.json`.
+2. Generate `modclass_utils.py` from `references/modelclass-utils-template.py`.
+3. Generate each `{model_name}.py` from `references/model-file-template.py`.
+4. Generate notebook(s) from `references/example-notebook-template.json` and `references/parameter-recovery-notebook.md`.
diff --git a/skills/pyem-model-generator/references/bayes.json b/skills/pyem-model-generator/references/bayes.json
@@ -0,0 +1,18 @@
+{
+  "model_class": "bayes",
+  "utils_file": "modclass_utils.py",
+  "models": [
+    {
+      "model_name": "bayes_fish",
+      "model_file": "bayes_fish.py",
+      "notebook_file": "bayes_fish.ipynb",
+      "required_attributes": ["mod_desc", "mod_spec", "mod_id", "MODEL"],
+      "required_functions": ["mod_params", "mod_sim", "mod_fit"],
+      "shared_import": "from modclass_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params",
+      "math_import": "from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval",
+      "parameters": ["beta", "lambda1"],
+      "sim_outputs": ["params", "choices", "observations", "posterior", "nll"],
+      "fit_outputs": ["npl", "nll", "all"]
+    }
+  ]
+}
diff --git a/skills/pyem-model-generator/references/example-notebook-template.json b/skills/pyem-model-generator/references/example-notebook-template.json
@@ -0,0 +1,73 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 5,
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cell_templates": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# {model_title}\\n",
+        "\\n",
+        "## {task_title}\\n",
+        "This notebook demonstrates simulation, fitting, and parameter recovery."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import importlib\\n",
+        "import numpy as np\\n",
+        "import matplotlib.pyplot as plt\\n",
+        "from pyem.api import EMModel"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "script_fn = \"{model_name}\"\\n",
+        "nsubj = {nsubjects}\\n",
+        "nblocks = {nblocks}\\n",
+        "ntrials = {ntrials}\\n",
+        "module = importlib.import_module(f\"{script_fn}\")\\n",
+        "MODEL = module.MODEL\\n",
+        "print(script_fn, end=\"\\n\")\\n",
+        "print(MODEL.id, end=\"\\n\\n\")\\n",
+        "print(MODEL.desc, end=\"\\n\\n\")\\n",
+        "mod_params, mod_sim, mod_fit = MODEL.params, MODEL.sim, MODEL.fit\\n",
+        "param_names, param_xform, true_params = mod_params(nsubj)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "sim_outp = mod_sim(true_params, nblocks=nblocks, ntrials=ntrials)\\n",
+        "sim_data = [[sim_outp['choices'][i, ...], sim_outp['rewards'][i, ...]] for i in range(nsubj)]\\n",
+        "len(sim_data)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "model = EMModel(all_data=sim_data, fit_func=mod_fit, param_names=param_names, param_xform=param_xform)\\n",
+        "result = model.fit(verbose=1)\\n",
+        "result"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "fig = model.plot_recovery({'true_params': true_params, 'estimated_params': model.outfit['params']})\\n",
+        "fig"
+      ]
+    }
+  ]
+}
diff --git a/skills/pyem-model-generator/references/glm.json b/skills/pyem-model-generator/references/glm.json
@@ -0,0 +1,18 @@
+{
+  "model_class": "glm",
+  "utils_file": "modclass_utils.py",
+  "models": [
+    {
+      "model_name": "glm_linear",
+      "model_file": "glm_linear.py",
+      "notebook_file": "glm_linear.ipynb",
+      "required_attributes": ["mod_desc", "mod_spec", "mod_id", "MODEL"],
+      "required_functions": ["mod_params", "mod_sim", "mod_fit"],
+      "shared_import": "from modclass_utils import _alloc_sim, _alloc_fit, ModelSpec, spec_to_id, build_params",
+      "math_import": "from pyem.utils.math import norm2alpha, norm2beta, softmax, calc_fval",
+      "parameters": ["w0", "w1", "sigma"],
+      "sim_outputs": ["params", "X", "y", "pred", "nll"],
+      "fit_outputs": ["npl", "nll", "all"]
+    }
+  ]
+}
diff --git a/skills/pyem-model-generator/references/model-file-template.py b/skills/pyem-model-generator/references/model-file-template.py
@@ -0,0 +1,92 @@
+"""Template for one generated model module."""
+
+from __future__ import annotations
+
+import numpy as np
+
+from pyem.utils.math import calc_fval, norm2alpha, norm2beta, softmax
+from modclass_utils import (
+    ModelSpec,
+    _alloc_fit,
+    _alloc_sim,
+    build_params,
+    spec_to_id,
+)
+
+
+mod_desc = """Replace with concise model description."""
+mod_spec = {"rl": {"softmax": ["beta"], "rw": ["alpha"]}}
+mod_id = spec_to_id(mod_spec)
+
+
+def mod_params(nsubj: int, rng: np.random.Generator | None = None):
+    """Generate parameter names, transforms, and true parameters."""
+    return build_params(["beta", "alpha"], nsubj, rng)
+
+
+def mod_sim(params: np.ndarray, nblocks: int = 4, ntrials: int = 12, **kwargs):
+    """Simulate behavior for this model variant."""
+    nsubj = params.shape[0]
+    dat = _alloc_sim(nsubj, nblocks, ntrials, nchoices=2)
+    rng = np.random.default_rng(kwargs.get("seed", None))
+
+    beta = params[:, 0]
+    alpha = params[:, 1]
+
+    for s in range(nsubj):
+        for b in range(nblocks):
+            dat["ev"][s, b, 0, :] = 0.5
+            for t in range(ntrials):
+                p = softmax(dat["ev"][s, b, t, :], beta[s])
+                c = rng.choice([0, 1], p=p)
+                r = float(rng.integers(0, 2))
+                dat["choices"][s, b, t] = "A" if c == 0 else "B"
+                dat["rewards"][s, b, t] = r
+                dat["ch_prob"][s, b, t, :] = p
+                dat["pe"][s, b, t] = r - dat["ev"][s, b, t, c]
+                dat["ev"][s, b, t + 1, :] = dat["ev"][s, b, t, :]
+                dat["ev"][s, b, t + 1, c] = (
+                    dat["ev"][s, b, t, c] + alpha[s] * dat["pe"][s, b, t]
+                )
+                dat["nll"][s] += -np.log(p[c] + 1e-12)
+
+    dat["params"] = params
+    return dat
+
+
+def mod_fit(params, choices, rewards, prior=None, output="npl"):
+    """Fit objective (npl/nll) with optional diagnostics."""
+    beta = float(norm2beta(params[0]))
+    alpha = float(norm2alpha(params[1]))
+
+    if not (1e-5 <= beta <= 20.0) or not (0.0 <= alpha <= 1.0):
+        return 1e7
+
+    nblocks, ntrials = rewards.shape
+    dat = _alloc_fit(nblocks, ntrials, nchoices=2)
+
+    for b in range(nblocks):
+        dat["ev"][b, 0, :] = 0.5
+        for t in range(ntrials):
+            c = 0 if choices[b, t] == "A" else 1
+            p = softmax(dat["ev"][b, t, :], beta)
+            r = rewards[b, t]
+            dat["ch_prob"][b, t, :] = p
+            dat["pe"][b, t] = r - dat["ev"][b, t, c]
+            dat["ev"][b, t + 1, :] = dat["ev"][b, t, :]
+            dat["ev"][b, t + 1, c] = dat["ev"][b, t, c] + alpha * dat["pe"][b, t]
+            dat["nll"] += -np.log(p[c] + 1e-12)
+
+    if output == "all":
+        return {"params": [beta, alpha], **dat}
+    return calc_fval(dat["nll"], np.asarray(params), prior=prior, output=output)
+
+
+MODEL = ModelSpec(
+    id=mod_id,
+    spec=mod_spec,
+    desc=mod_desc,
+    params=mod_params,
+    sim=mod_sim,
+    fit=mod_fit,
+)