shawnrhoads · shawnrhoads · Apr 25, 2026 · Apr 26, 2026 · Apr 26, 2026
diff --git a/skills/pyem-model-generator/README.md b/skills/pyem-model-generator/README.md
@@ -0,0 +1,42 @@
+# pyem-model-generator skill
+
+Use this skill to scaffold new computational cognitive models for pyEM that are not in base `pyem`, including from free-text task/model descriptions with equations.
+
+## What it generates
+
+- `pyem/models/{model_class}.py`
+  - `{model_name}_sim(params, nblocks, ntrials, **kwargs)`
+  - `{model_name}_fit(params, *, prior=None, output="npl", **kwargs)`
+- `examples/{model_class}.ipynb`
+  - model/task description
+  - simulation and fit demo
+  - parameter recovery plot (like `examples/rl.ipynb`)
+
+## Included templates
+
+- `template.json` (simple main template)
+- `rl.json`, `bayes.json`, `glm.json` (reference anchors)
+- `references/example-notebook-template.json` (parameter recovery notebook template)
+
+## How to use
+
+1. Copy and fill `template.json`.
+2. Provide it to the skill in your prompt.
+3. Answer follow-up questions if anything is missing.
+4. Ask the skill to generate the model module and notebook.
+
+
+## Offline resources
+
+If the runtime does not include full `pyem` source files, use:
+
+- `references/pyem-runtime-contract.md`
+- `references/parameter-recovery-notebook.md`
+
+These provide enough contract detail to generate pyEM-compatible sim/fit functions and notebook recovery plots.
+
+
+## Free-text description support
+
+If you provide prose + equations instead of a filled template, the skill will parse text into `description_input.extracted_spec`.
+Use `description_examples.social_signals` in `template.json` as a worked example of this conversion.
diff --git a/skills/pyem-model-generator/SKILL.md b/skills/pyem-model-generator/SKILL.md
@@ -0,0 +1,177 @@
+---
+name: pyem-model-generator
+description: Generate new computational cognitive model modules for pyEM and matching example notebooks. Use when asked to add a model not included in base pyEM, scaffold `pyem.models.{modelclass}.py` with `{modelname}_sim(params, nblocks, ntrials, **kwargs)` and `{modelname}_fit(params, *, prior=None, output="npl")`, and produce `examples/{modelclass}.ipynb`. Trigger this skill when the user wants pyEM-style imports, parameter transformations (e.g., `norm2alpha`, `norm2beta`), output dictionaries, model variants, or RL/Bayes/GLM-aligned structure.
+---
+
+# pyem-model-generator
+
+Generate pyEM-compatible model code using patterns in:
+
+- `pyem/models/rl.py`
+- `pyem/models/bayes.py`
+- `pyem/models/glm.py`
+
+## Offline/resource mode
+
+When full `pyem` package files are unavailable, load:
+
+- `references/pyem-runtime-contract.md` for utility and fit contracts.
+- `references/parameter-recovery-notebook.md` for notebook structure and plotting requirements.
+- `references/example-notebook-template.json` for a ready-to-fill notebook cell template.
+
+In offline mode, follow these references instead of guessing utility behavior.
+
+
+## Mandatory clarification behavior
+
+If any required information is missing or ambiguous, ask the user concise follow-up questions before generating code.
+
+Required items to confirm:
+
+1. `model_class`, `model_name`, and target module path.
+2. Simulation task inputs (at minimum `nblocks` and `ntrials`; plus task-specific arrays).
+3. Parameter list with transform/bounds and semantic role.
+4. Sim output keys and fit output modes.
+5. Variant definitions (if requested).
+
+
+## Converting free-text model descriptions into code
+
+When the user provides prose/equations instead of a filled template:
+
+1. Copy the text into `template.description_input.raw_text`.
+2. Extract structured fields into `template.description_input.extracted_spec`:
+   - task flow (stimulus, choice set, outcomes, feedback),
+   - tensor shapes (subject/block/trial/option),
+   - latent state names (`Q_self`, `Q_other`, etc.),
+   - update equations,
+   - choice policy equation(s),
+   - variant catalog and parameter toggles.
+3. Normalize equation variables into valid Python names and map them to parameter definitions.
+4. If any mapping is ambiguous (for example sign conventions or variant naming), ask targeted follow-up questions before code generation.
+5. Generate sim/fit using the extracted spec and preserve the user's intended equations exactly.
+
+### Example: social signals description
+
+For text like a “social signals task” with three options (A/B/C), dual value tracks (`Q_self`, `Q_other`), and policy variants, produce:
+
+- arrays shaped `(nsubjects, nblocks, ntrials, 3)` for option-level values/probabilities,
+- update rules for `Q_self` and `Q_other` with separate learning-rate/valence parameters where requested,
+- base policy `softmax(beta * (w_self * Q_self + w_other * Q_other))`,
+- arbitration variants `p = (1-omega) * softmax(beta * Q_self) + omega * softmax(beta * Q_other)`,
+- variant names tracked in template `variants.variant_names` and parsed parameter switches in `description_input.extracted_spec.variant_rules`.
+
+## pyEM import and function format (pseudo code)
+
+```python
+import numpy as np
+from ..utils.math import softmax, norm2alpha, norm2beta, calc_fval
+
+
+def {model_name}_sim(
+    params: np.ndarray,
+    nblocks: int = 4,
+    ntrials: int = 12,
+    **kwargs,
+) -> dict:
+    """Simulate behavior for one model family."""
+    n_subjects = params.shape[0]
+    rng = np.random.default_rng(kwargs.get("seed", None))
+
+    beta = params[:, 0]   # natural-space for simulation
+    alpha = params[:, 1]  # natural-space for simulation
+
+    choices = np.empty((n_subjects, nblocks, ntrials), dtype=object)
+    rewards = np.zeros((n_subjects, nblocks, ntrials), dtype=float)
+    ev = np.zeros((n_subjects, nblocks, ntrials + 1, 2), dtype=float)
+    pe = np.zeros((n_subjects, nblocks, ntrials), dtype=float)
+    nll = np.zeros((n_subjects, nblocks, ntrials), dtype=float)
+
+    for s in range(n_subjects):
+        for b in range(nblocks):
+            ev[s, b, 0, :] = 0.5
+            for t in range(ntrials):
+                p = softmax(ev[s, b, t, :], beta[s])
+                c = rng.choice([0, 1], p=p)
+                r = 0.0  # replace with task-specific outcome logic
+                pe[s, b, t] = r - ev[s, b, t, c]
+                ev[s, b, t + 1, :] = ev[s, b, t, :]
+                ev[s, b, t + 1, c] = ev[s, b, t, c] + alpha[s] * pe[s, b, t]
+                nll[s, b, t] = -np.log(p[c] + 1e-12)
+
+    return {
+        "params": params,
+        "choices": choices,
+        "rewards": rewards,
+        "EV": ev,
+        "PE": pe,
+        "nll": nll,
+    }
+
+
+def {model_name}_fit(
+    params,
+    *,
+    prior=None,
+    output: str = "npl",
+    **kwargs,
+):
+    """Compute fit objective compatible with pyEM."""
+    beta = float(norm2beta(params[0]))
+    alpha = float(norm2alpha(params[1]))
+
+    if not (1e-5 <= beta <= 20.0):
+        return 1e7
+    if not (0.0 <= alpha <= 1.0):
+        return 1e7
+
+    choices = kwargs["choices"]
+    rewards = kwargs["rewards"]
+    nblocks, ntrials = rewards.shape
+
+    ev = np.zeros((nblocks, ntrials + 1, 2), dtype=float)
+    pe = np.zeros((nblocks, ntrials), dtype=float)
+    nll = 0.0
+
+    for b in range(nblocks):
+        ev[b, 0, :] = 0.5
+        for t in range(ntrials):
+            c = 0 if choices[b, t] == "A" else 1
+            p = softmax(ev[b, t, :], beta)
+            r = rewards[b, t]
+            pe[b, t] = r - ev[b, t, c]
+            ev[b, t + 1, :] = ev[b, t, :]
+            ev[b, t + 1, c] = ev[b, t, c] + alpha * pe[b, t]
+            nll += -np.log(p[c] + 1e-12)
+
+    if output == "all":
+        return {"params": np.array([beta, alpha]), "EV": ev, "PE": pe, "nll": nll}
+
+    return calc_fval(nll, params, prior=prior, output=output)
+```
+
+## Generation workflow
+
+1. Load `template.json` and, if package context is missing, load relevant files under `references/`.
+2. If the user supplied prose/equations, parse them into `description_input.extracted_spec`; if fields remain missing, ask follow-up questions and wait for answers.
+3. Generate `pyem/models/{model_class}.py` using imports, signatures, transforms, and output keys in the template.
+4. Generate `examples/{model_class}.ipynb` from `references/example-notebook-template.json` and align section order to `examples/rl.ipynb`, `examples/bayes.ipynb`, and `examples/glm.ipynb` conventions:
+   - model/task description,
+   - simulation demo,
+   - fit simulated behavior via `EMModel.recover`,
+   - parameter recovery plot with identity line and correlation per parameter.
+5. Run smoke checks: import module, run sim, run fit with `output="npl"`.
+
+## Optional reference alignment
+
+If needed, consult `rl.json`, `bayes.json`, and `glm.json` to mirror existing style and output contracts.
+
+## Notebook generation without repository access
+
+When no local `examples/` notebooks are accessible:
+
+1. Load `references/parameter-recovery-notebook.md`.
+2. Load `references/example-notebook-template.json`.
+3. Fill placeholders and write a valid `.ipynb` (nbformat 4).
+4. Ensure imports include `EMModel` and the generated sim/fit functions.
+5. Ensure final cells run simulation, fitting, and recovery plotting end-to-end.
diff --git a/skills/pyem-model-generator/bayes.json b/skills/pyem-model-generator/bayes.json
@@ -0,0 +1,35 @@
+{
+  "source": "pyem/models/bayes.py",
+  "module": "pyem.models.bayes",
+  "imports": [
+    "import numpy as np",
+    "from ..utils.math import norm2alpha, calc_fval"
+  ],
+  "helpers": [
+    {
+      "name": "_generate_fishp",
+      "signature": "_generate_fishp(lambda1: float, n_fish: int) -> np.ndarray"
+    }
+  ],
+  "functions": [
+    {
+      "name": "bayes_sim",
+      "kind": "sim",
+      "signature": "bayes_sim(params: np.ndarray, nblocks: int = 10, ntrials: int = 15, n_fish: int = 3) -> dict",
+      "param_space": "natural",
+      "state_keys": ["choices", "observations", "probabilities", "ponds"],
+      "output_keys": ["params", "choices", "observations", "probabilities", "ponds"]
+    },
+    {
+      "name": "bayes_fit",
+      "kind": "fit",
+      "signature": "bayes_fit(params, choices, observations, prior=None, output: str = 'npl')",
+      "transform_map": [
+        {"index": 0, "name": "lambda1", "transform": "norm2alpha", "bounds": [0.0, 1.0]}
+      ],
+      "output_modes": ["npl", "nll", "all"],
+      "all_output_keys": ["params", "nll"],
+      "objective_call": "calc_fval(nll, params, prior=prior, output=output)"
+    }
+  ]
+}
diff --git a/skills/pyem-model-generator/glm.json b/skills/pyem-model-generator/glm.json
@@ -0,0 +1,40 @@
+{
+  "source": "pyem/models/glm.py",
+  "module": "pyem.models.glm",
+  "imports": [
+    "import numpy as np",
+    "from scipy.stats import norm",
+    "from scipy.special import expit",
+    "from ..utils.math import norm2alpha, calc_fval"
+  ],
+  "functions": [
+    {
+      "name": "glm_sim",
+      "kind": "sim",
+      "signature": "glm_sim(params: np.ndarray, ntrials: int = 100)",
+      "param_space": "natural",
+      "output_shape": "tuple",
+      "output_keys": ["X", "Y"]
+    },
+    {
+      "name": "glm_fit",
+      "kind": "fit",
+      "signature": "glm_fit(params, X, Y, prior=None, output: str = 'npl')",
+      "transform_map": [],
+      "output_modes": ["npl", "nll", "all"],
+      "all_output_keys": ["params", "predicted_y", "negll", "BIC"],
+      "objective_call": "calc_fval(negll, params, prior=prior, output=output)"
+    },
+    {
+      "name": "glm_decay_fit",
+      "kind": "fit",
+      "signature": "glm_decay_fit(params, X, Y, prior=None, output: str = 'npl', decay: str = 'twostep')",
+      "transform_map": [
+        {"index": -1, "name": "gamma", "transform": "norm2alpha", "bounds": [0.0, 1.0]}
+      ],
+      "output_modes": ["npl", "nll", "all"],
+      "all_output_keys": ["params", "predicted_y", "nll", "BIC"],
+      "objective_call": "calc_fval(negll, params, prior=prior, output=output)"
+    }
+  ]
+}
diff --git a/skills/pyem-model-generator/references/example-notebook-template.json b/skills/pyem-model-generator/references/example-notebook-template.json
@@ -0,0 +1,81 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 5,
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cell_templates": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# {model_title}\\n",
+        "\\n",
+        "## {task_title}\\n",
+        "This notebook demonstrates simulation, fitting, and parameter recovery."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import numpy as np\\n",
+        "import matplotlib.pyplot as plt\\n",
+        "from pyem.api import EMModel\\n",
+        "from pyem.models.{model_class} import {model_name}_sim, {model_name}_fit"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "rng = np.random.default_rng({random_seed})\\n",
+        "nsubjects = {nsubjects}\\n",
+        "nblocks = {nblocks}\\n",
+        "ntrials = {ntrials}\\n",
+        "true_params = np.column_stack([\\n",
+        "    rng.uniform({p1_low}, {p1_high}, nsubjects),\\n",
+        "    rng.uniform({p2_low}, {p2_high}, nsubjects),\\n",
+        "])"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "sim = {model_name}_sim(true_params, nblocks=nblocks, ntrials=ntrials)\\n",
+        "sim.keys()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "em = EMModel({model_name}_fit, prior='laplace')\\n",
+        "recovery = em.recover(sim, {model_name}_fit, n_jobs=1)\\n",
+        "recovery.keys()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "fitted = np.asarray(recovery['mfit'])\\n",
+        "param_names = {param_names}\\n",
+        "n_params = true_params.shape[1]\\n",
+        "fig, axes = plt.subplots(1, n_params, figsize=(4 * n_params, 4))\\n",
+        "for i, ax in enumerate(np.atleast_1d(axes)):\\n",
+        "    ax.scatter(true_params[:, i], fitted[:, i], alpha=0.7)\\n",
+        "    lo = min(true_params[:, i].min(), fitted[:, i].min())\\n",
+        "    hi = max(true_params[:, i].max(), fitted[:, i].max())\\n",
+        "    ax.plot([lo, hi], [lo, hi], 'k--', linewidth=1)\\n",
+        "    r = np.corrcoef(true_params[:, i], fitted[:, i])[0, 1]\\n",
+        "    ax.set_title(f\"{param_names[i]} (r={r:.2f})\")\\n",
+        "    ax.set_xlabel('True')\\n",
+        "    ax.set_ylabel('Recovered')\\n",
+        "plt.tight_layout()"
+      ]
+    }
+  ]
+}