Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,20 @@ The format is based on Keep a Changelog and this project follows Semantic Versio

## [Unreleased]

## [0.6.4] - 2026-04-02

### Fixed
- MCP-backed task completion finalization now only performs a direct provider-side status update when the active profile permits `C2_remote_bounded` transactional effects, avoiding false task failures under the default `C1_local_write` profile while preserving explicit remote close-out for C2-capable runs.
- Corrected MCP provider documentation to reflect the actual default profile behavior for direct provider-side remote writes.

## [0.6.3] - 2026-04-02

### Fixed
- Single-task scope validation now accepts compacted tasklists that move the selected completed item into the `Completed` summary section, avoiding false scope violations after compaction.
- Inner-loop `--continue` now preserves selected-task identity so resumed builder/reviewer work cannot drift to a different task after a mid-task halt.
- Successful MCP-backed task runs now explicitly mark the selected remote task done after commit, instead of relying only on the builder prompt to do it.
- Automatic compaction on tracked tasklists now finalizes its own tasklist rewrite immediately, preventing compaction-only changes from leaking into the next task's diff, review, or commit.

## [0.6.2] - 2026-03-31

### Fixed
Expand Down
4 changes: 3 additions & 1 deletion docs/providers/mcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,9 @@ Any MCP server your agent supports works — the `mcp_server` value is passed th

## Effect policy

All MCP operations are classified as `EffectClass.transactional` (C2). Under the default `DEV_IMPLEMENTATION` profile these are permitted. If you have a stricter policy configured (e.g. `C1_LOCAL_WRITE` only), MCP operations will raise `CapabilityViolation` — update `.millstone/policy.toml` to allow `C2_REMOTE_BOUNDED` effects.
Millstone-classified MCP write operations are `EffectClass.transactional` (`C2_remote_bounded`). The default `dev_implementation` profile is `C1_local_write`, so direct provider writes that millstone issues itself are not permitted unless you configure a `C2_remote_bounded` profile with `transactional` effects allowlisted.

In the default profile, prompt-driven MCP actions performed by the coding agent can still run because the agent is executing those tool calls from the rendered prompt, not through millstone's direct provider-effect path. Orchestrator-driven follow-up writes, such as explicit post-commit remote task completion, are skipped unless the active profile allows those transactional effects.

---

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "millstone"
version = "0.6.2"
version = "0.6.4"
description = "Orchestrator for agentic coding tools"
readme = "README.md"
requires-python = ">=3.10"
Expand Down
8 changes: 6 additions & 2 deletions src/millstone/artifact_providers/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -676,8 +676,12 @@ def get_prompt_placeholders(self) -> dict[str, str]:
"`- [ ]` pending, `- [x]` complete. Select the FIRST unchecked task."
),
"TASKLIST_COMPLETE_INSTRUCTIONS": (
f"Mark exactly this one task complete by changing its `- [ ]` to "
f"`- [x]` in `{self.path}` and stop. Do not modify any other tasks."
f"Mark exactly this one task complete in `{self.path}` and stop. "
"Normally, change its `- [ ]` to `- [x]`. "
"If the file has a `Completed` summary section instead of listing older "
"finished tasks individually, you may instead remove the selected task from "
"the pending list and update only that `Completed` summary. Do not modify "
"any other pending tasks."
),
"TASKLIST_REWRITE_INSTRUCTIONS": (
f"Write the entire compacted content back to `{self.path}`, "
Expand Down
26 changes: 22 additions & 4 deletions src/millstone/artifacts/tasklist.py
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,10 @@ def _extract_tasks(self, content: str) -> list[tuple[bool, str]]:
tasks.append((checked.lower() == "x", task_text.strip()))
return tasks

def _has_completed_summary_section(self, content: str) -> bool:
"""Return True when the tasklist already uses a compacted completed summary."""
return bool(re.search(r"^#{2,6}\s+Completed\b", content, re.MULTILINE | re.IGNORECASE))

def validate_single_task_completion(
self,
original_content: str,
Expand All @@ -430,7 +434,25 @@ def validate_single_task_completion(
if not original_tasks or not new_tasks:
return True, ""

first_unchecked = next(
(i for i, (checked, _) in enumerate(original_tasks) if not checked),
None,
)

if len(new_tasks) < len(original_tasks):
if (
first_unchecked is not None
and len(new_tasks) == len(original_tasks) - 1
and (
self._has_completed_summary_section(original_content)
or self._has_completed_summary_section(new_content)
)
):
expected_tasks = (
original_tasks[:first_unchecked] + original_tasks[first_unchecked + 1 :]
)
if new_tasks == expected_tasks:
return True, ""
return (
False,
f"Task count decreased: {len(original_tasks)} -> {len(new_tasks)}. Deleting tasks is not allowed.",
Expand All @@ -457,10 +479,6 @@ def validate_single_task_completion(
f"Multiple tasks were marked complete ({len(checkoffs)}).",
)

first_unchecked = next(
(i for i, (checked, _) in enumerate(original_tasks) if not checked),
None,
)
if first_unchecked is None:
return False, "No unchecked tasks remained, but a task was marked complete."

Expand Down
Loading
Loading