Skip to content

refactor(v2-review): shell injection hardening, bug fixes, and test cleanup#122

Open
coketaste wants to merge 4 commits into
developfrom
coketaste/v2-quality-improve
Open

refactor(v2-review): shell injection hardening, bug fixes, and test cleanup#122
coketaste wants to merge 4 commits into
developfrom
coketaste/v2-quality-improve

Conversation

@coketaste
Copy link
Copy Markdown
Collaborator

Summary

  • Harden all shell-interpolated Docker commands with shlex.quote() to prevent injection via image names, container names, mount paths, and context paths
  • Fix TypeError crash when KFD topology is inaccessible on ROCm < 6.4.1 systems (kfd_renderDs is None)
  • Fix infinite loop in deployment monitoring when a job is cancelled (CANCELLED missing from terminal status set)
  • Consolidate pytest config into pyproject.toml as single source of truth (delete redundant pytest.ini that was silently overriding it)
  • Clean up test infrastructure: remove sys.path hack, deduplicate marker registration, fix global error handler state leak

Changes

Security hardening (shlex.quote())

File What's quoted
core/docker.py container_name, image, mount paths, cwd in docker run
execution/docker_builder.py docker_image, dockerfile, docker_context in docker build
execution/container_runner.py registry_image, local_name in docker pull/tag/rmi; mount paths in get_mount_arg()
orchestration/run_orchestrator.py image_name in docker image inspect and docker pull

Bug fixes

  • core/context.py — Guard kfd_renderDs is None before len() call, with a clear error message pointing at KFD topology permissions
  • deployment/base.py — Add DeploymentStatus.CANCELLED to the terminal set in _monitor_until_complete() so cancelled jobs don't spin forever

Test & config cleanup

  • Delete pytest.ini — was overriding all pyproject.toml [tool.pytest.ini_options] settings silently
  • pyproject.toml — Fix python_paths typo → pythonpath, add --strict-markers, filterwarnings, minversion, and full marker list (10 markers)
  • tests/conftest.py — Remove sys.path insertion (now handled by pythonpath) and pytest_configure() marker duplication
  • tests/unit/test_error_handling.py — Add setup_method/teardown_method to reset global error handler between tests
  • tests/unit/test_shell_quoting.py — 11 new tests validating quoting across all hardened code paths

Test plan

  • pytest tests/unit/ -v — all 437 tests pass (verified locally)
  • Verify shell quoting tests cover malicious inputs (; rm -rf /, $(whoami)) and safe inputs
  • Confirm no regressions in build/run workflows with normal image names and paths

…onfig cleanup

- Harden shell commands with shlex.quote() in docker.py, container_runner.py,
  docker_builder.py, and run_orchestrator.py to prevent injection via
  user-controlled values (image names, paths, container names)
- Fix TypeError when kfd_renderDs is None on restricted ROCm < 6.4.1 systems
- Add CANCELLED to deployment monitor terminal states to prevent infinite loop
- Consolidate pytest config into pyproject.toml and delete redundant pytest.ini
- Remove sys.path hack and duplicate marker registration from conftest.py
- Fix global error handler state leak in test_error_handling.py
- Add test_shell_quoting.py with 11 tests validating quoting behavior

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
@coketaste coketaste self-assigned this May 3, 2026
Copilot AI review requested due to automatic review settings May 3, 2026 21:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens several Docker-related shell commands against injection, fixes a couple of runtime bugs in GPU mapping and deployment monitoring, and consolidates pytest configuration into pyproject.toml while cleaning up test infrastructure.

Changes:

  • Add shlex.quote() escaping for user-controlled values interpolated into docker run/build/pull/tag/rmi/inspect commands across core execution paths.
  • Fix ROCm < 6.4.1 crash when KFD topology is inaccessible, and prevent infinite monitoring loops on cancelled deployments.
  • Consolidate pytest config into pyproject.toml, remove redundant marker registration / sys.path hacks, and add new unit tests for shell quoting.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/madengine/core/docker.py Quote container name, image, mounts, and CWD in docker run command construction.
src/madengine/execution/docker_builder.py Quote image/dockerfile/context in docker build command (but other shell interpolations remain).
src/madengine/execution/container_runner.py Quote image names in docker pull/tag/rmi and quote mount paths in mount args.
src/madengine/orchestration/run_orchestrator.py Quote image name in docker image inspect and docker pull.
src/madengine/core/context.py Raise a clear error when KFD topology is unavailable on ROCm < 6.4.1 instead of crashing.
src/madengine/deployment/base.py Treat CANCELLED as terminal to stop monitoring loops.
tests/unit/test_shell_quoting.py Add unit tests validating quoting behavior across hardened command paths.
tests/unit/test_error_handling.py Reset global error handler between tests to prevent state leakage.
tests/conftest.py Remove sys.path insertion and redundant marker registration.
pyproject.toml Make pytest configuration the source of truth, enable --strict-markers, and define markers/warnings.
pytest.ini Remove redundant config file that could override pyproject.toml.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/test_shell_quoting.py Outdated
Comment thread tests/unit/test_shell_quoting.py Outdated
Comment thread src/madengine/execution/docker_builder.py
coketaste and others added 2 commits May 11, 2026 15:34
…n up test imports

Add shlex.quote() to remaining unquoted shell interpolations in
docker_builder.py (grep, docker manifest inspect, docker tag, docker push,
head commands). Remove unused pytest and tempfile imports and dead temp
file block from test_shell_quoting.py.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 11, 2026 20:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment on lines +22 to +43
@patch.object(Console, "sh", return_value="")
def test_container_name_and_image_are_quoted(self, mock_sh):
from madengine.core.docker import Docker

mock_sh.side_effect = self._make_sh_side_effect()
try:
Docker(
image=MALICIOUS_IMAGE,
container_name="evil;name",
dockerOpts="",
console=Console(shellVerbose=False),
)
except Exception:
pass

docker_run_calls = [
c for c in mock_sh.call_args_list if "docker run" in str(c)
]
assert docker_run_calls, "Expected at least one docker run call"
run_cmd = docker_run_calls[0].args[0]
assert shlex.quote("evil;name") in run_cmd
assert shlex.quote(MALICIOUS_IMAGE) in run_cmd
Comment thread pyproject.toml
testpaths = ["tests/unit", "tests/integration", "tests/e2e"]
pythonpath = ["src"]
addopts = "-v --tb=short -ra --strict-markers -W default"
minversion = "3.8"
Copilot AI review requested due to automatic review settings May 13, 2026 13:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

"allow_privileged_profiling": null,
"nfs_storage_class": "nfs-banff",
"local_path_storage_class": "local-path",
"storage_class": "nfs-banff",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants