Skip to content

Redact git credentials at remaining URL log/serialization sites (fixes #3751)#3792

Merged
simonrosenberg merged 2 commits into
mainfrom
fix-redact-remaining-url-sites
Jun 18, 2026
Merged

Redact git credentials at remaining URL log/serialization sites (fixes #3751)#3792
simonrosenberg merged 2 commits into
mainfrom
fix-redact-remaining-url-sites

Conversation

@simonrosenberg

@simonrosenberg simonrosenberg commented Jun 18, 2026

Copy link
Copy Markdown
Member

HUMAN:

Redact git credentials at remaining URL log/serialization sites


AGENT:
Verified locally with uv run pytest tests/sdk/utils/test_redact.py tests/sdk/git/test_url_redaction.py tests/workspace/test_cloud_workspace_repos.py
→ 118 passed, and the broader tests/sdk/git tests/sdk/plugin tests/sdk/utils tests/workspace → 615 passed. Reproduced the original leaks before the fix and
confirmed each is now redacted (see How to Test).

Why

Follow-up to #2154 and #3689. Those redacted credentials embedded in
https://user:token@host URLs at most logging sites and (in #3689) on
GitCommandError.command. A few related leak vectors remained where a
user-supplied credential could still be written to logs or persisted on an
exception. This closes them so credential redaction is consistent across the
git/workspace stack (defense-in-depth; lower severity than #3689).

Summary

  • Add redact_url_credentials_in_text() to sdk/utils/redact.py — a
    substring-capable redactor (re.sub over embedded https?://user:token@host).
    The existing redact_url_credentials() is anchored (^...$), so it only
    redacts when the whole string is a URL and misses credentials embedded inside
    a larger string such as git stderr.
  • RepoSource.validate_url (repo.py): redact the URL before logging the
    http://https:// normalization warning (previously logged any embedded
    credential verbatim).
  • run_git_command (git/utils.py): redact stderr before logging it and
    before storing it on GitCommandError.
  • Consolidate repo.py _mask_url/_mask_token onto the central helpers so
    there is a single source of truth. _mask_token only string-replaced the exact
    token value and failed for URL-encoded / partially-echoed credentials.

Issue Number

Closes #3751

How to Test

Run the tests:

uv run pytest tests/sdk/utils/test_redact.py \
  tests/sdk/git/test_url_redaction.py \
  tests/workspace/test_cloud_workspace_repos.py

Manual reproduction of the embedded-URL limitation that the new helper fixes:

from openhands.sdk.utils.redact import (
    redact_url_credentials,
    redact_url_credentials_in_text,
)

s = "fatal: unable to access 'https://oauth2:SECRET@github.com/o/r.git/': 403"
print(redact_url_credentials(s))          # unchanged — 'SECRET' still present
print(redact_url_credentials_in_text(s))  # redacted — 'https://****@github.com/...'

Site #1 (warning no longer leaks the credential):

import logging
logging.basicConfig(level=logging.WARNING)
from openhands.sdk.workspace.repo import RepoSource
RepoSource(url="http://oauth2:SUPERSECRET@github.com/owner/repo.git", provider="github")
# WARNING log now shows http://****@github.com/owner/repo.git

Video/Screenshots

N/A — logging/redaction change, covered by unit tests.

Type

  • Bug fix
  • Feature
  • Refactor
  • Breaking change
  • Docs / chore

Notes

redact_url_credentials() (anchored) is intentionally kept for whole-URL
arguments (e.g. each git clone arg); the new substring helper is used wherever
a credential may appear inside a larger string.


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:34744d3-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-34744d3-python \
  ghcr.io/openhands/agent-server:34744d3-python

All tags pushed for this build

ghcr.io/openhands/agent-server:34744d3-golang-amd64
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-golang-amd64
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-golang-amd64
ghcr.io/openhands/agent-server:34744d3-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:34744d3-golang-arm64
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-golang-arm64
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-golang-arm64
ghcr.io/openhands/agent-server:34744d3-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:34744d3-java-amd64
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-java-amd64
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-java-amd64
ghcr.io/openhands/agent-server:34744d3-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:34744d3-java-arm64
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-java-arm64
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-java-arm64
ghcr.io/openhands/agent-server:34744d3-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:34744d3-python-amd64
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-python-amd64
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-python-amd64
ghcr.io/openhands/agent-server:34744d3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:34744d3-python-arm64
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-python-arm64
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-python-arm64
ghcr.io/openhands/agent-server:34744d3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:34744d3-golang
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-golang
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-golang
ghcr.io/openhands/agent-server:34744d3-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:34744d3-java
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-java
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-java
ghcr.io/openhands/agent-server:34744d3-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:34744d3-python
ghcr.io/openhands/agent-server:34744d3fe8800265662421827bee945f9328a644-python
ghcr.io/openhands/agent-server:fix-redact-remaining-url-sites-python
ghcr.io/openhands/agent-server:34744d3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 34744d3-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 34744d3-python-amd64) are also available if needed

Follow-up to #2154/#3689. Closes the remaining credential leak vectors in
the git/workspace stack so URL-credential redaction is consistent.

- Add `redact_url_credentials_in_text()` to sdk/utils/redact.py: a
  substring-capable redactor that strips `https?://user:token@` credentials
  embedded anywhere in a larger string. The existing `redact_url_credentials()`
  is anchored and only redacts when the entire string is a URL, so it misses
  credentials inside messages like git stderr.
- RepoSource.validate_url: redact the URL before logging the http->https
  normalization warning (was logging user-supplied creds verbatim).
- run_git_command: redact stderr before logging it and before storing it on
  GitCommandError.
- Consolidate repo.py `_mask_url`/`_mask_token` onto the central helpers
  (`redact_url_credentials` / `redact_url_credentials_in_text`) so there is a
  single source of truth; the old `_mask_token` only string-replaced the exact
  token and missed URL-encoded/partial credentials.

Tests cover the new helper (embedded-URL strings, multiple URLs, URL-encoded
creds, path-`@` non-matches) and each leak site.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/git
   utils.py1242480%108–110, 198–199, 206–211, 216–217, 227–232, 242–244, 272, 378
openhands-sdk/openhands/sdk/utils
   redact.py881484%87, 226–227, 250–256, 272–275
openhands-sdk/openhands/sdk/workspace
   repo.py23314936%72–74, 76–83, 150, 153, 157, 161, 173–175, 180, 184–185, 187–189, 192, 196, 220–222, 237, 240–241, 244–245, 248, 251–252, 261, 263, 265, 278–279, 282–285, 302–304, 306–307, 310–312, 315–317, 319, 323, 333–334, 336–340, 352, 358–359, 361–363, 368–372, 374–375, 377–378, 380, 382–383, 386–388, 390–391, 394, 397–399, 401–402, 409–410, 413–415, 418–421, 440–442, 445–452, 456–458, 460–461, 463–466, 468–472, 475–478, 481–483, 485–486, 488–489, 496–497, 499–501, 503–505, 507
TOTAL32323843973% 

@simonrosenberg simonrosenberg marked this pull request as ready for review June 18, 2026 14:18

all-hands-bot commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Taste Rating: Good taste - The fix is small, centralized, and covers the remaining real leak sites without changing public behavior.

I reviewed the changed production paths and the new regression coverage. The new substring redactor handles embedded https?://user:token@host URLs in git stderr/log text, while the existing whole-URL helper remains available for URL arguments. The repo clone and git command call sites now redact before logging or persisting error text, and the tests exercise the important leak vectors.

Validation run: uv run pytest tests/sdk/utils/test_redact.py tests/sdk/git/test_url_redaction.py tests/workspace/test_cloud_workspace_repos.py -q → 118 passed.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW
    Narrow defense-in-depth logging/redaction change with no dependency changes and no API surface changes. The main compatibility-sensitive behavior is preserving stored repo URLs and command execution while only sanitizing log/error surfaces, which the tests cover.

VERDICT:
Worth merging: Core logic is sound; merge after the remaining standard CI checks finish green.

KEY INSIGHT:
Centralizing substring URL credential redaction eliminates the remaining special-case leak paths without broadening clone or git-command behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

The PR achieves its redaction goal: I reproduced credential leaks on the base commit and verified the same SDK/git/workspace operations redact credentials on the PR commit.

Does this PR achieve its stated goal?

Yes. The stated goal was to redact credentials at the remaining URL log/exception sites in the git/workspace stack. Running an SDK consumer probe showed the base commit leaked SUPERSECRET, oauth2, and a partially echoed token in RepoSource warnings, GitCommandError.stderr/logs, and clone_repos failure logs; the PR commit redacted those same paths to http(s)://****@... while preserving expected behavior such as HTTP→HTTPS normalization and clone failure reporting.

Phase Result
Environment Setup make build completed successfully and installed the uv dev environment.
CI Status ⏳ Observed via gh pr checks: 23 success, 7 in progress, 3 skipped; I did not rerun tests.
Functional Verification ✅ Before/after SDK probe reproduced base leaks and confirmed PR redaction.
Functional Verification

Test 1: SDK substring redaction helper for embedded URL text

Step 1 — Reproduce / establish baseline (without the fix):

Checked out base 89fd946d6dc0046bb6902ccc1415723063d5ea95 and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_redaction_probe.py.

Relevant output:

## anchored helper on embedded URL: LEAK: SUPERSECRET, oauth2
fatal: unable to access 'https://oauth2:SUPERSECRET@github.com/owner/repo.git/': 403
## substring helper availability: MISSING
ImportError

This confirms the old public helper only handled whole URL strings and did not provide substring redaction for git-style stderr text.

Step 2 — Apply the PR's changes:

Checked out PR head 34744d3fe8800265662421827bee945f9328a644.

Step 3 — Re-run with the fix in place:

Ran the same command.

Relevant output:

## anchored helper on embedded URL: LEAK: SUPERSECRET, oauth2
fatal: unable to access 'https://oauth2:SUPERSECRET@github.com/owner/repo.git/': 403
## substring helper on embedded URL: REDACTED
fatal: unable to access 'https://****@github.com/owner/repo.git/': 403

This shows the new substring-capable helper redacts embedded credentials while the anchored helper remains unchanged as described.

Test 2: RepoSource HTTP normalization warning no longer leaks credentials

Step 1 — Reproduce / establish baseline (without the fix):

On the base commit, the probe constructed RepoSource(url="http://oauth2:SUPERSECRET@github.com/owner/repo.git") as an SDK user would.

Relevant output:

## RepoSource normalization warning: LEAK: SUPERSECRET, oauth2
stored_https=True
WARNING:Converting HTTP URL to HTTPS for security: http://oauth2:SUPERSECRET@github.com/owner/repo.git

This confirms the bug: normalization worked, but the warning logged both username and token.

Step 2 — Apply the PR's changes:

Checked out PR head 34744d3fe8800265662421827bee945f9328a644.

Step 3 — Re-run with the fix in place:

Ran the same probe.

Relevant output:

## RepoSource normalization warning: REDACTED
stored_https=True
WARNING:Converting HTTP URL to HTTPS for security: http://****@github.com/owner/repo.git

This confirms the warning is now safe while preserving the HTTPS normalization behavior.

Test 3: run_git_command exception/log stderr redaction

Step 1 — Reproduce / establish baseline (without the fix):

On the base commit, the probe called run_git_command(...) against a real temporary executable that exited non-zero and wrote a git-style credential-bearing URL to stderr.

Relevant output:

## run_git_command failing stderr: LEAK: SUPERSECRET, oauth2
exception.stderr=fatal: Authentication failed for 'https://oauth2:SUPERSECRET@github.com/owner/repo.git/'
logs=ERROR:Git command failed: /tmp/tmpujnh6b7h/fake-git clone 'https://****@github.com/owner/repo.git' /tmp/tmpujnh6b7h/repo. Exit code: 128. Stderr: fatal: Authentication failed for 'https://oauth2:SUPERSECRET@github.com/owner/repo.git/'

This shows the command argument was already redacted, but stderr still leaked through both the exception payload and error log.

Step 2 — Apply the PR's changes:

Checked out PR head 34744d3fe8800265662421827bee945f9328a644.

Step 3 — Re-run with the fix in place:

Ran the same probe.

Relevant output:

## run_git_command failing stderr: REDACTED
exception.stderr=fatal: Authentication failed for 'https://****@github.com/owner/repo.git/'
logs=ERROR:Git command failed: /tmp/tmpdh06pqcc/fake-git clone 'https://****@github.com/owner/repo.git' /tmp/tmpdh06pqcc/repo. Exit code: 128. Stderr: fatal: Authentication failed for 'https://****@github.com/owner/repo.git/'

This confirms the remaining stderr leak path is fixed.

Test 4: clone_repos failure logging handles partially echoed credentials

Step 1 — Reproduce / establish baseline (without the fix):

On the base commit, the probe called clone_repos([RepoSource(url="owner/repo", provider="github")], ...) with a token fetcher returning tok@secret, exercising the clone failure path with real git behavior.

Relevant output:

## clone_repos failing stderr: LEAK: secret
success_count=0, failed_repos=['owner/repo']
logs=INFO:[clone] Cloning 1 repository(ies)...
INFO:[clone] Cloning owner/repo (github) -> repo/
WARNING:[clone] Failed: Cloning into '/tmp/tmpvmdawx3d/repos/repo'...
fatal: unable to access 'https://secret@github.com/owner/repo.git/': URL rejected: Bad hostname

This confirms the old exact token replacement missed a partially echoed credential component.

Step 2 — Apply the PR's changes:

Checked out PR head 34744d3fe8800265662421827bee945f9328a644.

Step 3 — Re-run with the fix in place:

Ran the same probe.

Relevant output:

## clone_repos failing stderr: REDACTED
success_count=0, failed_repos=['owner/repo']
logs=INFO:[clone] Cloning 1 repository(ies)...
INFO:[clone] Cloning owner/repo (github) -> repo/
WARNING:[clone] Failed: Cloning into '/tmp/tmp76vrerj1/repos/repo'...
fatal: unable to access 'https://****@github.com/owner/repo.git/': URL rejected: Bad hostname

This confirms the broader URL redaction also covers partially echoed credential strings in clone failures.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

@simonrosenberg simonrosenberg merged commit 0a13ebd into main Jun 18, 2026
36 of 37 checks passed
@simonrosenberg simonrosenberg deleted the fix-redact-remaining-url-sites branch June 18, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redact git credentials at remaining URL log/serialization sites (follow-up to #2154/#3689)

2 participants