ci: enforce anonymous commit author identity#113
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a GitHub Actions workflow that verifies PR commit metadata uses GitHub’s @users.noreply.github.com email format to reduce the risk of accidentally exposing personal/corporate email addresses in commit history.
Changes:
- Introduces a new
pull_requestworkflow to scan commit author emails in the PR range and fail if any are not@users.noreply.github.com.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| echo | ||
| echo "Fix locally:" | ||
| echo " git config user.email USER@users.noreply.github.com" | ||
| echo " git rebase -i \$base # squash/edit to apply new identity" |
There was a problem hiding this comment.
The remediation hint prints git rebase -i $base, but $base is not defined in a developer’s local shell (it’s only a variable inside this CI step). Consider printing the actual base SHA (expand ${base}) or suggesting a more actionable target like origin/main so the instructions are directly usable.
| echo " git rebase -i \$base # squash/edit to apply new identity" | |
| echo " git rebase -i ${base} # squash/edit to apply new identity" |
| base="${{ github.event.pull_request.base.sha }}" | ||
| head="${{ github.event.pull_request.head.sha }}" | ||
| bad=$(git log --pretty='%h %ae' "${base}..${head}" \ | ||
| | grep -v 'users\.noreply\.github\.com$' || true) |
There was a problem hiding this comment.
The regex users\.noreply\.github\.com$ matches any domain that merely ends with that suffix (e.g., someone@badusers.noreply.github.com). To avoid false passes, anchor it to the domain boundary, e.g. require @users.noreply.github.com at the end (and optionally do a case-insensitive match).
| | grep -v 'users\.noreply\.github\.com$' || true) | |
| | grep -Eiv '@users\.noreply\.github\.com$' || true) |
| bad=$(git log --pretty='%h %ae' "${base}..${head}" \ | ||
| | grep -v 'users\.noreply\.github\.com$' || true) |
There was a problem hiding this comment.
The bad=$(git log ... | grep ... || true) pipeline can silently succeed even if git log fails (e.g., invalid/missing base/head SHAs), because the pipeline status comes from grep and || true masks non-zero exits. This could let PRs with non-noreply emails pass without being checked. Split the commands so git log errors fail the step (or enable pipefail and only suppress grep's exit code when it’s the expected “no matches”).
| bad=$(git log --pretty='%h %ae' "${base}..${head}" \ | |
| | grep -v 'users\.noreply\.github\.com$' || true) | |
| authors=$(git log --pretty='%h %ae' "${base}..${head}") | |
| bad=$(printf '%s\n' "$authors" | grep -v 'users\.noreply\.github\.com$' || true) |
| bad=$(git log --pretty='%h %ae' "${base}..${head}" \ | ||
| | grep -v 'users\.noreply\.github\.com$' || true) | ||
| if [ -n "$bad" ]; then | ||
| echo "::error::PR contains commits with non-noreply author email:" | ||
| echo "$bad" | ||
| echo | ||
| echo "Fix locally:" | ||
| echo " git config user.email USER@users.noreply.github.com" | ||
| echo " git rebase -i \$base # squash/edit to apply new identity" | ||
| exit 1 | ||
| fi | ||
| echo "All commit author emails OK" |
There was a problem hiding this comment.
This only checks the commit author email (%ae). If the goal is preventing email PII in git metadata, consider also checking the committer email (%ce) since it can differ (e.g., amended/cherry-picked commits) and could still leak a non-noreply address.
| bad=$(git log --pretty='%h %ae' "${base}..${head}" \ | |
| | grep -v 'users\.noreply\.github\.com$' || true) | |
| if [ -n "$bad" ]; then | |
| echo "::error::PR contains commits with non-noreply author email:" | |
| echo "$bad" | |
| echo | |
| echo "Fix locally:" | |
| echo " git config user.email USER@users.noreply.github.com" | |
| echo " git rebase -i \$base # squash/edit to apply new identity" | |
| exit 1 | |
| fi | |
| echo "All commit author emails OK" | |
| bad="" | |
| while read -r hash author_email committer_email; do | |
| case "$author_email" in | |
| *users.noreply.github.com) author_ok=1 ;; | |
| *) author_ok=0 ;; | |
| esac | |
| case "$committer_email" in | |
| *users.noreply.github.com) committer_ok=1 ;; | |
| *) committer_ok=0 ;; | |
| esac | |
| if [ "$author_ok" -eq 0 ] || [ "$committer_ok" -eq 0 ]; then | |
| bad="${bad}${hash} author=${author_email} committer=${committer_email}"$'\n' | |
| fi | |
| done <<EOF | |
| $(git log --pretty='%h %ae %ce' "${base}..${head}") | |
| EOF | |
| if [ -n "$bad" ]; then | |
| echo "::error::PR contains commits with non-noreply author or committer email:" | |
| printf '%s' "$bad" | |
| echo | |
| echo "Fix locally:" | |
| echo " git config user.email USER@users.noreply.github.com" | |
| echo " git rebase -i \$base # squash/edit to apply new identity" | |
| exit 1 | |
| fi | |
| echo "All commit author and committer emails OK" |
Adds a CI check that fails any PR containing commits with non-`@users.noreply.github.com` author emails. Prevents accidental PII leaks from local git config defaulting to a corp/personal email. Co-authored-by: fitz123 <fitz123@users.noreply.github.com>
Adds a CI check that fails any PR containing commits with non-
@users.noreply.github.comauthor emails. Prevents accidental PII leaks from local git config defaulting to a corp/personal email.