chore: add version scanner to CI by chalmerlowe · Pull Request #17424 · googleapis/google-cloud-python

chalmerlowe · 2026-06-11T13:40:06Z

Note

This is a WIP to test the use of the version scanner to provide protection against regressions that add back in references to recently removed EOL runtimes OR dependencies.

…d tests

…e and add tests

…ication

…lines

…changelog.md

…boundaries for explicit_version_string

…kup logic

… to .scannerignore

…e api clients

…and package-file details

…ayout-agnostic package naming

…limits

Adds distinct exit codes, --stdout support, workflow file, and fixes 3.10 truncation.

gemini-code-assist

Code Review

This pull request introduces an Automated Dependency Version Scanner tool, which includes the core scanner script, a benchmarking utility, regex configuration rules, and a comprehensive test suite. The reviewer feedback highlights several key areas for improvement: resolving .scannerignore paths relative to the target directory being scanned, supporting wildcard matching in ignore patterns using fnmatch, cleaning up redundant code in the --stdout output block, and making file paths in integration and benchmark tests robust by resolving them relative to __file__.

gemini-code-assist · 2026-06-11T13:42:36Z

+    # Load ignore file from script directory (Option A)
+    script_dir = os.path.dirname(os.path.abspath(__file__))
+    ignore_file_path = os.path.join(script_dir, ".scannerignore")
+    ignore_dirs = load_ignore_file(ignore_file_path)


The README states that users can create a .scannerignore file in the directory being scanned (usually the repo root). However, the current implementation only loads .scannerignore from the script's directory (scripts/version_scanner/), meaning any .scannerignore placed in the target directory (e.g., the repository root) will be ignored.

We should update this to load .scannerignore from the target path (args.path), and fall back to the script's directory if not found.

Suggested change

# Load ignore file from script directory (Option A)

script_dir = os.path.dirname(os.path.abspath(__file__))

ignore_file_path = os.path.join(script_dir, ".scannerignore")

ignore_dirs = load_ignore_file(ignore_file_path)

# Load ignore file from the target directory being scanned, falling back to the script directory

ignore_file_path = os.path.join(args.path, ".scannerignore")

if not os.path.exists(ignore_file_path):

script_dir = os.path.dirname(os.path.abspath(__file__))

ignore_file_path = os.path.join(script_dir, ".scannerignore")

ignore_dirs = load_ignore_file(ignore_file_path)

gemini-code-assist · 2026-06-11T13:42:36Z

+    if args.stdout:
+        print("\n=== CSV Output ===")
+        import io
+        output = io.StringIO()
+        write_csv_report(output_path, all_matches, github_repo=args.github_repo, branch=args.branch) # this writes to the file, but we want stdout too
+        # Let's just read the file and print it
+        with open(output_path, 'r', encoding='utf-8') as f:
+            print(f.read(), end='')


If --stdout is enabled, the script currently imports io, creates an unused io.StringIO() object, and redundantly calls write_csv_report a second time (it was already called on line 679). We can simplify this block by removing the unused import, the unused variable, and the redundant function call, and simply read and print the already generated CSV file.

Suggested change

if args.stdout:

print("\n=== CSV Output ===")

import io

output = io.StringIO()

write_csv_report(output_path, all_matches, github_repo=args.github_repo, branch=args.branch) # this writes to the file, but we want stdout too

# Let's just read the file and print it

with open(output_path, 'r', encoding='utf-8') as f:

print(f.read(), end='')

if args.stdout:

print("\n=== CSV Output ===")

# Let's just read the file and print it

with open(output_path, 'r', encoding='utf-8') as f:

print(f.read(), end='')

References

Remove duplicate lines of code, especially duplicate assertions in tests, to keep the codebase clean and avoid redundancy.

gemini-code-assist · 2026-06-11T13:42:36Z

+        # Prune ignore directories (case-insensitive)
+        dirs[:] = [d for d in dirs if d.lower() not in ignore_lower]
+
+        # Filter ignore files (case-insensitive)
+        files = [f for f in files if f.lower() not in ignore_lower]


The .scannerignore file contains wildcard patterns (e.g., *.jpg, *.png). However, the current filtering logic uses direct string membership (not in ignore_lower), which does not support wildcards or globbing. To support wildcard matching as described in the README, we can use fnmatch.fnmatchcase from the standard library.

Note: You will also need to add import fnmatch at the top of the file.

Suggested change

# Prune ignore directories (case-insensitive)

dirs[:] = [d for d in dirs if d.lower() not in ignore_lower]

# Filter ignore files (case-insensitive)

files = [f for f in files if f.lower() not in ignore_lower]

# Prune ignore directories (case-insensitive)

dirs[:] = [d for d in dirs if not any(fnmatch.fnmatchcase(d.lower(), pattern) for pattern in ignore_lower)]

# Filter ignore files (case-insensitive)

files = [f for f in files if not any(fnmatch.fnmatchcase(f.lower(), pattern) for pattern in ignore_lower)]

gemini-code-assist · 2026-06-11T13:42:36Z

+    # Paths to real tools
+    scanner_path = os.path.abspath("version_scanner.py")
+    config_path = os.path.abspath("regex_config.yaml")
+
+    # Static data directory
+    data_dir = os.path.abspath("tests/data")


The integration test currently uses relative paths like version_scanner.py and tests/data assuming the test is run from the scripts/version_scanner/ directory. If the tests are run from the repository root (which is common in CI/CD environments), these paths will resolve incorrectly and the test will fail.

We can make the paths robust and directory-independent by resolving them relative to __file__.

Suggested change

# Paths to real tools

scanner_path = os.path.abspath("version_scanner.py")

config_path = os.path.abspath("regex_config.yaml")

# Static data directory

data_dir = os.path.abspath("tests/data")

# Paths to real tools resolved relative to this test file

test_dir = os.path.dirname(os.path.abspath(__file__))

scanner_path = os.path.abspath(os.path.join(test_dir, "../../version_scanner.py"))

config_path = os.path.abspath(os.path.join(test_dir, "../../regex_config.yaml"))

# Static data directory

data_dir = os.path.abspath(os.path.join(test_dir, "../data"))

gemini-code-assist · 2026-06-11T13:42:36Z

+    (packages_dir / "pkg1").mkdir()
+    (packages_dir / "pkg1" / "test.py").write_text("version = '3.7'\n")
+
+    scanner_path = "version_scanner.py"


Similar to the integration test, using a hardcoded relative path "version_scanner.py" will cause the test to fail if executed from any directory other than scripts/version_scanner/. We should resolve the path relative to __file__ to make the test robust.

Suggested change

scanner_path = "version_scanner.py"

test_dir = os.path.dirname(os.path.abspath(__file__))

scanner_path = os.path.abspath(os.path.join(test_dir, "../../version_scanner.py"))

chalmerlowe added 30 commits April 29, 2026 08:21

feat(scripts): Add dependency version scanner tool

f446ff7

perf(search): Apply bot suggestions for regex optimization and imports

256b048

refactor(benchmark): Use tempfile for unique names and safe cleanup

1010399

refactor(benchmark): Remove redundant directory check

68f61ee

test(integration): Check exit code of subprocess in integration test

cc960b4

test(unit): Remove redundant and brittle test_regex_patterns

a4ad9ce

test(unit): Move import yaml to top of file

2743957

refactor(benchmark): Remove redundant directory check in main

47450bb

test(unit): Remove duplicate import yaml from function

c777e44

feat(version_scanner): handle invalid format strings in config and ad…

8aab801

…d tests

feat(version_scanner): handle PermissionError when reading config fil…

f63053c

…e and add tests

feat(version_scanner): extract read_package_file and handle file errors

2af97b3

refactor(version_scanner): simplify target resolution and remove dupl…

cb29438

…ication

feat(version_scanner): add format_match_for_csv helper and tests

ea0e8be

feat(version_scanner): integrate GitHub link generation into CSV report

a8824af

feat(version_scanner): default output to results directory

baafb74

feat(version_scanner): ignore version_scanner directory during scan

a1cc08e

feat(version_scanner): broaden version regex and add case insensitivity

3ceea9b

feat(version_scanner): strip newlines from matched strings

d756c07

feat(version_scanner): add word boundaries and truncate long context …

075d04b

…lines

feat(version_scanner): add console summary table

85e9ff5

feat(version_scanner): add .scannerignore file support

5c8f673

feat(version_scanner): move ignore defaults to .scannerignore file

efb3331

docs(version_scanner): add README.md

bf39072

docs(version_scanner): update README options and CLI help strings

9d9ce22

feat(version_scanner): set default for --github-repo

14e4dcc

feat(version_scanner): default config path to script directory

7fc03ca

feat(version_scanner): support case-insensitive file ignores and add …

f64eac4

…changelog.md

feat(version_scanner): update small package list for demos

fc47dd6

Merge remote-tracking branch 'origin/main' into feat/add-version-scanner

95f6f19

chalmerlowe added 21 commits April 30, 2026 09:59

Merge branch 'origin/main' into feat/add-version-scanner

761def6

feat(version_scanner): add combined_version_string rule and use word …

9289c8c

…boundaries for explicit_version_string

feat(scanner): add ability to detect ignore pragma

d771258

feat(scanner): move .scannerignore to script directory and update loo…

bafae70

…kup logic

chore(scanner): ignore repositories.bzl in scanner

94174bb

feat(scanner): add filename scanning support

d652dbf

docs(scanner): update README with known issues and add binary ignores…

a1188c8

… to .scannerignore

docs(version-scanner): merge migration guide into README.md

0a6ae92

Merge branch 'main' into feat/add-version-scanner

7cdbe72

chore(version_scanner): add Apache 2.0 copyright headers

303906d

feat(version_scanner): implement lazy optional dependencies for googl…

919ae7e

…e api clients

docs(version_scanner): update README with setup, scope, limitations, …

a7907a9

…and package-file details

feat(version_scanner): implement generic subdirectory filtering and l…

208aa74

…ayout-agnostic package naming

docs(version_scanner): add disclaimer regarding prompt usage and LLM …

4287c04

…limits

feat: complete Phase 1 of version_scanner project

2f229ab

Adds distinct exit codes, --stdout support, workflow file, and fixes 3.10 truncation.

test: fix ConfigManager signature and regex assertions

3e0dbfb

test: fix regex rule compilation in tests

73b9fbf

fix: prevent truncation in sys.version_info.minor regexes

d11e606

fix: force string format in CSV output to prevent spreadsheet truncation

e3af193

build: update version_scanner.yml triggers to match repo standards

514dfef

chore: test workflow on push

a1b609b

gemini-code-assist Bot reviewed Jun 11, 2026

View reviewed changes

chalmerlowe closed this Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: add version scanner to CI#17424

chore: add version scanner to CI#17424
chalmerlowe wants to merge 51 commits into
mainfrom
feat/add-version-scanner

chalmerlowe commented Jun 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	scanner_path = "version_scanner.py"
	test_dir = os.path.dirname(os.path.abspath(__file__))
	scanner_path = os.path.abspath(os.path.join(test_dir, "../../version_scanner.py"))

Conversation

chalmerlowe commented Jun 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant