Skip to content

fix: normalize USPTO ODP search response to results/totalHits contract#24

Open
edwardwang-detecteng wants to merge 1 commit into
RobThePCGuy:mainfrom
edwardwang-detecteng:fix/uspto-search-response-normalization
Open

fix: normalize USPTO ODP search response to results/totalHits contract#24
edwardwang-detecteng wants to merge 1 commit into
RobThePCGuy:mainfrom
edwardwang-detecteng:fix/uspto-search-response-normalization

Conversation

@edwardwang-detecteng

Copy link
Copy Markdown

Problem

USPTOClient._make_request returns the raw USPTO Open Data Portal search
envelope, which carries results under patentFileWrapperDataBag and the match
count under count. Every caller, however, reads results / totalHits:
search_patents_simple, get_patent_by_number, get_patent_by_application,
get_recent_patents, and check_api_status_detailed. The result is that
searches silently return zero results even when the API responds with matches.

Fix

Add USPTOClient._normalize_search_response, applied in _make_request
immediately after JSON parsing, which maps the current ODP envelope onto the
documented results / totalHits contract. It runs before the success log
and empty-result diagnostics so they observe a stable shape too. Existing
results / totalHits keys are never overwritten, and non-search payloads
pass through unchanged.

Tests

Adds tests/test_uspto_api.py covering the envelope mapping, empty data bag,
no-override of pre-existing keys, and passthrough of non-search payloads.
ruff check is clean on the changed files; the new tests pass.

The USPTO Open Data Portal search endpoint returns the result set under
`patentFileWrapperDataBag` with the match count under `count`, but the
client and all of its callers (search_patents_simple, get_patent_by_number,
get_patent_by_application, get_recent_patents, check_api_status_detailed)
read `results` and `totalHits`. As a result every search silently returned
zero results.

Add `_normalize_search_response` to map the current ODP envelope onto the
documented `results`/`totalHits` contract, applied in `_make_request`
immediately after JSON parsing so logging, empty-result diagnostics, and the
return value all observe a stable shape. Existing `results`/`totalHits` keys
are preserved, and non-search payloads pass through unchanged.

Add tests covering the mapping, empty data bag, passthrough, and no-override
behavior.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a response normalization helper _normalize_search_response in mcp_server/uspto_api.py to map the USPTO ODP search envelope (patentFileWrapperDataBag and count) to a stable client contract (results and totalHits), along with corresponding unit tests. The review feedback suggests refining the normalization logic to correctly handle cases where the patentFileWrapperDataBag key is completely missing from the API response (e.g., when only count is returned) and splitting the tests to explicitly cover this scenario.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread mcp_server/uspto_api.py
Comment on lines +336 to +341
if not isinstance(result, dict) or "patentFileWrapperDataBag" not in result:
return result

normalized = dict(result)
normalized.setdefault("results", result.get("patentFileWrapperDataBag") or [])
normalized.setdefault("totalHits", result.get("count", 0))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation returns the response unchanged if "patentFileWrapperDataBag" is completely missing from the dictionary (due to the check on line 336). However, the test docstring for test_normalize_empty_databag_yields_empty_results states: 'A missing/empty data bag normalizes to an empty result list and zero hits.'

To truly support a missing "patentFileWrapperDataBag" (e.g., if the API returns only {"count": 0} or if the key is omitted on empty results), we should allow normalization if either "patentFileWrapperDataBag" or "count" is present.

Additionally, using result.get("count") or 0 instead of result.get("count", 0) is safer in case "count" is explicitly set to None in the API response.

Suggested change
if not isinstance(result, dict) or "patentFileWrapperDataBag" not in result:
return result
normalized = dict(result)
normalized.setdefault("results", result.get("patentFileWrapperDataBag") or [])
normalized.setdefault("totalHits", result.get("count", 0))
if not isinstance(result, dict) or ("patentFileWrapperDataBag" not in result and "count" not in result):
return result
normalized = dict(result)
normalized.setdefault("results", result.get("patentFileWrapperDataBag") or [])
normalized.setdefault("totalHits", result.get("count") or 0)

Comment thread tests/test_uspto_api.py
Comment on lines +30 to +37
def test_normalize_empty_databag_yields_empty_results():
"""A missing/empty data bag normalizes to an empty result list and zero hits."""
normalized = USPTOClient._normalize_search_response(
{"count": 0, "patentFileWrapperDataBag": []}
)

assert normalized["results"] == []
assert normalized["totalHits"] == 0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To match the updated normalization logic and fully cover the scenario where "patentFileWrapperDataBag" is completely missing from the response, we should split this test into two: one for an empty data bag and one for a completely missing data bag.

Suggested change
def test_normalize_empty_databag_yields_empty_results():
"""A missing/empty data bag normalizes to an empty result list and zero hits."""
normalized = USPTOClient._normalize_search_response(
{"count": 0, "patentFileWrapperDataBag": []}
)
assert normalized["results"] == []
assert normalized["totalHits"] == 0
def test_normalize_empty_databag_yields_empty_results():
"""An empty data bag normalizes to an empty result list and zero hits."""
normalized = USPTOClient._normalize_search_response(
{"count": 0, "patentFileWrapperDataBag": []}
)
assert normalized["results"] == []
assert normalized["totalHits"] == 0
def test_normalize_missing_databag_yields_empty_results():
"""A completely missing data bag normalizes to an empty result list and zero hits."""
normalized = USPTOClient._normalize_search_response(
{"count": 0}
)
assert normalized["results"] == []
assert normalized["totalHits"] == 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant