fix: normalize USPTO ODP search response to results/totalHits contract#24
Conversation
The USPTO Open Data Portal search endpoint returns the result set under `patentFileWrapperDataBag` with the match count under `count`, but the client and all of its callers (search_patents_simple, get_patent_by_number, get_patent_by_application, get_recent_patents, check_api_status_detailed) read `results` and `totalHits`. As a result every search silently returned zero results. Add `_normalize_search_response` to map the current ODP envelope onto the documented `results`/`totalHits` contract, applied in `_make_request` immediately after JSON parsing so logging, empty-result diagnostics, and the return value all observe a stable shape. Existing `results`/`totalHits` keys are preserved, and non-search payloads pass through unchanged. Add tests covering the mapping, empty data bag, passthrough, and no-override behavior.
There was a problem hiding this comment.
Code Review
This pull request introduces a response normalization helper _normalize_search_response in mcp_server/uspto_api.py to map the USPTO ODP search envelope (patentFileWrapperDataBag and count) to a stable client contract (results and totalHits), along with corresponding unit tests. The review feedback suggests refining the normalization logic to correctly handle cases where the patentFileWrapperDataBag key is completely missing from the API response (e.g., when only count is returned) and splitting the tests to explicitly cover this scenario.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if not isinstance(result, dict) or "patentFileWrapperDataBag" not in result: | ||
| return result | ||
|
|
||
| normalized = dict(result) | ||
| normalized.setdefault("results", result.get("patentFileWrapperDataBag") or []) | ||
| normalized.setdefault("totalHits", result.get("count", 0)) |
There was a problem hiding this comment.
The current implementation returns the response unchanged if "patentFileWrapperDataBag" is completely missing from the dictionary (due to the check on line 336). However, the test docstring for test_normalize_empty_databag_yields_empty_results states: 'A missing/empty data bag normalizes to an empty result list and zero hits.'
To truly support a missing "patentFileWrapperDataBag" (e.g., if the API returns only {"count": 0} or if the key is omitted on empty results), we should allow normalization if either "patentFileWrapperDataBag" or "count" is present.
Additionally, using result.get("count") or 0 instead of result.get("count", 0) is safer in case "count" is explicitly set to None in the API response.
| if not isinstance(result, dict) or "patentFileWrapperDataBag" not in result: | |
| return result | |
| normalized = dict(result) | |
| normalized.setdefault("results", result.get("patentFileWrapperDataBag") or []) | |
| normalized.setdefault("totalHits", result.get("count", 0)) | |
| if not isinstance(result, dict) or ("patentFileWrapperDataBag" not in result and "count" not in result): | |
| return result | |
| normalized = dict(result) | |
| normalized.setdefault("results", result.get("patentFileWrapperDataBag") or []) | |
| normalized.setdefault("totalHits", result.get("count") or 0) |
| def test_normalize_empty_databag_yields_empty_results(): | ||
| """A missing/empty data bag normalizes to an empty result list and zero hits.""" | ||
| normalized = USPTOClient._normalize_search_response( | ||
| {"count": 0, "patentFileWrapperDataBag": []} | ||
| ) | ||
|
|
||
| assert normalized["results"] == [] | ||
| assert normalized["totalHits"] == 0 |
There was a problem hiding this comment.
To match the updated normalization logic and fully cover the scenario where "patentFileWrapperDataBag" is completely missing from the response, we should split this test into two: one for an empty data bag and one for a completely missing data bag.
| def test_normalize_empty_databag_yields_empty_results(): | |
| """A missing/empty data bag normalizes to an empty result list and zero hits.""" | |
| normalized = USPTOClient._normalize_search_response( | |
| {"count": 0, "patentFileWrapperDataBag": []} | |
| ) | |
| assert normalized["results"] == [] | |
| assert normalized["totalHits"] == 0 | |
| def test_normalize_empty_databag_yields_empty_results(): | |
| """An empty data bag normalizes to an empty result list and zero hits.""" | |
| normalized = USPTOClient._normalize_search_response( | |
| {"count": 0, "patentFileWrapperDataBag": []} | |
| ) | |
| assert normalized["results"] == [] | |
| assert normalized["totalHits"] == 0 | |
| def test_normalize_missing_databag_yields_empty_results(): | |
| """A completely missing data bag normalizes to an empty result list and zero hits.""" | |
| normalized = USPTOClient._normalize_search_response( | |
| {"count": 0} | |
| ) | |
| assert normalized["results"] == [] | |
| assert normalized["totalHits"] == 0 |
Problem
USPTOClient._make_requestreturns the raw USPTO Open Data Portal searchenvelope, which carries results under
patentFileWrapperDataBagand the matchcount under
count. Every caller, however, readsresults/totalHits:search_patents_simple,get_patent_by_number,get_patent_by_application,get_recent_patents, andcheck_api_status_detailed. The result is thatsearches silently return zero results even when the API responds with matches.
Fix
Add
USPTOClient._normalize_search_response, applied in_make_requestimmediately after JSON parsing, which maps the current ODP envelope onto the
documented
results/totalHitscontract. It runs before the success logand empty-result diagnostics so they observe a stable shape too. Existing
results/totalHitskeys are never overwritten, and non-search payloadspass through unchanged.
Tests
Adds
tests/test_uspto_api.pycovering the envelope mapping, empty data bag,no-override of pre-existing keys, and passthrough of non-search payloads.
ruff checkis clean on the changed files; the new tests pass.