fix: return 400 for Elasticsearch query parse errors instead of 500#106
Merged
Conversation
…ead of 500 Malformed Lucene syntax (e.g. MARC subject headings with trailing `)-`) causes ES to return HTTP 400. Previously any non-200 ES response in the search and random paths was blindly mapped to SearchFailure → HTTP 500. Added SearchQueryParseFailure response type, wired it through the ES client, registry, and added end-to-end test coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
WalkthroughThe changes introduce explicit handling for Elasticsearch query parse failures (HTTP 400 errors) by adding a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant SearchRegistry
participant ElasticSearchClient
participant Elasticsearch
Client->>SearchRegistry: SearchQuery (with malformed q param)
SearchRegistry->>ElasticSearchClient: Execute search
ElasticSearchClient->>Elasticsearch: HTTP request
Elasticsearch-->>ElasticSearchClient: HTTP 400 Bad Request
ElasticSearchClient-->>SearchRegistry: SearchQueryParseFailure
SearchRegistry-->>Client: 400 BadRequest + ValidationFailure(message)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
qparameter (e.g. MARC bibliographic subject headings with trailing)-likeGarcía,, Eloy,, 1958)-) causes Elasticsearch to return HTTP 400. PreviouslyElasticSearchClient.processSearchandprocessRandomused a wildcardcase ElasticSearchHttpError(_)that discarded the status code and mapped all non-200 ES responses toSearchFailure→ HTTP 500.SearchQueryParseFailuretoSearchProtocoland updatedprocessSearch/processRandominElasticSearchClientto emit it when ES returns 400 (mirroring howprocessFetchalready handles 404).SearchQueryParseFailurethroughSearchRegistryBehavior→ValidationFailure("The q parameter contains invalid search syntax.")→ HTTP 400 with JSON body for search and random paths; added a defensive handler in the fetch path.MockEsClientQueryParseErrorand an end-to-end test asserting/v2/items?q=invalid%29-returns 400.Root cause
Observed in production: a Basque library scraper (
Jakarta Commons-HttpClient/3.1) sending MARC subject headings as Lucene queries, generating 7 HTTP 500s that should have been 400s. The ES 400 responses were silently swallowed and re-emitted as 500s.Test plan
sbt "compile; test"passes (274/274)Elasticsearch query parse error→return BadRequest for /v2/itemsSearchFailure(non-400 ES errors) unchanged🤖 Generated with Claude Code
Overview
This PR fixes incorrect HTTP status codes returned when Elasticsearch receives malformed Lucene query syntax in the
qparameter. Previously, Elasticsearch 400 responses were being converted to HTTP 500 errors; the fix ensures these now correctly return HTTP 400 with a descriptive validation error message. This resolves an issue where certain valid use cases (e.g., MARC subject headings with trailing punctuation likeGarcía,, Eloy,, 1958)-) were being rejected with 500 errors instead of 400 validation failures.Changes
Core Changes
SearchQueryParseFailureresponse type to theSearchProtocolsealed trait to distinguish ES 400 errors from other search failures.ElasticSearchClient.processSearch()andprocessRandom()to explicitly match HTTP 400 status codes from Elasticsearch and respond withSearchQueryParseFailure(consistent with howprocessFetchhandles 404 errors); all other HTTP errors continue to returnSearchFailure.SearchQueryParseFailurethroughSearchRegistryBehaviorto convert it toValidationFailure("The q parameter contains invalid search syntax.")for the/v2/itemssearch and random endpoints, producing HTTP 400 responses with JSON error bodies. Added a defensive handler in the fetch path for completeness.Testing
MockEsClientQueryParseErrorto simulate Elasticsearch 400 responses in tests./v2/items?q=invalid%29-returns HTTP 400 with appropriate error message.Public API Changes
Modifies public-facing API response shape: The
/v2/itemssearch and random endpoints now return HTTP 400 (BadRequest) instead of HTTP 500 (InternalServerError) when theqparameter contains invalid Lucene query syntax. Error response body format remains consistent with other validation errors.