Skip to content

feat(xref): allow empty-term searches when specs are provided#497

Open
marcoscaceres wants to merge 12 commits into
mainfrom
feat/empty-xref-search
Open

feat(xref): allow empty-term searches when specs are provided#497
marcoscaceres wants to merge 12 commits into
mainfrom
feat/empty-xref-search

Conversation

@marcoscaceres

Copy link
Copy Markdown
Collaborator

Summary

  • Browse all terms for a spec without providing a search term
  • collectBySpecs() aggregates entries, applies type/for filters
  • 1000-entry result limit for safety
  • 7 new test cases

Closes #278
Closes #208

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for “browse” queries in xref search so callers can omit the search term when constraining by spec (and optionally filtering by type/for), with a safety cap on returned entries and new unit tests.

Changes:

  • Allow term to be omitted on the GET endpoint when specs are provided, defaulting term to "".
  • Add a search fallback path for empty term + specs that aggregates entries by spec and applies type/for filtering with a 1000-result cap.
  • Add test fixtures and new test cases covering empty-term browsing scenarios.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
routes/xref/lib/search.ts Implements empty-term/spec browsing via collectBySpecs() and adds a 1000-result cap.
routes/xref/search.get.ts Makes term optional for GET requests when specs are present and defaults missing term to empty string.
tests/routes/xref/lib/search.test.js Adds tests for empty-term browsing and expands cache clearing between suites.
tests/routes/xref/lib/data-by-spec.js Introduces a bySpec fixture used by the new browsing tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread routes/xref/search.get.ts Outdated
Comment thread tests/routes/xref/lib/data-by-spec.js Outdated
Comment thread tests/routes/xref/lib/search.test.js
Comment thread tests/routes/xref/lib/search.test.js
Comment thread routes/xref/lib/search.ts
Comment thread routes/xref/lib/search.ts Outdated
Comment thread routes/xref/lib/search.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread routes/xref/lib/search.ts
Comment thread routes/xref/lib/search.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/routes/xref/lib/search.test.js
Comment thread routes/xref/lib/search.ts
Comment thread tests/routes/xref/lib/data-by-spec.js Outdated
Enables browsing all terms for a spec or filtering by type without
specifying a search term. Useful for discovering terms and debugging
why specific terms can't be found.

Closes #278
Closes #208

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread routes/xref/search.get.ts Outdated
Comment thread routes/xref/lib/search.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/routes/xref/lib/search.test.js

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@marcoscaceres

Copy link
Copy Markdown
Collaborator Author

@copilot can you address the unresolved comments?

… to production shape

- resolveSpecKey: iterate over specmap groups (current/snapshot) to find
  entry.shortname, fixing broken store.specmap?.[spec] flat-map assumption
- data-by-spec.js: rekey all entries by series shortname (not versioned spec id)
  to match production specs.json shape, merge svg2 into svg, merge css-cascade-3/4
  into css-cascade
- search.test.js: add minimal specmap to test store stub matching production
  { [group]: { [specid]: { shortname, url, title } } } shape; add test that
  verifies versioned spec ids (css-lists-3, web-bluetooth-1) resolve via specmap

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/dc6faf8f-7ab5-4633-a750-dc884e1aa53c

Copilot AI commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

@copilot can you address the unresolved comments?

Addressed in bc5238e:

  • resolveSpecKey: fixed to iterate over specmap groups (e.g., current/snapshot) and look up entry.shortname — the previous code treated specmap as a flat {specid → string} map which doesn't match the actual {group → {specid → {shortname, url, title}}} shape.
  • data-by-spec.js: rekeyed all entries by series shortname (e.g., referrer-policy, css-cascade, css-lists, web-bluetooth, wai-aria, svg) to match production specs.json shape; merged css-cascade-3/css-cascade-4 under css-cascade and svg2 into svg.
  • search.test.js: added a minimal specmap to the test store stub in production shape, and added a new test case verifying that versioned spec ids (css-lists-3, web-bluetooth-1) are resolved to their series shortnames via specmap when browsing with an empty term.

@marcoscaceres

Copy link
Copy Markdown
Collaborator Author

@copilot Can you review this and fix any issues that come up?

- filter(): apply .slice(0, 1000) on the empty-term browse result before
  returning so the MemCache stores a bounded list (not the full unbounded
  DataEntry[]), fixing the memory safety concern
- searchOne(): remove the duplicate post-cache slice(0,1000) guards that
  were the only cap, now redundant since filter() is already bounded
- search.get.ts: remove !types?.length from the 400-guard so that
  ?type=... without specs or term returns a proper 400 instead of
  silently falling through to an empty-string term search that returns
  misleading results; update error message accordingly

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/eafb7347-faf9-4e79-a820-4546c4b7b010

Copilot AI commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

@copilot Can you review this and fix any issues that come up?

Done in f7fa003. Two remaining issues were addressed:

  1. Cache safety (routes/xref/lib/search.ts): moved the 1000-entry .slice(0, 1000) cap inside filter() at the end of the empty-term branch (after type/for filtering), so the value stored in MemCache is always bounded. Removed the duplicate post-cache slices from searchOne().

  2. Route validation (routes/xref/search.get.ts): tightened the 400 guard to typeof term === "undefined" && !specs?.length?type=... without specs or term now correctly returns 400 instead of silently falling into the regular empty-string term search (which returned misleading results).

@marcoscaceres

Copy link
Copy Markdown
Collaborator Author

@copilot Can you confirm that all your comments have been addressed? Please resolve them if you’re happy. Can you fix anything outstanding from your feedback that hasn’t been addressed or anything else you find?

Copilot AI commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

@copilot Can you confirm that all your comments have been addressed? Please resolve them if you’re happy. Can you fix anything outstanding from your feedback that hasn’t been addressed or anything...

All review comments have been addressed across commits bc5238e and f7fa003. Here's the full status:

Resolved by code changes:

  • data-by-spec.js fixture rekeyed from versioned spec IDs to series shortnames (production shape) — bc5238e
  • Test store updated with a specmap in production { [group]: { [specid]: { shortname, url, title } } } shape — bc5238e
  • resolveSpecKey() correctly iterates over specmap groups to resolve versioned IDs — bc5238e
  • Versioned spec ID resolution test added (css-lists-3, web-bluetooth-1 → resolved via specmap) — bc5238e
  • filterBySpecType() dedup condition fixed to include uri (item.spec === it.spec && item.type === it.type && item.uri === it.uri) — 06be305
  • 1000-entry cap moved inside filter() so the MemCache value is bounded — f7fa003
  • Route validation tightened: 400 returned only when both term and specs are absent — f7fa003

Accepted design decisions (no code changes needed):

  • Types-only browsing without specs is intentionally not supported — empty-term browse is scoped by specs to prevent unbounded results
  • The 1000-cap sits after type/for filtering but before filterBySpecType/filterPreferLatestVersion; type/for filtering narrows the set first, making the cap unlikely to affect preference ordering in practice

All 48 tests pass.

@marcoscaceres marcoscaceres marked this pull request as ready for review April 29, 2026 03:14
- Remove required attribute from term input
- Allow form submission when specs are provided without a term
- Use entry.term for cite generation when search term is empty
- Include 'term' in requested fields so browse results have term data
- Fall back to entry.spec for shortname display
- Auto-open Advanced Search when URL has specs/types/for params
- Remove options.includes guard from select() so URL-driven
  selections work before metadata options are loaded
@marcoscaceres marcoscaceres changed the title feat(xref): allow empty-term searches when specs or types are provided feat(xref): allow empty-term searches when specs are provided May 3, 2026
marcoscaceres and others added 6 commits May 4, 2026 21:30
* refactor: use Promise.withResolvers() (Node 22+)

Replace new Promise((resolve, reject) => { ... }) with the cleaner
Promise.withResolvers() pattern in background-task-queue.ts (2 sites)
and sh.ts (1 site).

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: address all PR review feedback on sh.ts and background-task-queue.ts

- sh.ts: Remove broken duplicate Promise.withResolvers + new Promise code; use
  Promise.withResolvers() properly; switch exit->close for complete stdio flush;
  add error event handler for spawn failures; make rejection shapes consistent
  (both reject with Error instances with command/stdout/stderr/code fields)
- background-task-queue: Store split2 stream refs so off() targets the right
  stream and unpipe on cleanup; add worker error/exit handling in run() to
  prevent deadlocks on unexpected worker failures; add worker error/exit
  handling in activate() to prevent indefinite hangs during initialization

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/4640ebb0-d6ac-4269-b7bf-e563038f39e1

* fix: add job ID to worker unexpected-exit error message for easier debugging

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/4640ebb0-d6ac-4269-b7bf-e563038f39e1

* fix: cross-remove error/exit listeners in run() to prevent double lock release

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/5a7c6559-a65f-44e5-865b-b8a2f0b8c955

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
express-rate-limit v8.4+ rejects `trust proxy = true` because it allows
anyone to spoof X-Forwarded-For and bypass rate limiting
(ERR_ERL_PERMISSIVE_TRUST_PROXY). Setting it to 2 trusts exactly the two
known proxy hops (Cloudflare → nginx → Express) while still resolving
the correct client IP for logging and rate limiting.
* chore: upgrade TypeScript from 5.6.3 to 6.0.3

- Update typescript to ^6.0.3 and @types/node to ^24.12.2
- Change tsconfig module/moduleResolution from esnext/node to nodenext
- Fix catch clause variables typed as unknown (stricter in TS 6)
- Replace @ts-ignore with @ts-expect-error
- Fix definite assignment assertion for Logger.result
- Fix view-engine type compatibility with express

* Apply suggestions from code review

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor: type engine directly as EngineCallback, remove async and cast

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/50c73c28-bbb9-43db-8a58-d5c85ae17fd3

* refactor: simplify engine body, let TS infer params from EngineCallback type

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/50c73c28-bbb9-43db-8a58-d5c85ae17fd3

* fix: resolve TaskModule TS errors, extract getErrnoCode helper

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/a465b3f6-aa1e-4a52-9998-11d12704f748

* test: add coverage for getErrnoCode utility

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/a465b3f6-aa1e-4a52-9998-11d12704f748

* fix(xref): include options.all in search cache key to prevent cache poisoning

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/9ff1c977-d8d7-467c-937f-9353541e3b87

* fix: address Sid's review feedback

- Re-throw HTTPError directly in group.ts catch block instead of
  double-wrapping it
- Revert cache key change in search.ts (behavior change, not a TS
  upgrade fix; should be a separate PR)

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
* feat(api/baseline): add web-features/baseline data endpoint

Adds endpoints for serving Baseline browser availability data:
- GET /api/baseline — full web-features dataset
- GET /api/baseline/:feature — single feature lookup
- POST /api/baseline/search — spec-URL-based feature search

Data sourced from web-platform-dx/web-features GitHub releases.
Enables ReSpec's implementationStatus feature (speced/respec#5174)
without requiring unpkg.com dependency.

Closes #490

* fix(api/baseline): return flat {result: [...]} from search endpoint

Client expects {result: WebFeatureEntry[]}, server was returning
Record<string, {id, feature}[]>. Flatten features, deduplicate by
ID, and wrap in {result: [...]}.

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(api/baseline): address unresolved review comments

- search.post.ts: return 503 when store.data is null (service unavailable)
- search.post.ts: reject empty/whitespace-only strings in specs array
- store.ts: normalizeUrl() now also clears search params (query string)
- scraper.ts: add ETag/If-None-Match support to skip unnecessary re-downloads
- scraper.ts: use atomic write (temp file + rename) to prevent corrupt data on interrupted writes
- Add tests: normalizeUrl, feature route (moved/split/404), search.post validation

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/41823caf-5a5e-40fc-8bf7-fca753fb37ab

* fix(api/baseline): replace existsSync with try-catch for ETag read, add return type doc

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/41823caf-5a5e-40fc-8bf7-fca753fb37ab

* fix(api/baseline): update.ts returns 400 for non-published webhook actions, fix XSS

- Return 400 (not 200) for non-published webhook actions, consistent with
  caniuse/update.ts and respec/builds/update.ts patterns
- Set res.locals.reason = "action-not-published" for structured logging
- Use a static message instead of echoing req.body.action to avoid
  reflected XSS (CodeQL js/reflected-xss)

Agent-Logs-Url: https://github.com/speced/respec-web-services/sessions/5af37242-1fe0-4bdd-b54b-c45ab0c4120f

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
The baseline worker import was missing the .ts extension required by
TypeScript 6 with nodenext module resolution, causing a build failure.
Move the 1000-entry cap from filter() to searchOne(), after
filterBySpecType and filterPreferLatestVersion have run. This ensures
preferred entries (current over snapshot, latest version) are retained
before the result is capped.

Add clarifying comments that types-only browsing (no term, no specs) is
deliberately unsupported, and improve the route validation message.

Addresses Copilot review threads on PR #497.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow empty searches (when other things are given) Allow showing all the terms for a spec

3 participants