Cosmetology - Support staff searching by date of birth for providers#1311
Cosmetology - Support staff searching by date of birth for providers#1311isabeleliassen merged 14 commits intocsg-org:mainfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughConverts provider indexing to one OpenSearch document per license (includes dateOfBirth), adds DOB query permission checks, switches to multi-document generation/indexing with composite documentId and delete-by-provider support, and removes compact/jurisdiction summary notification email fields across schemas, APIs, docs, and tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Stream as DynamoDB Stream
participant Lambda as ProviderUpdateIngest Lambda
participant Utility as ProviderRecordUtility
participant Dynamo as DynamoDB
participant OpenSearch as OpenSearchClient
rect rgba(220,220,255,0.5)
Stream->>Lambda: Stream event (INSERT / MODIFY / REMOVE)
alt INSERT / MODIFY
Lambda->>Utility: generate_provider_opensearch_documents(compact, provider_id)
Utility->>Utility: emit list of license documents (dateOfBirth, privileges, documentId)
Lambda->>OpenSearch: bulk_index(index, documents, id_field='documentId')
else REMOVE
Lambda->>OpenSearch: delete_provider_documents(index, provider_id)
Lambda->>Dynamo: read remaining provider records
Lambda->>Utility: generate_provider_opensearch_documents(...) for remaining
Lambda->>OpenSearch: bulk_index(remaining_documents, id_field='documentId')
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (2)
backend/cosmetology-app/lambdas/python/search/handlers/provider_update_ingest.py (1)
144-153: Consider defensive handling for document ID parsing.The code assumes
doc_idcontains#when extractingprovider_id. While the current implementation should always produce composite IDs with#, defensive coding would handle unexpected formats.🛡️ Suggested defensive improvement
doc_id = index_result.get('_id', '') - provider_id = doc_id.split('#')[0] if '#' in doc_id else doc_id + # documentId format: {providerId}#{jurisdiction}#{licenseType} + provider_id = doc_id.split('#')[0] if doc_id else '' + if not provider_id: + logger.error('Unable to extract provider_id from document_id', document_id=doc_id) + continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/provider_update_ingest.py` around lines 144 - 153, The doc_id parsing assumes a '#' is present and doesn't guard against None/empty or unexpected formats; update the handling around index_result, doc_id and provider_id so you first coerce doc_id to a safe string (or use a fallback empty string), then extract provider_id using a single-split defensive approach (e.g., split with max 1) so it returns the whole doc_id when '#' is absent, and ensure failed_providers[compact].add(provider_id) uses that safe provider_id; apply this change where index_result, doc_id, provider_id and failed_providers[compact].add are referenced.backend/cosmetology-app/lambdas/python/search/handlers/search.py (1)
238-239: Harden scope regex matching with escaped inputs and full-string match.
compactandactionare interpolated into a regex pattern unescaped. Escaping dynamic segments and usingfullmatchmakes the check more robust and avoids regex edge cases.🔧 Suggested patch
-from re import match +from re import escape, fullmatch @@ - jurisdiction_scope_pattern = rf'.+/{compact}\.{action}$' - return any(match(jurisdiction_scope_pattern, scope) for scope in scopes) + jurisdiction_scope_pattern = rf'.+/{escape(compact)}\.{escape(str(action))}' + return any(fullmatch(jurisdiction_scope_pattern, scope) for scope in scopes)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py` around lines 238 - 239, The regex builds jurisdiction_scope_pattern using unescaped dynamic values (compact, action) and uses match(), which risks unintended regex behavior; escape compact and action with re.escape before interpolating and switch to full-string matching (re.fullmatch) or anchor the pattern with ^...$ to ensure exact scope matches when iterating over scopes (the variables/functions to edit are jurisdiction_scope_pattern, compact, action, and the match(...) call that checks scopes).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py`:
- Around line 469-476: There is trailing whitespace at the end of the
logger.debug call inside the block that checks include_inactive_privileges and
most_recent_license.compactEligibility; remove the trailing space(s) after the
string arguments in that logger.debug invocation (the one referencing
most_recent_license.jurisdiction and most_recent_license.licenseType) so the
file passes whitespace linting and the function behavior in
provider_record_util.py (around the include_inactive_privileges /
CompactEligibilityStatus.ELIGIBLE check and appending to
most_recent_licenses_for_each_type) remains unchanged.
In
`@backend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_license.py`:
- Line 359: The docstring "LicenseOpenSearchDocumentSchema should inherit
expiration status correction from LicenseExpirationStatusMixin." is too long and
must be wrapped to satisfy the line-length linter; edit the test docstring in
test_license (the triple-quoted string containing that sentence) and break it
into multiple shorter lines (or split into two sentences) within the same triple
quotes so each line is under the configured max length, preserving the original
wording and punctuation.
In
`@backend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_provider.py`:
- Line 82: The long triple-quoted test docstring referencing
"ProviderOpenSearchDocumentSchema" is exceeding the max line length; split that
docstring into multiple shorter lines (wrap the sentence into two or more lines)
within the same triple quotes so the content remains the same but no single line
is too long; locate the docstring in the test for
ProviderOpenSearchDocumentSchema (the docstring starting with
"ProviderOpenSearchDocumentSchema should NOT include top-level private fields
like dateOfBirth or ssnLastFour.") and reflow it into shorter lines to satisfy
the linter.
In
`@backend/cosmetology-app/lambdas/python/common/tests/unit/test_provider_record_util.py`:
- Around line 1-3: The import names in test_provider_record_util.py are not
alphabetically ordered; update the unittest.mock import to sort the imported
symbols (replace "from unittest.mock import MagicMock, patch, ANY" with them in
alphabetical order: ANY, MagicMock, patch) and ensure imports follow standard
grouping (stdlib imports only here) with correct ordering.
In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py`:
- Around line 66-68: The DOB permission check currently only inspects
body.get('query', {}) via _validate_date_of_birth_permission, so callers can
still reference dateOfBirth in other searchable clauses like 'sort' (and
anywhere else in the request body); update both call sites that invoke
_validate_date_of_birth_permission (the one using body.get('query', {}) and the
later one around lines 242-254) to pass a merged/combined searchable payload
(e.g., combine body.get('query', {}), body.get('sort', {}), and any other
searchable clauses such as filters/aggregations into a single dict) or otherwise
ensure _validate_date_of_birth_permission is called for each relevant clause;
ensure you still pass the same compact and get_event_scopes(event) arguments so
permission logic is unchanged.
- Around line 204-223: The function _query_references_field uses substring
matching ("field_name in key") causing false positives and misses cases where
field names appear as values (e.g., {"exists": {"field": "dateOfBirth"}});
change the dict key check to exact equality (key == field_name) and extend the
recursion to also detect field-name occurrences when a dict value is a string
equal to field_name or when a dict has common field-reference keys (e.g.,
"field", "name") whose values equal field_name; keep the existing recursion for
nested dicts/lists so _query_references_field reliably finds exact key matches
and value-based references.
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py`:
- Around line 522-525: The test function definition line for
test_search_with_date_of_birth_query_allowed_for_jurisdiction_level_read_private_scope
is longer than 120 characters; break the signature across multiple lines so it
obeys the linter (put the parameters on their own indented line(s), e.g. split
after the function name and place "self, mock_opensearch_client" on the next
line with proper indentation and the closing parenthesis on its own line) while
preserving the docstring and function body.
In `@backend/cosmetology-app/lambdas/python/search/utils.py`:
- Around line 16-32: Fix the docstring in
generate_provider_opensearch_documents: correct the typo "Becuase" to "Because"
and remove/normalize the extra blank lines/whitespace between sentences so the
paragraph lines 22–24 are wrapped consistently (single blank line between
description and parameter section) and adhere to project's docstring formatting
rules; keep the same content and exception mentions but ensure spacing and
spelling are clean.
---
Nitpick comments:
In
`@backend/cosmetology-app/lambdas/python/search/handlers/provider_update_ingest.py`:
- Around line 144-153: The doc_id parsing assumes a '#' is present and doesn't
guard against None/empty or unexpected formats; update the handling around
index_result, doc_id and provider_id so you first coerce doc_id to a safe string
(or use a fallback empty string), then extract provider_id using a single-split
defensive approach (e.g., split with max 1) so it returns the whole doc_id when
'#' is absent, and ensure failed_providers[compact].add(provider_id) uses that
safe provider_id; apply this change where index_result, doc_id, provider_id and
failed_providers[compact].add are referenced.
In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py`:
- Around line 238-239: The regex builds jurisdiction_scope_pattern using
unescaped dynamic values (compact, action) and uses match(), which risks
unintended regex behavior; escape compact and action with re.escape before
interpolating and switch to full-string matching (re.fullmatch) or anchor the
pattern with ^...$ to ensure exact scope matches when iterating over scopes (the
variables/functions to edit are jurisdiction_scope_pattern, compact, action, and
the match(...) call that checks scopes).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b0b8a474-eccc-4812-abf6-dc73fd393f7c
📒 Files selected for processing (19)
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/license/api.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/provider/api.pybackend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_license.pybackend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_provider.pybackend/cosmetology-app/lambdas/python/common/tests/unit/test_provider_record_util.pybackend/cosmetology-app/lambdas/python/search/handlers/manage_opensearch_indices.pybackend/cosmetology-app/lambdas/python/search/handlers/populate_provider_documents.pybackend/cosmetology-app/lambdas/python/search/handlers/provider_update_ingest.pybackend/cosmetology-app/lambdas/python/search/handlers/search.pybackend/cosmetology-app/lambdas/python/search/opensearch_client.pybackend/cosmetology-app/lambdas/python/search/tests/__init__.pybackend/cosmetology-app/lambdas/python/search/tests/function/__init__.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_manage_opensearch_indices.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_populate_provider_documents.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_provider_update_ingest.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.pybackend/cosmetology-app/lambdas/python/search/tests/unit/test_opensearch_client.pybackend/cosmetology-app/lambdas/python/search/utils.py
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py
Show resolved
Hide resolved
...cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_license.py
Outdated
Show resolved
Hide resolved
...osmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_provider.py
Outdated
Show resolved
Hide resolved
backend/cosmetology-app/lambdas/python/common/tests/unit/test_provider_record_util.py
Show resolved
Hide resolved
backend/cosmetology-app/lambdas/python/search/handlers/search.py
Outdated
Show resolved
Hide resolved
backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
backend/cosmetology-app/lambdas/python/search/handlers/search.py (2)
66-67:⚠️ Potential issue | 🟠 MajorDOB permission validation still misses non-
queryclauses.Line 67 only passes
body['query']into the guard, sodateOfBirthcan still be referenced fromsortwithoutreadPrivate. That leaves a hole in the privacy check this change is adding.🔧 Suggested patch
- _validate_date_of_birth_permission(body.get('query', {}), compact, get_event_scopes(event)) + _validate_date_of_birth_permission(body, compact, get_event_scopes(event))-def _validate_date_of_birth_permission(query: dict, compact: str, scopes: set[str]) -> None: +def _validate_date_of_birth_permission(request_body: dict, compact: str, scopes: set[str]) -> None: @@ - :param query: The OpenSearch query body + :param request_body: The validated request body @@ - if _query_references_field(query, 'dateOfBirth') and not _caller_has_read_private_scope(compact, scopes): + dob_referenced = ( + _query_references_field(request_body.get('query', {}), 'dateOfBirth') + or _query_references_field(request_body.get('sort', []), 'dateOfBirth') + ) + if dob_referenced and not _caller_has_read_private_scope(compact, scopes): raise CCInvalidRequestException('Searching by dateOfBirth requires readPrivate permission')Also applies to: 242-252
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py` around lines 66 - 67, The DOB permission guard currently only checks body.get('query', {}) via _validate_date_of_birth_permission, which misses references in other clauses (e.g., 'sort'); update the call to inspect the entire request body (or explicitly relevant clauses like 'query', 'sort', 'aggregations', etc.) so any occurrence of "dateOfBirth" is detected before proceeding. In practice, modify the invocation of _validate_date_of_birth_permission to accept and scan the full body (or a combined dict of all searchable/sortable clauses) and keep passing compact and get_event_scopes(event) unchanged; also apply the same fix to the other occurrences noted around the 242-252 region where _validate_date_of_birth_permission is used.
204-216:⚠️ Potential issue | 🟠 MajorMatch field references by field segment, not raw substring.
field_name in keyis too loose: it can false-positive on names likedateOfBirthYear, and it still misses DSL shapes where the field name is carried as a value, e.g.{"exists": {"field": "dateOfBirth"}}. That makes the auth check both over- and under-inclusive.🔧 Suggested patch
+def _matches_field_reference(candidate: str, field_name: str) -> bool: + return candidate == field_name or candidate.endswith(f'.{field_name}') + + def _query_references_field(obj, field_name: str) -> bool: @@ if isinstance(obj, dict): for key, value in obj.items(): - if field_name in key: + if _matches_field_reference(key, field_name): return True + if key in {'field', 'name'} and isinstance(value, str) and _matches_field_reference(value, field_name): + return True + if isinstance(value, str) and _matches_field_reference(value, field_name): + return True if _query_references_field(value, field_name): return True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py` around lines 204 - 216, The _query_references_field function currently uses a raw substring test ("field_name in key") which causes false positives and misses DSL shapes where the field appears as a value (e.g., {"exists": {"field": "dateOfBirth"}}); update _query_references_field to (1) when iterating dict keys, split each key on '.' and compare each segment for exact equality to field_name instead of substring matching, (2) when a dict value is itself a dict, explicitly check for keys named "field" (and possibly "name") whose value equals field_name and return True, and (3) continue recursive traversal for nested dicts and lists and ensure you return True immediately when a match is found and False only after full traversal. This preserves recursion and fixes both over- and under-inclusion in auth checks.
🧹 Nitpick comments (2)
backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py (1)
295-316: Assert thatdateOfBirthis removed from the list response.This fixture includes
licenses[0]['dateOfBirth'], but the test only verifies top-level redaction. Adding an explicit assertion on the nested license would lock in the privacy guarantee this PR is supposed to preserve.🔧 Suggested patch
self.assertEqual(1, len(provider['licenses'])) self.assertEqual('oh', provider['licenses'][0]['jurisdiction']) self.assertEqual('cosmetologist', provider['licenses'][0]['licenseType']) + self.assertNotIn('dateOfBirth', provider['licenses'][0]) # Verify private fields were stripped self.assertNotIn('ssnLastFour', provider)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py` around lines 295 - 316, The test currently verifies top-level redaction but doesn't assert that nested license fields are stripped; update the test in test_search_providers.py (the block using providers = body['providers'] and provider = providers[0]) to explicitly assert that the nested license entry does not include dateOfBirth—e.g. add an assertion against provider['licenses'][0] (or iterate provider['licenses']) using self.assertNotIn('dateOfBirth', provider['licenses'][0]) to lock in the privacy guarantee for nested license objects.backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py (1)
483-508: Normalize compact jurisdictions before emitting privileges.
home_jurisdictionis lowercased, but each value fromlive_compact_jurisdictionsis compared and stored as-is. Lowercasing the loop value once here would keep the home-state skip and emitted privilege records stable if config ever drifts to mixed case.🔧 Suggested patch
- for jurisdiction in live_jurisdictions_for_compact: + for jurisdiction in (j.lower() for j in live_jurisdictions_for_compact): if jurisdiction == home_jurisdiction: continueBased on learnings, "In provider_record_util.py (and related Python modules under backend/cosmetology-app), rely on and enforce the existing normalization pattern where jurisdiction values are stored in lowercase. Verify that config.live_compact_jurisdictions.get(compact, []) yields lowercase values and ensure any comparisons or storage also use lowercase (or enforce lowercase normalization at the data surface) to remain consistent with the app-wide normalization strategy."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py` around lines 483 - 508, The loop over live_jurisdictions_for_compact must normalize each jurisdiction to lowercase before comparing and emitting privileges to match the existing pattern (home_jurisdiction = most_recent_license.jurisdiction.lower()); change the loop to assign something like jur = jurisdiction.lower() and use jur for the home-state skip, for calls to get_adverse_action_records_for_privilege and get_investigation_records_for_privilege, and for the 'jurisdiction' and 'licenseJurisdiction' fields in the privilege_dict so emitted records remain stable if config values are mixed-case; also ensure any other uses in this block (e.g., license_type_abbr, compact) continue to use the normalized jur variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py`:
- Around line 629-672: provider_dict currently gets shallow-copied into every
document (via provider_dict = self.get_provider_record().to_dict() and doc =
dict(provider_dict)), which preserves provider-level license summary fields
(licenseJurisdiction, licenseStatus, compactEligibility, dateOfExpiration)
across non-home license docs; update the documents construction in the loop
(where license_dict, license_privileges and doc are built) to either remove
those provider-level license summary keys from doc before appending or overwrite
them with values derived from license_record (use license_record.to_dict() /
license_dict values) so each per-license OpenSearch doc reflects the correct
license-specific summary instead of the provider-level summary.
---
Duplicate comments:
In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py`:
- Around line 66-67: The DOB permission guard currently only checks
body.get('query', {}) via _validate_date_of_birth_permission, which misses
references in other clauses (e.g., 'sort'); update the call to inspect the
entire request body (or explicitly relevant clauses like 'query', 'sort',
'aggregations', etc.) so any occurrence of "dateOfBirth" is detected before
proceeding. In practice, modify the invocation of
_validate_date_of_birth_permission to accept and scan the full body (or a
combined dict of all searchable/sortable clauses) and keep passing compact and
get_event_scopes(event) unchanged; also apply the same fix to the other
occurrences noted around the 242-252 region where
_validate_date_of_birth_permission is used.
- Around line 204-216: The _query_references_field function currently uses a raw
substring test ("field_name in key") which causes false positives and misses DSL
shapes where the field appears as a value (e.g., {"exists": {"field":
"dateOfBirth"}}); update _query_references_field to (1) when iterating dict
keys, split each key on '.' and compare each segment for exact equality to
field_name instead of substring matching, (2) when a dict value is itself a
dict, explicitly check for keys named "field" (and possibly "name") whose value
equals field_name and return True, and (3) continue recursive traversal for
nested dicts and lists and ensure you return True immediately when a match is
found and False only after full traversal. This preserves recursion and fixes
both over- and under-inclusion in auth checks.
---
Nitpick comments:
In
`@backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py`:
- Around line 483-508: The loop over live_jurisdictions_for_compact must
normalize each jurisdiction to lowercase before comparing and emitting
privileges to match the existing pattern (home_jurisdiction =
most_recent_license.jurisdiction.lower()); change the loop to assign something
like jur = jurisdiction.lower() and use jur for the home-state skip, for calls
to get_adverse_action_records_for_privilege and
get_investigation_records_for_privilege, and for the 'jurisdiction' and
'licenseJurisdiction' fields in the privilege_dict so emitted records remain
stable if config values are mixed-case; also ensure any other uses in this block
(e.g., license_type_abbr, compact) continue to use the normalized jur variable.
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py`:
- Around line 295-316: The test currently verifies top-level redaction but
doesn't assert that nested license fields are stripped; update the test in
test_search_providers.py (the block using providers = body['providers'] and
provider = providers[0]) to explicitly assert that the nested license entry does
not include dateOfBirth—e.g. add an assertion against provider['licenses'][0]
(or iterate provider['licenses']) using self.assertNotIn('dateOfBirth',
provider['licenses'][0]) to lock in the privacy guarantee for nested license
objects.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 85773fd6-e081-47e3-848d-01347416108e
📒 Files selected for processing (8)
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.pybackend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_license.pybackend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_provider.pybackend/cosmetology-app/lambdas/python/common/tests/unit/test_provider_record_util.pybackend/cosmetology-app/lambdas/python/search/handlers/search.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.pybackend/cosmetology-app/lambdas/python/search/tests/unit/test_opensearch_client.pybackend/cosmetology-app/lambdas/python/search/utils.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/cosmetology-app/lambdas/python/search/tests/unit/test_opensearch_client.py
- backend/cosmetology-app/lambdas/python/common/tests/unit/test_data_model/test_schema/test_provider.py
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py
Show resolved
Hide resolved
There was a problem hiding this comment.
♻️ Duplicate comments (1)
backend/cosmetology-app/lambdas/python/search/handlers/search.py (1)
204-222:⚠️ Potential issue | 🟠 MajorHarden field-reference detection in
_query_references_fieldto avoid bypasses.Line 214 uses substring key matching and misses value-based field references (for example,
exists.fieldpatterns). This can allow DOB queries withoutreadPrivatein supported DSL shapes.🔧 Suggested patch
def _query_references_field(obj, field_name: str) -> bool: @@ if isinstance(obj, dict): for key, value in obj.items(): - if field_name in key: + if key == field_name or key.endswith(f'.{field_name}'): return True + if key in {'field', 'name'} and isinstance(value, str): + if value == field_name or value.endswith(f'.{field_name}'): + return True + if key == 'fields' and isinstance(value, list): + if any(isinstance(v, str) and (v == field_name or v.endswith(f'.{field_name}')) for v in value): + return True if _query_references_field(value, field_name): return True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py` around lines 204 - 222, The current _query_references_field uses substring matching on dict keys which can both false-positive and miss value-based references (e.g., "exists.field"); update the function to (1) match dict keys exactly or by dot-segment (check equality and last-dot segment == field_name) instead of arbitrary substring, (2) also inspect scalar string values for field-reference patterns (e.g., "field_name", "exists.field_name", or any suffix after a dot == field_name), and (3) keep the existing recursive traversal for lists/dicts so value-based references are discovered; apply these changes inside _query_references_field and preserve its return-True short-circuit behavior.
🧹 Nitpick comments (1)
backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py (1)
307-316: Add an explicit DOB-redaction assertion in the list response test.This block checks sanitization, but it doesn’t assert that
licenses[*].dateOfBirthis excluded from list-view results (a core privacy requirement in this PR). Please add a negative assertion here.✏️ Suggested patch
self.assertEqual(1, len(provider['licenses'])) self.assertEqual('oh', provider['licenses'][0]['jurisdiction']) self.assertEqual('cosmetologist', provider['licenses'][0]['licenseType']) + self.assertNotIn('dateOfBirth', provider['licenses'][0]) # Verify private fields were stripped self.assertNotIn('ssnLastFour', provider)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py` around lines 307 - 316, The test currently verifies sanitization for provider in the list response but misses asserting DOB redaction; update the test in test_search_providers.py (the list-response case that uses the local variable provider and its 'licenses' array) to explicitly assert that 'dateOfBirth' is not present on any license (e.g., assertNotIn 'dateOfBirth' in provider['licenses'][0] or loop through provider['licenses'] and assertNotIn 'dateOfBirth' for each license) so the list-view response guarantees DOB is stripped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py`:
- Around line 204-222: The current _query_references_field uses substring
matching on dict keys which can both false-positive and miss value-based
references (e.g., "exists.field"); update the function to (1) match dict keys
exactly or by dot-segment (check equality and last-dot segment == field_name)
instead of arbitrary substring, (2) also inspect scalar string values for
field-reference patterns (e.g., "field_name", "exists.field_name", or any suffix
after a dot == field_name), and (3) keep the existing recursive traversal for
lists/dicts so value-based references are discovered; apply these changes inside
_query_references_field and preserve its return-True short-circuit behavior.
---
Nitpick comments:
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py`:
- Around line 307-316: The test currently verifies sanitization for provider in
the list response but misses asserting DOB redaction; update the test in
test_search_providers.py (the list-response case that uses the local variable
provider and its 'licenses' array) to explicitly assert that 'dateOfBirth' is
not present on any license (e.g., assertNotIn 'dateOfBirth' in
provider['licenses'][0] or loop through provider['licenses'] and assertNotIn
'dateOfBirth' for each license) so the list-view response guarantees DOB is
stripped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3e81948e-a41c-494a-8f0b-802e1aff8f9f
📒 Files selected for processing (2)
backend/cosmetology-app/lambdas/python/search/handlers/search.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py (1)
545-594: Strengthen rejection tests by asserting no OpenSearch call occurs.These two deny-path tests validate
400, but they don’t prove authorization validation short-circuits before search execution. Patchhandlers.search.opensearch_clientand assertassert_not_called()for both tests.✏️ Proposed hardening for denial-path tests
- def test_search_with_date_of_birth_query_rejected_without_read_private_scope(self): + `@patch`('handlers.search.opensearch_client') + def test_search_with_date_of_birth_query_rejected_without_read_private_scope( + self, mock_opensearch_client + ): @@ self.assertEqual(400, response['statusCode']) body = json.loads(response['body']) self.assertIn('dateOfBirth', body['message']) + mock_opensearch_client.search.assert_not_called() - def test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope(self): + `@patch`('handlers.search.opensearch_client') + def test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope( + self, mock_opensearch_client + ): @@ self.assertEqual(400, response['statusCode']) body = json.loads(response['body']) self.assertIn('dateOfBirth', body['message']) + mock_opensearch_client.search.assert_not_called()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py` around lines 545 - 594, Update both test_search_with_date_of_birth_query_rejected_without_read_private_scope and test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope to patch handlers.search.opensearch_client (e.g., with unittest.mock.patch or self.patch) before calling search_api_handler, call search_api_handler as before, then assert the patched opensearch_client was not called (assert_not_called()/assert_not_called) in addition to existing statusCode and message assertions so the tests confirm authorization short-circuits before any OpenSearch activity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/cosmetology-app/lambdas/python/search/handlers/search.py`:
- Around line 204-227: The helper _query_references_field misses standalone
string values (e.g., list items like ["dateOfBirth"]), so add an early branch to
check string scalars: if obj is a str return True when obj == field_name or
obj.endswith('.' + field_name); keep the existing dict/list recursion for other
types. Update the function _query_references_field to perform this string check
before handling dicts/lists so direct string list elements are detected and
cannot bypass permission checks.
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py`:
- Around line 295-318: The test misses asserting that date of birth is redacted
from list responses: update the assertion block in test_search_providers.py
(after extracting provider = providers[0]) to assert that 'dateOfBirth' is not
present on the provider's license entry created by _create_mock_provider_hit();
specifically check provider['licenses'][0] does not contain 'dateOfBirth'
(and/or provider does not contain a top-level 'dateOfBirth') so the list-view
DOB redaction requirement is enforced.
---
Nitpick comments:
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py`:
- Around line 545-594: Update both
test_search_with_date_of_birth_query_rejected_without_read_private_scope and
test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope
to patch handlers.search.opensearch_client (e.g., with unittest.mock.patch or
self.patch) before calling search_api_handler, call search_api_handler as
before, then assert the patched opensearch_client was not called
(assert_not_called()/assert_not_called) in addition to existing statusCode and
message assertions so the tests confirm authorization short-circuits before any
OpenSearch activity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ca90827d-1ed2-42a0-a818-a5d13f16a4b9
📒 Files selected for processing (2)
backend/cosmetology-app/lambdas/python/search/handlers/search.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py
backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py
Show resolved
Hide resolved
|
@jlkravitz This is now ready for your review, thanks |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py (1)
547-596: Strengthen unauthorized DOB tests with explicit OpenSearch short-circuit assertions.These two rejection tests assert HTTP 400, but they don’t currently prove the handler blocked before calling OpenSearch. Please patch
opensearch_clienthere as well and assertsearch.assert_not_called()(as you already do in later negative cases) to lock in the authorization boundary.✅ Suggested test hardening diff
- def test_search_with_date_of_birth_query_rejected_without_read_private_scope(self): + `@patch`('handlers.search.opensearch_client') + def test_search_with_date_of_birth_query_rejected_without_read_private_scope(self, mock_opensearch_client): """Test that a query containing dateOfBirth returns 400 when the caller only has readGeneral scope.""" from handlers.search import search_api_handler + self._when_testing_mock_opensearch_client(mock_opensearch_client) @@ self.assertEqual(400, response['statusCode']) body = json.loads(response['body']) self.assertIn('dateOfBirth', body['message']) + self.assertIn('readPrivate', body['message']) + mock_opensearch_client.search.assert_not_called() - def test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope(self): + `@patch`('handlers.search.opensearch_client') + def test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope( + self, mock_opensearch_client + ): """Test that deeply nested dateOfBirth references are caught and rejected.""" from handlers.search import search_api_handler + self._when_testing_mock_opensearch_client(mock_opensearch_client) @@ self.assertEqual(400, response['statusCode']) body = json.loads(response['body']) self.assertIn('dateOfBirth', body['message']) + self.assertIn('readPrivate', body['message']) + mock_opensearch_client.search.assert_not_called()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py` around lines 547 - 596, In the two tests test_search_with_date_of_birth_query_rejected_without_read_private_scope and test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope, patch the opensearch client used by handlers.search.search_api_handler (e.g., the mock named opensearch_client or search) before invoking search_api_handler and add an assertion that search.assert_not_called() after asserting the 400 response; this ensures the handler rejected the request and did not call the OpenSearch client's search method.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py`:
- Around line 547-596: In the two tests
test_search_with_date_of_birth_query_rejected_without_read_private_scope and
test_search_with_nested_date_of_birth_query_rejected_without_read_private_scope,
patch the opensearch client used by handlers.search.search_api_handler (e.g.,
the mock named opensearch_client or search) before invoking search_api_handler
and add an assertion that search.assert_not_called() after asserting the 400
response; this ensures the handler rejected the request and did not call the
OpenSearch client's search method.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6c008415-cb78-4a3d-9997-ad8ad07340ff
⛔ Files ignored due to path filters (1)
backend/cosmetology-app/lambdas/nodejs/yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (3)
backend/cosmetology-app/lambdas/nodejs/package.jsonbackend/cosmetology-app/lambdas/python/search/handlers/search.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_search_providers.py
✅ Files skipped from review due to trivial changes (1)
- backend/cosmetology-app/lambdas/nodejs/package.json
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/cosmetology-app/lambdas/python/search/handlers/search.py
jlkravitz
left a comment
There was a problem hiding this comment.
Couple small questions! Otherwise, looks good.
backend/cosmetology-app/lambdas/python/common/tests/unit/test_provider_record_util.py
Outdated
Show resolved
Hide resolved
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/provider_record_util.py
Show resolved
Hide resolved
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/cosmetology-app/docs/design/README.md (1)
406-406: Consider documenting the composite documentId format explicitly.While you reference
search/utils.pyfor document construction, it would be helpful to explicitly state the composite documentId format here in the documentation for quick reference. Based on the implementation, the format is{providerId}#{jurisdiction}#{licenseType}.Additionally, the statement "same overall shape as the provider detail API with
readGeneralpermission" could be clarified: while the top-level document structure matches the provider detail API, the nestedlicensesarray includes additional private fields (such asdateOfBirth) that are indexed specifically to enable authorized staff searches but are not returned in list-view API responses.📝 Suggested clarification
-Each indexed document corresponds to **one license** and uses the same overall shape as the provider detail API with `readGeneral` permission. See the [application code](../../lambdas/python/search/handlers/manage_opensearch_indices.py) for the current mapping definition. Document construction (one sanitized document per license, including composite `documentId`) is implemented in [search/utils.py](../../lambdas/python/search/utils.py). +Each indexed document corresponds to **one license** and uses the same overall top-level shape as the provider detail API with `readGeneral` permission, while the nested `licenses` array includes additional private fields (e.g., `dateOfBirth`) indexed to enable authorized staff searches. Each document is assigned a composite `documentId` in the format `{providerId}#{jurisdiction}#{licenseType}` to uniquely identify license documents. See the [application code](../../lambdas/python/search/handlers/manage_opensearch_indices.py) for the current mapping definition. Document construction is implemented in [search/utils.py](../../lambdas/python/search/utils.py).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/docs/design/README.md` at line 406, The README should explicitly state the composite documentId format and clarify the difference between the top-level shape and nested license fields: add a sentence under the indexed document paragraph that the composite documentId used by search/utils.py is formatted as {providerId}#{jurisdiction}#{licenseType}, and expand the "same overall shape as the provider detail API with `readGeneral` permission" line to note that while the top-level structure matches the provider detail API (see manage_opensearch_indices.py for mapping), the nested `licenses` array contains additional private/indexed fields (e.g., `dateOfBirth`) used for authorized-staff search indexing but not returned in list-view API responses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/cosmetology-app/docs/design/README.md`:
- Line 406: The README should explicitly state the composite documentId format
and clarify the difference between the top-level shape and nested license
fields: add a sentence under the indexed document paragraph that the composite
documentId used by search/utils.py is formatted as
{providerId}#{jurisdiction}#{licenseType}, and expand the "same overall shape as
the provider detail API with `readGeneral` permission" line to note that while
the top-level structure matches the provider detail API (see
manage_opensearch_indices.py for mapping), the nested `licenses` array contains
additional private/indexed fields (e.g., `dateOfBirth`) used for
authorized-staff search indexing but not returned in list-view API responses.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b2ae19e3-09b6-4f75-a0da-ea1f4c6bf111
📒 Files selected for processing (1)
backend/cosmetology-app/docs/design/README.md
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/cosmetology-app/lambdas/python/common/requirements-dev.txt`:
- Line 11: The dev requirements file pins attrs==26.1.0 which conflicts with
attrs==25.4.0 in requirements.txt; update the attrs pin in
backend/cosmetology-app/lambdas/python/common/requirements-dev.txt to match the
runtime pin (attrs==25.4.0) OR update the runtime requirements.txt to the
desired version and then regenerate the lockfile from the synchronized .in
inputs so both requirements.txt and requirements-dev.txt have the same attrs
version and the lockfile is consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f3cd9605-8f16-4085-aa18-c755f0472bf0
📒 Files selected for processing (15)
backend/cosmetology-app/lambdas/python/cognito-backup/requirements-dev.txtbackend/cosmetology-app/lambdas/python/cognito-backup/requirements.txtbackend/cosmetology-app/lambdas/python/common/requirements-dev.txtbackend/cosmetology-app/lambdas/python/common/requirements.txtbackend/cosmetology-app/lambdas/python/compact-configuration/requirements-dev.txtbackend/cosmetology-app/lambdas/python/custom-resources/requirements-dev.txtbackend/cosmetology-app/lambdas/python/data-events/requirements-dev.txtbackend/cosmetology-app/lambdas/python/disaster-recovery/requirements-dev.txtbackend/cosmetology-app/lambdas/python/provider-data-v1/requirements-dev.txtbackend/cosmetology-app/lambdas/python/search/requirements-dev.txtbackend/cosmetology-app/lambdas/python/search/requirements.txtbackend/cosmetology-app/lambdas/python/staff-user-pre-token/requirements-dev.txtbackend/cosmetology-app/lambdas/python/staff-users/requirements-dev.txtbackend/cosmetology-app/requirements-dev.txtbackend/cosmetology-app/requirements.txt
✅ Files skipped from review due to trivial changes (14)
- backend/cosmetology-app/lambdas/python/staff-user-pre-token/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/cognito-backup/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/cognito-backup/requirements.txt
- backend/cosmetology-app/lambdas/python/custom-resources/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/search/requirements.txt
- backend/cosmetology-app/lambdas/python/search/requirements-dev.txt
- backend/cosmetology-app/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/provider-data-v1/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/common/requirements.txt
- backend/cosmetology-app/lambdas/python/disaster-recovery/requirements-dev.txt
- backend/cosmetology-app/requirements.txt
- backend/cosmetology-app/lambdas/python/data-events/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/staff-users/requirements-dev.txt
- backend/cosmetology-app/lambdas/python/compact-configuration/requirements-dev.txt
backend/cosmetology-app/lambdas/python/common/requirements-dev.txt
Outdated
Show resolved
Hide resolved
|
@jlkravitz The dependency check is failing for one dev transitive dependency used by pytest, This should be ready for another review. Thanks |
jlkravitz
left a comment
There was a problem hiding this comment.
@isabeleliassen This is good to merge!
The cosmetology compact needs to allow authenticated staff users to search practitioners by their date of birth. This updates their OpenSearch ingest process to include the date of birth field in the data we index into the OpenSearch domain. We also need to update the authenticated search API to now check specifically for permissions of the caller and only allow them to include the date of birth in the query if they have a readPrivate permission.
We also needed to update the way we are documenting records in OpenSearch to support native pagination. For JCC, we document the full list of license records nested within a provider object. This works because the search list view for JCC displays providers, so pagination lines up. For Cosmetology, the search list view will display line items per license, not provider, so pagination was not lined up since OpenSearch matches page size on top level provider hits, and provides may have more than one license, so you may end up with more hits than you requested.
To address this, we are now indexing one provider document per license into OpenSearch. This ensures that the hits returned from the search API will match with the requested page size.
In addition to the search updates, this PR includes the removal of the 'compactSummaryReportNotificationEmails' and 'jurisdictionSummaryReportNotificationEmails' configuration fields for compacts and jurisdictions, as the Cosmetology compact will not be sending out any financial summary reports.
Requirements List
Testing List
Closes #1310
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores