Skip to content

Feature: Add attribute-based exclusion filtering to the aggregation engine #249

@cgbarlow

Description

@cgbarlow

Task: Add attribute-based exclusion filtering to the aggregation engine (ADR-212)

Context

The aggregation engine (ADR-212 / SPEC-212-a) walks smart-markdown diagrams and
rolls up structured per-use values. The global “Shopping list” profile aggregates
ingredient elements from a meal-plan diagram into a grocery list grouped by aisle.

Grocery elements can carry a Seasoning attribute (attributes/Seasoning/type = "true").
Roughly 27 elements are flagged this way (e.g. Sugar, Flour, Olive oil, Parsley,
salt, pepper, herbs). Users do not want these in the shopping list output because
they are pantry staples bought infrequently.

Problem

There is currently NO way to exclude elements by attribute. The inner traversal
only supports skip_blank_values. Attempting to add an exclude_if_attribute key
to profile_data.traversal.inner results in the key being silently stripped on
save (the create/update path drops unrecognised keys), and the aggregate run
includes the flagged elements regardless. Confirmed empirically: a profile with
the candidate key returns identical output to the unfiltered profile, with Sugar
and Olive oil still present.

Goal

Add a first-class, attribute-based exclusion filter to the inner traversal so a
profile can drop elements matching an attribute condition (specifically, but not
limited to, Seasoning = true).

Proposed schema addition

Extend traversal.inner with an optional exclude_filters array. Each filter:

{
  "path": "attributes/Seasoning/type",
  "op": "equals",
  "value": "true"
}
  • path (string, required): attribute path on the collected element, same
    path grammar already used by value_attribute_path / bucket_attribute_path.
  • op (string, required): one of equals, not_equals, exists, not_exists,
    truthy, falsy. Start with equals and truthy as the minimum viable set;
    truthy should treat “true”/“1”/“yes” (case-insensitive) as match.
  • value (string, optional): required for equals / not_equals, ignored otherwise.

Semantics: an element is EXCLUDED from aggregation if it matches ANY filter in the
array (OR logic). Filters apply after token collection and before value summing,
so excluded elements never contribute a row or a quantity. Make this array
optional and default to empty (no behaviour change for existing profiles).

Prefer an array over a single object so future use cases (exclude by aisle,
exclude optional items) compose without another schema change.

Required changes

  1. Schema / validation (SPEC-212-a + whatever Pydantic / JSON-schema model
    backs profile_data): add exclude_filters to the inner-traversal model so it
    survives create_aggregation_profile and update_aggregation_profile round-trips.
    This is the key fix - right now the persistence layer is dropping the field.
    Add validation: reject unknown op values with a clear error rather than
    silently ignoring.
  2. Engine: in the inner traversal, after resolving each candidate element,
    evaluate exclude_filters and skip matched elements. Make sure provenance,
    per-source breakdown, and multiplier scaling all see the post-filter set.
  3. No change to output formatting, grouping, or the outer traversal.

Tests

  • Unit: filter evaluation for each op, including missing-attribute and
    blank-value edge cases.
  • Round-trip: create a profile with exclude_filters, fetch it, assert the field
    persists unchanged (this is the regression that currently fails).
  • Engine integration: aggregate a meal plan where some referenced ingredients are
    Seasoning: true; assert those elements are absent from output and that
    non-flagged elements (and their summed quantities) are unaffected.
  • Backward compat: existing profiles with no exclude_filters produce byte-identical
    output to before the change.

Acceptance criteria

  • A “Shopping list (no seasonings)” profile with
    exclude_filters: [{ "path": "attributes/Seasoning/type", "op": "truthy" }]
    excludes Sugar, Olive oil, etc. from the aggregated output.
  • The field persists through create and update.
  • All existing aggregation profiles behave identically.

Notes / non-goals

  • Do not hard-code “Seasoning”. The filter must be generic attribute-based.
  • Do not change how seasoning elements are stamped into recipe diagrams.
  • Document the new field in SPEC-212-a and bump the relevant version per the repo’s
    ADR/versioning convention.

Verify before running

  • The exact module that backs profile_data validation (Pydantic, JSON schema, or
    hand-rolled). That is where the silent-drop happens and is the crux of the fix.
  • The repo’s version-bump and ADR-update convention, so the change follows it
    rather than guessing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions