Resolve three output-changing correctness defects (CPP redundancy filter, TreeModel seeding, BH p-values) under ADR-0032 + regolden

## Problem

The July 2026 correctness audit found three defects that are real but **change
published output**, so — unlike the low-risk batch — each needs the same
governance before any code lands: a declared **ADR-0032** equivalence tier and a
**regolden** of the affected regression anchors (ADR-0015 pattern). They are
independent defects, grouped here only because they share that decision gate and
must **not** be folded into the low-risk correctness batch.

### 1. CPP redundancy filter compares digit-characters, not positions

`filtering_info_` builds each feature's position set with `set(x)` where `x` is the
comma-joined position string (e.g. `"11,12,…,20"`), so the position-overlap gate
compares digit characters `{'0'–'9', ','}` instead of integer positions. Any two
multi-digit-position features look ~100% overlapping, so the positional
decorrelation the algorithm documents effectively never fires — redundancy
reduction degenerates into a pure scale-correlation-within-category filter.
Reproduced: `TMD-Segment(1,2)` (pos 11–20) vs `TMD-Segment(2,2)` (pos 21–30), true
overlap 0.0, computed 1.0, second dropped. Identical defect in the `CPP.simplify`
path. This has been the shipped behavior since the file's inception.

### 2. TreeModel does no per-round seeding → zero uncertainty under a fixed seed

`fit_tree_based_models`' round loop passes a constant `random_state` to the RFE
`RandomForestClassifier` and the importance-model kwargs, so under a fixed seed
every round fits identical estimators; `feat_importance_std` (and `predict_proba`'s
`pred_std`) collapse to exactly 0 and rounds 2..N are wasted. This hits the
encouraged reproducibility path and contradicts the "average across training
rounds enhances robustness" claim. `ShapModel` already solves this with per-round
`random_state + round_idx` reseeding (`_seed_model_kwargs`).

### 3. BH-adjusted p-values omit the monotonicity step

`_bh_corrected_pvalues` computes `sorted_pvals * n / ranks` and clips to 1 but
omits the reverse cumulative-minimum, so `p_val_fdr_bh` deviates from canonical
Benjamini–Hochberg (e.g. `statsmodels.multipletests('fdr_bh')`) in non-monotone
regions (inflated/conservative). Does not affect selection (ranking uses
`abs_auc`/`abs_mean_dif`), only the reported column.

## Goal

Correct all three, each landing under a declared ADR-0032 tier with a regolden of
the affected anchors — **after** a maintainer decision per defect.

## Decision needed (HITL)

Each defect changes published output and requires maintainer sign-off + a regolden
before code. Decide, per defect, the ADR-0032 tier and the quality band (for #1)
on the canonical `DOM_GSEC` cell.

## Requirements

**CPP redundancy filter (#1)**
- [ ] `_backend/cpp/_filters/_redundancy_filter.py` and `_backend/cpp/_simplify.py`
      — parse `COL_POSITION` via `.split(",")` (optionally `map(int, …)`).
- [ ] Declare the ADR-0032 tier (T3) + documented quality band; regolden the CPP anchor.

**TreeModel per-round seeding (#2)**
- [ ] `explainable_ai/_backend/tree_model/tree_model_fit.py` — per-round kwargs with
      `random_state + i` for the RFE `RandomForestClassifier` and the importance
      models (no-op when `random_state is None`), mirroring `_seed_model_kwargs`.
- [ ] Regression anchor freezing the seeded importance mean + asserting non-zero std.

**BH p-value monotonicity (#3)**
- [ ] `_backend/cpp/_utils_feature_stat.py` — apply
      `np.minimum.accumulate(corrected[::-1])[::-1]` before scattering back.
- [ ] Anchor the `p_val_fdr_bh` column on the canonical cell.

## KPIs / Acceptance criteria
- [ ] #1: disjoint-TMD-half features retained (overlap 0 → not dropped); kept-feature
      set on the canonical cell re-frozen with the band stated numerically.
- [ ] #2: `feat_importance_std` non-zero under a fixed seed while `fit(seed) ==
      fit(seed)` still holds; anchor frozen.
- [ ] #3: matches `statsmodels` fdr_bh within tol on a non-monotone fixture;
      selection unchanged; anchor frozen.
- [ ] Each change carries a `versionchanged` note.

## Scope / non-goals
- All three are output-changing; kept out of the low-risk correctness batch.
- No performance changes (ADR-0033: program closed).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve three output-changing correctness defects (CPP redundancy filter, TreeModel seeding, BH p-values) under ADR-0032 + regolden #343

Problem

1. CPP redundancy filter compares digit-characters, not positions

2. TreeModel does no per-round seeding → zero uncertainty under a fixed seed

3. BH-adjusted p-values omit the monotonicity step

Goal

Decision needed (HITL)

Requirements

KPIs / Acceptance criteria

Scope / non-goals

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Resolve three output-changing correctness defects (CPP redundancy filter, TreeModel seeding, BH p-values) under ADR-0032 + regolden #343

Description

Problem

1. CPP redundancy filter compares digit-characters, not positions

2. TreeModel does no per-round seeding → zero uncertainty under a fixed seed

3. BH-adjusted p-values omit the monotonicity step

Goal

Decision needed (HITL)

Requirements

KPIs / Acceptance criteria

Scope / non-goals

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions