Skip to content

classification #5 - multi-label InferenceValues and conversion logic#925

Open
zale144 wants to merge 3 commits into
alek/engn-5153-4-worker-dictionary-ingestion-validationfrom
alek/engn-5154-5-internal-inference-representation-adapters
Open

classification #5 - multi-label InferenceValues and conversion logic#925
zale144 wants to merge 3 commits into
alek/engn-5153-4-worker-dictionary-ingestion-validationfrom
alek/engn-5154-5-internal-inference-representation-adapters

Conversation

@zale144

@zale144 zale144 commented Mar 4, 2026

Copy link
Copy Markdown
Contributor

Purpose of Changes and their Description

add get methods and conversion logic for multi-label InferenceValues

Link(s) to Ticket(s) or Issue(s) resolved by this PR

https://linear.app/alloralabs/issue/ENGN-5154/5-internal-inference-representation-and-adapters

Are these changes tested and documented?

  • If tested, please describe how. If not, why tests are not needed.
  • If documented, please describe where. If not, describe why docs are not needed.
  • Added to Unreleased section of CHANGELOG.md?

Still Left Todo

Fill this out if this is a Draft PR so others can help.


Summary by cubic

Introduces a simplified InferenceValues (DecArray) and adapters that normalize single- and multi-label inferences, padding to the epoch label registry. Worker getters and active-inference aggregation now use padded, deterministically ordered values. Addresses Linear ENGN-5154.

  • New Features
    • InferenceValues is now a DecArray with validation (length vs registry, NaN/finite).
    • Converters: proto → internal (SINGLE enforces len≤1 and requires Value == Values[0]; MULTI pads with zeros to the registry length, rejects overlong), internal → labeled values; plus getters for block/latest snapshots → []InferenceValues.
    • Added GetWorkersLatestInferencesByTopicIdValuesPadded: SINGLE canonicalizes to Values len=1 and syncs scalar; MULTI pads to registry length; results are sorted by inferer. closeActiveInferencesSet now uses this to build the active inferer map.
    • Tests cover conversion, padding, arity rules, zero-length registry edge cases, and worker getter behavior.

Written for commit 0d77963. Summary will update on new commits.

@zale144 zale144 self-assigned this Mar 4, 2026

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

5 issues found across 5 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="x/emissions/keeper/actor_utils/worker.go">

<violation number="1" location="x/emissions/keeper/actor_utils/worker.go:248">
P2: Redundant KV store read: `closeActiveInferencesSet` re-fetches the topic from state, but the caller `CloseWorkerNonce` already has the full `topic types.Topic` object and only passes `topic.Id`. Pass the topic directly to avoid the unnecessary store read.</violation>
</file>

<file name="x/emissions/keeper/inference_values_test.go">

<violation number="1" location="x/emissions/keeper/inference_values_test.go:11">
P1: After `s.SetupTest()` inside the subtest, `ctx`, `k`, `topicId`, `w1`, and `w2` are stale — they were captured from the outer scope before the suite was reset. You need to re-fetch them after each `SetupTest()` call, otherwise `setArity` and `registerLabels` operate on the pre-reset store, and all assertions compare against outdated values.

Add re-initialization after `s.SetupTest()`:
```go
s.SetupTest()
ctx = s.Ctx()
k = s.EmissionsKeeper()
creator = s.Addrs(0)
topicId = s.SetupTopic(creator)
w1 = s.AddrsStr(0)
w2 = s.AddrsStr(1)
```</violation>
</file>

<file name="x/emissions/keeper/inference_values.go">

<violation number="1" location="x/emissions/keeper/inference_values.go:108">
P1: Silent zero-padding when `len(inf.Values) < regLen` may violate the "array length matches epoch registry" invariant. According to linked Linear issue ENGN-5154, adapters should enforce alignment — a worker submitting fewer values than the registry expects would get zeros silently injected, which could corrupt inference results. Consider requiring an exact length match (`len(inf.Values) != regLen`) unless partial submissions are an intentional design choice.</violation>

<violation number="2" location="x/emissions/keeper/inference_values.go:174">
P2: Redundant `GetEpochLabelRegistry` calls: `InferenceValuesFromInferencesSnapshot` fetches the registry once, then calls `InferenceValuesFromProto` per inference, which fetches the same registry again for multi-arity topics. For a snapshot with N inferences, this results in N+1 store reads instead of 1. Consider either passing the pre-fetched registry into the per-inference conversion, or extracting the per-inference logic that doesn't need to re-fetch.</violation>
</file>

<file name="x/emissions/keeper/keeper_test.go">

<violation number="1" location="x/emissions/keeper/keeper_test.go:6558">
P2: `w1` and `w2` are captured before the subtest loop but `s.SetupTest()` inside each `s.Run` regenerates all addresses with new random keys. These variables then hold stale addresses that are not funded or whitelisted in the new test context. Move these assignments inside the `s.Run` closure, after `s.SetupTest()`.</violation>
</file>

Linked issue analysis

Linked issue: ENGN-5154: 5. Internal inference representation & adapters

Status Acceptance criteria Notes
Introduce InferenceValues (internal inference representation) Added InferenceValues struct and methods
Adapter: proto → internal (convert stored Inference to InferenceValues) Implemented InferenceValuesFromProto with checks
Adapter: worker dictionary → internal (worker/latest -> internal) Added GetWorkersLatestInferencesByTopicIdValuesPadded and snapshot converter
Adapter: internal → labeled dictionary (internal -> labeled values) Implemented ToLabeledValues to produce LabeledValue slices
Enforce invariant: array length matches epoch label registry Validation and padding against registry enforced
Enforce invariant: scalar vs array consistency for SINGLE topics Rejects len(values)>1 and reconciles scalar/array
Remove dependency on TopicType in math paths No changes removing TopicType dependency found
⚠️ All math code consumes InferenceValues (acceptance) Internal type and adapters added but math usage not updated
Adapters enforce alignment and invariants (acceptance) Adapters validate/pad and reject invalid shapes
Legacy scalar paths remain intact (acceptance) Scalar-only inferences canonicalized and supported
Architecture diagram
sequenceDiagram
    participant W as Worker Actor (actor_utils)
    participant K as Keeper (emissions)
    participant DB as KV Store (Inferences/Topics/Registry)

    Note over W,DB: NEW: Closing Active Inferences Set (Epoch End)

    W->>K: closeActiveInferencesSet(topicId, nonce, workers)
    K->>DB: GetTopic(topicId)
    DB-->>K: topic (arity: SINGLE or MULTI)

    rect rgb(23, 37, 84)
    Note right of K: NEW: GetWorkersLatestInferencesByTopicIdValuesPadded
    K->>K: Initialize inferences collection
    
    loop For each Worker Address
        K->>DB: GetWorkerLatestInferenceByTopicId(worker)
        DB-->>K: inference (Value, Values[])

        alt SINGLE Arity
            K->>K: CHANGED: Align scalar 'Value' with 'Values[0]'
            Note over K: Ensure Values length is 1
        else MULTI Arity
            K->>DB: GetEpochLabelRegistry(topicId, nonce)
            DB-->>K: registry (Labels[])
            
            alt Inference values > Registry labels
                K-->>W: Error (Logic mismatch)
            else Inference values < Registry labels
                K->>K: NEW: Pad Values with ZeroDec to match Registry length
            end
        end
    end
    
    K->>K: CHANGED: Sort inferences by Inferer address (Deterministic)
    end

    K->>DB: InsertActiveInferences(topicId, height, inferences)
    DB-->>K: Success
    K-->>W: activeInfererMap, inferences
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread x/emissions/keeper/inference_values_test.go
Comment thread x/emissions/keeper/inference_values.go
Comment thread x/emissions/keeper/actor_utils/worker.go Outdated
Comment thread x/emissions/keeper/inference_values.go Outdated
Comment thread x/emissions/keeper/keeper_test.go Outdated

@xmariachi xmariachi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concern on how labels get managed.

active = append(active, &infCopy)
}
case types.TopicOutputArity_TOPIC_OUTPUT_ARITY_MULTI:
reg, err := k.GetEpochLabelRegistry(ctx, topic.Id, nonce)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a query that uses this function, gets called at the middle of the worker submission window?
I guess it will pad with whatever the "current" epoch label registry is.

Are labels added / removed from the registry as they get added to / removed from the active set? If not removed then the size of the epoch label registry (until it gets closed in CloseWorkerNonce) is unbounded. This may be problematic. So it sounds like we should do the former (add/remove labels to the registry as we go).

@zale144 zale144 mentioned this pull request Apr 9, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants