classification #5 - multi-label InferenceValues and conversion logic#925
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
cubic analysis
5 issues found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="x/emissions/keeper/actor_utils/worker.go">
<violation number="1" location="x/emissions/keeper/actor_utils/worker.go:248">
P2: Redundant KV store read: `closeActiveInferencesSet` re-fetches the topic from state, but the caller `CloseWorkerNonce` already has the full `topic types.Topic` object and only passes `topic.Id`. Pass the topic directly to avoid the unnecessary store read.</violation>
</file>
<file name="x/emissions/keeper/inference_values_test.go">
<violation number="1" location="x/emissions/keeper/inference_values_test.go:11">
P1: After `s.SetupTest()` inside the subtest, `ctx`, `k`, `topicId`, `w1`, and `w2` are stale — they were captured from the outer scope before the suite was reset. You need to re-fetch them after each `SetupTest()` call, otherwise `setArity` and `registerLabels` operate on the pre-reset store, and all assertions compare against outdated values.
Add re-initialization after `s.SetupTest()`:
```go
s.SetupTest()
ctx = s.Ctx()
k = s.EmissionsKeeper()
creator = s.Addrs(0)
topicId = s.SetupTopic(creator)
w1 = s.AddrsStr(0)
w2 = s.AddrsStr(1)
```</violation>
</file>
<file name="x/emissions/keeper/inference_values.go">
<violation number="1" location="x/emissions/keeper/inference_values.go:108">
P1: Silent zero-padding when `len(inf.Values) < regLen` may violate the "array length matches epoch registry" invariant. According to linked Linear issue ENGN-5154, adapters should enforce alignment — a worker submitting fewer values than the registry expects would get zeros silently injected, which could corrupt inference results. Consider requiring an exact length match (`len(inf.Values) != regLen`) unless partial submissions are an intentional design choice.</violation>
<violation number="2" location="x/emissions/keeper/inference_values.go:174">
P2: Redundant `GetEpochLabelRegistry` calls: `InferenceValuesFromInferencesSnapshot` fetches the registry once, then calls `InferenceValuesFromProto` per inference, which fetches the same registry again for multi-arity topics. For a snapshot with N inferences, this results in N+1 store reads instead of 1. Consider either passing the pre-fetched registry into the per-inference conversion, or extracting the per-inference logic that doesn't need to re-fetch.</violation>
</file>
<file name="x/emissions/keeper/keeper_test.go">
<violation number="1" location="x/emissions/keeper/keeper_test.go:6558">
P2: `w1` and `w2` are captured before the subtest loop but `s.SetupTest()` inside each `s.Run` regenerates all addresses with new random keys. These variables then hold stale addresses that are not funded or whitelisted in the new test context. Move these assignments inside the `s.Run` closure, after `s.SetupTest()`.</violation>
</file>
Linked issue analysis
Linked issue: ENGN-5154: 5. Internal inference representation & adapters
| Status | Acceptance criteria | Notes |
|---|---|---|
| ✅ | Introduce InferenceValues (internal inference representation) | Added InferenceValues struct and methods |
| ✅ | Adapter: proto → internal (convert stored Inference to InferenceValues) | Implemented InferenceValuesFromProto with checks |
| ✅ | Adapter: worker dictionary → internal (worker/latest -> internal) | Added GetWorkersLatestInferencesByTopicIdValuesPadded and snapshot converter |
| ✅ | Adapter: internal → labeled dictionary (internal -> labeled values) | Implemented ToLabeledValues to produce LabeledValue slices |
| ✅ | Enforce invariant: array length matches epoch label registry | Validation and padding against registry enforced |
| ✅ | Enforce invariant: scalar vs array consistency for SINGLE topics | Rejects len(values)>1 and reconciles scalar/array |
| ❌ | Remove dependency on TopicType in math paths | No changes removing TopicType dependency found |
| All math code consumes InferenceValues (acceptance) | Internal type and adapters added but math usage not updated | |
| ✅ | Adapters enforce alignment and invariants (acceptance) | Adapters validate/pad and reject invalid shapes |
| ✅ | Legacy scalar paths remain intact (acceptance) | Scalar-only inferences canonicalized and supported |
Architecture diagram
sequenceDiagram
participant W as Worker Actor (actor_utils)
participant K as Keeper (emissions)
participant DB as KV Store (Inferences/Topics/Registry)
Note over W,DB: NEW: Closing Active Inferences Set (Epoch End)
W->>K: closeActiveInferencesSet(topicId, nonce, workers)
K->>DB: GetTopic(topicId)
DB-->>K: topic (arity: SINGLE or MULTI)
rect rgb(23, 37, 84)
Note right of K: NEW: GetWorkersLatestInferencesByTopicIdValuesPadded
K->>K: Initialize inferences collection
loop For each Worker Address
K->>DB: GetWorkerLatestInferenceByTopicId(worker)
DB-->>K: inference (Value, Values[])
alt SINGLE Arity
K->>K: CHANGED: Align scalar 'Value' with 'Values[0]'
Note over K: Ensure Values length is 1
else MULTI Arity
K->>DB: GetEpochLabelRegistry(topicId, nonce)
DB-->>K: registry (Labels[])
alt Inference values > Registry labels
K-->>W: Error (Logic mismatch)
else Inference values < Registry labels
K->>K: NEW: Pad Values with ZeroDec to match Registry length
end
end
end
K->>K: CHANGED: Sort inferences by Inferer address (Deterministic)
end
K->>DB: InsertActiveInferences(topicId, height, inferences)
DB-->>K: Success
K-->>W: activeInfererMap, inferences
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
xmariachi
left a comment
There was a problem hiding this comment.
A concern on how labels get managed.
| active = append(active, &infCopy) | ||
| } | ||
| case types.TopicOutputArity_TOPIC_OUTPUT_ARITY_MULTI: | ||
| reg, err := k.GetEpochLabelRegistry(ctx, topic.Id, nonce) |
There was a problem hiding this comment.
What happens if a query that uses this function, gets called at the middle of the worker submission window?
I guess it will pad with whatever the "current" epoch label registry is.
Are labels added / removed from the registry as they get added to / removed from the active set? If not removed then the size of the epoch label registry (until it gets closed in CloseWorkerNonce) is unbounded. This may be problematic. So it sounds like we should do the former (add/remove labels to the registry as we go).
Purpose of Changes and their Description
add get methods and conversion logic for multi-label InferenceValues
Link(s) to Ticket(s) or Issue(s) resolved by this PR
https://linear.app/alloralabs/issue/ENGN-5154/5-internal-inference-representation-and-adapters
Are these changes tested and documented?
Unreleasedsection ofCHANGELOG.md?Still Left Todo
Fill this out if this is a Draft PR so others can help.
Summary by cubic
Introduces a simplified
InferenceValues(DecArray) and adapters that normalize single- and multi-label inferences, padding to the epoch label registry. Worker getters and active-inference aggregation now use padded, deterministically ordered values. Addresses Linear ENGN-5154.InferenceValuesis now aDecArraywith validation (length vs registry, NaN/finite).Value == Values[0]; MULTI pads with zeros to the registry length, rejects overlong), internal → labeled values; plus getters for block/latest snapshots →[]InferenceValues.GetWorkersLatestInferencesByTopicIdValuesPadded: SINGLE canonicalizes toValueslen=1 and syncs scalar; MULTI pads to registry length; results are sorted by inferer.closeActiveInferencesSetnow uses this to build the active inferer map.Written for commit 0d77963. Summary will update on new commits.