Skip to content

Realtime topology termination fast-path is unreachable in the current operator rewrite #88

@lanycrost

Description

@lanycrost

Summary

Bobrapet contains a DAG fast-path for realtime shutdown based on ConditionDegraded=True with ReasonTopologyTerminated. That path marks running realtime steps as failed and allows the DAG to advance immediately.

In the current rewrite, the read-side logic and tests exist, but I could not find production controller code in this repo that actually sets ReasonTopologyTerminated for a StoryRun.

This makes the fast-path effectively dead.

Current read-side behavior

internal/controller/runs/dag.go:433-460

  • When story.Spec.Pattern.IsRealtime() and the StoryRun has ConditionDegraded=True with ReasonTopologyTerminated, the DAG:
    • treats main phase as done
    • marks non-terminal realtime steps as failed with realtime topology terminated
    • continues into compensation/finally/terminal flow

Evidence that the fast-path exists but is not wired

Added condition constant:

  • pkg/conditions/conditions.go:119
    • ReasonTopologyTerminated = "TopologyTerminated"

DAG logic:

  • internal/controller/runs/dag.go:433-460

Tests covering the behavior:

  • internal/controller/runs/dag_test.go:1603-1687
  • internal/controller/runs/dag_test.go:1707-1786

But repo search only found:

  • the condition constant
  • the DAG check
  • the DAG tests
  • an unrelated generic degraded output path in dag.go

I did not find production code in the mounted rewrite that sets ConditionDegraded=True with ReasonTopologyTerminated on StoryRuns.

Impact

  • The operator contains a realtime fast-finish design that is not reachable
  • Cancelled realtime StoryRuns fall back to the slower cancel loop instead of advancing through the DAG immediately
  • The code/tests suggest intended behavior that operators cannot actually observe in production

Acceptance criteria

  • Wire a production controller path that sets ConditionDegraded=True with ReasonTopologyTerminated when realtime topology has actually terminated, or remove/replace the dead fast-path
  • Add integration coverage that proves a realtime StoryRun reaches the DAG fast-path from live controller signals instead of only from synthetic test status
  • Ensure the resulting signal is emitted exactly once and is safe to consume across requeues

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/operatorBobrapet controller or CRD-level change.kind/bugUnexpected behaviour or regression that needs fixing.priority/highImportant issue to schedule soon.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions