Auto-discover additional fields for cortex filtering #40

PhilippMatthes · 2025-12-29T10:20:23Z

In this pull request we implemented a cortex filtering pipeline for KVM. This pipeline uses the hypervisor CRD as single source of truth to find out on which hypervisors a vm can be scheduled. To complete this implementation, we extended the hypervisor crd in this pull request. The hypervisor crd pull request added additional fields and removed outdated ones, which need to be autodiscovered in the kvm node agent. The following fields are now populated:

Support filtering based on hypervisor type and other capabilities:

Export the hypervisor type, architecture, supported devices, supported cpu modes, and supported features

Capacity filtering:

Aggregate the allocated and total available capacity and populate the corresponding fields

(Bonus)

Add numa cell capacity & allocation information so we can implement numa sensitive initial placement

When done:

Test with ssh-forwarded libvirt socket

Note

The scope of this PR is to establish a minimum viable scheduling pipeline in cortex, with the least amount of changes possible. Refactorings of the hypervisor crd spec can follow if needed.

github-actions · 2025-12-30T11:29:59Z

Merging this branch changes the coverage (1 decrease, 5 increase)

Impacted Packages	Coverage Δ	🤖
github.com/cobaltcore-dev/kvm-node-agent/internal/controller	23.42% (+2.04%)	👍
github.com/cobaltcore-dev/kvm-node-agent/internal/emulator	0.00% (ø)
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt	11.08% (+2.58%)	👍
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/capabilities	63.64% (-3.86%)	👎
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/domcapabilities	71.79% (+71.79%)	🌟
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/dominfo	62.50% (+62.50%)	🌟
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/util	100.00% (+100.00%)	🌟

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/cobaltcore-dev/kvm-node-agent/internal/controller/hypervisor_controller.go	34.91% (+1.57%)	106 (+13)	37 (+6)	69 (+7)	👍
github.com/cobaltcore-dev/kvm-node-agent/internal/emulator/libvirt.go	0.00% (ø)	0	0	0
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/capabilities/client.go	63.64% (+2.35%)	33 (+2)	21 (+2)	12	👍
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/capabilities/schema.go	0.00% (-88.89%)	0 (-9)	0 (-8)	0 (-1)	💀 💀 💀 💀 💀
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/domcapabilities/client.go	71.79% (+71.79%)	39 (+39)	28 (+28)	11 (+11)	🌟
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/domcapabilities/example.go	0.00% (ø)	0	0	0
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/domcapabilities/schema.go	0.00% (ø)	0	0	0
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/dominfo/client.go	62.50% (+62.50%)	48 (+48)	30 (+30)	18 (+18)	🌟
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/dominfo/example.go	0.00% (ø)	0	0	0
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/dominfo/schema.go	0.00% (ø)	0	0	0
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/interface.go	0.00% (ø)	0	0	0
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/interface_mock.go	35.00% (+3.75%)	120 (+24)	42 (+12)	78 (+12)	👍
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/libvirt.go	0.00% (ø)	35 (+2)	0	35 (+2)
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/util/util.go	100.00% (+100.00%)	9 (+9)	9 (+9)	0	🌟

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

github.com/cobaltcore-dev/kvm-node-agent/internal/controller/hypervisor_controller_test.go
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/capabilities/client_test.go
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/domcapabilities/client_test.go
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/domcapabilities/schema_test.go
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/dominfo/client_test.go
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/dominfo/schema_test.go
github.com/cobaltcore-dev/kvm-node-agent/internal/libvirt/util/util_test.go

PhilippMatthes · 2026-01-02T12:52:29Z

Technically this is ready for review, but I want to make sure everything is implemented correctly and will check with a ssh forwarded libvirt socket, once I get an available hypervisor.

## Background For virtual machines spawned on the kvm hypervisor, we want to no longer use nova and placement as source of truth. Instead, filters should use the hypervisor crd exposed by the [hypervisor operator](github.com/cobaltcore-dev/openstack-hypervisor-operator) and populated by the [node agent](https://github.com/cobaltcore-dev/kvm-node-agent). This contribution replaces the implementation of all filters that were originally ported from nova accordingly. Afterward, we can disable filters in nova one-by-one, moving the compute placement logic over to cortex. > [!TIP] > You can use the newly added [mirror tool](93fdcc0) to mirror hypervisor resources from our compute cluster over to the local cluster. ## Completion - [x] ~internal/scheduling/decisions/nova/plugins/filters/filter_compute_capabilities.go~ (REMOVED) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_capabilities.go (NEW) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_correct_az.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_external_customer.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_has_accelerators.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_has_enough_capacity.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_has_requested_traits.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_host_instructions.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_maintenance.go (NEW) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_packed_virtqueue.go - [x] ~internal/scheduling/decisions/nova/plugins/filters/filter_project_aggregates.go~ (REMOVED) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_allowed_projects.go (NEW) - [x] ~internal/scheduling/decisions/nova/plugins/filters/filter_disabled.go~ (REMOVED) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_status_conditions.go (NEW) ## Dependencies > [!NOTE] > The scope of this PR is to establish a minimum viable scheduling pipeline with the current state. Extensive refactorings, for example of the filter for requested traits, are out of scope. Hypervisor operator PR: cobaltcore-dev/openstack-hypervisor-operator#217 KVM node agent PR: cobaltcore-dev/kvm-node-agent#40

PhilippMatthes · 2026-01-05T13:06:35Z

Tested result with real hypervisor:

apiVersion: kvm.cloud.sap/v1
kind: Hypervisor
# ...
status:
  capabilities:
    cpuArch: x86_64
    cpus: "128"
    hostTopology:
    - capacity:
        cpu: "64"
        memory: 528110060Ki
      id: 0
    - capacity:
        cpu: "64"
        memory: 528456396Ki
      id: 1
    memory: 1056566456Ki
  conditions:
  - lastTransitionTime: "2026-01-05T13:01:25Z"
    message: ""
    reason: DomainInfoClientGetSucceeded
    status: "True"
    type: DomainInfoClientConnection
  - lastTransitionTime: "2026-01-05T13:01:25Z"
    message: ""
    reason: DomainCapabilitiesClientGetSucceeded
    status: "True"
    type: DomainCapabilitiesClientConnection
  domainCapabilities:
    arch: x86_64
    hypervisorType: ch
    supportedCpuModes:
    - mode/host-passthrough
    supportedDevices:
    - video
    - video/none
    supportedFeatures: []
  domainInfos:
  - allocation:
      cpu: "2"
      memory: 2032Mi
    cpuCells:
    - 0
    memoryCells:
    - 0
    name: # omitted
    uuid: # omitted
  - allocation:
      cpu: "1"
      memory: 2032Mi
    cpuCells:
    - 0
    memoryCells:
    - 0
    name: # omitted
    uuid: # omitted
  # ...

PhilippMatthes · 2026-01-05T16:36:09Z

Will polish this a bit more tomorrow. I'm not happy yet with storing the domain infos in a list, which could explode on a bigger hypervisor.

notandy · 2026-01-05T17:16:36Z

go.sum

+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229101148-5c49ce751841 h1:CQTvuKSm1YnALv5gJP2NkX5/3gz6qludor89PJ1eibw=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229101148-5c49ce751841/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229103057-906a154e6429 h1:1E4S42PyC1fsCJ2kjJ2qu+Ryk2vc7C0D1IInDaZWJGU=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229103057-906a154e6429/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229104931-d99e352a3886 h1:Tqvuis23JJnTJMhtL1zo5dqlV6THlNzsS+IfDzWTsRg=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229104931-d99e352a3886/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229115749-52d9308090a6 h1:yjxe8xMx3T2ZR8Vq9NqH332xoUXFAGhzZu/MLD34j0Q=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229115749-52d9308090a6/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251230105055-37950dd7ff29 h1:2tPhnOy0tPv49xLuk1i/0mvPwOneWE+oK/yP8s4GKZY=
+github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251230105055-37950dd7ff29/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s=


can you run go mod tidy

notandy · 2026-01-05T17:25:33Z

internal/libvirt/capabilities/client.go

 	totalCpus := resource.NewQuantity(0, resource.DecimalSI)
 	for _, cell := range in.Host.Topology.CellSpec.Cells {
-		mem, err := cell.Memory.AsQuantity()
+		mem, err := util.MemoryToResource(cell.Memory.Value, cell.Memory.Unit)


Just noticed, there is quantity.ParseQuantity in the k8s apimachinary package, doesn't it do the same already?

This was referenced Dec 29, 2025

Use hypervisor CRD for filtering cobaltcore-dev/cortex#441

Merged

Extend hypervisor crd for cortex filtering cobaltcore-dev/openstack-hypervisor-operator#217

Open

PhilippMatthes force-pushed the cortex-filtering branch 4 times, most recently from 5bcc488 to 0252935 Compare December 29, 2025 15:52

Auto-discover additional fields for cortex filtering

e344b1c

PhilippMatthes force-pushed the cortex-filtering branch from ce406f3 to e344b1c Compare December 29, 2025 16:17

Add domain information list with numa cell allocation

73f85cb

PhilippMatthes marked this pull request as ready for review January 2, 2026 12:57

notandy requested changes Jan 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto-discover additional fields for cortex filtering #40

Auto-discover additional fields for cortex filtering #40

Uh oh!

PhilippMatthes commented Dec 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 30, 2025

Changed files (no unit tests)

Changed unit test files

Uh oh!

PhilippMatthes commented Jan 2, 2026

Uh oh!

PhilippMatthes commented Jan 5, 2026

Uh oh!

PhilippMatthes commented Jan 5, 2026

Uh oh!

notandy Jan 5, 2026

Uh oh!

notandy Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Auto-discover additional fields for cortex filtering #40

Are you sure you want to change the base?

Auto-discover additional fields for cortex filtering #40

Uh oh!

Conversation

PhilippMatthes commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 30, 2025

Merging this branch changes the coverage (1 decrease, 5 increase)

Changed files (no unit tests)

Changed unit test files

Uh oh!

PhilippMatthes commented Jan 2, 2026

Uh oh!

PhilippMatthes commented Jan 5, 2026

Uh oh!

PhilippMatthes commented Jan 5, 2026

Uh oh!

notandy Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

notandy Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PhilippMatthes commented Dec 29, 2025 •

edited

Loading