-
Notifications
You must be signed in to change notification settings - Fork 0
Auto-discover additional fields for cortex filtering #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5bcc488 to
0252935
Compare
ce406f3 to
e344b1c
Compare
Merging this branch changes the coverage (1 decrease, 5 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
|
Technically this is ready for review, but I want to make sure everything is implemented correctly and will check with a ssh forwarded libvirt socket, once I get an available hypervisor. |
## Background For virtual machines spawned on the kvm hypervisor, we want to no longer use nova and placement as source of truth. Instead, filters should use the hypervisor crd exposed by the [hypervisor operator](github.com/cobaltcore-dev/openstack-hypervisor-operator) and populated by the [node agent](https://github.com/cobaltcore-dev/kvm-node-agent). This contribution replaces the implementation of all filters that were originally ported from nova accordingly. Afterward, we can disable filters in nova one-by-one, moving the compute placement logic over to cortex. > [!TIP] > You can use the newly added [mirror tool](93fdcc0) to mirror hypervisor resources from our compute cluster over to the local cluster. ## Completion - [x] ~internal/scheduling/decisions/nova/plugins/filters/filter_compute_capabilities.go~ (REMOVED) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_capabilities.go (NEW) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_correct_az.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_external_customer.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_has_accelerators.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_has_enough_capacity.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_has_requested_traits.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_host_instructions.go - [x] internal/scheduling/decisions/nova/plugins/filters/filter_maintenance.go (NEW) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_packed_virtqueue.go - [x] ~internal/scheduling/decisions/nova/plugins/filters/filter_project_aggregates.go~ (REMOVED) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_allowed_projects.go (NEW) - [x] ~internal/scheduling/decisions/nova/plugins/filters/filter_disabled.go~ (REMOVED) - [x] internal/scheduling/decisions/nova/plugins/filters/filter_status_conditions.go (NEW) ## Dependencies > [!NOTE] > The scope of this PR is to establish a minimum viable scheduling pipeline with the current state. Extensive refactorings, for example of the filter for requested traits, are out of scope. Hypervisor operator PR: cobaltcore-dev/openstack-hypervisor-operator#217 KVM node agent PR: cobaltcore-dev/kvm-node-agent#40
|
Tested result with real hypervisor: apiVersion: kvm.cloud.sap/v1
kind: Hypervisor
# ...
status:
capabilities:
cpuArch: x86_64
cpus: "128"
hostTopology:
- capacity:
cpu: "64"
memory: 528110060Ki
id: 0
- capacity:
cpu: "64"
memory: 528456396Ki
id: 1
memory: 1056566456Ki
conditions:
- lastTransitionTime: "2026-01-05T13:01:25Z"
message: ""
reason: DomainInfoClientGetSucceeded
status: "True"
type: DomainInfoClientConnection
- lastTransitionTime: "2026-01-05T13:01:25Z"
message: ""
reason: DomainCapabilitiesClientGetSucceeded
status: "True"
type: DomainCapabilitiesClientConnection
domainCapabilities:
arch: x86_64
hypervisorType: ch
supportedCpuModes:
- mode/host-passthrough
supportedDevices:
- video
- video/none
supportedFeatures: []
domainInfos:
- allocation:
cpu: "2"
memory: 2032Mi
cpuCells:
- 0
memoryCells:
- 0
name: # omitted
uuid: # omitted
- allocation:
cpu: "1"
memory: 2032Mi
cpuCells:
- 0
memoryCells:
- 0
name: # omitted
uuid: # omitted
# ... |
|
Will polish this a bit more tomorrow. I'm not happy yet with storing the domain infos in a list, which could explode on a bigger hypervisor. |
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229101148-5c49ce751841 h1:CQTvuKSm1YnALv5gJP2NkX5/3gz6qludor89PJ1eibw= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229101148-5c49ce751841/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229103057-906a154e6429 h1:1E4S42PyC1fsCJ2kjJ2qu+Ryk2vc7C0D1IInDaZWJGU= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229103057-906a154e6429/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229104931-d99e352a3886 h1:Tqvuis23JJnTJMhtL1zo5dqlV6THlNzsS+IfDzWTsRg= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229104931-d99e352a3886/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229115749-52d9308090a6 h1:yjxe8xMx3T2ZR8Vq9NqH332xoUXFAGhzZu/MLD34j0Q= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251229115749-52d9308090a6/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251230105055-37950dd7ff29 h1:2tPhnOy0tPv49xLuk1i/0mvPwOneWE+oK/yP8s4GKZY= | ||
| github.com/cobaltcore-dev/openstack-hypervisor-operator v0.0.0-20251230105055-37950dd7ff29/go.mod h1:i/YQm59sAvilkgTFpKc+elMIf/KzkdimnXMd13P3V9s= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you run go mod tidy
| totalCpus := resource.NewQuantity(0, resource.DecimalSI) | ||
| for _, cell := range in.Host.Topology.CellSpec.Cells { | ||
| mem, err := cell.Memory.AsQuantity() | ||
| mem, err := util.MemoryToResource(cell.Memory.Value, cell.Memory.Unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed, there is quantity.ParseQuantity in the k8s apimachinary package, doesn't it do the same already?
In this pull request we implemented a cortex filtering pipeline for KVM. This pipeline uses the hypervisor CRD as single source of truth to find out on which hypervisors a vm can be scheduled. To complete this implementation, we extended the hypervisor crd in this pull request. The hypervisor crd pull request added additional fields and removed outdated ones, which need to be autodiscovered in the kvm node agent. The following fields are now populated:
Support filtering based on hypervisor type and other capabilities:
Capacity filtering:
(Bonus)
When done:
Note
The scope of this PR is to establish a minimum viable scheduling pipeline in cortex, with the least amount of changes possible. Refactorings of the hypervisor crd spec can follow if needed.