fix(api): update InternalRBACRules SPIFFE identifiers to nico-* prefix#1907
Open
shayan1995 wants to merge 1 commit into
Open
fix(api): update InternalRBACRules SPIFFE identifiers to nico-* prefix#1907shayan1995 wants to merge 1 commit into
shayan1995 wants to merge 1 commit into
Conversation
After the carbide → nico platform rename, all deployed services present
SPIFFE identifiers with the nico-* prefix, but InternalRBACRules in
crates/api/src/auth/internal_rbac_rules.rs still matched against hardcoded
carbide-* strings. Every internal service-to-api gRPC call failed mTLS
authorization with HTTP 403, silently breaking all service-to-service
communication.
Update each RulePrincipal → Principal::SpiffeServiceIdentifier mapping
(plus the corresponding test fixtures in the same file) to use the
nico-* prefix:
carbide-dns -> nico-dns
carbide-dhcp -> nico-dhcp
carbide-ssh-console -> nico-ssh-console
carbide-ssh-console-rs -> nico-ssh-console-rs
carbide-pxe -> nico-pxe
carbide-bmc-proxy -> nico-bmc-proxy
carbide-hardware-health -> nico-hardware-health
carbide-flow -> nico-flow
carbide-maintenance-jobs -> nico-maintenance-jobs
carbide-dsx-exchange-consumer -> nico-dsx-exchange-consumer
Failure mode before this fix: inbound gRPC from e.g. nico-dns to nico-api
surfaced as
WARN auth::internal_rbac_rules — principal SpiffeServiceIdentifier("nico-dns")
not authorized for method LookupRecordLegacy — no matching rule
with no TLS-level error, masking the root cause. Impact spanned DNS
resolution, DHCP lease lookups, PXE GetCloudInitInstructions, SSH console
access, hardware health reporting, and maintenance job scheduling — every
internal principal that authenticates via SpiffeServiceIdentifier.
Follow-up (not in this PR): these identifiers are stringly-typed with no
compile-time link to the actual deployed service names. Worth deriving
them from a shared constant or asserting consistency in an integration
test that round-trips each principal through cert subject + RBAC lookup.
Fixes NVIDIA#1891
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
After the carbide → NICo rename, deployed services present
nico-*SPIFFE identifiers butInternalRBACRulesincrates/api/src/auth/internal_rbac_rules.rsstill matched against hardcodedcarbide-*strings. Every internal service-to-api gRPC call failed mTLS authorization with HTTP 403.Updates all 21 hardcoded
carbide-*strings (production + test fixtures) tonico-*soRulePrincipal::{Dns, Dhcp, Ssh, SshRs, Pxe, BmcProxy, Health, Flow, MaintenanceJobs, DsxExchangeConsumer}match the SPIFFE identifiers presented by deployednico-*services.Type of Change
Related Issues (Optional)
Fixes #1891
Breaking Changes
Testing
Verified deployed serviceNames in
helm/charts/nico-*/values.yamlmatch the updated rule strings (nico-dns, nico-dhcp, nico-pxe, nico-bmc-proxy, nico-hardware-health, nico-ssh-console-rs, nico-dsx-exchange-consumer, nico-flow).Additional Notes
These identifiers are stringly-typed with no compile-time link to the actual deployed service names. A follow-up should either derive them from a shared constant or add an integration test that asserts each
RulePrincipalresolves to a SPIFFE identifier matching the cert subject of the corresponding deployed service.