Version
main
Describe the bug.
NOTE: Written with assistance from Claude Sonnet 4.6
Summary
After the platform rename from carbide to NICo, all deployed services present
SPIFFE identifiers using the nico-* prefix (e.g. nico-dns, nico-dhcp).
However, InternalRBACRules in crates/api/src/auth/internal_rbac_rules.rs
still matched against hardcoded carbide-* strings. Every internal service-to-api
gRPC call failed mTLS authorization with HTTP 403, silently breaking all
service-to-service communication.
Affected services
All internal principals that authenticate via SpiffeServiceIdentifier:
| Service |
Old (broken) identifier |
Expected identifier |
| DNS |
carbide-dns |
nico-dns |
| DHCP |
carbide-dhcp |
nico-dhcp |
| SSH Console |
carbide-ssh-console |
nico-ssh-console |
| SSH Console RS |
carbide-ssh-console-rs |
nico-ssh-console-rs |
| PXE |
carbide-pxe |
nico-pxe |
| Hardware Health |
carbide-hardware-health |
nico-hardware-health |
| RLA / Flow |
carbide-rla |
nico-rla |
| Maintenance Jobs |
carbide-maintenance-jobs |
nico-maintenance-jobs |
| DSX Exchange Consumer |
carbide-dsx-exchange-consumer |
nico-dsx-exchange-consumer |
Root cause
InternalRBACRules::principal_to_rule_principal() maps each RulePrincipal
variant to a Principal::SpiffeServiceIdentifier string. These strings were
hardcoded at implementation time and never updated when services were renamed:
// crates/api/src/auth/internal_rbac_rules.rs
RulePrincipal::Dns => {
Principal::SpiffeServiceIdentifier("carbide-dns".to_string()) // wrong after rename
}
When nico-api validates an inbound gRPC call from nico-dns, it resolves the
presented SPIFFE URI spiffe://forge.local/forge-system/nico-dns → extracts
service name nico-dns → compares against carbide-dns → no match → 403.
Impact
- nico-dns → nico-api:
LookupRecordLegacy denied — DNS resolution for
provisioned hosts broken
- nico-dhcp → nico-api: DHCP lease lookups denied — host boot broken
- nico-pxe → nico-api:
GetCloudInitInstructions denied — PXE boot broken
- nico-ssh-console / nico-ssh-console-rs → nico-api: SSH console access denied
- nico-hardware-health / nico-rla → nico-api: health reporting and maintenance
scheduling denied
- All failures surface as 403 with no indication that the SPIFFE identifier
is the cause — no certificate error, no TLS handshake failure
Fix
Update all carbide-* strings to nico-* in InternalRBACRules:
File: crates/api/src/auth/internal_rbac_rules.rs
Prevention
These identifiers are stringly-typed and have no compile-time link to the actual
service names. Consider:
- Deriving SPIFFE identifiers from a shared constant / config rather than
duplicating strings in both the service cert configuration and InternalRBACRules
- Adding an integration test that verifies each
RulePrincipal variant resolves
to a SPIFFE identifier that matches the cert subject of the corresponding deployed service
Minimum reproducible example
### Steps to reproduce
#### Live deployment
1. Deploy `nico-api` and `nico-dns` via `setup.sh`
2. Trigger any DNS lookup for a provisioned host (e.g. attempt a PXE boot). It was also observed that underlying k8s host DNS breaks as all DNS requests go through `nico-dns` pod.
3. Observe in `nico-api` logs:
WARN auth::internal_rbac_rules — principal SpiffeServiceIdentifier("nico-dns") \
not authorized for method LookupRecordLegacy — no matching rule
Relevant log output
Other/Misc.
No response
Code of Conduct
Version
main
Describe the bug.
NOTE: Written with assistance from Claude Sonnet 4.6
Summary
After the platform rename from
carbidetoNICo, all deployed services presentSPIFFE identifiers using the
nico-*prefix (e.g.nico-dns,nico-dhcp).However,
InternalRBACRulesincrates/api/src/auth/internal_rbac_rules.rsstill matched against hardcoded
carbide-*strings. Every internal service-to-apigRPC call failed mTLS authorization with HTTP 403, silently breaking all
service-to-service communication.
Affected services
All internal principals that authenticate via
SpiffeServiceIdentifier:carbide-dnsnico-dnscarbide-dhcpnico-dhcpcarbide-ssh-consolenico-ssh-consolecarbide-ssh-console-rsnico-ssh-console-rscarbide-pxenico-pxecarbide-hardware-healthnico-hardware-healthcarbide-rlanico-rlacarbide-maintenance-jobsnico-maintenance-jobscarbide-dsx-exchange-consumernico-dsx-exchange-consumerRoot cause
InternalRBACRules::principal_to_rule_principal()maps eachRulePrincipalvariant to a
Principal::SpiffeServiceIdentifierstring. These strings werehardcoded at implementation time and never updated when services were renamed:
When
nico-apivalidates an inbound gRPC call fromnico-dns, it resolves thepresented SPIFFE URI
spiffe://forge.local/forge-system/nico-dns→ extractsservice name
nico-dns→ compares againstcarbide-dns→ no match → 403.Impact
LookupRecordLegacydenied — DNS resolution forprovisioned hosts broken
GetCloudInitInstructionsdenied — PXE boot brokenscheduling denied
is the cause — no certificate error, no TLS handshake failure
Fix
Update all
carbide-*strings tonico-*inInternalRBACRules:File:
crates/api/src/auth/internal_rbac_rules.rsPrevention
These identifiers are stringly-typed and have no compile-time link to the actual
service names. Consider:
duplicating strings in both the service cert configuration and
InternalRBACRulesRulePrincipalvariant resolvesto a SPIFFE identifier that matches the cert subject of the corresponding deployed service
Minimum reproducible example
Relevant log output
Other/Misc.
No response
Code of Conduct