CNF-22724: Add External Secrets Operator reference configuration#808
CNF-22724: Add External Secrets Operator reference configuration#808rdiscala wants to merge 1 commit into
Conversation
|
@rdiscala: This pull request references CNF-22724 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rdiscala The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR adds comprehensive External Secrets Operator integration to the telco-core and telco-hub environments. It registers ESO as an optional component in core-baseline.yaml, defines reference installation and configuration manifests across both environments, provides templated secret provisioning for cluster pull secrets and node BMC credentials, implements network policies, and includes example overlays with documentation. ChangesExternal Secrets Operator Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
78341ba to
2b1daf9
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@telco-core/configuration/core-baseline.yaml`:
- Around line 55-62: Add a core-side NetworkPolicy manifest and include it in
the baseline alongside the existing ESO manifests (the ones referenced as
esoNS.yaml / esoOperatorgroup.yaml / esoSubscription.yaml) so ESO operator pods
are constrained; create a new manifest (e.g.,
reference-crs/optional/external-secrets/eso-networkpolicy.yaml) that targets the
ESO operator namespace/pods (use namespaceSelector/podSelector matching the ESO
deployment labels) and define egress rules limited to approved Vault
endpoints/namespaces (allow egress to Vault service FQDNs/IPs and required ports
like 8200 and/or to a Vault namespace selector), then add a new - path:
reference-crs/optional/external-secrets/eso-networkpolicy.yaml entry in
core-baseline.yaml next to the existing ESO paths so the NetworkPolicy is
applied whenever ESO is enabled.
In
`@telco-core/configuration/reference-crs/optional/external-secrets/esoClusterSecretStore.yaml`:
- Around line 3-16: The ClusterSecretStore "ztp-secret-provider" currently uses
a shared long‑lived token (tokenSecretRef -> vault-token) and is cluster‑scoped,
which lets any ExternalSecret reference it and read arbitrary Vault paths; fix
by scoping credentials and adding guardrails: replace or supplement the single
ClusterSecretStore "ztp-secret-provider" with per‑cluster or per‑namespace
SecretStore resources (or create cluster stores per cluster name) that use
distinct tokenSecretRef secrets (not the shared vault-token), and/or restrict
the Vault provider config to limit allowed Vault path prefixes (so remoteRef.key
is constrained); additionally add RBAC/OPA/Gatekeeper policy to deny
ExternalSecret objects from referencing the global "ztp-secret-provider"
ClusterSecretStore (or require a specific label/annotation on allowed stores)
and update any templates that reference "ztp-secret-provider" to point to the
new scoped stores.
In `@telco-hub/configuration/kustomization.yaml`:
- Around line 13-14: Remove the example external-secrets overlay entry (the list
item "- example-overlays-config/external-secrets/") from the kustomization
resources so the example patches with placeholder Vault endpoints are not
applied by default; instead keep it as a commented example or document it in
README so users opt-in explicitly. Ensure the kustomization.resources array no
longer includes the external-secrets example entry and verify kustomize builds
without it.
In
`@telco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoOperatorgroup.yaml`:
- Around line 1-8: OperatorGroup "openshift-external-secrets-operator" is
missing the sync-wave annotation so it can be created in the correct order; add
a metadata.annotations entry (e.g., "kluctl.io/sync-wave" or the cluster's
sync-wave key) with a value between the Namespace (-45) and Subscription (-40)
(for example "-42") under the OperatorGroup resource (metadata -> annotations)
to ensure it is applied between namespace and subscription creation.
In
`@telco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoSubscription.yaml`:
- Line 14: The hub ESO subscription in esoSubscription.yaml currently sets
installPlanApproval: Automatic which conflicts with the core ESO subscriptions
that use installPlanApproval: Manual; either change the hub subscription's
installPlanApproval value in the telco-hub ESO subscription resource (the
installPlanApproval field in esoSubscription.yaml / the Subscription resource)
to Manual to match core, or add a short ESO-specific justification adjacent to
the hub subscription (or in the ESO README) explicitly stating the security
rationale for allowing Automatic upgrades (include who approves, risk
mitigations, and conditions), and ensure the change references the same
Subscription resource name used in the file so reviewers can verify consistency.
In
`@telco-hub/configuration/reference-crs/optional/external-secrets/esoNetworkPolicy.yaml`:
- Around line 13-22: The NetworkPolicy currently only allows egress to Vault on
TCP/8200 so DNS queries are blocked and hostname resolution for Vault fails; add
an additional egress rule alongside the existing egress entry that permits DNS
traffic (port 53, both UDP and TCP) to your cluster DNS (e.g. target the
kube-system namespace via namespaceSelector matching
kubernetes.io/metadata.name: kube-system or the appropriate DNS service IP
range) so that functions relying on DNS name resolution (the existing
policyTypes/egress and egress->to/namespaceSelector/ports entries) can resolve
Vault hostnames.
In
`@telco-hub/configuration/reference-crs/optional/external-secrets/esoSubscription.yaml`:
- Line 14: The install plan approval is set to Automatic which permits
unattended ESO operator upgrades; change the installPlanApproval field from
Automatic to Manual to require manual approval of operator InstallPlans so
upgrades from stable-v1 are controlled and vetted (update the
installPlanApproval value in the ESO subscription manifest where the
installPlanApproval key is defined).
In `@telco-hub/configuration/reference-crs/optional/external-secrets/README.md`:
- Around line 23-26: Mismatch: the oc create secret command creates vault-token
in external-secrets-operator but the ESO manifests in this PR expect the secret
in external-secrets; update the README.md example so the vault-token secret is
created in the external-secrets namespace (use -n external-secrets) or,
alternatively, adjust the ESO manifests to reference external-secrets-operator
consistently—ensure the secret name vault-token and the target namespace
referenced by the ESO resources match exactly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: d9d0237e-0787-4705-9af4-ce86942b7709
📒 Files selected for processing (36)
telco-core/configuration/core-baseline.yamltelco-core/configuration/reference-crs-kube-compare/metadata.yamltelco-core/configuration/reference-crs-kube-compare/optional/external-secrets/esoClusterSecretStore.yamltelco-core/configuration/reference-crs-kube-compare/optional/external-secrets/esoExternalSecretsConfig.yamltelco-core/configuration/reference-crs-kube-compare/optional/external-secrets/esoNS.yamltelco-core/configuration/reference-crs-kube-compare/optional/external-secrets/esoOperatorgroup.yamltelco-core/configuration/reference-crs-kube-compare/optional/external-secrets/esoSubscription.yamltelco-core/configuration/reference-crs/optional/external-secrets/esoClusterSecretStore.yamltelco-core/configuration/reference-crs/optional/external-secrets/esoExternalSecretsConfig.yamltelco-core/configuration/reference-crs/optional/external-secrets/esoNS.yamltelco-core/configuration/reference-crs/optional/external-secrets/esoOperatorgroup.yamltelco-core/configuration/reference-crs/optional/external-secrets/esoSubscription.yamltelco-hub/configuration/example-overlays-config/external-secrets/cluster-secret-store-patch.yamltelco-hub/configuration/example-overlays-config/external-secrets/kustomization.yamltelco-hub/configuration/example-overlays-config/external-secrets/network-policy-patch.yamltelco-hub/configuration/kustomization.yamltelco-hub/configuration/reference-crs-kube-compare/metadata.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoClusterSecretStore.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoClusterTemplates.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoExternalSecretsConfig.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoNS.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoNetworkPolicy.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoNodeTemplates.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoOperatorgroup.yamltelco-hub/configuration/reference-crs-kube-compare/optional/external-secrets/esoSubscription.yamltelco-hub/configuration/reference-crs/kustomization.yamltelco-hub/configuration/reference-crs/optional/external-secrets/README.mdtelco-hub/configuration/reference-crs/optional/external-secrets/esoClusterSecretStore.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoClusterTemplates.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoExternalSecretsConfig.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoNS.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoNetworkPolicy.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoNodeTemplates.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoOperatorgroup.yamltelco-hub/configuration/reference-crs/optional/external-secrets/esoSubscription.yamltelco-hub/configuration/reference-crs/optional/external-secrets/kustomization.yaml
| name: openshift-external-secrets-operator | ||
| source: redhat-operators-disconnected | ||
| sourceNamespace: openshift-marketplace | ||
| installPlanApproval: Automatic |
There was a problem hiding this comment.
Use manual install plan approval to prevent unvetted operator drift.
Line 14 sets installPlanApproval: Automatic, which allows unattended ESO upgrades from stable-v1. In this stack, that increases rollout risk and can introduce unvalidated operator/operand behavior changes. Align this with controlled promotion (Manual) for safer hub lifecycle management.
Suggested patch
- installPlanApproval: Automatic
+ installPlanApproval: ManualAs per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| installPlanApproval: Automatic | |
| installPlanApproval: Manual |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@telco-hub/configuration/reference-crs/optional/external-secrets/esoSubscription.yaml`
at line 14, The install plan approval is set to Automatic which permits
unattended ESO operator upgrades; change the installPlanApproval field from
Automatic to Manual to require manual approval of operator InstallPlans so
upgrades from stable-v1 are controlled and vetted (update the
installPlanApproval value in the ESO subscription manifest where the
installPlanApproval key is defined).
Source: Coding guidelines
There was a problem hiding this comment.
I think installPlanApproval should remain Automatic as ArgoCD is the gate keeper, owning the life cycle of the operator. @imiller0 thoughts?
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
a433a86 to
1dfafb4
Compare
| annotations: | ||
| workload.openshift.io/allowed: management |
There was a problem hiding this comment.
Can you confirm if the pods in ESO are annotated for workload partitioning? As part of the platform they should be but this is a new operator for Telco use case and likely they have not been added yet. The annotation is:
target.workload.openshift.io/management: '{"effect":"PreferredDuringScheduling"}'
There was a problem hiding this comment.
@imiller0 they are not annotated. Shall I open a request on Jira to the upstream owners?
| kind: OperatorGroup | ||
| metadata: | ||
| annotations: | ||
| olm.operatorframework.io/bundle-install-timeout: "10m" |
There was a problem hiding this comment.
I don't see this annotation defined in OLM, is it valid?
There was a problem hiding this comment.
Good catch. This was based on an invalid annotation that was already present and fixed in 0eb3ede. I've amended the annotation.
| operator: Exists | ||
| provider: | ||
| vault: | ||
| server: "https://vault.example.com:8200" |
There was a problem hiding this comment.
At a minimum this field needs to be templated to allow user variation here
There was a problem hiding this comment.
Good point. Fixed.
| kind: OperatorGroup | ||
| metadata: | ||
| annotations: | ||
| olm.operatorframework.io/bundle-install-timeout: "10m" |
There was a problem hiding this comment.
Same here and other instances. I don't think this is a valid annotation.
| annotations: | ||
| siteconfig.open-cluster-management.io/sync-wave: "1" | ||
| spec: | ||
| refreshPolicy: CreatedOnce |
There was a problem hiding this comment.
This would imply that any change to the bmc credentials (or pull secret) would need some procedure by the user in the event that a node/cluster needs to be redeployed. What would that procedure look like? Delete the secret and it would be re-created? Actually I think if the BMH (handled below) is deleted that ACM will delete the secret anyway. Can you verify and add a note here if that is the case. If not be sure the documentation (README in this repo and official docs) for this feature includes a description of how to handle this case.
There was a problem hiding this comment.
Interesting point. After thinking about it, I believe OnChange is a better fit. It supports on-demand refresh using an annotation (documented as a comment). Caveat: it does not support automatic refresh when the secret changes in the vault. I've also evaluated Periodic but discarded the idea since 1) a refresh based on a periodic check induces potential downtime when the secret is a credential, 2) a potential thundering herd problem causing a DOS on the external vault in very large clusters, requiring a careful approach in staggering the update interval. OnChange is simpler and safer as a general recommendation. I've made the change and added the update-via-annotation as a scenario in our test suite.
| data: | ||
| - secretKey: .dockerconfigjson | ||
| remoteRef: | ||
| key: clusters/{{ "{{ .Spec.ClusterName }}" }}/pull-secret |
There was a problem hiding this comment.
This is essentially a "contract" that we are creating which the vault needs to conform to. This needs to be clearly noted in the RDS documentation (thank you for noting clearly in the readme here!)
There was a problem hiding this comment.
OK. I've clarified this further in the documentation. I've also added a section for it in the changes to the reference-design-specifications project, which I will soon push.
| kind: NetworkPolicy | ||
| metadata: | ||
| name: allow-secret-store-egress | ||
| namespace: external-secrets |
There was a problem hiding this comment.
Is this supposed to be external-secrets-operator? Here and other instances of esoNetworkPolicy
7abf27f to
c82a74d
Compare
Add ESO as an optional operator for the telco-hub and telco-core RDS, providing secure secret management via external stores (Vault example). Hub configuration includes: - Namespace, OperatorGroup, Subscription, ExternalSecretsConfig, ClusterSecretStore (ztp-secret-provider) - ClusterInstance template ConfigMaps for pull secret and BMC credentials ExternalSecrets - kube-compare metadata (allOrNoneOf, optional) - example-overlays-config with ClusterSecretStore patch - README documenting usage, vault path conventions, and templateRefs integration Core configuration includes ESO in the PolicyGenerator baseline (included by default, not commented out). Co-authored-by: Claude
c82a74d to
159d27c
Compare
Add ESO as an optional operator for the telco-hub and telco-core RDS, providing secure secret management via external stores (Vault example).
Hub configuration includes:
Core configuration includes ESO in the PolicyGenerator baseline (included by default, not commented out).
Co-authored-by: Claude