CNF-14592: add LokiStack configuration CRs to the optional reference section#756
CNF-14592: add LokiStack configuration CRs to the optional reference section#756rdiscala wants to merge 2 commits into
Conversation
|
@rdiscala: This pull request references CNF-14592 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds optional OpenShift Loki operator manifests, S3 secret template, LokiStack CR, ClusterLogForwarder that routes audit/infrastructure logs to Loki, and commented policy integration references. ChangesLoki Logging Stack Setup
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@telco-core/configuration/reference-crs/optional/logging/LokiStack.yaml`:
- Around line 22-23: Replace the placeholder in the LokiStack.yaml manifest by
setting the effectiveDate field (currently "effectiveDate:
\"<yyyy>-<mm>-<dd>\"") to a real date in YYYY-MM-DD format (approximately two
months before your deployment date); ensure the value is a valid date string and
preserved under the same key so the version/effectiveDate schema remains intact.
- Line 27: Replace the hardcoded storageClassName value under the persistent
volume spec (storageClassName: "ocs-storagecluster-ceph-rbd") with a
configurable placeholder (e.g. <storageClassName>) and add a comment next to the
placeholder indicating it must be set to the cluster's available StorageClass
(or document that OCS is required) so deployments are portable; update any
references to storageClassName in the LokiStack.yaml template to consume the
placeholder when rendering/templating.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: d8e5c20e-b228-446f-a3ff-8f2c0e724800
📒 Files selected for processing (8)
telco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-to_amend.yamltelco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorGroup.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorNS.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorStatus.yamltelco-core/configuration/reference-crs/optional/logging/LokiSecret.yamltelco-core/configuration/reference-crs/optional/logging/LokiStack.yamltelco-core/configuration/reference-crs/optional/logging/LokiSubscription.yaml
|
Any plan to update also:
|
7caea0d to
4940da0
Compare
|
@yprokule good point. I've updated the files. Let me know if there is something else that needs to be amended. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
telco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-LokiStack.yaml (1)
42-45: ⚡ Quick winRemove the commented-out status section.
Including a
statussection in a manifest template is confusing since status fields are managed by the controller, not user-configured. This commented block serves no functional or documentation purpose and should be removed.♻️ Proposed fix to remove status section
serviceAccount: name: collector -status: - `#conditions`: - `#-` type: observability.openshift.io/Valid - # status: "True"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@telco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-LokiStack.yaml` around lines 42 - 45, Remove the commented-out status block from the manifest: delete the commented lines starting with "status:" and the following commented "#conditions:" and "#- type: observability.openshift.io/Valid" / "# status: \"True\"" entries in ClusterLogForwarder-LokiStack.yaml so the template contains no status section (status is controller-managed and should not appear, even commented).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@telco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-LokiStack.yaml`:
- Around line 42-45: Remove the commented-out status block from the manifest:
delete the commented lines starting with "status:" and the following commented
"#conditions:" and "#- type: observability.openshift.io/Valid" / "# status:
\"True\"" entries in ClusterLogForwarder-LokiStack.yaml so the template contains
no status section (status is controller-managed and should not appear, even
commented).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: e9ae3f63-c476-478c-925c-d2f099d24f31
📒 Files selected for processing (9)
telco-core/configuration/core-baseline.yamltelco-core/configuration/core-overlay.yamltelco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-LokiStack.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorGroup.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorNS.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorStatus.yamltelco-core/configuration/reference-crs/optional/logging/LokiSecret.yamltelco-core/configuration/reference-crs/optional/logging/LokiStack.yamltelco-core/configuration/reference-crs/optional/logging/LokiSubscription.yaml
✅ Files skipped from review due to trivial changes (5)
- telco-core/configuration/reference-crs/optional/logging/LokiOperatorNS.yaml
- telco-core/configuration/core-baseline.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiSecret.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiSubscription.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiOperatorStatus.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- telco-core/configuration/reference-crs/optional/logging/LokiOperatorGroup.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiStack.yaml
| outputRefs: | ||
| - kafka-open | ||
|
|
||
| # Cluster Logging with LokiStack (optional - use instead of the above ClusterLogForwarder) |
There was a problem hiding this comment.
should it be used instead of or in addition to ? My understanding is that in addition to sending logs to central location store some of them locally for easier/faster troubleshooting so existing ClusterLogForwarder CR should be updated with this use case.
There was a problem hiding this comment.
Good catch. I've amended the configuration so that the LokiStack configuration is part of ClusterLogForwarder.
4940da0 to
b7d0ccc
Compare
|
/assign @imiller0 |
| kind: OperatorGroup | ||
| metadata: | ||
| name: loki-operator | ||
| namespace: openshift-operators-redhat |
There was a problem hiding this comment.
Needs a standard annotation for olm bundle unpack retry:
operatorframework.io/bundle-unpack-min-retry-interval: 10m
| metadata: | ||
| name: loki-operator.openshift-operators-redhat | ||
| annotations: | ||
| ran.openshift.io/ztp-deploy-wave: "2" |
There was a problem hiding this comment.
No ztp-deploy-wave annotations in Core (we will be removing them from RAN eventually as well)
| kind: OperatorGroup | ||
| metadata: | ||
| name: loki-operator | ||
| namespace: openshift-operators-redhat |
There was a problem hiding this comment.
Does Loki not have a namespace of its own for installation? If this is the documented reference from monitoring team it is fine, just want to confirm
There was a problem hiding this comment.
Good question. I've followed the Red Hat OpenShift Logging documentation (6.2 and 6.5) for which states:
name: openshift-operators-redhat: Use this namespace for Red Hat Operators. Avoid using openshift-operators, which might contain untrusted community Operators and cause metric conflicts.
The netobserv-loki namespace is used for network observability, but our use case is local log storage, so openshift-operators-redhat seems the right choice for me.
| namespace: openshift-logging | ||
| stringData: | ||
| access_key_id: "<access key>" | ||
| access_key_secret: "<access key secret>" |
There was a problem hiding this comment.
Add note that secrets should be managed via secure method such as External Secrets Operator
| storage: | ||
| schemas: | ||
| - version: v13 | ||
| effectiveDate: "<yyyy>-<mm>-<dd>" # Set to approximately two months before deployment date |
There was a problem hiding this comment.
What does this control and what are the consequences of setting it incorrectly? Is it sufficient to place an arbitrary date here (eg ocp release date)? or does this truly need to be set by the user?
There was a problem hiding this comment.
This parameter controls the date from when this version of the schema will become active. Setting it incorrectly (e.g. in the future, or too recently in the past) may result in Loki refusing to ingest logs. It also cannot be changed retroactively.
I've added some guidance as a comment.
There was a problem hiding this comment.
BTW, 2 months was suggested in the proposed CRs. It's mentioned in third-party guides. Maybe it's too far in the past. 2 weeks could be sufficient.
| secret: | ||
| name: logging-loki-s3 | ||
| type: s3 | ||
| storageClassName: "ocs-storagecluster-ceph-rbd" # Set to the cluster's available StorageClass |
There was a problem hiding this comment.
How does this relate to the storage config in the Secret? Are these the same endpoint, or completely independent or ??
There was a problem hiding this comment.
These are two independent entries for two different types and uses of storage. The secret is used to access an S3-compatible long-term storage for compressed log files. The storage class is used instead for short-term buffering (for performance and recovery in case of crash).
Added comments that clarify their use.
There was a problem hiding this comment.
Within RDS Core storageClassed are created later in the process (ODF configuration is done in the core-overlay.yaml file file LokiStack.yaml is referenced in the core-baseline.yaml, which means that this referenced storageClass must be created outside? of the RDS policies? Or shall we recommend using SC from the ODF that's configured by the RDS Core manifests and re-shuffle configuration of LockStack resource?
@imiller0 thoughts?
| source: redhat-operators-disconnected # For disconnected environments. | ||
| sourceNamespace: openshift-marketplace | ||
| status: | ||
| # state: AtLatestKnown # Expected status |
There was a problem hiding this comment.
Should include this line (uncomment) so that the operator status is verified prior to showing compliant status on the policy.
| # - name: loki-local | ||
| # type: lokiStack |
There was a problem hiding this comment.
If I understand correctly, in this form the reference will have 2 cluster log forwarder instances running simultaneously. One sending to loki locally and the other sending via kafka to a remote endpoint. Both are scraping audit and infrastructure logs.
Would it be more optimal to have a single ClusterLogForwarder with options for either loki output or kafka or both?
Is there an advantage/disadvantage to either approach?
There was a problem hiding this comment.
There's only one ClusterLogForwarder named instance in openshift-logging. It is patched by PolicyGenerator to add multiple outputs for Kafka and Loki. The latter is provided here in commented form, as it is optional. When it is uncommented, there is still one ClusterLogForwarder, but with the aforementioned two outputs.
The ClusterLogForwarder-LokiStack.yaml manifest is provided here only as an example.
There are no real advantages to having two ClusterLogForwarders, except perhaps redundancy and potential competition between the two outputs for the resources of the single instance.
|
/retest |
1 similar comment
|
/retest |
|
Can you fix lintchecks please. It's failing the check of cluster-compare vs reference CRs. |
b7d0ccc to
1c235e1
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rdiscala The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
1c235e1 to
5130bbe
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@telco-core/configuration/reference-crs/optional/logging/LokiOperatorStatus.yaml`:
- Around line 10-11: Remove (comment out) the standalone "status:" key in the
example CR so it does not render as status: null; specifically edit the
LokiOperatorStatus.yaml example and comment out the entire status: line (and
leave the nested components commented/unset) so the status field is absent
unless explicitly populated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 697a0716-e83e-4a70-858b-9131670e576b
📒 Files selected for processing (12)
telco-core/configuration/core-baseline.yamltelco-core/configuration/core-overlay.yamltelco-core/configuration/reference-crs-kube-compare/compare_ignoretelco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-LokiStack.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorGroup.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorNS.yamltelco-core/configuration/reference-crs/optional/logging/LokiOperatorStatus.yamltelco-core/configuration/reference-crs/optional/logging/LokiSecret.yamltelco-core/configuration/reference-crs/optional/logging/LokiStack.yamltelco-core/configuration/reference-crs/optional/logging/LokiSubscription.yamltelco-core/install/extra-manifests/Network.yamltelco-hub/install/openshift/openshift/Network.yaml
✅ Files skipped from review due to trivial changes (5)
- telco-core/configuration/reference-crs-kube-compare/compare_ignore
- telco-core/install/extra-manifests/Network.yaml
- telco-core/configuration/core-baseline.yaml
- telco-hub/install/openshift/openshift/Network.yaml
- telco-core/configuration/core-overlay.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
- telco-core/configuration/reference-crs/optional/logging/LokiOperatorGroup.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiSecret.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiStack.yaml
- telco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder-LokiStack.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiOperatorNS.yaml
- telco-core/configuration/reference-crs/optional/logging/LokiSubscription.yaml
53bf731 to
e240341
Compare
There was a problem hiding this comment.
is this one needed? why not to update existing telco-core/configuration/reference-crs/optional/logging/ClusterLogForwarder.yaml
There was a problem hiding this comment.
Valid point. I provided a specific reference implementation for LokiStack as an alternative to the standard configuration. It might be better to merge it into the standard one, while keeping the Kafka part commented. I'll ask @imiller0 to weigh in on this one.
| # status: | ||
| # components: | ||
| # refs: | ||
| # - kind: Subscription | ||
| # namespace: openshift-operators-redhat | ||
| # conditions: | ||
| # - type: CatalogSourcesUnhealthy | ||
| # status: "False" | ||
| # - kind: InstallPlan | ||
| # namespace: openshift-operators-redhat | ||
| # conditions: | ||
| # - type: Installed | ||
| # status: "True" | ||
| # - kind: ClusterServiceVersion | ||
| # namespace: openshift-operators-redhat | ||
| # conditions: | ||
| # - type: Succeeded | ||
| # status: "True" | ||
| # reason: InstallSucceeded |
There was a problem hiding this comment.
this should be uncommented, correct?
There was a problem hiding this comment.
You are right. It should be uncommented. I'll fix that.
There was a problem hiding this comment.
this is unrelated change so no need to merge it with logging CRs
01132d1 to
6edbba3
Compare
|
@yprokule thanks for your review. I've addressed your comments. |
|
Hello @yprokule. I've addressed your comments. Would you mind validating that the PR is good to go? |
| # components: | ||
| # refs: | ||
| # - kind: Subscription | ||
| # namespace: openshift-operators-redhat | ||
| # conditions: | ||
| # - type: CatalogSourcesUnhealthy | ||
| # status: "False" | ||
| # - kind: InstallPlan | ||
| # namespace: openshift-operators-redhat | ||
| # conditions: | ||
| # - type: Installed | ||
| # status: "True" | ||
| # - kind: ClusterServiceVersion | ||
| # namespace: openshift-operators-redhat | ||
| # conditions: | ||
| # - type: Succeeded | ||
| # status: "True" | ||
| # reason: InstallSucceeded |
There was a problem hiding this comment.
shouldn't this part be uncommented?
There was a problem hiding this comment.
I left the status commented out as an example. This is consistent with the already existing ClusterLogOperatorStatus.yaml added by @imiller0 . If you'd prefer status to be uncommented, I'd need to make the change to the other file as well, for consistency. What do you think?
6edbba3 to
1c3c30e
Compare
Add Loki Operator installation and LokiStack log storage CRs for the telco-core optional logging configuration: - LokiStack CR with 1x.extra-small sizing, S3 storage, v13 schema, and 5-day retention - Loki Operator Namespace, OperatorGroup, Subscription, and installation status verification CR - S3 object storage Secret template - Update ClusterLogForwarder to enable local Loki storage by default with audit and infrastructure log forwarding, plus commented examples for external system integration (Kafka, Splunk, etc.) - Update core-overlay.yaml PolicyGenerator to demonstrate adding external forwarding to the base Loki configuration - Add kube-compare reference templates for validation This implements "optional but enabled by default" pattern: CRs are in the optional/ folder and commented in PolicyGenerators, but when enabled provide a complete working Loki configuration that supports both local CNF vendor troubleshooting and external OSS/BSS forwarding. Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
1c3c30e to
182ff45
Compare
Add Loki Operator installation and LokiStack log storage CRs for the telco-core optional logging configuration:
Co-authored-by: Claude