-
Notifications
You must be signed in to change notification settings - Fork 67
Improve logging in VolumeReplicationGroup controller #2318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
asn1809
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, helps reading logs in good manner.
| v.log.Info("Separate PVCs using only StorageClass provisioner") | ||
| } else { | ||
| v.log.Info("Separate PVCs using peerClasses") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a debugging log about how we handle the PVCs. Move the debug level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We hardly use debug level, and adding logs at that level is not my choice. This instance can be root caused inspecting the VRG resource and looking at peerClass elements, and I would as such remove this log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
| return nil, fmt.Errorf("no %s found to match provisioner and schedule", objType) | ||
| case 1: | ||
| v.log.Info(fmt.Sprintf("Found %s that matches provisioner and schedule %s/%s", objType, | ||
| storageClass.Provisioner, v.instance.Spec.Async.SchedulingInterval)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we log the result of this search later?
Logging what we found once makes sense. If we don't log it elsewhere this may be a debug log that will help understanding what the system is doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nirs the problem here is that we are logging it for each PVC and then we see the following (all logs for only one VRG with 10 PVC during one reconcile request):
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.113Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
2025-10-29T11:37:36.206Z INFO vrg controller/vrg_volrep.go:1451 Found VolumeGroupReplicationClass that matches provisioner and schedule openshift-storage.rbd.csi.ceph.com/5m {"vrg": {"name":"cg-app-1","namespace":"openshift-dr-ops"}, "rid": "31b5bd24", "State": "primary"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so the log per pvc do not add any value, we can remove them.
| // this. If not found, then reconcile request would not be sent | ||
| selector, err := metav1.LabelSelectorAsSelector(&pvcSelector.LabelSelector) | ||
| if err != nil { | ||
| log1.Error(err, "Failed to get the label selector from VolumeReplicationGroup") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an expected condition or an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nirs according to comment for this call, it can happen and not an error:
// continue if we fail to get the labels for this object hoping
// that pvc might actually belong to some other vrg instead of
// this. If not found, then reconcile request would not be sent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log line indicates a failure, the continue is a best case continue, just in case this PVC belongs to a different VRG. We should not get an error when we attempt to get the selector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShyamsundarR I didn't understand, you suggest to retain or to remove?
| log1.Info("Found VolumeReplicationGroup with matching labels or owner", | ||
| "vrg", vrgNamespacedName.String(), "selector", selector, | ||
| "namespaces selected", pvcSelector.NamespaceNames, | ||
| "label match", labelMatch, "owner match", ownerMatch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a debug log - do we log the same info later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nirs We are printing debug logs also, so that Ramen log is overloaded with these messages! It causes wrap around.
Here is the example from relaxed QE setup:
Line 590: 2025-10-29T11:37:26.448Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-1","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 591: 2025-10-29T11:37:26.449Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-1","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 596: 2025-10-29T11:37:26.527Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-2","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 597: 2025-10-29T11:37:26.527Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-2","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 601: 2025-10-29T11:37:26.553Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-6","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 602: 2025-10-29T11:37:26.553Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-6","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 605: 2025-10-29T11:37:26.720Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-7","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 606: 2025-10-29T11:37:26.721Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-7","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 611: 2025-10-29T11:37:26.764Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-3","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 612: 2025-10-29T11:37:26.765Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-3","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 616: 2025-10-29T11:37:26.788Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-4","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 617: 2025-10-29T11:37:26.789Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-4","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 620: 2025-10-29T11:37:26.826Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-5","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 621: 2025-10-29T11:37:26.827Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-5","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 626: 2025-10-29T11:37:26.855Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-8","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 627: 2025-10-29T11:37:26.856Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-8","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 631: 2025-10-29T11:37:26.874Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-10","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 632: 2025-10-29T11:37:26.875Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-10","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 635: 2025-10-29T11:37:26.905Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-9","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
Line 636: 2025-10-29T11:37:26.906Z INFO pvcmap.VolumeReplicationGroup controller/volumereplicationgroup_controller.go:359 Found VolumeReplicationGroup with matching labels or owner {"pvc": {"name":"busybox-pvc-9","namespace":"cg-app-1"}, "vrg": "cg-app-1", "vrg": "openshift-dr-ops/cg-app-1", "selector": "workloadpattern in (simple_io_pvc)", "namespaces selected": ["cg-app-1"], "label match": true, "owner match": false}
All these are from one second
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like too verbose debug log, not clear why we need to know that we found a vrg for the pvc.
In general we need to log when we change the state of the system, for example when we are starting the use a pvc, when the pvc state changes, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, I can's see, how this log helps us to debug the system, that is why I suggest to remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log line indicates why a VRG was reconciled, i.e due to which PVC triggering a VRG reconcile.
This is useful in the steady state, where PVC events would be minimal (in terms of update/delete/create) and so knowing what triggered the VRG reconcile is useful at times in troubleshooting.
The log example given is when an application is possibly relocated/failedover or deployed with DR enabled the first time. As then all PVCs are created and bound in a couple of reconciles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShyamsundarR Usually issues don’t occur in a steady state. This PR was created to prevent ramen log wrap-around, after which we lose the logs completely and can’t investigate the problem.
This message is printed for every PVC and can be very noisy. If you feel it’s important to keep it, I can revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ELENAGER We need more data to tell how bad is this log. In general we want to delete (or move to debug level, hidden by default) logs that add little value and consume lot of space, leading to wrapped log files.
The way I analyzed logs in the past was to import the logs to sqlite database, and then query the database.
This is the tool I used. It is built for vdsm logs but modifying it for ramen logs should not be hard.
https://github.com/oVirt/vdsm/blob/master/contrib/logdb
Example commit showing some queries:
http://github.com/oVirt/vdsm/commit/886224ada622fcfc5e5de1aaeae030ae40ad0556
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I retained this log line
| CreateFunc: func(e event.CreateEvent) bool { | ||
| log.Info("Create event for PersistentVolumeClaim") | ||
| o := e.Object | ||
| log.Info("PVC Create event", "namespace", o.GetNamespace(), "name", o.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make the log shorted using "PVC created"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
|
|
||
| log.Info("Update event for PersistentVolumeClaim") | ||
| log.Info("PVC Update event", "namespace", oldPVC.GetNamespace(), "name", oldPVC.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PVC updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| DeleteFunc: func(e event.DeleteEvent) bool { | ||
| o := e.Object | ||
| log.Info("PVC Delete", "namespace", o.GetNamespace(), "name", o.GetName()) | ||
| log.Info("PVC Delete event", "namespace", o.GetNamespace(), "name", o.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PVC deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
4a1e9c6 to
f938449
Compare
| log.Info("Create event for PersistentVolumeClaim") | ||
| o := e.Object | ||
| log.Info("PVC created", "namespace", o.GetNamespace(), "name", o.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're using past-tense verbs like “created” and “deleted,” but these predicates run before reconciliation - they only observe Kubernetes events, not completed actions. Using phrasing like "Create event" reflects what’s actually happening.
Also, there’s no real need to log the object’s namespaced name at this stage - the reconciler logs that when processing the object, so removing it here would keep the logs cleaner and less noisy.
| log.Info("Create event for PersistentVolumeClaim") | |
| o := e.Object | |
| log.Info("PVC created", "namespace", o.GetNamespace(), "name", o.GetName()) | |
| log.Info("Create event for PersistentVolumeClaim") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think “PVC created” is accurate because by the time we receive the ADDED event, the PVC object already exists in the cluster (persisted in etcd). We’re reacting to a completed creation, not an in-progress action. Using “Create event” might imply it hasn’t happened yet.
Regarding the namespaced name, I’d prefer to keep it for easier debugging and correlation, but open to revisiting if logs become too noisy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retaining the "event" word in the log is useful, as we are receiving events, and when looking at logs it is more easily understandable.
But more broadly, we should drop all logs in this function, and just rely on logs from the pvcMapFunc which decides if a PVC is of interest to reconcile a VRG.
Logs here till date never carried the PVC NamespaceName, and most often for debugging the log generated regarding which VRG that was triggered for a reconcile was more useful (as that had the PVC name).
As this is a global system wide PVC watcher, we should drop the logs in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Event" word is retained, I still prefer to have the logs of the events, in spite of having the logs from the pvcMapFunc
| } | ||
|
|
||
| log.Info("Update event for PersistentVolumeClaim") | ||
| log.Info("PVC updated", "namespace", oldPVC.GetNamespace(), "name", oldPVC.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
| log.Info("PVC updated", "namespace", oldPVC.GetNamespace(), "name", oldPVC.GetName()) | |
| log.Info("Update event for PersistentVolumeClaim") |
| DeleteFunc: func(e event.DeleteEvent) bool { | ||
| o := e.Object | ||
| log.Info("PVC Delete", "namespace", o.GetNamespace(), "name", o.GetName()) | ||
| log.Info("PVC deleted", "namespace", o.GetNamespace(), "name", o.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
| log.Info("PVC deleted", "namespace", o.GetNamespace(), "name", o.GetName()) | |
| log.Info("Delete event for PersistentVolumeClaim", "namespace", o.GetNamespace(), "name", o.GetName()) |
pruthvitd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smart tradeoff — reducing log noise while keeping critical events visible longer before rotation. Thanks for addressing this.
2193397 to
6446830
Compare
6446830 to
3c6fa88
Compare
| log.Info("Create event for PersistentVolumeClaim") | ||
| o := e.Object | ||
| log.Info("PVC created", "namespace", o.GetNamespace(), "name", o.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retaining the "event" word in the log is useful, as we are receiving events, and when looking at logs it is more easily understandable.
But more broadly, we should drop all logs in this function, and just rely on logs from the pvcMapFunc which decides if a PVC is of interest to reconcile a VRG.
Logs here till date never carried the PVC NamespaceName, and most often for debugging the log generated regarding which VRG that was triggered for a reconcile was more useful (as that had the PVC name).
As this is a global system wide PVC watcher, we should drop the logs in this function.
| // this. If not found, then reconcile request would not be sent | ||
| selector, err := metav1.LabelSelectorAsSelector(&pvcSelector.LabelSelector) | ||
| if err != nil { | ||
| log1.Error(err, "Failed to get the label selector from VolumeReplicationGroup") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log line indicates a failure, the continue is a best case continue, just in case this PVC belongs to a different VRG. We should not get an error when we attempt to get the selector.
| log1.Info("Found VolumeReplicationGroup with matching labels or owner", | ||
| "vrg", vrgNamespacedName.String(), "selector", selector, | ||
| "namespaces selected", pvcSelector.NamespaceNames, | ||
| "label match", labelMatch, "owner match", ownerMatch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log line indicates why a VRG was reconciled, i.e due to which PVC triggering a VRG reconcile.
This is useful in the steady state, where PVC events would be minimal (in terms of update/delete/create) and so knowing what triggered the VRG reconcile is useful at times in troubleshooting.
The log example given is when an application is possibly relocated/failedover or deployed with DR enabled the first time. As then all PVCs are created and bound in a couple of reconciles.
| v.log.Info("Separate PVCs using only StorageClass provisioner") | ||
| } else { | ||
| v.log.Info("Separate PVCs using peerClasses") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We hardly use debug level, and adding logs at that level is not my choice. This instance can be root caused inspecting the VRG resource and looking at peerClass elements, and I would as such remove this log.
| []string{VRGConditionReasonUnused}, | ||
| subconditions...) | ||
| msg = fmt.Sprintf("updated %s status to %s", conditionName, finalCondition.Status) | ||
| msg := fmt.Sprintf("updated %s status to %s", conditionName, finalCondition.Status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would get rid of this condition log printing, the i memory condition is updated to VRG once the callers are done, and so having them in the logs maybe superfluous.
There are times while inspecting logs, how the condition progressed has been useful, but given VRG status, and looking at current state of the system this log is more noise and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShyamsundarR I honestly prefer to retain this log, I already deleted the duplicate line (see above), so we reduced the noise. Sometimes the possibility to follow the progression can be useful and not only final result,
| } | ||
|
|
||
| if slices.Contains(*vrg.Spec.ProtectedNamespaces, obj.GetNamespace()) { | ||
| log1.Info("Found VolumeReplicationGroup with matching namespace", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log line again tells us which event triggered a VRG was reconcile. It is generally useful to know this when looking at logs, as only an external trigger should ideally trigger a VRG reconciliation (apart from any exponential backoff from a previous reconcile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retained
|
|
||
| rgs, ok := obj.(*ramendrv1alpha1.ReplicationGroupSource) | ||
| if !ok { | ||
| log.Info("map function received not a replication group source resource") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should drop other such log lines in the functions above this. While I have never seen this log, and so will not contribute to reducing log noise as such, it is not essential as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
1824948 to
e9e2fe9
Compare
08496f3 to
7e0f2e1
Compare
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
7e0f2e1 to
0586abe
Compare
Remove noisy/non-informative logs, add context to critical log entries, and log at a higher scope than PVC to avoid repeated lines.