Improve logging for DatasourceEvent field access errors, fix error_raw type#730
Improve logging for DatasourceEvent field access errors, fix error_raw type#730
Conversation
…w type Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
📝 WalkthroughWalkthroughModified Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/utils/datasource_event.go (1)
156-176:⚠️ Potential issue | 🟠 MajorScope field accessor cache by datasource, not only event type.
Line 156caches bye.EventTypeonly. With multiple datasources producing the same event type, a field accessor can be reused across datasources, causing incorrect field decoding.💡 Proposed fix
+func (e *DatasourceEvent) fieldCacheKey() string { + return string(e.EventType) + ":" + e.Datasource.Name() +} + func (e *DatasourceEvent) getFieldAccessor(fieldName string) datasource.FieldAccessor { if e == nil { return missingFieldAccessor } - cacheVal, ok := fieldCaches.Load(e.EventType) + cacheKey := e.fieldCacheKey() + cacheVal, ok := fieldCaches.Load(cacheKey) if !ok { - cacheVal, _ = fieldCaches.LoadOrStore(e.EventType, &sync.Map{}) + cacheVal, _ = fieldCaches.LoadOrStore(cacheKey, &sync.Map{}) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/utils/datasource_event.go` around lines 156 - 176, The cache key is currently just e.EventType so field accessors can be wrongly shared across different datasources; change the caching to include the datasource identity (e.g., use a composite key of e.EventType + datasource ID or scope a per-datasource map) so each datasource has its own sync.Map; update usages around fieldCaches, the lookup code that currently does fieldCaches.Load(e.EventType), the LoadOrStore path, and where you call e.Datasource.GetField / m.LoadOrStore to ensure you first get or create a per-datasource cache (or use a nested map keyed by datasource) and only then look up/store fieldName, returning missingFieldAccessor unchanged when datasource is nil or GetField returns nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/utils/datasource_event.go`:
- Around line 729-731: The SyscallEventType branch currently returns a hardcoded
PID 0 which breaks correlation; replace it by extracting the PID from the
syscall event payload and falling back to other available places: check
evt.Syscall.Pid (or evt.Syscall.ProcessID) first, then fallback to
evt.Process.Pid (or evt.GetProcess().Pid), then evt.Context/Metadata PID fields
if present, and only return a sentinel (e.g. -1) if no PID can be resolved;
update the SyscallEventType case accordingly instead of returning 0.
---
Outside diff comments:
In `@pkg/utils/datasource_event.go`:
- Around line 156-176: The cache key is currently just e.EventType so field
accessors can be wrongly shared across different datasources; change the caching
to include the datasource identity (e.g., use a composite key of e.EventType +
datasource ID or scope a per-datasource map) so each datasource has its own
sync.Map; update usages around fieldCaches, the lookup code that currently does
fieldCaches.Load(e.EventType), the LoadOrStore path, and where you call
e.Datasource.GetField / m.LoadOrStore to ensure you first get or create a
per-datasource cache (or use a nested map keyed by datasource) and only then
look up/store fieldName, returning missingFieldAccessor unchanged when
datasource is nil or GetField returns nil.
| case SyscallEventType: | ||
| // FIXME this is a temporary workaround until the gadget has proc enrichment | ||
| containerPid, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data) | ||
| if err != nil { | ||
| logger.L().Warning("GetPID - runtime.containerPid field not found in event type", helpers.String("eventType", string(e.EventType))) | ||
| return 0 | ||
| } | ||
| return containerPid | ||
| return 0 |
There was a problem hiding this comment.
Avoid returning a hardcoded PID for syscall events.
Line 731 forces PID to 0 for every SyscallEventType, which can break event correlation and attribution.
💡 Proposed fix
case SyscallEventType:
- // FIXME this is a temporary workaround until the gadget has proc enrichment
- return 0
+ // Keep best-effort PID extraction for correlation.
+ pidValue, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data)
+ if err == nil && pidValue != 0 {
+ return pidValue
+ }
+ pidValue, err = e.getFieldAccessor("proc.pid").Uint32(e.Data)
+ if err != nil {
+ logger.L().Warning("GetPID - error reading syscall pid fields", helpers.String("eventType", string(e.EventType)), helpers.Error(err))
+ return 0
+ }
+ return pidValue📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case SyscallEventType: | |
| // FIXME this is a temporary workaround until the gadget has proc enrichment | |
| containerPid, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data) | |
| if err != nil { | |
| logger.L().Warning("GetPID - runtime.containerPid field not found in event type", helpers.String("eventType", string(e.EventType))) | |
| return 0 | |
| } | |
| return containerPid | |
| return 0 | |
| case SyscallEventType: | |
| // Keep best-effort PID extraction for correlation. | |
| pidValue, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data) | |
| if err == nil && pidValue != 0 { | |
| return pidValue | |
| } | |
| pidValue, err = e.getFieldAccessor("proc.pid").Uint32(e.Data) | |
| if err != nil { | |
| logger.L().Warning("GetPID - error reading syscall pid fields", helpers.String("eventType", string(e.EventType)), helpers.Error(err)) | |
| return 0 | |
| } | |
| return pidValue |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/utils/datasource_event.go` around lines 729 - 731, The SyscallEventType
branch currently returns a hardcoded PID 0 which breaks correlation; replace it
by extracting the PID from the syscall event payload and falling back to other
available places: check evt.Syscall.Pid (or evt.Syscall.ProcessID) first, then
fallback to evt.Process.Pid (or evt.GetProcess().Pid), then evt.Context/Metadata
PID fields if present, and only return a sentinel (e.g. -1) if no PID can be
resolved; update the SyscallEventType case accordingly instead of returning 0.
Summary by CodeRabbit
Release Notes