Skip to content

Improve logging for DatasourceEvent field access errors, fix error_raw type#730

Merged
matthyx merged 1 commit intomainfrom
fix-getters
Feb 27, 2026
Merged

Improve logging for DatasourceEvent field access errors, fix error_raw type#730
matthyx merged 1 commit intomainfrom
fix-getters

Conversation

@matthyx
Copy link
Contributor

@matthyx matthyx commented Feb 27, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Enhanced error logging with detailed diagnostic messages when field retrieval fails
    • Improved robustness of field access operations to gracefully handle unavailable fields
    • Strengthened fallback mechanisms to prevent incorrect caching of null or missing values across data readers

…w type

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

📝 Walkthrough

Walkthrough

Modified pkg/utils/datasource_event.go to strengthen error handling and logging for field access operations. Added nil-check in getFieldAccessor to prevent caching nil results, replaced field-not-found logs with error-reading logs across numerous getter methods, adjusted endpoint reader error paths, and introduced a defensive missingFieldAccessor fallback mechanism.

Changes

Cohort / File(s) Summary
Field Accessor Error Handling
pkg/utils/datasource_event.go
Added nil-check to getFieldAccessor to prevent caching nil field results; introduced missingFieldAccessor fallback for cases where fields are not present in the currently configured datasource; reworked error handling in GetDstEndpoint and related endpoint readers to return zero-value structs with error-logged messages.
Standardized Error Logging Across Getters
pkg/utils/datasource_event.go
Replaced "field not found in event type" logs with "error reading field" logs across approximately 40+ getter methods (addresses, args, attr_size, buf, cap, cmd, comm, container variants, cwd, dns fields, endpoint fields, ecs fields, error fields, exePath, exitCode, extra, flags, gid, hostNetwork, identifier, module, pid variants, proto, port readers, signal, socket fields, src/dst fields, syscall variants, timestamp, type, uid, and others); augmented logs with error details via helpers.Error(err).
SyscallEventType GetPid Adjustment
pkg/utils/datasource_event.go
Modified GetPid to return 0 for SyscallEventType instead of attempting to read runtime.containerPid (temporary workaround).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • slashben
  • YakirOren

Poem

🐰 A rabbit's fields now safely read,
With nil-checks guarding ahead,
No more lost logs in the dark—
Each error leaves a gleaming mark,
Error-reading logs now lead the way! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: improving logging for DatasourceEvent field access errors and fixing the error_raw type, which aligns with the primary modifications in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-getters

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@matthyx matthyx added the release Create release label Feb 27, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/utils/datasource_event.go (1)

156-176: ⚠️ Potential issue | 🟠 Major

Scope field accessor cache by datasource, not only event type.

Line 156 caches by e.EventType only. With multiple datasources producing the same event type, a field accessor can be reused across datasources, causing incorrect field decoding.

💡 Proposed fix
+func (e *DatasourceEvent) fieldCacheKey() string {
+	return string(e.EventType) + ":" + e.Datasource.Name()
+}
+
 func (e *DatasourceEvent) getFieldAccessor(fieldName string) datasource.FieldAccessor {
 	if e == nil {
 		return missingFieldAccessor
 	}
 
-	cacheVal, ok := fieldCaches.Load(e.EventType)
+	cacheKey := e.fieldCacheKey()
+	cacheVal, ok := fieldCaches.Load(cacheKey)
 	if !ok {
-		cacheVal, _ = fieldCaches.LoadOrStore(e.EventType, &sync.Map{})
+		cacheVal, _ = fieldCaches.LoadOrStore(cacheKey, &sync.Map{})
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/datasource_event.go` around lines 156 - 176, The cache key is
currently just e.EventType so field accessors can be wrongly shared across
different datasources; change the caching to include the datasource identity
(e.g., use a composite key of e.EventType + datasource ID or scope a
per-datasource map) so each datasource has its own sync.Map; update usages
around fieldCaches, the lookup code that currently does
fieldCaches.Load(e.EventType), the LoadOrStore path, and where you call
e.Datasource.GetField / m.LoadOrStore to ensure you first get or create a
per-datasource cache (or use a nested map keyed by datasource) and only then
look up/store fieldName, returning missingFieldAccessor unchanged when
datasource is nil or GetField returns nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/utils/datasource_event.go`:
- Around line 729-731: The SyscallEventType branch currently returns a hardcoded
PID 0 which breaks correlation; replace it by extracting the PID from the
syscall event payload and falling back to other available places: check
evt.Syscall.Pid (or evt.Syscall.ProcessID) first, then fallback to
evt.Process.Pid (or evt.GetProcess().Pid), then evt.Context/Metadata PID fields
if present, and only return a sentinel (e.g. -1) if no PID can be resolved;
update the SyscallEventType case accordingly instead of returning 0.

---

Outside diff comments:
In `@pkg/utils/datasource_event.go`:
- Around line 156-176: The cache key is currently just e.EventType so field
accessors can be wrongly shared across different datasources; change the caching
to include the datasource identity (e.g., use a composite key of e.EventType +
datasource ID or scope a per-datasource map) so each datasource has its own
sync.Map; update usages around fieldCaches, the lookup code that currently does
fieldCaches.Load(e.EventType), the LoadOrStore path, and where you call
e.Datasource.GetField / m.LoadOrStore to ensure you first get or create a
per-datasource cache (or use a nested map keyed by datasource) and only then
look up/store fieldName, returning missingFieldAccessor unchanged when
datasource is nil or GetField returns nil.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc5fca4 and 46e5df8.

📒 Files selected for processing (1)
  • pkg/utils/datasource_event.go

Comment on lines 729 to +731
case SyscallEventType:
// FIXME this is a temporary workaround until the gadget has proc enrichment
containerPid, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data)
if err != nil {
logger.L().Warning("GetPID - runtime.containerPid field not found in event type", helpers.String("eventType", string(e.EventType)))
return 0
}
return containerPid
return 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid returning a hardcoded PID for syscall events.

Line 731 forces PID to 0 for every SyscallEventType, which can break event correlation and attribution.

💡 Proposed fix
 	case SyscallEventType:
-		// FIXME this is a temporary workaround until the gadget has proc enrichment
-		return 0
+		// Keep best-effort PID extraction for correlation.
+		pidValue, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data)
+		if err == nil && pidValue != 0 {
+			return pidValue
+		}
+		pidValue, err = e.getFieldAccessor("proc.pid").Uint32(e.Data)
+		if err != nil {
+			logger.L().Warning("GetPID - error reading syscall pid fields", helpers.String("eventType", string(e.EventType)), helpers.Error(err))
+			return 0
+		}
+		return pidValue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case SyscallEventType:
// FIXME this is a temporary workaround until the gadget has proc enrichment
containerPid, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data)
if err != nil {
logger.L().Warning("GetPID - runtime.containerPid field not found in event type", helpers.String("eventType", string(e.EventType)))
return 0
}
return containerPid
return 0
case SyscallEventType:
// Keep best-effort PID extraction for correlation.
pidValue, err := e.getFieldAccessor("runtime.containerPid").Uint32(e.Data)
if err == nil && pidValue != 0 {
return pidValue
}
pidValue, err = e.getFieldAccessor("proc.pid").Uint32(e.Data)
if err != nil {
logger.L().Warning("GetPID - error reading syscall pid fields", helpers.String("eventType", string(e.EventType)), helpers.Error(err))
return 0
}
return pidValue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/datasource_event.go` around lines 729 - 731, The SyscallEventType
branch currently returns a hardcoded PID 0 which breaks correlation; replace it
by extracting the PID from the syscall event payload and falling back to other
available places: check evt.Syscall.Pid (or evt.Syscall.ProcessID) first, then
fallback to evt.Process.Pid (or evt.GetProcess().Pid), then evt.Context/Metadata
PID fields if present, and only return a sentinel (e.g. -1) if no PID can be
resolved; update the SyscallEventType case accordingly instead of returning 0.

@matthyx matthyx merged commit c0d8b1d into main Feb 27, 2026
27 checks passed
@matthyx matthyx deleted the fix-getters branch February 27, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release Create release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant