ROX-30437: resolve host path via inode tracking #158

Molter73 · 2025-11-19T13:57:03Z

Description

This patch makes it so, at start up, we walk through the configured
paths we have to monitor and add the path to the inode of every file we
find. This first implementation is inherently flawed and meant as a way
to show how it would look once it is fully fleshed out.

Checklist

Investigated and inspected CI test results
Updated documentation accordingly

Automated testing

Added unit tests
Added integration tests
Added regression tests

If any of these don't apply, please comment below.

Testing Performed

Added tests will validate events generated on an overlayfs file properly
shows the event on the upper layer and the access to the underlying FS.
They also validate a mounted path on a container resolves to the correct
host path.

While developing these tests, it became painfully obvious getting the
information of the process running inside the container is not
straightforward. Because containers tend to be fairly static, we should
be able to manually create the information statically in the test and
still have everything work correctly. In order to minimize the amount of
changes on existing tests, the default Process constructor now takes
fields directly and there is a from_proc class method that builds a new
Process object from /proc. Additionally, getting the pid of a process in
a container is virtually impossible, so we make the pid check optional.

This patch makes it so, at start up, we walk through the configured paths we have to monitor and add the path to the inode of every file we find. This first implementation is inherently flawed and meant as a way to show how it would look once it is fully fleshed out.

Added tests will validate events generated on an overlayfs file properly shows the event on the upper layer and the access to the underlying FS. They also validate a mounted path on a container resolves to the correct host path. While developing these tests, it became painfully obvious getting the information of the process running inside the container is not straightforward. Because containers tend to be fairly static, we should be able to manually create the information statically in the test and still have everything work correctly. In order to minimize the amount of changes on existing tests, the default Process constructor now takes fields directly and there is a from_proc class method that builds a new Process object from /proc. Additionally, getting the pid of a process in a container is virtually impossible, so we make the pid check optional.

Molter73 · 2025-11-19T14:11:50Z

This is an alternative implementation to #149. Compared to that implementation, there are a few benefits to this implementation:

We use simpler code on the kernel side, doing a single copy from the contents of the inode storage map to the event buffer it is accessible.
We are not limited by the verifier in collecting path components by walking up the dentry list, if we hit the 16 component limit we will completely miss the host path (since the path is constructed from the file to the root).
We don't need to emit all events to userspace for filtering.

However, there are also some downsides:

Currently only files that exist when fact starts up are tracked.
Renaming a file would cause it to keep emitting events with the incorrect host path.
Similarly, renaming a file to a path that is not monitored will have it still emitting events when it shouldn't.
We are adding a 4K buffer to every inode that falls in a monitored path, this might not seem like much, but it can add up pretty quickly if we are monitoring large numbers of files.
If configuration of monitored directories is changed, the inode store is not updated in any way.

IMO, though significant the limitations of this implementation can be improved in subsequent PRs, the current implementation should be enough for our current target of monitoring 4 system files that should never be removed/renamed.

JoukoVirtanen · 2025-12-03T00:16:59Z

fact-ffi/src/c/inode.c

+
+int32_t add_path(int32_t map_fd, const char* path, const char* host_path) {
+  int fd = open(path, O_RDONLY);
+  if (fd <= 0) {


Suggested change

if (fd <= 0) {

if (fd < 0) {

erthalion · 2025-12-05T10:21:50Z

Cargo.lock

-checksum = "deec109607ca693028562ed836a5f1c4b8bd77755c4e132fc5ce11b0b6211ae7"
+checksum = "b97463e1064cb1b1c1384ad0a0b9c8abd0988e2a91f52606c80ef14aadb63e36"
 dependencies = [
+ "find-msvc-tools",


What's the purpose of this dependency?

erthalion · 2025-12-05T10:26:01Z

fact-ebpf/src/bpf/main.c

  }

-  if (!is_monitored(path)) {
+  if (host_path == NULL && !is_monitored(path)) {


Why should we ignore events if the host_path was not resolved?

erthalion · 2025-12-05T10:33:59Z

fact-ffi/src/c/inode.c

+
+  long res = syscall(SYS_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
+  if (res == -EEXIST) {
+    res = 0;


Some debug log output in this case?

erthalion · 2025-12-05T10:39:09Z

fact-ffi/src/inode_store.rs

+}
+
+fn path_to_cstring(path: &Path) -> anyhow::Result<CString> {
+    let path = path.as_os_str().to_string_lossy();


Looks like Path has its own to_string_lossy that does the same thing as OsStr, why convert?

Regarding lossy part -- I assume it will cause problems with non-UTF8 file names, correct?

erthalion · 2025-12-05T10:51:33Z

fact/src/fs_walker.rs

+
+pub fn walk_path(inode_store: &mut MapData, path: &Path) -> anyhow::Result<()> {
+    if path.is_dir() {
+        for entry in (path.read_dir()?).flatten() {


Looks like is_dir resolves symlinks, I guess it's worth adding some loop prevention logic in the future.

erthalion · 2025-12-05T10:53:50Z

fact-ffi/src/c/inode.c

+  union bpf_attr attr;
+  memset(&attr, 0, sizeof(attr));
+  attr.map_fd = map_fd;
+  attr.key = (unsigned long long)&fd;


I don't get it, we search in inode_store by the inode:

bpf_inode_storage_get(&inode_store, file->f_inode, NULL, 0);

but store by the file descriptor?

That's how the inode storage works. When accessing from userspace via the bpf syscall you don't have a way to access the inode pointer itself (because it lives on kernel memory), so you give it the file descriptor to the file and the kernel resolves the inode under the hood:
https://github.com/torvalds/linux/blob/2061f18ad76ecaddf8ed17df81b8611ea88dbddd/kernel/bpf/bpf_inode_storage.c#L99

But on kernel side, you do have the inode pointer, so you can use that directly:
https://github.com/torvalds/linux/blob/2061f18ad76ecaddf8ed17df81b8611ea88dbddd/kernel/bpf/bpf_inode_storage.c#L128-L129

Molter73 · 2025-12-15T14:42:47Z

Superseded by #166

Molter73 added 2 commits November 19, 2025 14:56

Molter73 mentioned this pull request Nov 19, 2025

ROX-30437: refine host path algorithm #149

Closed

5 tasks

JoukoVirtanen reviewed Dec 3, 2025

View reviewed changes

erthalion reviewed Dec 5, 2025

View reviewed changes

Molter73 closed this Dec 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ROX-30437: resolve host path via inode tracking #158

ROX-30437: resolve host path via inode tracking #158

Uh oh!

Molter73 commented Nov 19, 2025

Uh oh!

Molter73 commented Nov 19, 2025 •

edited

Loading

Uh oh!

JoukoVirtanen Dec 3, 2025

Uh oh!

erthalion Dec 5, 2025

Uh oh!

erthalion Dec 5, 2025

Uh oh!

erthalion Dec 5, 2025

Uh oh!

erthalion Dec 5, 2025

Uh oh!

erthalion Dec 5, 2025

Uh oh!

erthalion Dec 5, 2025

Uh oh!

Molter73 Dec 5, 2025

Uh oh!

Molter73 commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ROX-30437: resolve host path via inode tracking #158

ROX-30437: resolve host path via inode tracking #158

Uh oh!

Conversation

Molter73 commented Nov 19, 2025

Description

Checklist

Testing Performed

Uh oh!

Molter73 commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Molter73 commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Molter73 commented Nov 19, 2025 •

edited

Loading