Skip to content

Conversation

@Molter73
Copy link
Collaborator

Description

This patch makes it so, at start up, we walk through the configured
paths we have to monitor and add the path to the inode of every file we
find. This first implementation is inherently flawed and meant as a way
to show how it would look once it is fully fleshed out.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

If any of these don't apply, please comment below.

Testing Performed

Added tests will validate events generated on an overlayfs file properly
shows the event on the upper layer and the access to the underlying FS.
They also validate a mounted path on a container resolves to the correct
host path.

While developing these tests, it became painfully obvious getting the
information of the process running inside the container is not
straightforward. Because containers tend to be fairly static, we should
be able to manually create the information statically in the test and
still have everything work correctly. In order to minimize the amount of
changes on existing tests, the default Process constructor now takes
fields directly and there is a from_proc class method that builds a new
Process object from /proc. Additionally, getting the pid of a process in
a container is virtually impossible, so we make the pid check optional.

This patch makes it so, at start up, we walk through the configured
paths we have to monitor and add the path to the inode of every file we
find. This first implementation is inherently flawed and meant as a way
to show how it would look once it is fully fleshed out.
Added tests will validate events generated on an overlayfs file properly
shows the event on the upper layer and the access to the underlying FS.
They also validate a mounted path on a container resolves to the correct
host path.

While developing these tests, it became painfully obvious getting the
information of the process running inside the container is not
straightforward. Because containers tend to be fairly static, we should
be able to manually create the information statically in the test and
still have everything work correctly. In order to minimize the amount of
changes on existing tests, the default Process constructor now takes
fields directly and there is a from_proc class method that builds a new
Process object from /proc. Additionally, getting the pid of a process in
a container is virtually impossible, so we make the pid check optional.
@Molter73
Copy link
Collaborator Author

Molter73 commented Nov 19, 2025

This is an alternative implementation to #149. Compared to that implementation, there are a few benefits to this implementation:

  • We use simpler code on the kernel side, doing a single copy from the contents of the inode storage map to the event buffer it is accessible.
  • We are not limited by the verifier in collecting path components by walking up the dentry list, if we hit the 16 component limit we will completely miss the host path (since the path is constructed from the file to the root).
  • We don't need to emit all events to userspace for filtering.

However, there are also some downsides:

  • Currently only files that exist when fact starts up are tracked.
  • Renaming a file would cause it to keep emitting events with the incorrect host path.
  • Similarly, renaming a file to a path that is not monitored will have it still emitting events when it shouldn't.
  • We are adding a 4K buffer to every inode that falls in a monitored path, this might not seem like much, but it can add up pretty quickly if we are monitoring large numbers of files.
  • If configuration of monitored directories is changed, the inode store is not updated in any way.

IMO, though significant the limitations of this implementation can be improved in subsequent PRs, the current implementation should be enough for our current target of monitoring 4 system files that should never be removed/renamed.


int32_t add_path(int32_t map_fd, const char* path, const char* host_path) {
int fd = open(path, O_RDONLY);
if (fd <= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (fd <= 0) {
if (fd < 0) {

checksum = "deec109607ca693028562ed836a5f1c4b8bd77755c4e132fc5ce11b0b6211ae7"
checksum = "b97463e1064cb1b1c1384ad0a0b9c8abd0988e2a91f52606c80ef14aadb63e36"
dependencies = [
"find-msvc-tools",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this dependency?

}

if (!is_monitored(path)) {
if (host_path == NULL && !is_monitored(path)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we ignore events if the host_path was not resolved?


long res = syscall(SYS_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
if (res == -EEXIST) {
res = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some debug log output in this case?

}

fn path_to_cstring(path: &Path) -> anyhow::Result<CString> {
let path = path.as_os_str().to_string_lossy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like Path has its own to_string_lossy that does the same thing as OsStr, why convert?

Regarding lossy part -- I assume it will cause problems with non-UTF8 file names, correct?


pub fn walk_path(inode_store: &mut MapData, path: &Path) -> anyhow::Result<()> {
if path.is_dir() {
for entry in (path.read_dir()?).flatten() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like is_dir resolves symlinks, I guess it's worth adding some loop prevention logic in the future.

union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.map_fd = map_fd;
attr.key = (unsigned long long)&fd;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it, we search in inode_store by the inode:

bpf_inode_storage_get(&inode_store, file->f_inode, NULL, 0);

but store by the file descriptor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the inode storage works. When accessing from userspace via the bpf syscall you don't have a way to access the inode pointer itself (because it lives on kernel memory), so you give it the file descriptor to the file and the kernel resolves the inode under the hood:
https://github.com/torvalds/linux/blob/2061f18ad76ecaddf8ed17df81b8611ea88dbddd/kernel/bpf/bpf_inode_storage.c#L99

But on kernel side, you do have the inode pointer, so you can use that directly:
https://github.com/torvalds/linux/blob/2061f18ad76ecaddf8ed17df81b8611ea88dbddd/kernel/bpf/bpf_inode_storage.c#L128-L129

@Molter73
Copy link
Collaborator Author

Superseded by #166

@Molter73 Molter73 closed this Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants