Skip to content

Scan dir once#658

Open
doke2 wants to merge 11 commits intoossec:masterfrom
doke2:scan_dir_once
Open

Scan dir once#658
doke2 wants to merge 11 commits intoossec:masterfrom
doke2:scan_dir_once

Conversation

@doke2
Copy link
Copy Markdown
Contributor

@doke2 doke2 commented Aug 21, 2015

Don't decend into monitored subdirectories of monitored directories.
If you you have both /var and /var/log monitored, the /var scan won't
decend into /var/log. /var/log is only scanned once, with the
right options.

For example, this would only scan /etc/sysconfig with report changes.
<directories check_all="yes">/etc</directories>
<directories check_all="yes" report_changes="yes">/etc/sysconfig</directories>

For example, this would avoid summing the log files.
<directories check_all="yes">/var</directories>
<directories check_owner="yes" check_perm="yes" check_sum="no">/var/log </directories>

Review on Reviewable

doke2 added 2 commits August 20, 2015 19:23
So you you have both /var and /var/log monitored, the /var scan won't
decend into /var/log.  /var/log is only scanned once, with the
right options.

For example, this would only scan /etc/sysconfig with report changes.
    <directories check_all="yes">/etc</directories>
    <directories check_all="yes" report_changes="yes">/etc/sysconfig</directories>

For example, this would avoid summing the log files.
    <directories check_all="yes">/var</directories>
    <directories check_owner="yes" check_perm="yes" check_sum="no">/var/log</directories>
@mstarks01
Copy link
Copy Markdown
Contributor

It seems that this changes the default behavior. Wouldn't it be best to add it as a configurable option?

@doke2
Copy link
Copy Markdown
Contributor Author

doke2 commented Aug 24, 2015

That makes sense. Do you think it should be a compile time option, or a check_something option to the directories tag, or both?

doke2 added 7 commits August 24, 2015 14:59
…also

target directories.  So if you have both /var and /var/log, and apply this
option to /var, then /var/log will only be scanned once.  This make a huge
disk I/O difference if it gets into your logs.  It also lets you apply
different options to a subdirectory, ie check_sum="no".
@mstarks01
Copy link
Copy Markdown
Contributor

I would expect something like <recurse>no</recurse> in the syscheck section of ossec.conf. That's just off the top of my head--there may be a more elegant name. This way, existing configurations would not be affected.

@ddpbsd
Copy link
Copy Markdown
Member

ddpbsd commented Aug 25, 2015

Since the current behavior of defining directories multiple times produces undesirable results, I don't know if this really needs a configuration option. I haven't tested this pull, but if it fixes the interaction, I'm fine with it changing a currently broken operation.

@mstarks01
Copy link
Copy Markdown
Contributor

Perhaps I am misunderstanding this. If I currently have /etc defined, then I'll receive alerts if /etc/foo/bar changes. With this patch, wouldn't I have to manually enumerate all current and future subdirectories and define them individually? Wouldn't I have to maintain the configuration file manually as people add directories?

@ddpbsd
Copy link
Copy Markdown
Member

ddpbsd commented Aug 25, 2015

The way I understand it (again, I haven't looked at it much) is that you can define /etc to check_all, but /etc/important_file to change ownership. The size and md5 of important_file wouldn't cause alerts, but a change in owner would. Everything else should be monitored as before.

@doke2
Copy link
Copy Markdown
Contributor Author

doke2 commented Aug 25, 2015

My intent is very close to what Dan described, except it just applies to subdirectories, not files (that might make sense as future work).

syscheck currently itterates through the list of directories. On each one, it does a recursive scan of all subdirectories and files, similar to unix find(1). If you have both /etc and /etc/important_dir defined in the config, the itteration loop will do a find in each. So things in /etc/important_dir get scanned twice, and entered into the database twice.

This change is supposed to let it realize that it has a listed starting directory of /etc/important_dir, and skip that while scanning /etc. Then /etc/important_dir will be scanned later when the itteration loop gets to it. All of it's sibling directories under /etc would be scanned as normal.

This lets you have different check options on /etc/important_dir, ie report_changes. It also lets you scan all of /var, but avoid checksumming the big log files. That was my main reason for writing this, syscheck was eating most of my disk I/O bandwidth, repeatedly sha1 and md5 summing my tomcat logs.

I've added a skip_subdir="yes" option to the directories tag. That lets people decide if they want this or not. This does use more cpu to do the directory name comparisons.

@doke2
Copy link
Copy Markdown
Contributor Author

doke2 commented Aug 25, 2015

I wasn't clear about the cpu usage. This change uses more cpu to compare each subdirectory found with the list of top level starting directories. So having it optional can reduce the cpu used by syscheck.

This works by having the recursive filesystem crawl strcmp() every directory if finds with the list of target directories. That could be made more efficent, at a cost in complexity. Maybe I should do the subdir checks between the top level starting directories at the beginning and keep a list of subdirs of each dir? Then while crawling /usr, I wouldn't be tring to strcmp /usr/local, /usr/share, ... with /etc, /etc/important_dir, /usr ...

@mstarks01
Copy link
Copy Markdown
Contributor

This sounds good. <ignore> not really meaning ignore (from a scanning pov) also sort of gets addressed as a side-effect.

@doke2
Copy link
Copy Markdown
Contributor Author

doke2 commented Aug 25, 2015

I think ignore in syscheck was fixed somewhere between 2.7.1 and 2.8.2. In 2.8.2, ignore seems to apply to syscheck scanning, not just reporting.

@dcid
Copy link
Copy Markdown

dcid commented Feb 25, 2016

@doke2 That depends where you apply it. On the manager-side, it is reporting only. On the agent side, it is ignored on the scanning time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants