RFC: config: aggregate mpm/spm options under the detect node v1#14902
RFC: config: aggregate mpm/spm options under the detect node v1#14902
Conversation
There was a problem hiding this comment.
Pull request overview
This RFC proposes aggregating MPM (Multi-Pattern Matcher) and SPM (Single-Pattern Matcher) configuration options under nested nodes within the detect section. The changes move configuration options from the root level and flat detect structure to a more hierarchical organization under detect.mpm and detect.spm.
Changes:
- Moved
mpm-algotodetect.mpm.algowith enhanced inline documentation - Moved
detect.sgh-mpm-contexttodetect.mpm.sgh-context - Reorganized caching options from flat
detect.sgh-mpm-caching*to nesteddetect.mpm.cachestructure withenabled,path, andmax-agekeys (following PR #14838 suggestions) - Moved
spm-algotodetect.spm.algowith inline documentation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| mpm: | ||
| # Select the multi pattern algorithm you want to run for scan/search | ||
| # in the engine. | ||
| # | ||
| # The supported algorithms are: | ||
| # "ac" - Aho-Corasick, default implementation | ||
| # "ac-ks" - Aho-Corasick, "Ken Steele" variant | ||
| # "hs" - Hyperscan, available when built with Hyperscan support | ||
| # | ||
| # The default value of "auto" will use "hs" if Hyperscan is | ||
| # available, "ac" otherwise. | ||
| # | ||
| # The mpm you choose also decides the distribution of mpm contexts for | ||
| # signature groups, specified by the conf - "detect.mpm.sgh-context". | ||
| # Selecting "ac" as the mpm would require "detect.mpm.sgh-context" | ||
| # to be set to "single", because of ac's memory requirements, unless the | ||
| # ruleset is small enough to fit in memory, in which case one can | ||
| # use "full" with "ac". The rest of the mpms can be run in "full" mode. | ||
| algo: auto | ||
| sgh-context: auto | ||
| cache: | ||
| # Cache MPM contexts to the disk to avoid rule compilation at the startup. | ||
| # Cache files are created in the standard library directory. | ||
| enabled: yes | ||
| path: @e_sghcachedir@ | ||
| # Maximum age for cached MPM databases before they are pruned. | ||
| # Accepts a combination of time units (s,m,h,d,w,y). | ||
| # Omit to use the default, 0 to disable. | ||
| # max-age: 7d | ||
| spm: | ||
| # Select the matching algorithm you want to use for single-pattern searches. | ||
| # | ||
| # Supported algorithms are "bm" (Boyer-Moore) and "hs" (Hyperscan, only | ||
| # available if Suricata has been built with Hyperscan support). | ||
| # | ||
| # The default of "auto" will use "hs" if available, otherwise "bm". | ||
| algo: auto |
There was a problem hiding this comment.
The configuration paths have been changed in the YAML file (e.g., mpm-algo moved to detect.mpm.algo), but the C code still reads from the old configuration paths. The code in src/detect-engine-mpm.c line 917 uses SCConfGet("mpm-algo", ...), which will not find the value at the new location detect.mpm.algo. Similar issues exist for:
spm-algo(used insrc/util-spm.cline 72)detect.sgh-mpm-context(used insrc/detect-engine.cline 2793)detect.sgh-mpm-caching(used insrc/detect-engine.cline 2458)detect.sgh-mpm-caching-path(used insrc/detect-engine.cline 2470)detect.sgh-mpm-caching-max-age(used insrc/detect-engine.cline 2555)
This PR needs corresponding C code changes to read from the new configuration paths, or it needs backward compatibility support to handle both old and new paths during a transition period.
| # Omit to use the default, 0 to disable. | ||
| # sgh-mpm-caching-max-age: 7d | ||
| mpm: | ||
| # Select the multi pattern algorithm you want to run for scan/search |
There was a problem hiding this comment.
Grammar issue: "you want to run for scan/search in the engine" is awkward. Should be "you want to run for scanning/searching in the engine" or "you want to use for pattern matching in the engine".
| # Select the multi pattern algorithm you want to run for scan/search | |
| # Select the multi pattern algorithm you want to use for pattern matching |
| mpm: | ||
| # Select the multi pattern algorithm you want to run for scan/search | ||
| # in the engine. | ||
| # | ||
| # The supported algorithms are: | ||
| # "ac" - Aho-Corasick, default implementation | ||
| # "ac-ks" - Aho-Corasick, "Ken Steele" variant | ||
| # "hs" - Hyperscan, available when built with Hyperscan support | ||
| # | ||
| # The default value of "auto" will use "hs" if Hyperscan is | ||
| # available, "ac" otherwise. | ||
| # | ||
| # The mpm you choose also decides the distribution of mpm contexts for | ||
| # signature groups, specified by the conf - "detect.mpm.sgh-context". | ||
| # Selecting "ac" as the mpm would require "detect.mpm.sgh-context" | ||
| # to be set to "single", because of ac's memory requirements, unless the | ||
| # ruleset is small enough to fit in memory, in which case one can | ||
| # use "full" with "ac". The rest of the mpms can be run in "full" mode. | ||
| algo: auto | ||
| sgh-context: auto | ||
| cache: | ||
| # Cache MPM contexts to the disk to avoid rule compilation at the startup. | ||
| # Cache files are created in the standard library directory. | ||
| enabled: yes | ||
| path: @e_sghcachedir@ | ||
| # Maximum age for cached MPM databases before they are pruned. | ||
| # Accepts a combination of time units (s,m,h,d,w,y). | ||
| # Omit to use the default, 0 to disable. | ||
| # max-age: 7d | ||
| spm: | ||
| # Select the matching algorithm you want to use for single-pattern searches. | ||
| # | ||
| # Supported algorithms are "bm" (Boyer-Moore) and "hs" (Hyperscan, only | ||
| # available if Suricata has been built with Hyperscan support). | ||
| # | ||
| # The default of "auto" will use "hs" if available, otherwise "bm". | ||
| algo: auto |
There was a problem hiding this comment.
Documentation in multiple files (doc/userguide/configuration/suricata-yaml.rst, doc/userguide/performance/hyperscan.rst, doc/userguide/performance/tuning-considerations.rst, etc.) still references the old configuration paths like mpm-algo, spm-algo, detect.sgh-mpm-context, sgh-mpm-caching, etc. These documentation files need to be updated to reflect the new nested structure under detect.mpm and detect.spm.
|
Information: QA skipped due to tag. Set to force a run. Pipeline = skip |
| cache: | ||
| # Cache MPM contexts to the disk to avoid rule compilation at the startup. | ||
| # Cache files are created in the standard library directory. | ||
| enabled: yes |
There was a problem hiding this comment.
Not really specific to this PR, but is the cache enabled by default even if this is commented out?
There was a problem hiding this comment.
yes, starting in Suricata 8, if the caching folder is available/writeable.
There was a problem hiding this comment.
We can comment it out then. This helps us make sure defaults are actually defaults, and not just a default because they are set that way in the configuration file.
As a follow-up on Victor's suggestions in #14838 I present the aggregated MPM/SPM configuration options under the detect node. I picked:
Victor only suggested aggregating the caching options, but I thought I would take it a step further with the other MPM/SPM options too.
Redmine ticket https://redmine.openinfosecfoundation.org/issues/8324
The main question is whether this is "too much" or if you see it as a valid improvement. I like it this way, but I understand it might bring some unneeded hurdles when converting from one major version to another. On the other hand, upgrading to a new major version should be done with caution, ideally manually, so I don't see it as a big deal.