Add the capability to report the actual OS memory usage (RSS) for the pytorch_inference process, similar to what was implemented for autodetect in #2846.
Background
PR #2846 introduced reporting of actual memory usage (via getrusage RSS) for the autodetect process. This provides valuable insight into the real memory footprint of anomaly detection jobs as reported by the OS, rather than relying solely on internal memory tracking.
The pytorch_inference process currently:
- Has the infrastructure to report RSS values via
writeProcessStats() (called on-demand via E_ProcessStats control message)
- Uses
CProcessStats::residentSetSize() and CProcessStats::maxResidentSetSize() in Main.cc
- Does not periodically report this information back to the Java process
Proposed Changes
-
Add periodic reporting of system memory usage for pytorch_inference, similar to how autodetect updates E_TSADSystemMemoryUsage and E_TSADMaxSystemMemoryUsage program counters.
-
Include the RSS values in the output stream that can be consumed by the Java side. Options include:
- Adding new fields to an existing result type
- Creating a new periodic stats message
- Extending the response from
E_ProcessStats to be sent periodically
-
The values to report:
system_memory_bytes - current resident set size (CProcessStats::residentSetSize())
max_system_memory_bytes - peak resident set size (CProcessStats::maxResidentSetSize())
Files likely to be modified
bin/pytorch_inference/Main.cc - Add periodic memory reporting
bin/pytorch_inference/CResultWriter.cc / CResultWriter.h - Potentially extend output format
bin/pytorch_inference/CCommandParser.cc / CCommandParser.h - If new message types are needed
Relates to
Add the capability to report the actual OS memory usage (RSS) for the
pytorch_inferenceprocess, similar to what was implemented forautodetectin #2846.Background
PR #2846 introduced reporting of actual memory usage (via
getrusageRSS) for theautodetectprocess. This provides valuable insight into the real memory footprint of anomaly detection jobs as reported by the OS, rather than relying solely on internal memory tracking.The
pytorch_inferenceprocess currently:writeProcessStats()(called on-demand viaE_ProcessStatscontrol message)CProcessStats::residentSetSize()andCProcessStats::maxResidentSetSize()inMain.ccProposed Changes
Add periodic reporting of system memory usage for
pytorch_inference, similar to howautodetectupdatesE_TSADSystemMemoryUsageandE_TSADMaxSystemMemoryUsageprogram counters.Include the RSS values in the output stream that can be consumed by the Java side. Options include:
E_ProcessStatsto be sent periodicallyThe values to report:
system_memory_bytes- current resident set size (CProcessStats::residentSetSize())max_system_memory_bytes- peak resident set size (CProcessStats::maxResidentSetSize())Files likely to be modified
bin/pytorch_inference/Main.cc- Add periodic memory reportingbin/pytorch_inference/CResultWriter.cc/CResultWriter.h- Potentially extend output formatbin/pytorch_inference/CCommandParser.cc/CCommandParser.h- If new message types are neededRelates to