Skip to content

refactor Callback implementation#677

Open
ekapadi wants to merge 1 commit into
neutrons:nextfrom
ekapadi:EWM16074_live_data_memory_leak
Open

refactor Callback implementation#677
ekapadi wants to merge 1 commit into
neutrons:nextfrom
ekapadi:EWM16074_live_data_memory_leak

Conversation

@ekapadi
Copy link
Copy Markdown
Contributor

@ekapadi ekapadi commented Apr 28, 2026

Description of work

This PR deals with a major memory-allocation issue in SNAPRed caused by the previous implementation and usage of the Callback class by MantidSnapper.

The old implementation was creating a new Callback class every time callback(clazz) was called, then dynamically attaching delegated methods to that class before returning an instance. This meant we were repeatedly allocating distinct callback class objects, each with its own type-object overhead and its own set of generated forwarding methods. The memory issue appears to have come from both the volume of those class allocations and the fact that the generated class objects persisted longer than expected. This refactor fixes that by moving to a stable top-level Callback, generating forwarding behavior via CallbackMeta, and caching one subclass per wrapped type instead of recreating classes on every call.

Explanation of work

This commit includes the following changes:

  • Replace the define-class-in-closure pattern with a cached per-type Callback subclass.
  • Use CallbackMeta to generate forwarding magic methods at class creation time instead of patching them on afterward.
  • Expand magic-method pass-through so wrapped primitive types and other built-ins behave more transparently.
  • Centralize forwarding behavior in _FORWARDED_MAGIC_METHODS and _make_forwarder() to simplify maintenance.
  • Tighten attribute access/assignment handling to better separate internal state from forwarded behavior.
  • Preserve the existing “not populated” safeguards while improving debugging via _wrapped_type, __repr__, and __str__.

Note (this next change was not strictly necessary, but it makes testing on a laptop, a bit easier):

  • LoadLiveData algorithm deletion: match 'MantidSnapper' changes from EWM13905

To test

New unit tests were added to cover the new changes.

Dev testing

Two new test scripts are provided at tests/cis_tests/live_data_memory_leak.py and tests/cis_tests/live_data_memory_leak_v2.py. The former is a minimal reproducer of the metadata-only loop from the live-data pane; the latter is a mantid-only version of the same thing.

These scripts will run the metadata-only part of SNAPRed's live-data loop, and display a memory-usage summary after each execution. Before these changes, the former script would only run for 20 cycles or so before triggering an OOM-kill on a laptop with 32 GB of memory. After these changes, I've successfully run the script for over 12 hours of cycling.

For convenience, I've attached my version of dev.yml. I generally use a mirror of the relevent /SNS tree. Note that in the following you must use the LD_PRELOAD: otherwise Mantid's extensive memory fragmentation will confuse the test results. (I actually used jemalloc for my tests, but tbbmalloc should work as well.)

The commands to run the test script should be:

export LD_PRELOAD=${HOME}/mambaforge/envs/mantid-developer/lib/libtbbmalloc.so
env=dev tests/scripts/live_data_memory_leak.py

Link to EWM item

EWM#16074

Verification

  • the author has read the EWM story and acceptance critera
  • the reviewer has read the EWM story and acceptance criteria
  • the reviewer certifies the acceptance criteria below reflect the criteria in EWM

Acceptance Criteria

This list is for ease of reference, and does not replace reading the EWM story as part of the review. Verify this list matches the EWM story before reviewing.

  • acceptance criterion 1
  • acceptance criterion 2

@ekapadi
Copy link
Copy Markdown
Contributor Author

ekapadi commented Apr 28, 2026

dev.yml

@ekapadi ekapadi force-pushed the EWM16074_live_data_memory_leak branch from 797f39c to dea2a2d Compare April 28, 2026 10:22
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.42%. Comparing base (db8da75) to head (5b40f3e).

Additional details and impacted files
@@            Coverage Diff             @@
##             next     #677      +/-   ##
==========================================
+ Coverage   96.37%   96.42%   +0.05%     
==========================================
  Files          78       78              
  Lines        7138     7160      +22     
==========================================
+ Hits         6879     6904      +25     
+ Misses        259      256       -3     
Flag Coverage Δ
integration 49.28% <61.90%> (-0.02%) ⬇️
unittests 96.14% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ekapadi ekapadi force-pushed the EWM16074_live_data_memory_leak branch 2 times, most recently from 096267a to 2650fb4 Compare April 28, 2026 10:30
@walshmm
Copy link
Copy Markdown
Collaborator

walshmm commented Apr 29, 2026

logs from running tests/cis_tests/live_data_memory_leak.py on this branch against some live data (25 iterations):

(base) [wqp@analysis-node22 SNAPRed]$ export LD_PRELOAD=${HOME}/git/mantid/.pixi/envs/default/lib/libtbbmalloc.so && env=kort pixi run python ./tests/cis_tests/live_data_memory_leak.py 
2026-04-29 13:19:19 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:19:29 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
9.8196 seconds: RSS 2819044 kB
2026-04-29 13:19:59 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:20:06 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
47.8014 seconds: RSS 3407248 kB
2026-04-29 13:20:37 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:20:44 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
85.3226 seconds: RSS 4090900 kB
2026-04-29 13:21:14 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:21:22 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
122.9452 seconds: RSS 4094924 kB
2026-04-29 13:21:52 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 3184.9 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 12897.2 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 2231.1 microseconds)
2026-04-29 13:22:00 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
161.2318 seconds: RSS 4078112 kB
2026-04-29 13:22:30 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:22:37 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
198.6535 seconds: RSS 4083300 kB
2026-04-29 13:23:08 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:23:16 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
237.7523 seconds: RSS 4083700 kB
2026-04-29 13:23:47 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:23:55 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
276.7369 seconds: RSS 4751496 kB
2026-04-29 13:24:26 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 4835.5 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 7382.2 microseconds)
2026-04-29 13:24:33 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
314.5559 seconds: RSS 4929084 kB
2026-04-29 13:25:03 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:25:11 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
352.8536 seconds: RSS 5526340 kB
2026-04-29 13:25:42 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:25:50 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
391.3630 seconds: RSS 6194036 kB
2026-04-29 13:26:20 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:26:28 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
429.1808 seconds: RSS 6148340 kB
2026-04-29 13:26:58 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:27:05 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
466.8648 seconds: RSS 6797404 kB
2026-04-29 13:27:36 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:27:44 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
505.0951 seconds: RSS 6947604 kB
2026-04-29 13:28:14 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 3490.5 microseconds)
2026-04-29 13:28:23 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
545.0874 seconds: RSS 7610408 kB
2026-04-29 13:28:54 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 1889.6 microseconds)
2026-04-29 13:29:03 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
584.3631 seconds: RSS 8134644 kB
2026-04-29 13:29:33 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:29:40 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
621.7414 seconds: RSS 8291608 kB
2026-04-29 13:30:11 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:30:18 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
659.4562 seconds: RSS 7628048 kB
2026-04-29 13:30:48 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:30:56 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
697.4128 seconds: RSS 7484724 kB
2026-04-29 13:31:26 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2737.1 microseconds)
2026-04-29 13:31:34 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
735.5804 seconds: RSS 8073492 kB
2026-04-29 13:32:05 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:32:12 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
773.5111 seconds: RSS 8223768 kB
2026-04-29 13:32:42 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 1792.6 microseconds)
2026-04-29 13:32:50 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
811.4310 seconds: RSS 8229576 kB
2026-04-29 13:33:20 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:33:28 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
849.3718 seconds: RSS 8124744 kB
2026-04-29 13:33:58 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:34:05 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
886.9621 seconds: RSS 8811236 kB
2026-04-29 13:34:36 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 11609.6 microseconds)
2026-04-29 13:34:46 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
927.4324 seconds: RSS 8913496 kB

Logs on next (Also 25 iterations):

(base) [wqp@analysis-node22 SNAPRed]$ export LD_PRELOAD=${HOME}/git/mantid/.pixi/envs/default/lib/libtbbmalloc.so && env=kort pixi run python ./tests/cis_tests/live_data_memory_leak.py 
2026-04-29 13:38:49 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:38:58 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
9.2378 seconds: RSS 2946756 kB
2026-04-29 13:39:29 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:39:36 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
47.1145 seconds: RSS 3593608 kB
2026-04-29 13:40:07 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:40:15 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
85.4126 seconds: RSS 3806796 kB
2026-04-29 13:40:45 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:40:52 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
122.4194 seconds: RSS 3648356 kB
2026-04-29 13:41:22 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:41:30 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
160.2185 seconds: RSS 3578988 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 6218.9 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 3466.9 microseconds)
2026-04-29 13:42:00 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2700.5 microseconds)
2026-04-29 13:42:09 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
199.5095 seconds: RSS 3587128 kB
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:42:40 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2417.3 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 6648.8 microseconds)
2026-04-29 13:42:47 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
237.4405 seconds: RSS 3624404 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 6965.6 microseconds)
2026-04-29 13:43:18 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:43:25 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
275.8400 seconds: RSS 4274604 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2754.1 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 9356.2 microseconds)
2026-04-29 13:43:56 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:44:04 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
314.4730 seconds: RSS 4429660 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 3162 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 6334.9 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 9170.8 microseconds)
2026-04-29 13:44:35 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 4979.1 microseconds)
2026-04-29 13:44:42 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
352.6590 seconds: RSS 4358664 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 12657.4 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2673.3 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 9420.6 microseconds)
2026-04-29 13:45:13 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:45:21 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
391.6170 seconds: RSS 4402144 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 6220.8 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 3430.9 microseconds)
2026-04-29 13:45:52 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 12192.8 microseconds)
2026-04-29 13:46:00 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
430.5162 seconds: RSS 4501196 kB
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:46:31 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:46:38 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
468.6695 seconds: RSS 4403288 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 5970.4 microseconds)
2026-04-29 13:47:09 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:47:17 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
507.4282 seconds: RSS 4373840 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 10358.5 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 3807.4 microseconds)
2026-04-29 13:47:48 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 6965.5 microseconds)
2026-04-29 13:47:55 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
545.8884 seconds: RSS 5067740 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 1918.1 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 5238.7 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2178.5 microseconds)
2026-04-29 13:48:26 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:48:33 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
584.0530 seconds: RSS 5324868 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 4447.1 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 5742.6 microseconds)
2026-04-29 13:49:04 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:49:12 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
622.4917 seconds: RSS 5218228 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 6329.4 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2221.2 microseconds)
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:49:43 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:49:51 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
661.2068 seconds: RSS 5878096 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 2445.4 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2873.3 microseconds)
2026-04-29 13:50:21 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 11844.8 microseconds)
2026-04-29 13:50:28 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
698.6068 seconds: RSS 6045328 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 4397.4 microseconds)
2026-04-29 13:51:00 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:51:09 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
739.2486 seconds: RSS 6048612 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 3502.7 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2025 microseconds)
2026-04-29 13:51:39 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:51:47 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
777.3068 seconds: RSS 5989712 kB
2026-04-29 13:52:18 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 1819.6 microseconds)
2026-04-29 13:52:25 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
816.0557 seconds: RSS 5987952 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 8553.8 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 7483.4 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 6285.1 microseconds)
2026-04-29 13:52:56 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:53:04 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
854.8704 seconds: RSS 5988012 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 11210.2 microseconds)
2026-04-29 13:53:35 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:53:42 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
893.1472 seconds: RSS 5844504 kB
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 2538.4 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 7581.5 microseconds)
2026-04-29 13:54:14 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - LoadLiveData - load live-data chunk
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
2026-04-29 13:54:22 - INFO     - snapred.backend.recipe.algorithm.MantidSnapper - DeleteWorkspace - delete temporary workspace
932.2345 seconds: RSS 5845780 kB
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:E for devId=3, pvId=1; skipping.
SNSLiveEventDataListener-[Error] Ignoring duplicate process variable BL3:SE:Potentiostat:I for devId=3, pvId=2; skipping.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 1 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Error] Ignoring variable value packet for device 3, variable 2 because we haven't received a device descriptor packet for it.
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179646 (TofF: 8092.3 microseconds)
SNSLiveEventDataListener-[Warning] Invalid pixel ID: 1179645 (TofF: 11931.2 microseconds)

These were both run on analysis, Im not sure if its dependant on what is being spat out by the current live run, but Im not sure this is properly highlighting the memory leak.
image

2819044, 3407248, 4090900, 4094924, 4078112, 4083300, 4083700, 4751496, 4929084 RSS 5526340, 6194036, 6148340, 6797404, 6947604, 7610408, 8134644, 8291608, 7628048, 7484724, 8073492, 8223768, 8229576, 8124744, 8811236, 8913496,

VS

image ``` 2946756, 3593608, 3806796, 3648356, 3578988, 3587128, 3624404, 4274604, 4429660, 4358664, 4402144, 4501196, 4403288, 4373840, 5067740, 5324868, 5218228, 5878096, 6045328, 6048612, 5989712, 5987952, 5988012, 5844504, 5845780, ```

next just happened to be better?

@ekapadi
Copy link
Copy Markdown
Contributor Author

ekapadi commented Apr 30, 2026

The logs are normal. We have some clean-up to do (in the IDF and with DAQ) about the PV initialization. As per my comment at slack -- please run with jemalloc instead (tbbmalloc will fragment)? RE the results: I think that possibly you're still in the region before the fragmentation starts to get bad. I'd like to see ~ 200 cycles or so? However, be "polite" on analysis, because when things get bad you can easily consume > 64 GB or so -- you need to check back often enough to make sure that's not happening. (Otherwise, this is a quite unexpected result, and I'm not sure what's going on! :( )

@walshmm
Copy link
Copy Markdown
Collaborator

walshmm commented Apr 30, 2026

I ran both again for about an hour each:
this branch got up to 15090056 kb
next got up to............14642748 kB

@ekapadi
Copy link
Copy Markdown
Contributor Author

ekapadi commented May 5, 2026

Also adding here (from slack) -- when you turn on tracemalloc in the test script, these results show why I made the modifications that I did. The main issue was that the Callback class was itself being initialized inside of a closure (each pass-through was a distinct scope). I also had thought that the main fragmentation issue was that the callback[s] were keeping references around longer than they needed to, but I'm now convinced that that is not actually what was going on. Regardless, I think we should go forward in making the changes of this PR, and spin-off the fragmentation issue to another story. Here is a summary of my test results. jemalloc is loaded via LD_PRELOAD, and for <this branch> vs. <next>: each is run for 250 cycles of the script:

==============================================
EWM16074_live_data_memory_leak 347756ab1f (libjemalloc.so LD_PRELOAD)

DeleteWorkspace-[Notice] DeleteWorkspace successful, Duration 0.00 seconds
8890.9876 seconds: RSS 1742600 kB
-----------------------------------------------------------
--------------- Memory-allocation traces ------------------
-----------------------------------------------------------
[ Top 10 ]
/home/ux0/workspaces/SNAPRed/.pixi/envs/default/lib/python3.12/collections/__init__.py:508: size=55.1 KiB, count=242, average=233 B
/home/ux0/workspaces/SNAPRed/tests/cis_tests/live_data_memory_leak.py:96: size=50.7 KiB, count=2, average=25.4 KiB
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:342: size=49.7 KiB, count=1149, average=44 B
<frozen importlib._bootstrap_external>:757: size=49.2 KiB, count=971, average=52 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:334: size=39.1 KiB, count=500, average=80 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:331: size=39.1 KiB, count=500, average=80 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/recipe/algorithm/MantidSnapper.py:156: size=29.3 KiB, count=250, average=120 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/recipe/algorithm/MantidSnapper.py:308: size=21.9 KiB, count=250, average=90 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:340: size=21.2 KiB, count=376, average=58 B
/home/ux0/workspaces/SNAPRed/.pixi/envs/default/lib/python3.12/re/__init__.py:224: size=20.8 KiB, count=454, average=47 B
-----------------------------------------------------------
(snapred) (base) [ux0@LAP135679 SNAPRed]$ 

==============================================
next f65ca12829825 (libjemalloc.so LD_PRELOAD)

DeleteWorkspace-[Notice] DeleteWorkspace successful, Duration 0.00 seconds
8901.5263 seconds: RSS 2945964 kB
-----------------------------------------------------------
--------------- Memory-allocation traces ------------------
-----------------------------------------------------------
[ Top 10 ]
/home/ux0/workspaces/SNAPRed/src/snapred/meta/Callback.py:2: size=2270 KiB, count=21921, average=106 B
/home/ux0/workspaces/SNAPRed/src/snapred/meta/Callback.py:66: size=824 KiB, count=21096, average=40 B
/home/ux0/workspaces/SNAPRed/src/snapred/meta/Callback.py:31: size=601 KiB, count=1463, average=421 B
/home/ux0/workspaces/SNAPRed/src/snapred/meta/Callback.py:10: size=387 KiB, count=144, average=2755 B
/home/ux0/workspaces/SNAPRed/tests/cis_tests/live_data_memory_leak.py:96: size=50.7 KiB, count=2, average=25.4 KiB
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:342: size=50.7 KiB, count=1168, average=44 B
<frozen importlib._bootstrap_external>:757: size=49.1 KiB, count=974, average=52 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:334: size=39.1 KiB, count=500, average=80 B
/home/ux0/workspaces/SNAPRed/src/snapred/backend/dao/RunMetadata.py:331: size=39.1 KiB, count=500, average=80 B
/home/ux0/workspaces/SNAPRed/.pixi/envs/default/lib/python3.12/logging/__init__.py:457: size=36.2 KiB, count=309, average=120 B
-----------------------------------------------------------
(snapred) (base) [ux0@LAP135679 SNAPRed]$ 

walshmm
walshmm previously approved these changes May 7, 2026
Copy link
Copy Markdown
Collaborator

@walshmm walshmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was able to reproduce the callback results. Approved!

on next:

-----------------------------------------------------------
--------------- Memory-allocation traces ------------------
-----------------------------------------------------------
[ Top 10 ]
/SNS/users/wqp/git/SNAPRed/src/snapred/meta/Callback.py:2: size=2200 KiB, count=21151, average=107 B
/SNS/users/wqp/git/SNAPRed/src/snapred/meta/Callback.py:66: size=802 KiB, count=20544, average=40 B
/SNS/users/wqp/git/SNAPRed/src/snapred/meta/Callback.py:31: size=592 KiB, count=1422, average=426 B
/SNS/users/wqp/git/SNAPRed/src/snapred/meta/Callback.py:10: size=382 KiB, count=140, average=2791 B
/SNS/users/wqp/git/SNAPRed/./tests/cis_tests/live_data_memory_leak.py:96: size=50.7 KiB, count=2, average=25.4 KiB
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/dao/RunMetadata.py:342: size=50.6 KiB, count=1168, average=44 B
<frozen importlib._bootstrap_external>:757: size=49.3 KiB, count=976, average=52 B
/SNS/users/wqp/git/SNAPRed/.pixi/envs/qa/lib/python3.12/threading.py:293: size=46.0 KiB, count=124, average=380 B
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/dao/RunMetadata.py:334: size=39.1 KiB, count=500, average=80 B
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/dao/RunMetadata.py:331: size=39.1 KiB, count=500, average=80 B
-----------------------------------------------------------

on branch:

--------------- Memory-allocation traces ------------------
-----------------------------------------------------------
[ Top 10 ]
/SNS/users/wqp/git/SNAPRed/./tests/cis_tests/live_data_memory_leak.py:96: size=50.7 KiB, count=2, average=25.4 KiB
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/dao/RunMetadata.py:342: size=50.2 KiB, count=1149, average=45 B
<frozen importlib._bootstrap_external>:757: size=49.5 KiB, count=973, average=52 B
/SNS/users/wqp/git/SNAPRed/.pixi/envs/qa/lib/python3.12/threading.py:293: size=46.0 KiB, count=124, average=380 B
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/dao/RunMetadata.py:334: size=39.1 KiB, count=500, average=80 B
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/dao/RunMetadata.py:331: size=39.1 KiB, count=500, average=80 B
/SNS/users/wqp/git/SNAPRed/.pixi/envs/qa/lib/python3.12/collections/__init__.py:508: size=32.7 KiB, count=148, average=226 B
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/recipe/algorithm/MantidSnapper.py:156: size=28.7 KiB, count=245, average=120 B
/SNS/users/wqp/git/SNAPRed/.pixi/envs/qa/lib/python3.12/re/__init__.py:224: size=24.1 KiB, count=525, average=47 B
/SNS/users/wqp/git/SNAPRed/src/snapred/backend/recipe/algorithm/MantidSnapper.py:308: size=21.9 KiB, count=250, average=90 B
-----------------------------------------------------------

@ekapadi ekapadi force-pushed the EWM16074_live_data_memory_leak branch 2 times, most recently from b6082ec to dd452c1 Compare May 12, 2026 17:42
The old implementation was creating a new `Callback` class every time `callback(clazz)` was called, then dynamically attaching delegated methods to that class before returning an instance. This meant we were repeatedly allocating distinct callback class objects, each with its own type-object overhead and its own set of generated forwarding methods. The memory issue appears to have come from both the volume of those class allocations and the fact that the generated class objects persisted longer than expected. This refactor fixes that by moving to a stable top-level `Callback`, generating forwarding behavior via `CallbackMeta`, and caching one subclass per wrapped type instead of recreating classes on every call.

  - Replace the define-class-in-closure pattern with a cached per-type `Callback` subclass.
  - Use `CallbackMeta` to generate forwarding magic methods at class creation time instead of patching them on afterward.
  - Expand magic-method pass-through so wrapped primitive types and other built-ins behave more transparently.
  - Centralize forwarding behavior in `_FORWARDED_MAGIC_METHODS` and `_make_forwarder()` to simplify maintenance.
  - Tighten attribute access/assignment handling to better separate internal state from forwarded behavior.
  - Preserve the existing “not populated” safeguards while improving debugging via `_wrapped_type`, `__repr__`, and `__str__`.

Note (this next change was not strictly necessary, but it makes testing on a laptop, a bit easier):

  - `LoadLiveData` algorithm deletion: match 'MantidSnapper' changes from EWM13905

To test:

Two new test scripts are provided at `tests/cis_tests/live_data_memory_leak.py` and `tests/cis_tests/live_data_memory_leak_v2.py`.  The former is a minimal reproducer of the metadata-only loop from the live-data pane; the latter is a mantid-only version of the same thing.

These scripts will run the metadata-only part of SNAPRed's live-data loop, and display a memory-usage summary after each execution.  Before these changes, the former script would only run for 20 cycles or so before triggering an OOM-kill on a laptop with 32 GB of memory.  After these changes, I've successfully run the script for over 12 hours of cycling.

For convenience, I've attached my version of `dev.yml`.  I generally use a mirror of the relevent `/SNS` tree.  Note that in the following you must use the `LD_PRELOAD`: otherwise Mantid's extensive memory fragmentation will confuse the test results.  (I actually used `jemalloc` for my tests, but `tbbmalloc` should work as well.)

The commands to run the test script should be:
``` bash
export LD_PRELOAD=${HOME}/mambaforge/envs/mantid-developer/lib/libtbbmalloc.so
env=dev tests/scripts/live_data_memory_leak.py
```
@ekapadi ekapadi force-pushed the EWM16074_live_data_memory_leak branch from dd452c1 to 5b40f3e Compare May 12, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants