Skip to content

fix: prevent DDPQueue unblock stack overflow via depth-based microtask breaking#11

Closed
welkinwong wants to merge 1 commit into
SkySignalAPM:mainfrom
welkinwong:main
Closed

fix: prevent DDPQueue unblock stack overflow via depth-based microtask breaking#11
welkinwong wants to merge 1 commit into
SkySignalAPM:mainfrom
welkinwong:main

Conversation

@welkinwong

Copy link
Copy Markdown

Summary

Fixed a stack overflow edge case in DDPQueueCollector.wrapUnblock() where very deep synchronous unblock chains could trigger RangeError: Maximum call stack size exceeded.

Added a depth guard with a hardcoded threshold (1000); once reached, the next unblock hop is deferred via queueMicrotask to break the current call stack safely.

Added a regression test for deep chained unblock wrappers to verify no stack overflow and exactly-once execution of the base unblock.

Fixed #7

Test Plan

Run:

npm test -- --grep "deep synchronous unblock chains|wrapUnblock \(Bug #7 regression\)|Cross-Collector: DDPQueue \+ MethodTracer interaction"

@mvogttech

Copy link
Copy Markdown
Contributor

Hi @welkinwong , thanks for the PR and for digging into this!

I took a look at the change and had a question — are you running the latest version of the agent that includes the _skySignalDDPQueueWrapped guard? That guard prevents sessions from being double-wrapped on hot code push or agent restart, which should keep the real-world unblock chain depth to 2-3 at most.

I ask because the depth-based microtask approach has a subtle downside — queueMicrotask defers the final originalUnblock() call, which breaks the synchronous contract that Meteor's DDP queue relies on to advance to the next message. It works in the test, but could cause unexpected timing issues in production.

@welkinwong

Copy link
Copy Markdown
Author

@mvogttech Hi, thanks for the follow-up.

I’ve installed the latest version (1.0.23), but I still hit:

RangeError: Maximum call stack size exceeded (in DDPQueueCollector.wrapUnblock path).
In our production workload this error also causes server instability (after it starts happening continuously, the server behavior becomes abnormal).

Our project characteristic is that we call this.unblock() in essentially all method and publish handlers. Under normal runtime traffic, this leads to repeated errors continuously, so we currently cannot keep SkySignal enabled in production.
Before this is fully resolved, could you consider adding a configuration option to disable only wrapUnblock behavior (while keeping other queue metrics if possible)? That would let us keep using SkySignal with reduced functionality instead of disabling the whole agent.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// retries. Once nesting gets too deep, defer one hop to break recursion.
if (self.unblockDepth >= 1000) {
queueMicrotask(() => {
originalUnblock();

Copilot AI Mar 7, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When originalUnblock() is deferred via queueMicrotask, any exception it throws will become an unhandled promise rejection (Node.js will emit an unhandledRejection or uncaughtException depending on the Node version), since there is no try/catch around the call inside the microtask. In the synchronous path, exceptions propagate to the caller. This inconsistency means that in the deferred path, errors from deep in the unblock chain will be silently lost or will crash the process, rather than being handled by the existing error-handling logic in the wrappers. A try/catch should be added inside the queueMicrotask callback to handle errors consistently with the synchronous path.

Suggested change
originalUnblock();
try {
originalUnblock();
} catch (error) {
// Ensure errors in the deferred path are handled and don't surface
// as unhandled rejections / uncaught exceptions.
console.error('⚠️ DDPQueueCollector: Error during deferred unblock:', error);
}

Copilot uses AI. Check for mistakes.
@mvogttech

Copy link
Copy Markdown
Contributor

Implemented in 9cd84f4

@mvogttech mvogttech closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v1.0.16 still showing "DDPQueueCollector: Error in wrapped unblock: RangeError: Maximum call stack size exceeded" error

3 participants