Skip to content

nats bench js consume: large messages and high batch size causes stalls #1630

@bwerthmann

Description

@bwerthmann

Observed behavior

This is with 1MB messages...

nats cli seems to stall for >10 seconds, then the server kills the consumer due to Inactive Threshold, then CLI hangs "forever".

This works:

time nats bench js consume --stream zeek-logs --msgs 10000 --size 1m --batch 750 --clients 6
03:28:13 Starting JetStream durable consumer (callback) benchmark [acks=explicit, batch=750, clients=6, consumer=nats-bench, double-acked=false, msg-size=1.0 MiB, msgs=10,000, purge=false, sleep=0s, stream=zeek-logs]
03:28:13 [1] Starting JetStream durable consumer (callback), expecting 1,667 messages
03:28:13 [2] Starting JetStream durable consumer (callback), expecting 1,667 messages
03:28:13 [3] Starting JetStream durable consumer (callback), expecting 1,667 messages
03:28:13 [4] Starting JetStream durable consumer (callback), expecting 1,667 messages
03:28:13 [5] Starting JetStream durable consumer (callback), expecting 1,666 messages
03:28:13 [6] Starting JetStream durable consumer (callback), expecting 1,666 messages
Finished     33s [================================================================================================================================================================================================================================================================================] 100%
Finished     33s [================================================================================================================================================================================================================================================================================] 100%
Finished     33s [================================================================================================================================================================================================================================================================================] 100%
Finished     33s [================================================================================================================================================================================================================================================================================] 100%
Finished     33s [================================================================================================================================================================================================================================================================================] 100%
Finished     33s [================================================================================================================================================================================================================================================================================] 100%

  [1] 49 msgs/sec ~ 50 MiB/sec (1,667 msgs)
  [2] 49 msgs/sec ~ 50 MiB/sec (1,667 msgs)
  [3] 49 msgs/sec ~ 50 MiB/sec (1,667 msgs)
  [4] 49 msgs/sec ~ 50 MiB/sec (1,667 msgs)
  [5] 49 msgs/sec ~ 50 MiB/sec (1,666 msgs)
  [6] 49 msgs/sec ~ 49 MiB/sec (1,666 msgs)

 NATS JetStream durable consumer (callback) aggregated stats: 296 msgs/sec ~ 297 MiB/sec
 message rates min 49 | avg 49 | max 49 | stddev 0 msgs


real	0m33.817s
user	0m8.650s
sys	0m5.485s

but this does not:

time nats bench js consume --stream zeek-logs --msgs 10000 --size 1m --batch 750 --clients 7
03:29:08 Starting JetStream durable consumer (callback) benchmark [acks=explicit, batch=750, clients=7, consumer=nats-bench, double-acked=false, msg-size=1.0 MiB, msgs=10,000, purge=false, sleep=0s, stream=zeek-logs]
03:29:08 [1] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:29:08 [2] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:29:08 [3] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:29:08 [4] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:29:08 [5] Starting JetStream durable consumer (callback), expecting 1,428 messages
03:29:08 [6] Starting JetStream durable consumer (callback), expecting 1,428 messages
03:29:08 [7] Starting JetStream durable consumer (callback), expecting 1,428 messages
Consuming   33s [===========================================================================================================================================================================================================================================================================>----]  98%
Consuming   33s [===========================================================================================================================================================================================================================================================================>----]  98%
Consuming   33s [===========================================================================================================================================================================================================================================================================>----]  98%
Consuming   33s [==========================================================================================================================================================================================================================================================================>-----]  98%
Consuming   33s [===========================================================================================================================================================================================================================================================================>----]  98%
Consuming   33s [==========================================================================================================================================================================================================================================================================>-----]  98%
Consuming   33s [==========================================================================================================================================================================================================================================================================>-----]  98%
^C

real	0m57.476s
user	0m8.766s
sys	0m5.764s

Expected behavior

Ideally, nats cli should not stall during bench for >10 seconds.

However, if this is not feasible, perhaps:

  • increase the TTL of the created bench consumer.
  • if the consumer nolonger exists NATS CLI should exit

Server and client version

NATS Server: Version: 2.12.6

nats --version
0.3.2

Host environment

AWS EC2: c6a.2xlarge
Linux: 6.1.161-1
nats-server and nats-cli are on the same host.

Steps to reproduce

  1. Create a stream with 10,000 1 MB messages:
time nats bench js pub async --batch 8 --clients 4 --stream zeek-logs --msgs 10000 --multisubject --multisubjectmax 100 --size 1mb --no-progress zeek.logs
02:31:45 Starting JetStream asynchronous publisher benchmark [batch=8, clients=4, msg-size=1.0 MiB, msgs=10,000, multi-subject=true, multi-subject-max=100, multi-subject-randomize=false, purge=false, sleep=0s, stream=zeek-logs, subject=zeek.logs.*]
02:31:45 Using stream: zeek-logs
02:31:45 [1] Starting JetStream asynchronous publisher, publishing 2,500 messages
02:31:45 [2] Starting JetStream asynchronous publisher, publishing 2,500 messages
02:31:45 [3] Starting JetStream asynchronous publisher, publishing 2,500 messages
02:31:45 [4] Starting JetStream asynchronous publisher, publishing 2,500 messages

  [1] 67 msgs/sec ~ 67 MiB/sec ~ min: 19,783.95us ~ avg: 118,760.44us ~ max: 3,692,676.99us ~ P50: 42,327.54us ~ P90: 97,705.71us ~ P99: 2,238,015.27us ~ P99.9: 3,692,676.99us (2,500 msgs)
  [2] 67 msgs/sec ~ 67 MiB/sec ~ min: 16,397.59us ~ avg: 118,859.48us ~ max: 3,692,651.61us ~ P50: 42,263.08us ~ P90: 85,308.35us ~ P99: 2,244,147.92us ~ P99.9: 3,692,651.61us (2,500 msgs)
  [3] 67 msgs/sec ~ 67 MiB/sec ~ min: 11,247.30us ~ avg: 118,965.52us ~ max: 3,692,641.12us ~ P50: 42,360.10us ~ P90: 85,300.36us ~ P99: 2,239,121.35us ~ P99.9: 3,692,641.12us (2,500 msgs)
  [4] 67 msgs/sec ~ 67 MiB/sec ~ min: 11,350.68us ~ avg: 118,967.04us ~ max: 3,692,689.76us ~ P50: 42,346.89us ~ P90: 85,326.35us ~ P99: 2,239,186.70us ~ P99.9: 3,692,689.76us (2,500 msgs)

 NATS JetStream asynchronous publisher aggregated stats: 268 msgs/sec ~ 268 MiB/sec
 message rates min 67 | avg 67 | max 67 | stddev 0 msgs
 latencies per operation min 11,247.30us | avg 118,888.12us | max 3,692,689.76us | stddev 394,574.75us | P50 42,311.84us | P90 85,326.35us | P99 2,244,147.92us | P99.9: 3,692,676.99us


real	0m37.260s
user	0m0.987s
sys	0m3.805s
  1. Stream info:
nats stream info zeek-logs
Information for Stream zeek-logs created 2026-03-26 10:06:39

                Subjects: zeek.logs.*
                Replicas: 1
                 Storage: File

Options:

               Retention: Limits
         Acknowledgments: true
          Discard Policy: Old
        Duplicate Window: 2m0s
              Direct Get: true
    Allows Batch Publish: false
         Allows Counters: false
       Allows Msg Delete: true
  Allows Per-Message TTL: false
            Allows Purge: true
        Allows Schedules: false
          Allows Rollups: false

Limits:

        Maximum Messages: unlimited
     Maximum Per Subject: unlimited
           Maximum Bytes: 2.0 TiB
             Maximum Age: unlimited
    Maximum Message Size: unlimited
       Maximum Consumers: unlimited

State:

            Host Version: 2.12.6
      Required API Level: 0 hosted at level 3
                Messages: 10,000
                   Bytes: 9.8 GiB
          First Sequence: 91,249,257 @ 2026-03-30 02:31:45
           Last Sequence: 91,259,256 @ 2026-03-30 02:32:22
        Active Consumers: 0
      Number of Subjects: 100
  1. Run the bench command:
nats bench js consume --stream zeek-logs --msgs 10000 --size 1m --batch 750 --clients 7
03:49:10 Starting JetStream durable consumer (callback) benchmark [acks=explicit, batch=750, clients=7, consumer=nats-bench, double-acked=false, msg-size=1.0 MiB, msgs=10,000, purge=false, sleep=0s, stream=zeek-logs]
03:49:10 [1] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:49:10 [2] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:49:10 [3] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:49:10 [4] Starting JetStream durable consumer (callback), expecting 1,429 messages
03:49:10 [5] Starting JetStream durable consumer (callback), expecting 1,428 messages
03:49:10 [6] Starting JetStream durable consumer (callback), expecting 1,428 messages
03:49:10 [7] Starting JetStream durable consumer (callback), expecting 1,428 messages
Consuming   32s [===============================================================================================================================================================================================================================================================================>] 100%
Consuming   32s [===============================================================================================================================================================================================================================================================================>] 100%
Consuming   33s [===============================================================================================================================================================================================================================================================================>] 100%
Consuming   33s [===============================================================================================================================================================================================================================================================================>] 100%
Consuming   33s [===============================================================================================================================================================================================================================================================================>] 100%
Consuming   33s [===============================================================================================================================================================================================================================================================================>] 100%
Consuming   33s [===============================================================================================================================================================================================================================================================================>] 100%
...
  1. Consumer info just before Inactive Threshold:

Configuration:

                    Name: nats-bench
               Pull Mode: true
          Deliver Policy: All
              Ack Policy: Explicit
                Ack Wait: 30.00s
           Replay Policy: Instant
         Max Ack Pending: 5,250
       Max Waiting Pulls: 512
      Inactive Threshold: 10.00s

State:

            Host Version: 2.12.6
      Required API Level: 0 hosted at level 3
  Last Delivered Message: Consumer sequence: 9,936 Stream sequence: 91,259,192 Last delivery: 9.59s ago
    Acknowledgment Floor: Consumer sequence: 9,936 Stream sequence: 91,259,192 Last Ack: 9.59s ago
        Outstanding Acks: 0 out of maximum 5,250
    Redelivered Messages: 0
    Unprocessed Messages: 64
           Waiting Pulls: 0 of maximum 512

Metadata

Metadata

Assignees

Labels

defectSuspected defect such as a bug or regression

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions