Add comprehensive profiling instrumentation to async_data.py by Copilot · Pull Request #371 · hlin99/LMCache

Copilot · 2026-06-16T08:03:04Z

Adds time.perf_counter() timing to both the forward thread and background gather/commit thread in AsyncDataTransferContext to pinpoint where latency is spent in the async KV store path.

Forward thread — `[FWD <request_id>]` log per `submit_store()` call

Field	Measures
`lock`	inflight_lock acquire + `_is_closing` check
`prepare`	`prepare_store()`
`block_ids`	`_single_group_block_ids()`
`alloc`	SHM view lookup or pinned staging allocation
`submit`	`commit_executor.submit()`
`bookkeep`	inflight_commits add + callback registration
`total`	full forward thread time (`_t_entry` → `_t_exit`)

Background thread — `[BG <request_id>]` log per gather/commit task

thread_start · stream_enter · event_wait · gather_launch · record · stream_exit · lock · sync · commit · total

Also instrumented

flush_inflight_gathers() — logs event count and sync duration
close() — logs gather drain and executor shutdown durations

Fixes

import time moved to module level (was inside submit_store)
Fixed missing total argument in [FWD] log call (format string had 10 specifiers, only 9 args — would raise TypeError at runtime)
used_shm_direct (outer scope) used in log instead of _used_shm_direct (closure-captured copy)

No functional changes.

hlin99 · 2026-06-16T08:08:22Z

(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] EngineCore encountered a fatal error.
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] Traceback (most recent call last):
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1158, in run_engine_core
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] engine_core.run_busy_loop()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1199, in run_busy_loop
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self._process_engine_step()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1238, in _process_engine_step
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] outputs, model_executed = self.step_fn()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 499, in step_with_batch_queue
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] exec_future = self.model_executor.execute_model(
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 119, in execute_model
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] output.result()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/concurrent/futures/_base.py", line 449, in result
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] return self.__get_result()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] raise self._exception
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 97, in collective_rpc
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/serial_utils.py", line 510, in run_method
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] return func(*args, **kwargs)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 345, in execute_model
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] return self.worker.execute_model(scheduler_output)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] return func(*args, **kwargs)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 841, in execute_model
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] output = self.model_runner.execute_model(
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] return func(*args, **kwargs)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4213, in execute_model
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] with (
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/contextlib.py", line 144, in exit
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] next(self.gen)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/kv_connector_model_runner_mixin.py", line 107, in _get_kv_connector_output
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] kv_connector.wait_for_save()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/workspace/LMCache/lmcache/integration/vllm/lmcache_mp_connector.py", line 745, in wait_for_save
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self.worker_adapter.batched_submit_store_requests(
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/workspace/LMCache/lmcache/integration/vllm/vllm_multi_process_adapter.py", line 1263, in batched_submit_store_requests
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self.submit_store_request(request_id, op, event, cache_salt=salt)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/workspace/LMCache/lmcache/integration/vllm/vllm_multi_process_adapter.py", line 1177, in submit_store_request
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] future = self.transfer_ctx.submit_store(
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/workspace/LMCache/lmcache/v1/multiprocess/transfer_context/async_data.py", line 314, in submit_store
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] logger.info(
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/logging/init.py", line 1539, in info
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self._log(INFO, msg, args, **kwargs)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/logging/init.py", line 1684, in _log
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self.handle(record)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/logging/init.py", line 1700, in handle
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self.callHandlers(record)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/logging/init.py", line 1762, in callHandlers
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] hdlr.handle(record)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/logging/init.py", line 1028, in handle
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] self.emit(record)
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/opentelemetry/sdk/_logs/_internal/init.py", line 617, in emit
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] logger.emit(self._translate(record))
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/local/lib/python3.12/dist-packages/opentelemetry/sdk/_logs/_internal/init.py", line 591, in _translate
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] body = record.getMessage()
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ^^^^^^^^^^^^^^^^^^^
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] File "/usr/lib/python3.12/logging/init.py", line 392, in getMessage
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] msg = msg % self.args
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] ~~~~^~~~~~~~~~~
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] TypeError: not enough arguments for format string
(EngineCore pid=108285) [2026-06-16 08:21:43,582] LMCache INFO: Unregistering kv caches (vllm_multi_process_adapter.py:1449:lmcache.integration.vllm.vllm_multi_process_adapter)
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] AsyncLLM output_handler failed.
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] Traceback (most recent call last):
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 660, in output_handler
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] outputs = await engine_core.get_output_async()
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 1030, in get_output_async
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] raise self._format_exception(outputs) from None
(APIServer pid=108019) ERROR 06-16 08:21:43 [async_llm.py:704] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
(APIServer pid=108019) INFO: 127.0.0.1:47432 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error

@copilot

… log

Copilot · 2026-06-16T08:11:10Z

(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] EngineCore encountered a fatal error.
(EngineCore pid=108285) ERROR 06-16 08:21:43 [core.py:1167] Traceback (most recent call last):
(Engi...

This was already fixed in commit 5d74074 — the [FWD] log call was missing the total argument (10 specifiers but only 9 args). The current HEAD has (_t_exit - _t_entry) * 1000 as the 8th positional arg, making the count correct. The error you hit was from the intermediate commit 5101739 that had the bug. Please pull the latest and retry.

Initial plan

a33075e

Copilot AI assigned Copilot and hlin99 Jun 16, 2026

Copilot started work on behalf of hlin99 June 16, 2026 08:03 View session

Add comprehensive profiling instrumentation to async_data.py

5101739

Fix missing total argument and use outer-scope used_shm_direct in FWD…

5d74074

… log

Copilot AI changed the title ~~[WIP] Add comprehensive profiling instrumentation to async_data.py~~ Add comprehensive profiling instrumentation to async_data.py Jun 16, 2026

Copilot finished work on behalf of hlin99 June 16, 2026 08:09

Copilot AI requested a review from hlin99 June 16, 2026 08:09

Copilot started work on behalf of hlin99 June 16, 2026 08:10 View session

Copilot finished work on behalf of hlin99 June 16, 2026 08:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive profiling instrumentation to async_data.py#371

Add comprehensive profiling instrumentation to async_data.py#371
Copilot wants to merge 3 commits into
copilot/ww24-pr-async-againfrom
copilot/profilingasync-data-timing

Copilot AI commented Jun 16, 2026 •

edited

Loading

Uh oh!

hlin99 commented Jun 16, 2026

Uh oh!

Copilot AI commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Forward thread — [FWD <request_id>] log per submit_store() call

Background thread — [BG <request_id>] log per gather/commit task

Also instrumented

Fixes

Uh oh!

hlin99 commented Jun 16, 2026

Uh oh!

Copilot AI commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jun 16, 2026 •

edited

Loading

Forward thread — `[FWD <request_id>]` log per `submit_store()` call

Background thread — `[BG <request_id>]` log per gather/commit task