Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 176 additions & 0 deletions SPECS/qemu/0062-Guard-against-epoch-advance-overshoot.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Chew, Tong Liang <tong.liang.chew@intel.com>
Date: Thu, 12 Jun 2026 00:13:16 +0800
Subject: [PATCH] hw/usb: fix ~2s isochronous stall from MFINDEX epoch-advance
overshoot

xhci_calc_iso_kick() reconstructs the 64-bit target MFINDEX from an
11-bit Frame ID supplied by the guest driver:

mfindex_kick = (frame_id << 3) | (mfindex & ~0x3fff)

When the IO thread is briefly delayed (>32 ms) the chosen epoch is
already in the past, so the code unconditionally advances by one epoch
(+0x4000 = 2048 ms). Two distinct code paths can leave mfindex_kick
so far in the future that all isochronous transfers stall until the
kick timer fires, producing an audible dropout of up to ~2.1 seconds:

PATH-B — epoch boundary overflow
frame_id values near 0x7FF map to a kick close to the epoch
boundary. Adding 0x4000 produces a result up to ~2047 ms ahead of
the current MFINDEX. Per xHCI spec §4.11.2.5 a Frame ID is valid
only within ±895 ms (0x1BF8 microframes) of MFINDEX; anything
beyond that window indicates the wrong epoch was selected.

PATH-A — mid-epoch IO thread delay
When a TRB is processed in the middle of an epoch (frame_kick in
[0x2408, 0x3F5F] relative to the epoch start), adding 0x4000 leaves
mfindex_kick more than 20 ms ahead even though the target frame has
already passed.

Fix both cases by clamping mfindex_kick to the current MFINDEX:

PATH-B: clamp when the result exceeds the ±895 ms spec validity
window, i.e. mfindex_kick > mfindex + 0x1BF8.

PATH-A: clamp when an epoch advance was applied (io_delay_applied)
and the result is still more than 20 ms (0xA0 microframes)
ahead of now.

In both cases the missed frame is dispatched immediately rather than
waiting for a timer to fire up to 2.1 seconds later.

Four trace events are added under the 'usb_xhci_iso*' wildcard to
allow operators to observe epoch-advance activity, stall prevention,
and kick-timer scheduling latency at runtime:

usb_xhci_iso_kick_io_delay — IO thread fell >32 ms behind
usb_xhci_iso_kick_epoch_clamp — PATH-B clamp applied (spec window)
usb_xhci_iso_kick_stall_risk — PATH-A clamp applied (mid-epoch)
usb_xhci_iso_kick_timer_arm — kick timer armed with delay >5 ms

Tested with a USB audio headset passed through to a Windows 11 guest
via QEMU XHCI emulation. Under IO thread scheduling pressure
(4 parallel virtio-blk streams running diskspd), ~2-second audio
dropouts occurred roughly every 3 seconds.
After this fix, no dropouts occur under the same load. The epoch
clamp trace events confirm the fix fires on every dropout-prone
mfindex_kick value that would previously have stalled the audio
stream.

Signed-off-by: Chew, Tong Liang <tong.liang.chew@intel.com>
---
hw/usb/hcd-xhci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++---
hw/usb/trace-events | 5 ++++
2 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index b6411f0bda..8f84e23b79 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -1749,12 +1749,71 @@ static void xhci_calc_iso_kick(XHCIState *xhci, XHCITransfer *xfer,
xfer->mfindex_kick = asap;
}
} else {
- xfer->mfindex_kick = ((xfer->trbs[0].control >> TRB_TR_FRAMEID_SHIFT)
- & TRB_TR_FRAMEID_MASK) << 3;
+ uint32_t frame_id = (xfer->trbs[0].control >> TRB_TR_FRAMEID_SHIFT)
+ & TRB_TR_FRAMEID_MASK;
+ xfer->mfindex_kick = (uint64_t)frame_id << 3;
xfer->mfindex_kick |= mfindex & ~0x3fff;
+ bool io_delay_applied = false;
if (xfer->mfindex_kick + 0x100 < mfindex) {
+ /*
+ * The IO thread fell >32ms behind: mfindex advanced past the
+ * target frame by more than the lookahead threshold. Log this
+ * so the operator can see the delay that triggered the epoch
+ * advance (enable with --trace 'usb_xhci_iso*').
+ */
+ uint64_t late_mf = mfindex - xfer->mfindex_kick;
+ trace_usb_xhci_iso_kick_io_delay(epctx->slotid, epctx->epid,
+ frame_id,
+ xfer->mfindex_kick & 0x3fff,
+ mfindex,
+ late_mf, late_mf >> 3);
xfer->mfindex_kick += 0x4000;
+ io_delay_applied = true;
}
+ /*
+ * Guard against epoch-advance overshoot: when the IO thread is briefly
+ * delayed, mfindex can advance past frame_kick by more than the 0x100
+ * lookahead threshold above. Adding 0x4000 then places mfindex_kick
+ * ~2 seconds into the future, stalling all isochronous transfers until
+ * the kick timer fires (observed as periodic ~2.1 s USB audio/video
+ * stalls).
+ *
+ * Per xHCI spec §4.11.2.5 a Frame ID is only valid within ±895 ms
+ * (0x1BF8 microframes) of the current MFINDEX. If the result exceeds
+ * that window the epoch selection was wrong; clamp to now so the
+ * missed frame is dispatched immediately instead of stalling.
+ */
+ if (xfer->mfindex_kick > mfindex + 0x1BF8) {
+ uint64_t future_mf = xfer->mfindex_kick - mfindex;
+ trace_usb_xhci_iso_kick_epoch_clamp(epctx->slotid, epctx->epid,
+ frame_id, mfindex,
+ xfer->mfindex_kick,
+ future_mf, future_mf >> 3);
+ xfer->mfindex_kick = mfindex;
+ }
+ /*
+ * PATH-A danger zone: IO_DELAY advanced the kick by one epoch, but
+ * the result still lies far enough ahead that audio may stall before
+ * the transfer fires. This happens when the IO thread processes a
+ * TRB in the middle of an epoch (X in [0x2408, 0x3F5F]) so that
+ * 0x4000-X > 20 ms (0xA0 microframes) yet below the 895 ms spec
+ * clamp threshold above.
+ *
+ * Fix: clamp to now so the missed frame fires immediately, exactly
+ * as PATH-B does. The stall_risk trace event fires first so the
+ * operator can see that CPU scheduling pressure was the root cause.
+ */
+#define XHCI_ISO_STALL_RISK_MF 0xA0 /* 160 microframes = 20 ms */
+ if (io_delay_applied &&
+ xfer->mfindex_kick > mfindex + XHCI_ISO_STALL_RISK_MF) {
+ uint64_t risk_mf = xfer->mfindex_kick - mfindex;
+ trace_usb_xhci_iso_kick_stall_risk(epctx->slotid, epctx->epid,
+ frame_id, mfindex,
+ xfer->mfindex_kick,
+ risk_mf, risk_mf >> 3);
+ xfer->mfindex_kick = mfindex;
+ }
+#undef XHCI_ISO_STALL_RISK_MF
}
}

@@ -1762,8 +1821,12 @@ static void xhci_check_intr_iso_kick(XHCIState *xhci, XHCITransfer *xfer,
XHCIEPContext *epctx, uint64_t mfindex)
{
if (xfer->mfindex_kick > mfindex) {
+ uint64_t delay_mf = xfer->mfindex_kick - mfindex;
+ trace_usb_xhci_iso_kick_timer_arm(epctx->slotid, epctx->epid,
+ xfer->mfindex_kick, mfindex,
+ delay_mf, delay_mf * 125);
timer_mod(epctx->kick_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
- (xfer->mfindex_kick - mfindex) * 125000);
+ delay_mf * 125000);
xfer->running_retry = 1;
} else {
epctx->mfindex_last = xfer->mfindex_kick;
diff --git a/hw/usb/trace-events b/hw/usb/trace-events
index dd04f14add..908be77a84 100644
--- a/hw/usb/trace-events
+++ b/hw/usb/trace-events
@@ -183,6 +183,11 @@ usb_xhci_xfer_success(void *xfer, uint32_t bytes) "%p: len %d"
usb_xhci_xfer_error(void *xfer, uint32_t ret) "%p: ret %d"
usb_xhci_unimplemented(const char *item, int nr) "%s (0x%x)"
usb_xhci_enforced_limit(const char *item) "%s"
+# ISO isochronous kick profiling (enable with --trace 'usb_xhci_iso*')
+usb_xhci_iso_kick_io_delay(uint32_t slotid, uint32_t epid, uint32_t frame_id, uint64_t frame_kick, uint64_t mfindex, uint64_t late_mf, uint64_t late_ms) "slotid %d epid %d frame_id 0x%03x frame_kick 0x%"PRIx64" mfindex 0x%"PRIx64" late %"PRIu64" mf (%"PRIu64" ms) [IO thread delayed >32ms -- epoch advance triggered]"
+usb_xhci_iso_kick_stall_risk(uint32_t slotid, uint32_t epid, uint32_t frame_id, uint64_t mfindex, uint64_t mfindex_kick, uint64_t risk_mf, uint64_t risk_ms) "slotid %d epid %d frame_id 0x%03x mfindex 0x%"PRIx64" mfindex_kick 0x%"PRIx64" stall_risk %"PRIu64" mf (%"PRIu64" ms) -> clamped to now [PATH-A danger zone FIXED: IO thread mid-epoch, CPU scheduling pressure]"
+usb_xhci_iso_kick_epoch_clamp(uint32_t slotid, uint32_t epid, uint32_t frame_id, uint64_t mfindex, uint64_t bad_kick, uint64_t future_mf, uint64_t future_ms) "slotid %d epid %d frame_id 0x%03x mfindex 0x%"PRIx64" bad_kick 0x%"PRIx64" would_stall %"PRIu64" mf (%"PRIu64" ms) -> clamped to now [STALL AVERTED]"
+usb_xhci_iso_kick_timer_arm(uint32_t slotid, uint32_t epid, uint64_t mfindex_kick, uint64_t mfindex, uint64_t delay_mf, uint64_t delay_us) "slotid %d epid %d mfindex_kick 0x%"PRIx64" mfindex 0x%"PRIx64" delay %"PRIu64" mf (%"PRIu64" us)"

# hcd-dwc2.c
usb_dwc2_update_irq(uint32_t level) "level=%d"
--
2.43.0
6 changes: 5 additions & 1 deletion SPECS/qemu/qemu.spec
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,7 @@ Obsoletes: sgabios-bin <= 1:0.20180715git-10.fc38
Summary: QEMU is a FAST! processor emulator
Name: qemu
Version: 9.1.0
Release: 8%{?dist}
Release: 9%{?dist}
License: Apache-2.0 AND BSD-2-Clause AND BSD-3-Clause AND FSFAP AND GPL-1.0-or-later AND GPL-2.0-only AND GPL-2.0-or-later AND GPL-2.0-or-later WITH GCC-exception-2.0 AND LGPL-2.0-only AND LGPL-2.0-or-later AND LGPL-2.1-only AND LGPL-2.1-or-later AND MIT AND LicenseRef-Fedora-Public-Domain AND CC-BY-3.0
URL: http://www.qemu.org/

Expand Down Expand Up @@ -544,6 +544,7 @@ Patch62: CVE-2025-54567.patch
Patch63: 0059-hw-usb-host-libusb-udev-product_desc-is-non-NULL.patch
Patch64: 0060-ui-gtk-Add-HW-cursor-and-render_sync-status-to-statu.patch
Patch65: 0061-ui-gtk-check-return-value-of-gdk_seat_grab.patch
Patch66: 0062-Guard-against-epoch-advance-overshoot.patch

BuildRequires: gnupg2
BuildRequires: meson >= %{meson_version}
Expand Down Expand Up @@ -3542,6 +3543,9 @@ useradd -r -u 107 -g qemu -G kvm -d / -s /sbin/nologin \


%changelog
* Thu Jun 11 2026 Andy <andy.peng@intel.com> - 9.1.0-9
- Patch on Qemu XHCI to fix Audio Video pausing on Video playback on Windows VM

* Wed Mar 25 2026 Lee Chee Yang <chee.yang.lee@intel.com> - 9.1.0-8
- Bump to rebuild with updated glibc

Expand Down
Loading