Skip to content

Conversation

@schythanyaku
Copy link

NVIDIA: SAUCE: MEDIATEK: platform: Add PCIe Hotplug Driver for CX7 on DGX Spark
This driver manages PCIe link for NVIDIA ConnectX-7 (CX7) hot-plug/unplug
on DGX Spark systems with GB10 SoC. It disables the PCIe link
on cable removal and enables it on cable insertion.

Upstream-friendly improvements over 6.14 driver:

  • Separated from MTK pinctrl driver into NVIDIA platform driver
  • Configuration via ACPI (_CRS and _DSD), no hardcoded values
  • Device-managed resources (devm_*) for automatic cleanup
  • Thread-safe state management with locking
  • Enhanced error handling and logging
  • Uses standard Linux kernel APIs

The driver exposes a sysfs interface to emulate cable plug in/out:
echo 1 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/debug_state # plug in
echo 0 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/debug_state # plug out

It also provides a runtime enable/disable switch via sysfs:
echo 1 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/hotplug_enabled # Enable
echo 0 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/hotplug_enabled # Disable

This allows enabling/disabling hotplug functionality. Hotplug is disabled by default
and must be explicitly enabled via userspace.

It also implements uevent notifications for coordination with userspace:

  • cable plug-in:
    Report plug-in uevent (driver)
    Enable PCIe link (driver)
    Rescan CX7 devices (application)

  • cable removal:
    Report removal uevent (driver)
    Remove CX7 devices (application)
    Disable PCIe link (driver)

Signed-off-by: Vaibhav Vyas vavyas@nvidia.com
Signed-off-by: Scott Fudally sfudally@nvidia.com
Signed-off-by: Surabhi Chythanya Kumar schythanyaku@nvidia.com

@schythanyaku schythanyaku force-pushed the cx7-hotplug-driver-upstream branch 2 times, most recently from 4d823d4 to ca61f87 Compare January 9, 2026 05:47
for (i = 0; i < hp_dev->gpio_count; i++) {
app_ctx = &hp_dev->pins[i];
if (app_ctx->desc)
gpiod_put(app_ctx->desc);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the proper way to clean up these GPIO descriptors? It seems app_ctx->desc is set earlier via:

cx7_hp_enumerate_gpios()
    gpio_device_get_desc()

Isn't the device's cleanup path already enough?

cx7_hp_put_gpio_device()
    gpio_device_put(()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. Thanks for pointing this out.
gpio_device_get_desc() doesn't take a reference - it just returns a pointer into the GPIO device's internal structure. Since the GPIO device is managed via devm_add_action_or_reset() --> cx7_hp_put_gpio_device()--> gpio_device_put(), the manual gpiod_put() calls are unnecessary.

I've removed the gpiod_put() calls from both the error cleanup path and the remove function.

I also fixed the error cleanup path by adding a sysfs_remove: label to properly clean up the sysfs group when bus_register_notifier() fails, ensuring sysfs is cleaned up before pinctrl.


cx7_hp_pinctrl_remove(hp_dev);

for (i = 0; i < hp_dev->gpio_count; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above regarding GPIO descriptor cleanup.

… DGX Spark

This driver manages PCIe link for NVIDIA ConnectX-7 (CX7) hot-plug/unplug
on DGX Spark systems with GB10 SoC. It disables the PCIe link
on cable removal and enables it on cable insertion.

Upstream-friendly improvements over 6.14 driver:
- Separated from MTK pinctrl driver into NVIDIA platform driver
- Configuration via ACPI (_CRS and _DSD), no hardcoded values
- Device-managed resources (devm_*) for automatic cleanup
- Thread-safe state management with locking
- Enhanced error handling and logging
- Uses standard Linux kernel APIs

The driver exposes a sysfs interface to emulate cable plug in/out:
  echo 1 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/debug_state  # plug in
  echo 0 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/debug_state  # plug out

It also provides a runtime enable/disable switch via sysfs:
  echo 1 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/hotplug_enabled  # Enable
  echo 0 > /sys/devices/platform/MTKP0001:00/pcie_hotplug/hotplug_enabled  # Disable

This allows enabling/disabling hotplug functionality. Hotplug is disabled by default
and must be explicitly enabled via userspace.

It also implements uevent notifications for coordination with userspace:

* cable plug-in:
    Report plug-in uevent (driver)
    Enable PCIe link (driver)
    Rescan CX7 devices (application)

* cable removal:
    Report removal uevent (driver)
    Remove CX7 devices (application)
    Disable PCIe link (driver)

Signed-off-by: Vaibhav Vyas <vavyas@nvidia.com>
Signed-off-by: Scott Fudally <sfudally@nvidia.com>
Signed-off-by: Surabhi Chythanya Kumar <schythanyaku@nvidia.com>
@schythanyaku schythanyaku force-pushed the cx7-hotplug-driver-upstream branch from ca61f87 to a2dcb79 Compare January 12, 2026 17:38
@jamieNguyenNVIDIA
Copy link
Collaborator

Acked-by: Jamie Nguyen <jamien@nvidia.com>

@clsotog
Copy link
Collaborator

clsotog commented Jan 12, 2026

Acked-by: Carol L Soto <csoto@nvidia.com>

@jamieNguyenNVIDIA
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants