Skip to content

Conversation

@jamieNguyenNVIDIA
Copy link
Contributor

Coresight PMU driver didn't reject events meant for other PMUs. This caused some of the Core PMU events disappearing from the output of "perf list". In addition, trying to run e.g.

 $ perf stat -e r2 sleep 1

made Coresight PMU driver to handle the event instead of letting Core PMU driver to deal with it.

Cc: stable@vger.kernel.org
Fixes: e37dfd6 ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver")
Signed-off-by: Ilkka Koskinen ilkka@os.amperecomputing.com
Acked-by: Will Deacon will@kernel.org
Reviewed-by: Besar Wicaksono bwicaksono@nvidia.com
Acked-by: Mark Rutland mark.rutland@arm.com
Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com
Link: https://lore.kernel.org/r/20231103001654.35565-1-ilkka@os.amperecomputing.com
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
Acked-by: Jamie Nguyen jamien@nvidia.com
Acked-by: Besar Wicaksono bwicaksono@nvidia.com
(cherry picked from commit 15c7ef7)

ianmay81 and others added 30 commits December 4, 2023 11:14
Signed-off-by: Ian May <ian.may@canonical.com>
…dversion"

This reverts commit 47d27f2.

We need to revert this to avoid regressing any modules used in Jammy.

Signed-off-by: Ian May <ian.may@canonical.com>
This reverts commit b2638e9.

This feature is not targeted for Jammy.

Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Ian May <ian.may@canonical.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2038972
Properties: no-test-build
Signed-off-by: Ian May <ian.may@canonical.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
… a pasid support

BugLink: https://bugs.launchpad.net/bugs/2031320

When an iommu_domain is set to IOMMU_DOMAIN_IDENTITY, the driver would
skip the allocation of a CD table and set the CONFIG field of the STE
to STRTAB_STE_0_CFG_BYPASS. This works well for devices that only have
one substream, i.e. PASID disabled.

However, there could be a use case, for a pasid capable device, that
allows bypassing the translation at the default substream while still
enabling the pasid feature, which means the driver should not skip the
allocation of a CD table nor simply bypass the CONFIG field. Instead,
the S1DSS field should be set to STRTAB_STE_1_S1DSS_BYPASS and the
SHCFG field should be set to STRTAB_STE_1_SHCFG_INCOMING.

Add s1dss in struct arm_smmu_s1_cfg, to allow a configuration in the
finalise() to support this use case.

Also, according to "13.5 Summary of attribute/permission configuration
fields" in the reference manual, the SHCFG field value is irrelevant.
So, set the SHCFG field of the STE always to STRTAB_STE_1_SHCFG_INCOMING
for simplification.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Nicolin Chen <nicolinc@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Ian May <ian.may@canonical.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
…stflint and nvidia-fs

Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Ian May <ian.may@canonical.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2045591
Properties: no-test-build
Signed-off-by: Ian May <ian.may@canonical.com>
The current version of nvidia-fs(2.17.4-0) is failing to compile
with the 6.5 kernel.  Drop inclusion until an updated version
is available.

Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
…rnel

BugLink: https://bugs.launchpad.net/bugs/2043132

With this change, the NVMe and NVMeOF driver would be enabled to support GPUDirectStorage(GDS).
The change is around nvme/nvme rdma map_data()
and unmap_data(), where the IO request is
first intercepted to check for GDS pages and
if it is a GDS page then the request is served
by GDS driver component called nvidia-fs,
else the request would be served by the standard NVMe driver code

Signed-off-by: Sourab Gupta <sougupta@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2043132

With this change, the NFS driver would be enabled to support GPUDirectStorage(GDS).
The change is around frwr_map and frwr_unmap in the NFS driver, where the IO request
is first intercepted to check for GDS pages and if it is a GDS page then the
request is served by GDS driver component called nvidia-fs,
else the request would be served by the standard NFS driver code.

Signed-off-by: Sourab Gupta <sougupta@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
This reverts commit b2638e9.

This feature is not targeted for Jammy.

Signed-off-by: Brad Figg <bfigg@nvidia.com>
Signed-off-by: Ian May <ian.may@canonical.com>
…ng packages that do not apply to this kernel.

Signed-off-by: Brad Figg <bfigg@nvidia.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Add support for exposing rprovides data for standalone modules
too. Switch to exposing provides as a shared debian/substvar file
and use that in the templates.

Signed-off-by: Brad Figg <bfigg@nvidia.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
Signed-off-by: Ian May <ian.may@canonical.com>
kkyarlagadda and others added 17 commits January 11, 2024 11:28
BugLink: https://bugs.launchpad.net/bugs/2048815

TPM devices may insert wait state on last clock cycle of ADDR phase.
For SPI controllers that support full-duplex transfers, this can be
detected using software by reading the MISO line. For SPI controllers
that only support half-duplex transfers, such as the Tegra QSPI, it is
not possible to detect the wait signal from software. The QSPI
controller in Tegra234 and Tegra241 implement hardware detection of the
wait signal which can be enabled in the controller for TPM devices.

The current TPM TIS driver only supports software detection of the wait
signal. To support SPI controllers that use hardware to detect the wait
signal, add the function tpm_tis_spi_transfer_half() and move the
existing code for software based detection into a function called
tpm_tis_spi_transfer_full(). SPI controllers that only support
half-duplex transfers will always call tpm_tis_spi_transfer_half()
because they cannot support software based detection. The bit
SPI_TPM_HW_FLOW is set to indicate to the SPI controller that hardware
detection is required and it is the responsibility of the SPI controller
driver to determine if this is supported or not.

For hardware flow control, CMD-ADDR-DATA messages are combined into a
single message where as for software flow control exiting method of
CMD-ADDR in a message and DATA in another is followed.

[jarkko: Fixed the function names to match the code change, and the tag
in the short summary.]
Signed-off-by: Krishna Yarlagadda <kyarlagadda@nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
(cherry picked from commit a86a42a)
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1786013
Signed-off-by: Ian May <ian.may@canonical.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2047983
Properties: no-test-build
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

Add support of "Thermal fast Sampling Period (_TFP)" for passive
cooling.

As per the ACPI specification (ACPI 6.5, Section 11.4.17 "_TFP (Thermal
fast Sampling Period)", _TFP overrides _TSP ("Thermal Sampling Period"
if both are present in a Thermal zone.

Signed-off-by: Jeff Brasen <jbrasen@nvidia.com>
Co-developed-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Sumit Gupta <sumitg@nvidia.com>
(backported from commit a2ee758 linux-next)
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

Current implementation of processor_thermal performs software throttling
in fixed steps of "20%" which can be too coarse for some platforms.
We observed some performance gain after reducing the throttle percentage.
Change the CPUFREQ thermal reduction percentage and maximum thermal steps
to be configurable. Also, update the default values of both for Nvidia
Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
and accordingly the maximum number of thermal steps are increased as they
are derived from the reduction percentage.

Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Co-developed-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Sumit Gupta <sumitg@nvidia.com>
(cherry picked from commit 310293a linux-next)
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
…arlier

BugLink: https://bugs.launchpad.net/bugs/2049537

Allocate and device assign 'struct etmv4_drvdata' earlier during the driver
probe, ensuring that it can be retrieved in power management based runtime
callbacks if required. This will also help in dropping iomem base address
argument from the function etm4_probe() later.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-2-anshuman.khandual@arm.com
(cherry picked from commit 3095e90 linux)
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

'struct etm4_drvdata' itself can carry the base address before etm4_probe()
gets called. Just drop that redundant argument from etm4_probe().

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-3-anshuman.khandual@arm.com
(cherry picked from commit 4e3b9a6 linux)
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

Coresight device pid can be retrieved from its iomem base address, which is
stored in 'struct etm4x_drvdata'. This drops pid argument from etm4_probe()
and 'struct etm4_init_arg'. Instead etm4_check_arch_features() derives the
coresight device pid with a new helper coresight_get_pid(), right before it
is consumed in etm4_hisi_match_pid().

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-4-anshuman.khandual@arm.com
(cherry picked from commit 5a1c709 linux)
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

Add support for handling MMIO based devices via platform driver. We need to
make sure that :

1) The APB clock, if present is enabled at probe and via runtime_pm ops
2) Use the ETM4x architecture or CoreSight architecture registers to
   identify a device as CoreSight ETM4x, instead of relying a white list of
   "Peripheral IDs"

The driver doesn't get to handle the devices yet, until we wire the ACPI
changes to move the devices to be handled via platform driver than the
etm4_amba driver.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-5-anshuman.khandual@arm.com
(cherry picked from commit 73d779a linux)
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

Drop ETM4X ACPI ID from the AMBA ACPI device list, and instead just move it
inside the new ACPI devices list detected and used via platform driver.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: linux-acpi@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for ACPI specific changes)
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20230710062500.45147-7-anshuman.khandual@arm.com
(cherry picked from commit 134124a linux)
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2049537

The peer_memory_client scheme allows a driver to register with the ib_umem
system that it has the ability to understand user virtual address ranges
that are not compatible with get_user_pages(). For instance VMAs created
with io_remap_pfn_range(), or other driver special VMA.

For ranges the interface understands it can provide a DMA mapped sg_table
for use by the ib_umem, allowing user virtual ranges that cannot be
supported by get_user_pages() to be used as umems for RDMA.

This is designed to preserve the kABI, no functions or structures are
changed, only new symbols are added:

 ib_register_peer_memory_client
 ib_unregister_peer_memory_client
 ib_umem_activate_invalidation_notifier
 ib_umem_get_peer

And a bitfield in struct ib_umem uses more bits.

This interface is compatible with the two out of tree GPU drivers:
 https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/master/drivers/gpu/drm/amd/amdkfd/kfd_peerdirect.c
 https://github.com/Mellanox/nv_peer_memory/blob/master/nv_peer_mem.c

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jacob Martin <jacob.martin@canonical.com>
Acked-by: Ian May <ian.may@canonical.com>
Signed-off-by: Brad Figg <bfigg@nvidia.com>
Ignore: yes
Signed-off-by: Ian May <ian.may@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2049653
Properties: no-test-build
Signed-off-by: Ian May <ian.may@canonical.com>
Signed-off-by: Ian May <ian.may@canonical.com>
Coresight PMU driver didn't reject events meant for other PMUs.
This caused some of the Core PMU events disappearing from
the output of "perf list". In addition, trying to run e.g.

     $ perf stat -e r2 sleep 1

made Coresight PMU driver to handle the event instead of letting
Core PMU driver to deal with it.

Cc: stable@vger.kernel.org
Fixes: e37dfd6 ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20231103001654.35565-1-ilkka@os.amperecomputing.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Besar Wicaksono <bwicaksono@nvidia.com>
(cherry picked from commit 15c7ef7)
@jamieNguyenNVIDIA
Copy link
Contributor Author

jamieNguyenNVIDIA commented Feb 2, 2024

Internal bug: N/A (ad-hoc testing)

Verified by running "perf record" with and without this patch.

With the "arm_cspmu_module" driver loaded, and without this patch, this command errors out:
cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

Under the same conditions, with the patch applied, "perf record" succeeds.

nvidia-bfigg pushed a commit that referenced this pull request Apr 24, 2024
BugLink: https://bugs.launchpad.net/bugs/2059284

[ Upstream commit fc3a553 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com>
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Changye Wu <wuchangye@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
nvidia-bfigg pushed a commit that referenced this pull request May 25, 2024
BugLink: https://bugs.launchpad.net/bugs/2059284

[ Upstream commit fc3a553 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com>
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Changye Wu <wuchangye@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants