Show VCN and JPEG busy values where VCN/JPEG activity is not supported#1
Show VCN and JPEG busy values where VCN/JPEG activity is not supported#1sputhala-amd wants to merge 7 commits into
Conversation
WalkthroughThe changes refactor the handling of VCN and JPEG GPU activity metrics in the AMD SMI data collection, unifying them under a new Changes
Sequence Diagram(s)sequenceDiagram
participant Device
participant DataSampler
participant MetricsStore
participant Tracer
Device->>DataSampler: Request sample (with dev_id)
DataSampler->>Device: Check VCN/JPEG support
alt VCN/JPEG supported
DataSampler->>MetricsStore: Collect VCN/JPEG metrics into single xcp_metrics_t
else VCN/JPEG not supported
loop For each XCP
DataSampler->>MetricsStore: Collect VCN/JPEG busy metrics per XCP
end
end
DataSampler->>Tracer: Post-process and trace metrics with correct naming/indexing
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
source/lib/rocprof-sys/library/amd_smi.cpp (2)
555-568: Complex index calculation needs documentation.The index calculation for JPEG metrics when VCN is also enabled involves summing VCN metrics across all XCPs. While the logic appears correct, it's complex and could benefit from clarification.
Add a comment explaining the index calculation:
if(_settings.vcn_activity) { + // When both VCN and JPEG are enabled, JPEG indices start after all VCN metrics. + // Calculate the total number of VCN metrics across all XCPs to offset JPEG indices. size_t total_vcn_metrics = 0; if(gpu::is_vcn_activity_supported(_dev_id)) { total_vcn_metrics = itr.m_xcp_metrics[0].vcn_busy.size(); } else { for(const auto& temp : itr.m_xcp_metrics) total_vcn_metrics += temp.vcn_busy.size(); } idx += (total_vcn_metrics - 1); }
194-263: Consider validating metric array consistency.The code assumes that when activity metrics are not supported, the XCP busy metrics will be available and valid. Consider adding validation to handle edge cases where neither metric type is available.
Add validation after metric collection:
if(get_settings(m_dev_id).vcn_activity) { // ... existing VCN collection logic ... + + // Validate that we collected at least some metrics + if(m_xcp_metrics.empty() || + std::all_of(m_xcp_metrics.begin(), m_xcp_metrics.end(), + [](const auto& m) { return m.vcn_busy.empty(); })) + { + ROCPROFSYS_VERBOSE_F(1, "No valid VCN metrics collected for device %u\n", _dev_id); + get_settings(m_dev_id).vcn_activity = false; + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
source/lib/core/gpu.cpp(2 hunks)source/lib/core/gpu.hpp(1 hunks)source/lib/rocprof-sys/library/amd_smi.cpp(4 hunks)source/lib/rocprof-sys/library/amd_smi.hpp(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
source/lib/core/gpu.hpp (1)
source/lib/core/gpu.cpp (10)
get_processor_handles(211-304)get_processor_handles(212-212)is_vcn_activity_supported(306-311)is_vcn_activity_supported(307-307)is_jpeg_activity_supported(313-318)is_jpeg_activity_supported(314-314)is_vcn_busy_supported(320-325)is_vcn_busy_supported(321-321)is_jpeg_busy_supported(327-332)is_jpeg_busy_supported(328-328)
🔇 Additional comments (8)
source/lib/rocprof-sys/library/amd_smi.hpp (2)
96-100: LGTM! Well-structured consolidation of metrics.The introduction of the nested
xcp_metrics_tstruct effectively consolidates VCN and JPEG busy metrics into a logical grouping, which aligns well with the PR's objective of handling XCP-specific metrics.
115-115: Good refactoring to support multiple XCP metrics.Replacing the separate VCN and JPEG metric vectors with a single vector of
xcp_metrics_tprovides better structure for handling devices with multiple XCP partitions.source/lib/core/gpu.hpp (2)
43-53: LGTM! Clear and consistent API for feature support queries.The four new boolean query functions provide a clean interface for checking device-specific support for VCN and JPEG activity/busy metrics.
59-62: Well-designed static storage for device capabilities.The static boolean vectors efficiently track per-device support status, maintaining consistency with the existing processor tracking pattern.
source/lib/core/gpu.cpp (2)
253-296: LGTM! Comprehensive metric support detection logic.The implementation correctly detects support by checking for valid values (not UINT16_MAX) in both activity and XCP busy metrics. The nested loops properly handle the different metric array structures.
309-310: Add consistent bounds checking across all support query functions.Good defensive programming with the bounds checking. This prevents potential out-of-bounds access if an invalid device ID is provided.
Also applies to: 316-317, 323-324, 330-331
source/lib/rocprof-sys/library/amd_smi.cpp (2)
192-215: LGTM! Correct VCN metric collection with proper branching.The logic correctly handles both cases:
- When VCN activity is supported: collects from
vcn_activityarray- When not supported: collects from XCP-specific
vcn_busyarraysThe use of UINT16_MAX as a sentinel value for invalid/unsupported metrics is consistent with the AMD SMI API.
417-430: LGTM! Well-structured metric name formatting.The
addendum_blklambda effectively handles the different naming conventions for regular metrics vs XCP-specific busy metrics, providing clear identification in the trace output.
On AMD-SMI in rocm 7.0 vcn_activity and jpeg_activity will not be reported where XCP(partition) stats vcn_busy and jpeg_busy are available. This causes the activity tracking to fail. The fix is to read the busy values when activity values are not supported.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
source/lib/rocprof-sys/library/amd_smi.cpp (1)
253-260: Array bounds issue flagged in previous review.This is the same array bounds issue identified in the previous review comment. The code assumes
_gpu_metrics.xcp_stats[i]exists for alli < m_xcp_metrics.size()without proper bounds checking.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
source/lib/core/gpu.cpp(0 hunks)source/lib/rocprof-sys/library/amd_smi.cpp(4 hunks)source/lib/rocprof-sys/library/amd_smi.hpp(2 hunks)
💤 Files with no reviewable changes (1)
- source/lib/core/gpu.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- source/lib/rocprof-sys/library/amd_smi.hpp
🔇 Additional comments (4)
source/lib/rocprof-sys/library/amd_smi.cpp (4)
192-215: LGTM! Clean conditional logic for VCN activity handling.The conditional logic correctly handles both supported and unsupported VCN activity scenarios. The fallback to XCP stats when activity metrics are unavailable addresses the ROCm 7.0 issue described in the PR objectives.
417-429: LGTM! Improved naming convention for XCP metrics.The updated
addendum_blkfunction provides better naming and indexing for XCP metrics, distinguishing between busy and activity metrics with appropriate XCP indexing.
217-232:⚠️ Potential issueVerify empty container access safety.
The code accesses
m_xcp_metrics[0]without checking if the container is empty, which could lead to undefined behavior.Add bounds checking before accessing the container:
if(m_xcp_metrics.empty()) m_xcp_metrics.push_back(metrics); else + { + if(m_xcp_metrics.empty()) + m_xcp_metrics.push_back(xcp_metrics_t{}); m_xcp_metrics[0].jpeg_busy = metrics.jpeg_busy; + }Likely an incorrect or invalid review comment.
555-568:Details
✅ Verification successful
Verify index calculation consistency.
The complex index calculation for JPEG metrics needs verification to ensure it matches the counter track creation logic. The calculation depends on VCN metrics count and could be brittle.
Run the following script to verify the consistency between counter track creation and tracing index calculations:
🏁 Script executed:
#!/bin/bash # Description: Analyze index calculation patterns for potential mismatches # Search for counter track creation patterns echo "=== Counter Track Creation Patterns ===" rg -A 10 -B 2 "counter_track::emplace.*VCN.*Activity" echo -e "\n=== Counter Track Creation Patterns for JPEG ===" rg -A 10 -B 2 "counter_track::emplace.*JPEG.*Activity" echo -e "\n=== Tracing Index Usage Patterns ===" rg -A 5 -B 5 "TRACE_COUNTER.*device_vcn_activity" echo -e "\n=== JPEG Index Calculation ===" rg -A 10 -B 5 "total_vcn_metrics.*vcn_busy\.size"Length of output: 4448
🏁 Script executed:
#!/bin/bash # Search for all counter_track::emplace calls to correlate track registration and idx offsets rg -n "counter_track::emplace" -A5 -B5 source/lib/rocprof-sys/library/amd_smi.cppLength of output: 3273
🏁 Script executed:
#!/bin/bash # Print the full TRACE_COUNTER section around VCN and JPEG to verify idx handling sed -n '500,650p' source/lib/rocprof-sys/library/amd_smi.cppLength of output: 5933
🏁 Script executed:
#!/bin/bash # Print the section where `_idx` is declared and populated sed -n '350,500p' source/lib/rocprof-sys/library/amd_smi.cppLength of output: 6988
Index calculation for JPEG metrics verified
The adjustmentidx = _idx.at(7); if(_settings.vcn_activity) idx += (total_vcn_metrics - 1);correctly re-bases JPEG’s start index to immediately follow the VCN traces (
_idx[6] + total_vcn_metrics). This holds for both the “supported” (_vcn_metrics[0].vcn_busy.size()) and “fallback” (sum of each temp.vcn_busy.size()) paths. No change required.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
source/lib/rocprof-sys/library/amd_smi.cpp (1)
437-458: Previously raised bounds checking concerns remain valid.Multiple locations access
itr.m_xcp_metrics[0]without verifying the container is not empty. While the learning indicates that VCN activity support guarantees at least one row, this only applies to the supported case. The unsupported paths and JPEG handling still require bounds checking.Also applies to: 462-481, 515-539, 560-584
🧹 Nitpick comments (1)
source/lib/rocprof-sys/library/amd_smi.cpp (1)
545-558: Consider extracting VCN metrics counting into a helper function.The index calculation logic for JPEG counters when VCN is also active is complex and could benefit from being extracted into a separate function for better readability and maintainability.
Consider creating a helper function:
size_t count_total_vcn_metrics(const data& itr, uint32_t dev_id) { size_t total = 0; if(gpu::is_vcn_activity_supported(dev_id)) { total = itr.m_xcp_metrics.empty() ? 0 : itr.m_xcp_metrics[0].vcn_busy.size(); } else { for(const auto& temp : itr.m_xcp_metrics) total += temp.vcn_busy.size(); } return total; }Then simplify the main code:
- size_t total_vcn_metrics = 0; - if(gpu::is_vcn_activity_supported(_dev_id)) - { - total_vcn_metrics = itr.m_xcp_metrics[0].vcn_busy.size(); - } - else - { - for(const auto& temp : itr.m_xcp_metrics) - total_vcn_metrics += temp.vcn_busy.size(); - } + size_t total_vcn_metrics = count_total_vcn_metrics(itr, _dev_id);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
source/lib/core/gpu.cpp(1 hunks)source/lib/rocprof-sys/library/amd_smi.cpp(4 hunks)
✅ Files skipped from review due to trivial changes (1)
- source/lib/core/gpu.cpp
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.687Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
source/lib/rocprof-sys/library/amd_smi.cpp (1)
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.687Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
🔇 Additional comments (3)
source/lib/rocprof-sys/library/amd_smi.cpp (3)
192-199: Well-structured helper function for filtering invalid metrics.The
fill_busy_metricslambda effectively filters out invalid UINT16_MAX values, ensuring only valid metrics are collected.
201-218: Clean refactoring of VCN metrics collection.The conditional handling based on
is_vcn_activity_supportedprovides a clear separation between unified and XCP-specific metrics collection.
247-250:⚠️ Potential issueAdd bounds checking for XCP stats array access.
The code accesses
_gpu_metrics.xcp_stats[i]without verifying thatiis within the bounds of the XCP stats array. This could lead to undefined behavior ifm_xcp_metricshas more elements than_gpu_metrics.xcp_stats.Apply this diff to add bounds checking:
else { // Add JPEG busy to existing metrics (one per XCP) - for(size_t i = 0; i < m_xcp_metrics.size() && i < std::size(_gpu_metrics.xcp_stats); ++i) + size_t xcp_count = std::min(m_xcp_metrics.size(), + std::size(_gpu_metrics.xcp_stats)); + for(size_t i = 0; i < xcp_count; ++i) { fill_busy_metrics(m_xcp_metrics[i].jpeg_busy, _gpu_metrics.xcp_stats[i].jpeg_busy); } }Likely an incorrect or invalid review comment.
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
source/lib/rocprof-sys/library/amd_smi.cpp (1)
556-599: Consider extracting index calculation logic for better maintainability.The JPEG tracing logic is correct but contains complex index calculations to offset by VCN metrics count. While functionally sound, this could benefit from extraction into a helper function.
Consider extracting the VCN metrics counting logic:
+ auto calculate_vcn_metrics_count = [&]() -> size_t { + if(!_settings.vcn_activity) return 0; + if(gpu::is_vcn_activity_supported(_dev_id)) + return itr.m_xcp_metrics[0].vcn_busy.size(); + else { + size_t total = 0; + for(const auto& temp : itr.m_xcp_metrics) + total += temp.vcn_busy.size(); + return total; + } + }; + if(_settings.jpeg_activity && !itr.m_xcp_metrics.empty()) { uint64_t idx = _idx.at(7); - if(_settings.vcn_activity) - { - size_t total_vcn_metrics = 0; - if(gpu::is_vcn_activity_supported(_dev_id)) - { - total_vcn_metrics = itr.m_xcp_metrics[0].vcn_busy.size(); - } - else - { - for(const auto& temp : itr.m_xcp_metrics) - total_vcn_metrics += temp.vcn_busy.size(); - } - idx += (total_vcn_metrics - 1); - } + size_t total_vcn_metrics = calculate_vcn_metrics_count(); + if(total_vcn_metrics > 0) + idx += (total_vcn_metrics - 1);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
source/lib/core/gpu.cpp(1 hunks)source/lib/rocprof-sys/library/amd_smi.cpp(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- source/lib/core/gpu.cpp
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:462-481
Timestamp: 2025-06-10T12:25:14.058Z
Learning: When JPEG activity is supported by AMD GPUs in the AMD SMI API, the sampling logic ensures there will always be at least one element in m_xcp_metrics, making bounds checking for m_xcp_metrics[0] unnecessary in the supported case.
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.687Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
source/lib/rocprof-sys/library/amd_smi.cpp (2)
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:462-481
Timestamp: 2025-06-10T12:25:14.058Z
Learning: When JPEG activity is supported by AMD GPUs in the AMD SMI API, the sampling logic ensures there will always be at least one element in m_xcp_metrics, making bounds checking for m_xcp_metrics[0] unnecessary in the supported case.
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.687Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
🔇 Additional comments (7)
source/lib/rocprof-sys/library/amd_smi.cpp (7)
192-198: LGTM! Clean helper function implementation.The
fill_busy_metricslambda effectively abstracts the filtering of invalidUINT16_MAXvalues, improving code reusability and readability throughout the sampling logic.
200-217: LGTM! Well-structured conditional logic for VCN metrics.The implementation correctly handles both supported and unsupported VCN activity cases. The fallback to XCP-specific busy values when activity is not supported is properly implemented with appropriate empty checks.
219-255: LGTM! Comprehensive JPEG metrics handling with proper bounds checking.The JPEG activity sampling logic correctly handles the integration with existing VCN metrics and includes proper bounds checking for XCP stats access on lines 246-248. The conditional logic appropriately handles both supported and unsupported cases.
409-422: LGTM! Enhanced lambda with proper parameterization.The
addendum_blklambda enhancement to support XCP indexing and busy flags provides the necessary flexibility for generating appropriate metric names across different device configurations.
439-466: LGTM! Robust counter track creation for VCN metrics.The implementation properly handles both activity-supported and XCP-specific cases with appropriate empty checks and nested iteration for multiple XCPs.
470-496: LGTM! Consistent JPEG counter track creation.The JPEG counter track creation follows the same robust pattern as VCN metrics, with proper empty checks and conditional logic based on device support.
526-554: LGTM! Clean VCN activity tracing implementation.The tracing logic correctly handles both supported and unsupported cases with proper iteration and indexing. Based on the retrieved learnings, the access to
itr.m_xcp_metrics[0]is safe when VCN activity is supported.
* Fix the AMD-SMI double initialization * Optimize VCN/JPEG sampling to store both values always
|
@coderabbitai, I pushed a fix in commit 5a90d9a , please review it. |
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
source/lib/core/gpu.cpp (1)
90-108: Unusedinit_flag– remove or wire it intostd::call_once
static std::once_flag init_flag;is declared, but never used.
Either drop the variable or guard the lambda with it:-bool -amdsmi_init() -{ - static std::once_flag init_flag; // Ensures initialization happens only once - auto _amdsmi_init = []() { +bool +amdsmi_init() +{ + static std::once_flag init_flag; + std::call_once(init_flag, []() { try { ROCPROFSYS_AMD_SMI_CALL(::amdsmi_init(AMDSMI_INIT_AMD_GPUS)); get_processor_handles(); - _amdsmi_is_initialized() = true; // Mark as initialized + _amdsmi_is_initialized() = true; } catch(std::exception& _e) { ROCPROFSYS_BASIC_VERBOSE(1, "Exception thrown initializing amd-smi: %s\n", _e.what()); _amdsmi_is_initialized() = false; } - return true; - }(); - - return _amdsmi_init; + }); + return _amdsmi_is_initialized(); }This eliminates dead code and matches the comment.
🧹 Nitpick comments (2)
source/lib/core/gpu.cpp (2)
76-85:std::once_flagand init-state helper look good, but consider usingstd::atomic_boolfor visibility
_amdsmi_is_initialized()returns a plainbool&. Afterstd::call_oncecompletes, all threads will have the necessary happens-before relation, so it is technically safe, but a non-atomic global boolean can still be surprising.
A cheapstd::atomic_bool(or evenstd::memory_order_relaxed) removes any doubt and makes intent explicit.
271-299: Metric-detection logic is clearer, but double-checks could be simplified
Thehas_validlambda is a tidy replacement for explicit loops.
Two minor nits:
v_busy_supported/j_busy_supportedare toggled inside the loop; you can break immediately after both becometrueto avoid unnecessary iterations (micro-optimisation).- Given the comment that activity is absent when busy is present, callers may want a single helper that picks whichever is supported (
activity || busy). Otherwise every caller must repeat that logic.No action required, just flagging for future polish.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
source/lib/core/gpu.cpp(3 hunks)source/lib/rocprof-sys/library/amd_smi.cpp(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- source/lib/rocprof-sys/library/amd_smi.cpp
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:462-481
Timestamp: 2025-06-10T12:25:14.111Z
Learning: When JPEG activity is supported by AMD GPUs in the AMD SMI API, the sampling logic ensures there will always be at least one element in m_xcp_metrics, making bounds checking for m_xcp_metrics[0] unnecessary in the supported case.
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.719Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
🔇 Additional comments (1)
source/lib/core/gpu.cpp (1)
160-163: Nice use ofstd::call_oncein the public entry point
Wrapping the internalamdsmi_init()withamdsmi_oncecleanly externalises the one-time initialisation. 👍
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
source/lib/rocprof-sys/library/amd_smi.cpp (1)
395-408:addendum_blkis doing two jobs – split for clarityThe lambda now formats both plain device-level metrics and per-XCP metrics.
A tiny helper that just prints the index block ("[02]") would let you compose
the final string with simpleJOINs, remove the nestedif/else, and avoid
the duplicatedJOIN("", '[', (_i < 10 ? "0" : ""), _i, ']')logic.Not urgent, but will pay dividends when another metric type shows up.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
source/lib/core/gpu.cpp(4 hunks)source/lib/rocprof-sys/library/amd_smi.cpp(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- source/lib/core/gpu.cpp
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:462-481
Timestamp: 2025-06-10T12:25:14.111Z
Learning: When JPEG activity is supported by AMD GPUs in the AMD SMI API, the sampling logic ensures there will always be at least one element in m_xcp_metrics, making bounds checking for m_xcp_metrics[0] unnecessary in the supported case.
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.719Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
source/lib/rocprof-sys/library/amd_smi.cpp (2)
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:462-481
Timestamp: 2025-06-10T12:25:14.111Z
Learning: When JPEG activity is supported by AMD GPUs in the AMD SMI API, the sampling logic ensures there will always be at least one element in m_xcp_metrics, making bounds checking for m_xcp_metrics[0] unnecessary in the supported case.
Learnt from: sputhala-amd
PR: sputhala-amd/rocprofiler-systems#1
File: source/lib/rocprof-sys/library/amd_smi.cpp:447-468
Timestamp: 2025-06-10T02:42:26.719Z
Learning: When VCN activity is supported by AMD GPUs in the AMD SMI API, there will always be at least one row of values provided, making bounds checking for `m_xcp_metrics[0]` unnecessary in the supported case.
🔇 Additional comments (1)
source/lib/rocprof-sys/library/amd_smi.cpp (1)
232-239: Confirm XCP loop bound is correct for all firmware versions
for(const auto& xcp : _gpu_metrics.xcp_stats)iterates the full fixed-length
array returned by the driver. On some cards onlyxcp_stat_num(or a similarly
named field) entries are valid, the rest are allUINT16_MAX.
You currently rely on thefill_busy_metricsfilter to prune invalid values,
which is fine, but double-checking the driver header for an explicit “count”
field and respecting it would avoid work and future surprises.Can you verify whether
amdsmi_gpu_metrics_texposes such a field and, if it
does, slice the loop accordingly?
rocprof-sys-instrument to avoid instrumenting around C "main" wrapper (#1322) * Add check for Fortran main * Comment change * MAIN__ -> Fortran main * Cray Compiler comment change * Add changelog and troubleshooting comments * Improve CHANGELOG.md message * Change CHANGELOG msg to be in 7.2.0 * Apply review change #1 Co-authored-by: Pratik Basyal <pratik.basyal@amd.com> * Apply review change #2 Co-authored-by: Pratik Basyal <pratik.basyal@amd.com> * Apply review change #3 Co-authored-by: Pratik Basyal <pratik.basyal@amd.com> [rocm-systems] ROCm/rocm-systems#1322 (commit f0a41b6)
=?UTF-8?q?=20Perfetto=20annotations=20to=20match=20new=20n=E2=80=A6=20(#1?= =?UTF-8?q?618)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Rename "corr_id" to "stack_id" in Perfetto annotations to match new naming in schema. Signed-off-by: David Galiffi <David.Galiffi@amd.com> * correlation_id.ancestor was not added until ROCPROFILER_VERSION 1.0 [rocm-systems] ROCm/rocm-systems#1618 (commit 4b0fb2c)
On AMD-SMI in rocm 7.0 vcn_activity and jpeg_activity will not be reported where XCP(partition) stats vcn_busy and jpeg_busy are available.
This causes the activity tracking to fail. The fix is to read the busy values when activity values are not supported.
Summary by CodeRabbit
New Features
Refactor