feat:Implement perf event groups, scaled reads, and group snapshots#22
Open
SiyuanSun0736 wants to merge 3 commits into
Open
feat:Implement perf event groups, scaled reads, and group snapshots#22SiyuanSun0736 wants to merge 3 commits into
SiyuanSun0736 wants to merge 3 commits into
Conversation
- Introduced `group_fd` field in the perf options structure to allow attaching BPF programs to a group of perf events. - Updated the `ks_open_perf_event` function to accept `group_fd` and handle group event management. - Implemented helper functions for managing active members of perf event groups, ensuring that group leaders cannot be detached while active members exist. - Enhanced the generated code to include necessary checks and structures for handling multiplexed perf events. - Added tests to validate the new group management features and ensure correct code generation for group-related operations.
- Introduced functions to manage performance event groups, including detection of maximum events and validation of static groups. - Added support for new performance read functions: `read_raw`, `read_details`, and `read_group`, along with their corresponding structures and handling in the code generation. - Enhanced the type checker to validate performance event group attachments and ensure no cycles exist in group leader relationships. - Updated userspace code generation to track usage of new performance read functions and manage group attachments. - Added tests for new functionality, including validation of oversized static performance event groups and code generation for new read functions.
…erformance event groups; added snapshot index printing functionality; updated userspace code generation tests to verify variable reuse logic.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces the ability to group multiple
perfmetrics (e.g., cache misses, branch misses, cycles) into a single scheduling group. This ensures that counters observing the same workload are started and stopped together, solving the issue of misaligned results from independently managed counters.Additionally, it brings comprehensive multiplex-aware read APIs, static PMU slot limit validations, and fixes several internal userspace codegen edges to stabilize snapshot data consumption.
Key Features & User-Facing Changes
1. High-Level Grouping API
groupfield: Added a high-levelgroupfield inperf_optionsto easily attach members to a leader.group_fd: leader.perf_fdapproach is preserved for backward compatibility.2. Multiplex-Aware Read APIs
read(att): Now returns scaled values by default, corrected viatime_enabled / time_runningwhen PMU multiplexing occurs. (Matches raw count if no multiplexing happens).read_raw(att): Returns the uncorrected, raw counter values.read_details(att): Returns a struct containingraw,scaled,time_enabled, andtime_running—ideal for manual delta or rate calculations.read_group(leader): Captures an atomic snapshot of the entire group. Returns up to 16 ID/Value pairs (wherevalues[]are pre-scaled according to snapshot timing) and snapshot time fields.3. Group Lifecycle Management
4. Compile-Time PMU Slot Validation
4(or dynamically probessysfs), and can be overridden via theKERNELSCRIPT_PERF_GROUP_MAX_EVENTSenvironment variable.perf_type_softwareandperf_type_tracepointare correctly excluded from hardware PMU slot counts.Internal & Codegen Improvements
read_group()snapshot arrays (snapshot.ids[i]/snapshot.values[i]).memcpy", preventing invalid C generation from snapshot struct fields.forloop counters and subsequent variables of the same name produced duplicate function-level C declarations.Documentation & Examples
examples/perf_cache_miss.ks: Refactored to use the newgroupAPI. Added demonstrations ofread_details()for rate calculation andread_group()for iterating through snapshotid/valuepairs.examples/perf_page_fault.ks: Extended to demonstrate updated perf read semantics.README.md,SPEC.md, andBUILTINS.mdto reflect group semantics, read interfaces, and PMU slot constraints.Test Coverage
group_fdand high-levelgrouppaths.read(), and helper generation forread_raw(),read_details(), andread_group().forloop counter variable reuse in userspace codegen.