feat:Implement perf event groups, scaled reads, and group snapshots by SiyuanSun0736 · Pull Request #22 · multikernel/kernelscript

SiyuanSun0736 · 2026-05-19T07:59:25Z

Overview

This PR introduces the ability to group multiple perf metrics (e.g., cache misses, branch misses, cycles) into a single scheduling group. This ensures that counters observing the same workload are started and stopped together, solving the issue of misaligned results from independently managed counters.

Additionally, it brings comprehensive multiplex-aware read APIs, static PMU slot limit validations, and fixes several internal userspace codegen edges to stabilize snapshot data consumption.

Key Features & User-Facing Changes

1. High-Level Grouping API

New group field: Added a high-level group field in perf_options to easily attach members to a leader.

var cache = attach(prog, perf_options { perf_type: perf_type_hardware, perf_config: cache_misses }, 0)
var branch = attach(prog, perf_options { perf_type: perf_type_hardware, perf_config: branch_misses, group: cache }, 0)

Compatibility: The lower-level group_fd: leader.perf_fd approach is preserved for backward compatibility.

2. Multiplex-Aware Read APIs

read(att): Now returns scaled values by default, corrected via time_enabled / time_running when PMU multiplexing occurs. (Matches raw count if no multiplexing happens).
read_raw(att): Returns the uncorrected, raw counter values.
read_details(att): Returns a struct containing raw, scaled, time_enabled, and time_running—ideal for manual delta or rate calculations.
read_group(leader): Captures an atomic snapshot of the entire group. Returns up to 16 ID/Value pairs (where values[] are pre-scaled according to snapshot timing) and snapshot time fields.

3. Group Lifecycle Management

Group Restarts: Dynamically attaching a new member to an existing active group now triggers a disable/reset/enable sequence on the whole group, ensuring counters start from zero together.
Cascading Detach: Detaching a group leader no longer conservatively rejects the operation. It now cascades and automatically detaches all active members.

4. Compile-Time PMU Slot Validation

Statically visible perf groups are now evaluated during the type-checking phase to calculate hardware PMU slot consumption.
Compilation will fail early if the group is too large. The limit defaults to 4 (or dynamically probes sysfs), and can be overridden via the KERNELSCRIPT_PERF_GROUP_MAX_EVENTS environment variable.
perf_type_software and perf_type_tracepoint are correctly excluded from hardware PMU slot counts.

Internal & Codegen Improvements

Array IR Lowering: Fixed array indexing and dereferencing in IR lowering to ensure user-space C code generates correctly when iterating over read_group() snapshot arrays (snapshot.ids[i] / snapshot.values[i]).
Array Initialization: Modified non-literal array initializations to "declare first, then memcpy", preventing invalid C generation from snapshot struct fields.
Variable Declarations: Fixed an issue where reused for loop counters and subsequent variables of the same name produced duplicate function-level C declarations.
Read Helpers: Added raw/details/group perf read helpers, leveraging 128-bit intermediate values for safe multiplex scaling.

Documentation & Examples

examples/perf_cache_miss.ks: Refactored to use the new group API. Added demonstrations of read_details() for rate calculation and read_group() for iterating through snapshot id/value pairs.
examples/perf_page_fault.ks: Extended to demonstrate updated perf read semantics.
Docs: Updated README.md, SPEC.md, and BUILTINS.md to reflect group semantics, read interfaces, and PMU slot constraints.

Test Coverage

Added IR and codegen assertions for both group_fd and high-level group paths.
Covered member-attach group restarts, ioctl generation, and cascading leader detaches.
Covered multiplex scaling fast/slow paths for read(), and helper generation for read_raw(), read_details(), and read_group().
Covered oversized static group validation during compilation.
Added regression tests for for loop counter variable reuse in userspace codegen.

- Introduced `group_fd` field in the perf options structure to allow attaching BPF programs to a group of perf events. - Updated the `ks_open_perf_event` function to accept `group_fd` and handle group event management. - Implemented helper functions for managing active members of perf event groups, ensuring that group leaders cannot be detached while active members exist. - Enhanced the generated code to include necessary checks and structures for handling multiplexed perf events. - Added tests to validate the new group management features and ensure correct code generation for group-related operations.

- Introduced functions to manage performance event groups, including detection of maximum events and validation of static groups. - Added support for new performance read functions: `read_raw`, `read_details`, and `read_group`, along with their corresponding structures and handling in the code generation. - Enhanced the type checker to validate performance event group attachments and ensure no cycles exist in group leader relationships. - Updated userspace code generation to track usage of new performance read functions and manage group attachments. - Added tests for new functionality, including validation of oversized static performance event groups and code generation for new read functions.

…erformance event groups; added snapshot index printing functionality; updated userspace code generation tests to verify variable reuse logic.

SiyuanSun0736 added 3 commits May 19, 2026 07:00

Enhanced the output of cache miss counts and branch miss counts for p…

68bfab7

…erformance event groups; added snapshot index printing functionality; updated userspace code generation tests to verify variable reuse logic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat:Implement perf event groups, scaled reads, and group snapshots#22

feat:Implement perf event groups, scaled reads, and group snapshots#22
SiyuanSun0736 wants to merge 3 commits into
multikernel:mainfrom
SiyuanSun0736:perf-group

SiyuanSun0736 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SiyuanSun0736 commented May 19, 2026

Overview

Key Features & User-Facing Changes

1. High-Level Grouping API

2. Multiplex-Aware Read APIs

3. Group Lifecycle Management

4. Compile-Time PMU Slot Validation

Internal & Codegen Improvements

Documentation & Examples

Test Coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant