Skip to content

[rocpdsna] Add v4 schema support with versioned writer#5347

Closed
anujshuk-amd wants to merge 8 commits into
sna-developfrom
users/anujshuk-amd/versioned-rocpd-sna-develop_with_v4
Closed

[rocpdsna] Add v4 schema support with versioned writer#5347
anujshuk-amd wants to merge 8 commits into
sna-developfrom
users/anujshuk-amd/versioned-rocpd-sna-develop_with_v4

Conversation

@anujshuk-amd

@anujshuk-amd anujshuk-amd commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Motivation

Rocpdsna currently only supports the V3 (3.0.0) schema. As the profiling ecosystem evolves, a newer V4 (4.0.0) schema is needed to support runtime selection of schema version. Changes support...

Timestamp normalization: V4 replaces direct start/end BIGINT columns with foreign keys to a rocpd_timestamp table, enabling deduplication and cross-table timestamp correlation.

Track-based context: V4 consolidates per-row nid/pid/tid/agent_id/queue_id/stream_id columns into a single track_id FK, reducing redundancy and simplifying queries.

Normalized event metadata: V4 moves embedded JSONB call_stack and line_info out of rocpd_event into dedicated rocpd_call_stack and rocpd_line_info tables with proper FK relationships.

New V4-only tables : rocpd_info_category, rocpd_info_address_range, rocpd_info_source_code, and rocpd_info_pc support richer profiling data.

Runtime schema selection: Users can choose V3 or V4 at runtime without recompilation.
This PR implements the V4 schema writers, adds comprehensive test coverage for both schemas, and cleans up the codebase.

Technical Details

  1. V4 Schema Writer Implementation : Implemented region_writer.hpp, kernel_dispatch_writer.hpp, memory_copy_writer.hpp, memory_alloc_writer.hpp, pmc_event_writer.hpp under source/writers/schema_v4/
  2. Insert Statement Layer
  • schema_v3/insert_statements.hpp: updated V3 insert SQL to align with the versioned backend schema under backends/schema/3.0.0/
  • schema_v4/insert_statements.hpp: new V4 insert SQL covering all 27 V4 tables, including normalized timestamp, track, call stack, line info, and category tables
  1. Entity Registry & Utilities updated
  2. Test Coverage: Added new test cases for v3 and v4 schema validation.
    The rocpdsna depends on [rocpd] Schema Updates #347 PR, the latest version of rocprofiler-sdk schema changes.

JIRA ID

TBA

Test Plan

  • Build succeeds with cmake --build
  • All existing v3 unit tests pass
  • New v4 unit tests pass
  • Writer benchmarks run with v4 schema
  • CI formatting check passes (clang-format-18)

Test Result

  • Local build and v3 tests pass
  • v4 tests require updated rocprofiler-sdk-rocpd with v4 schema support

mradosav-amd and others added 8 commits April 22, 2026 23:13
## Summary

Adds a GitHub Actions workflow that enforces formatting on the rocpdsna
project. First in a planned series of CI/CD PRs against `sna-develop`.

Three jobs run on every PR/push that touches `projects/rocpdsna/**`:

| Job | Tool | Scope |
|-----|------|-------|
| `clang-format` | clang-format-18 | 77 files in `source/`, `include/`,
`tests/` (`.cpp` `.hpp` `.h` `.cc`) |
| `cmake-format` | gersemi 0.25.1 | 17 `CMakeLists.txt` and `*.cmake`
files (excluding `build/`, `external/`, `_deps/`) |
| `markdown-lint` | `DavidAnson/markdownlint-cli2-action@v10.0.1` | All
`*.md` under `projects/rocpdsna/` |

## Technical Details

- Triggers: `pull_request` (any target) + `push` to `develop` and
`sna-develop`, scoped via path filters to `projects/rocpdsna/**`.
Excludes `build/` and `docs/graphs/` (binary diagrams).
- Concurrency cancellation: superseded runs cancelled on new push.
- Sparse checkout: every job only fetches `projects/rocpdsna/` +
`.github/workflows/` for speed.
- Read-only permissions.
- Includes one prep commit (`0aed2ea`) that formats
`cmake/rocpdsna-config.cmake.in` with gersemi so the workflow is green
from day one.

Patterns follow existing `rocprofiler-systems-formatting.yml`
(clang-format / gersemi job structure) and `hipfile-clang-format.yml`
(focused single-purpose workflow naming).

## Test Plan

- [x] Local `clang-format-18 -output-replacements-xml` on all 77 source
files: 0 replacements
- [x] Local `gersemi --check` on all 17 CMake files: clean (after the
prep commit)
- [x] YAML syntax validated (`python3 -c "import yaml;
yaml.safe_load(...)"`)
- [ ] First CI run on this PR confirms all 3 jobs pass
@github-actions github-actions Bot added documentation Improvements or additions to documentation github actions Pull requests that update GitHub Actions code labels Apr 23, 2026
@anujshuk-amd anujshuk-amd changed the base branch from develop to sna-develop April 23, 2026 16:05
@anujshuk-amd anujshuk-amd changed the title Users/anujshuk amd/versioned rocpd sna develop with v4 [rocpdsna] Add v4 schema support with versioned writer Apr 23, 2026
@anujshuk-amd anujshuk-amd self-assigned this Apr 23, 2026
@anujshuk-amd anujshuk-amd requested a review from Copilot April 23, 2026 16:09
@anujshuk-amd anujshuk-amd marked this pull request as ready for review April 23, 2026 16:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends rocpdsna to support both ROCPD schema v3 (3.0.0) and v4 (4.0.0) at runtime by introducing v4 writers, versioned schema initialization, and additional unit/benchmark coverage.

Changes:

  • Added schema v4 writer implementation (policy + writers + common insert ops) and expanded writer API for v4-only info types.
  • Implemented runtime schema selection (v3 vs v4) in the writer via a std::variant core and versioned sqlite_backend::initialize_schema(version).
  • Added/updated schema SQL (3.0.0 + 4.0.0) and expanded unit tests/benchmarks to exercise versioned behavior and FK toggling.

Reviewed changes

Copilot reviewed 57 out of 58 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
projects/rocpdsna/tests/unit/database_test.cpp Parameterized schema init tests for v3/v4; adds FK PRAGMA behavior tests
projects/rocpdsna/tests/unit/CMakeLists.txt Adds new unit test sources and passes schema-selection compile defs
projects/rocpdsna/tests/benchmarks/writer_bench.cpp Benchmarks now construct storage/writer with an explicit schema version and FK toggle
projects/rocpdsna/tests/benchmarks/writer_parallel_bench.cpp Parallel benchmarks updated for versioned storage and FK toggle; expands event-count args
projects/rocpdsna/source/writers/schema_v4/writer_policy.hpp Introduces v4 writer policy (schema tag + writer types + insert statements binding)
projects/rocpdsna/source/writers/schema_v4/region_writer.hpp v4 region writer using track_id + timestamp IDs + event correlation
projects/rocpdsna/source/writers/schema_v4/kernel_dispatch_writer.hpp v4 kernel dispatch writer using track_id + timestamp IDs
projects/rocpdsna/source/writers/schema_v4/memory_copy_writer.hpp v4 memory copy writer using track_id + timestamp IDs
projects/rocpdsna/source/writers/schema_v4/memory_alloc_writer.hpp v4 memory allocation writer using track_id + timestamp IDs and type/level validation
projects/rocpdsna/source/writers/schema_v4/pmc_event_writer.hpp v4 PMC event writer; inserts event/sample via v4 normalized model
projects/rocpdsna/source/writers/schema_v4/info_registration_writer.hpp v4 info registration including v4-only tables (category/address/source/pc)
projects/rocpdsna/source/writers/schema_v4/common_insert_operations.hpp v4 common ops: timestamp table, track resolution, normalized call-stack/line-info materialization
projects/rocpdsna/source/writers/schema_v3/info_registration_writer.hpp Adds v4+ registration API stubs (no-op / string-only) for v3 writers
projects/rocpdsna/source/writers/schema_v3/common_insert_operations.hpp Extends maybe_insert_sample signature to pass extdata
projects/rocpdsna/source/writers/interfaces/info_registration_writer_interface.hpp Adds v4+ public registration methods (category/address/source/pc)
projects/rocpdsna/source/writer_impl.hpp Switches writer core to runtime-selected variant; exposes FK toggling and v4 APIs
projects/rocpdsna/source/writer_impl.cpp Implements versioned schema init and policy selection (v3 vs v4)
projects/rocpdsna/source/writer.cpp Plumbs new writer APIs and FK toggle to implementation
projects/rocpdsna/source/storage_impl.hpp Adds storage impl ctor overload accepting schema version
projects/rocpdsna/source/storage_impl.cpp Stores default schema version (3.0.0) and versioned ctor path
projects/rocpdsna/source/storage.cpp Adds versioned storage_t constructor
projects/rocpdsna/source/primary_key_providers.hpp Adds v4-only key providers (timestamp/category/callstack/lineinfo/address/source/pc)
projects/rocpdsna/source/entity_utility.hpp Changes missing-key behavior to throw clearer runtime errors
projects/rocpdsna/source/entity_registry.hpp Adds v4-only registries for category/address/source/pc
projects/rocpdsna/source/data_storage/schema_version.hpp Adds schema_v4_tag for compile-time tagging
projects/rocpdsna/source/data_storage/schema_v3/insert_statements.hpp Annotates/realigns v3 insert statement parameter lists for the versioned backend schema
projects/rocpdsna/source/data_storage/read_statements.hpp Fixes include path to sqlite backend header
projects/rocpdsna/source/data_storage/backends/sqlite_backend.hpp Makes schema init versioned; adds FK PRAGMA toggle API
projects/rocpdsna/source/data_storage/backends/sqlite_backend.cpp Implements versioned schema loading, local v3/v4 schema selection, and FK PRAGMA toggle
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/rocpd_tables.sql Adds v4 DDL with normalized timestamps/tracks/callstack/lineinfo and new info tables
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/rocpd_views.sql Adds v4 “pass-through” views to uuid-suffixed tables
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/data_views.sql Adds v4 joined/query-friendly views (tracks/events/kernels/mem/etc.)
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/summary_views.sql Adds v4 summary views (top_kernels/busy/top)
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/rocpd_metadata.sql Adds v4 metadata insertion script (schema version + creation time)
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/rocpd_indexes.sql Adds v4 index DDL
projects/rocpdsna/source/data_storage/backends/schema/4.0.0/rocpd_shema.in Template for generating v4 schema header constants
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/rocpd_tables.sql Adds v3 DDL as versioned backend input
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/rocpd_views.sql Adds v3 “pass-through” views
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/data_views.sql Adds v3 joined/query-friendly views
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/summary_views.sql Adds v3 summary views
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/rocpd_indexes.sql Adds v3 index file (currently mostly commented)
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/marker_views.sql Adds v3 marker views placeholder
projects/rocpdsna/source/data_storage/backends/schema/3.0.0/rocpd_shema.in Template for generating v3 schema header constants
projects/rocpdsna/source/common/types.hpp Modifies header prologue (copyright/SPDX removal)
projects/rocpdsna/source/common/string_conversions.hpp Adds get_key/to_string for new v4 info structs
projects/rocpdsna/include/writer.hpp Exposes v4-only registration APIs + SQLite FK toggle API
projects/rocpdsna/include/storage_types.hpp Adds comparisons/helpers on version_t
projects/rocpdsna/include/storage.hpp Adds versioned storage constructor
projects/rocpdsna/include/shared_types.hpp Adds v4 mapping notes and extends line info entry with extdata
projects/rocpdsna/cmake/rocprofiler-sdk-rocpd.cmake Generates schema headers per version; detects SDK schema-version API availability
projects/rocpdsna/CMakeLists.txt Propagates ROCPD_SQL_HAS_SCHEMA_VERSION define based on SDK detection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +331 to +336
#if ROCPD_SQL_HAS_SCHEMA_VERSION
schema_kinds.push_back(ROCPD_SQL_SCHEMA_ROCPD_METADATA);
#else
// Old SDK has MARKER_VIEWS instead of METADATA
schema_kinds.push_back(ROCPD_SQL_SCHEMA_ROCPD_MARKER_VIEWS);
#endif

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize_schema() unconditionally references ROCPD_SQL_SCHEMA_ROCPD_MARKER_VIEWS when ROCPD_SQL_HAS_SCHEMA_VERSION is 0, but the locally-defined rocpd_sql_schema_kind_t enum (used when USE_SCHEMA_FROM_ROCPROFILER_SDK_ROCPD=0) does not define ROCPD_SQL_SCHEMA_ROCPD_MARKER_VIEWS. This will not compile in the local-schema build. Also, versioned initialization should fail fast when a v4 schema is requested but the linked SDK API cannot select schema versions, to avoid creating a v3 schema while the writers use v4 insert statements.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +54
LEFT JOIN `rocpd_info_process` P ON P.pid = T.pid
AND P.guid = T.guid
LEFT JOIN `rocpd_info_thread` TH ON TH.tid = T.tid

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the v4 schema, rocpd_track.pid and rocpd_track.tid are foreign keys to rocpd_info_process.id and rocpd_info_thread.id (see the table DDL), but this view joins them against the OS pid/tid columns (P.pid and TH.tid). This breaks the tracks view (and every downstream view that uses it) unless DB primary keys happen to match OS ids. Join on the referenced primary key columns instead (e.g., P.id = T.pid, TH.id = T.tid).

Suggested change
LEFT JOIN `rocpd_info_process` P ON P.pid = T.pid
AND P.guid = T.guid
LEFT JOIN `rocpd_info_thread` TH ON TH.tid = T.tid
LEFT JOIN `rocpd_info_process` P ON P.id = T.pid
AND P.guid = T.guid
LEFT JOIN `rocpd_info_thread` TH ON TH.id = T.tid

Copilot uses AI. Check for mistakes.
`rocpd_info_code_object` CO
INNER JOIN `rocpd_info_agent` A ON CO.agent_id = A.id
AND CO.guid = A.guid
INNER JOIN `rocpd_info_process` P ON CO.pid = P.pid

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code_objects view joins rocpd_info_code_object.pid against rocpd_info_process.pid, but rocpd_info_code_object.pid is a foreign key to rocpd_info_process.id in the v4 DDL. This join will return incorrect results unless PK ids match OS pids. Join on the referenced PK (P.id).

Suggested change
INNER JOIN `rocpd_info_process` P ON CO.pid = P.pid
INNER JOIN `rocpd_info_process` P ON CO.pid = P.id

Copilot uses AI. Check for mistakes.
JSON_EXTRACT(KS.extdata, '$.kernel_address.handle') AS kernel_address
FROM
`rocpd_info_kernel_symbol` KS
INNER JOIN `rocpd_info_process` P ON KS.pid = P.pid

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kernel_symbols view joins rocpd_info_kernel_symbol.pid against rocpd_info_process.pid, but rocpd_info_kernel_symbol.pid is a foreign key to rocpd_info_process.id in the v4 DDL. This should join on P.id to preserve referential correctness.

Suggested change
INNER JOIN `rocpd_info_process` P ON KS.pid = P.pid
INNER JOIN `rocpd_info_process` P ON KS.pid = P.id

Copilot uses AI. Check for mistakes.
Comment on lines +3 to 9
# Copyright(c) 2025 Advanced Micro Devices, Inc. All Rights Reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# of this software and associated documentation files(the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# to use, copy, modify, merge, publish, distribute, sublicense, and / or sell
# copies of the Software, and to permit persons to whom the Software is

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MIT license header text was modified (e.g., Copyright(c), missing spaces in files(the, and and / or). License blocks should remain verbatim to avoid legal ambiguity and to keep consistency with other files.

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 3
#include <cstddef>
#include <functional>
#include <string>

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header had its copyright/SPDX lines removed, which makes it inconsistent with other project headers and may violate the repository’s licensing/header requirements. Restore the standard header block (or apply whatever the project’s current header convention is) rather than deleting it.

Copilot uses AI. Check for mistakes.
CREATE INDEX `rocpd_timestamp{{uuid}}_track_id_idx` ON `rocpd_timestamp{{uuid}}` ("track_id");

-- CREATE INDEX `rocpd_kernel_dispatch{{uuid}}_guid_pid_tid_idx` ON `rocpd_kernel_dispatch{{uuid}}` ("guid", "pid", "tid");
CREATE INDEX `rocpd_memory_copy{{uuid}}_guid_pid_tid_idx` ON `rocpd_memory_copy{{uuid}}` ("guid", "pid", "tid");

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This v4 index targets columns pid and tid on rocpd_memory_copy{{uuid}}, but the v4 table definition uses track_id and does not have pid/tid columns. Applying this schema will fail at init time when SQLite executes the index DDL.

Suggested change
CREATE INDEX `rocpd_memory_copy{{uuid}}_guid_pid_tid_idx` ON `rocpd_memory_copy{{uuid}}` ("guid", "pid", "tid");
CREATE INDEX `rocpd_memory_copy{{uuid}}_guid_track_id_idx` ON `rocpd_memory_copy{{uuid}}` ("guid", "track_id");

Copilot uses AI. Check for mistakes.
Comment on lines +189 to +190
INNER JOIN `rocpd_info_process` P ON P.pid = T.pid
AND N.guid = T.guid

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threads view join condition references N.guid before N is joined, which makes the SQL invalid. Additionally, it joins P.pid = T.pid, but rocpd_info_thread.pid is a foreign key to rocpd_info_process.id in v4, so it should join on P.id (and P.guid = T.guid).

Suggested change
INNER JOIN `rocpd_info_process` P ON P.pid = T.pid
AND N.guid = T.guid
INNER JOIN `rocpd_info_process` P ON P.id = T.pid
AND P.guid = T.guid

Copilot uses AI. Check for mistakes.
Comment on lines +374 to +377
sqlite3_open(m_database_path.c_str(), &raw);
sqlite3_prepare_v2(
raw, ("SELECT COUNT(*) FROM " + table).c_str(), -1, &stmt, nullptr);
if(sqlite3_step(stmt) == SQLITE_ROW)

Copilot AI Apr 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count_rows() uses sqlite3_open / sqlite3_prepare_v2 / sqlite3_step without checking return codes. If any of these fail, the test can dereference a null/invalid statement handle and crash (masking the real failure). Add assertions on the SQLite return values (and handle stmt == nullptr) so failures are reported cleanly.

Copilot uses AI. Check for mistakes.
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

This pull request has been inactive for 25 days and will be marked as stale.

If you would like to keep this PR open, please:

  • Add new commits
  • Add a comment explaining why it should remain open

This PR will be automatically closed in 5 days if no further activity occurs.

@github-actions github-actions Bot added the Stale PR has no activity for 25+ days label Jun 7, 2026
@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically closed due to inactivity (30 days with no updates).

If you'd like to continue working on this, feel free to reopen the PR or create a new one.

@github-actions github-actions Bot closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation github actions Pull requests that update GitHub Actions code organization: ROCm Stale PR has no activity for 25+ days

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants