Skip to content

Using semaphore to sync with all peer processes in finalization stage#169

Merged
ammarwa merged 8 commits into
developfrom
import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync
Aug 25, 2025
Merged

Using semaphore to sync with all peer processes in finalization stage#169
ammarwa merged 8 commits into
developfrom
import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync

Conversation

@systems-assistant

@systems-assistant systems-assistant Bot commented Aug 7, 2025

Copy link
Copy Markdown
Contributor

[rocprofv3] Implement synchronization using POSIX semaphore in finalization

PR Details

Associated Jira Ticket Number/Link

Previously in https://github.com/AMD-ROCm-Internal/rocprofiler-sdk-internal/pull/500

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update
  • Continuous Integration

Technical details

This PR is to fix https://ontrack-internal.amd.com/browse/SWDEV-539022. In some workload, during the exiting stage, the framework uses a fast-fail pattern: if one process finishes, it kills all peer processes. E.g. in vLLM:
vllm/executor/multiproc_worker_utils.py:127:

This fix is using a POSIX semaphore to sync with peer processes, wait all peers finish and keep executing.

Added/updated tests?

  • Yes
  • No, Does not apply to this PR.

Updated CHANGELOG?

  • Yes
  • No, Does not apply to this PR.

Added/Updated documentation?

  • Yes
  • No, Does not apply to this PR.

🔁 Imported from ROCm/rocprofiler-sdk#114
🧑‍💻 Originally authored by @rocm-devops

ammallya pushed a commit that referenced this pull request Aug 7, 2025
* Generate codecoverage comment as collapsible summary

* Tweak markdown formatting

[ROCm/rocprofiler-sdk commit: 58ecbd8]
ywang103-amd pushed a commit to ywang103-amd/rocm-systems that referenced this pull request Aug 7, 2025
For compatibility with recent rocprofiler-sdk change.

[ROCm/rocprofiler-systems commit: 2680ccc]
jayhawk-commits pushed a commit that referenced this pull request Aug 8, 2025
Update job_stats_sample.rst

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>

[ROCm/rdc commit: 4a230f0]
@huanrwan-amd huanrwan-amd force-pushed the import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync branch from 4fc07a5 to a381bc7 Compare August 14, 2025 20:55
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Aug 14, 2025
Comment thread projects/rocprofiler-sdk/source/lib/rocprofiler-sdk-tool/helper.hpp Outdated
Comment thread projects/rocprofiler-sdk/source/lib/rocprofiler-sdk-tool/tool.cpp Outdated
@ammarwa

ammarwa commented Aug 15, 2025

Copy link
Copy Markdown
Collaborator

Code Coverage Report

Code Coverage Report

Tests Only

code coverage tests.png

Samples Only

code coverage samples.png

Tests + Samples

code coverage all.png

@huanrwan-amd huanrwan-amd force-pushed the import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync branch 2 times, most recently from 1983ccb to cb60831 Compare August 18, 2025 17:03
jayhawk-commits pushed a commit that referenced this pull request Aug 18, 2025
@bwelton bwelton self-assigned this Aug 19, 2025
@huanrwan-amd huanrwan-amd force-pushed the import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync branch from cb60831 to cff2b99 Compare August 19, 2025 18:19

@bwelton bwelton left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had reviewed and signed off on the original PR in the other repo. Signing off here.

@huanrwan-amd huanrwan-amd force-pushed the import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync branch from cff2b99 to b0ecb7d Compare August 22, 2025 15:10
@ammarwa ammarwa force-pushed the import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync branch from b0ecb7d to 33a8dab Compare August 25, 2025 13:14
@ammarwa ammarwa merged commit b645010 into develop Aug 25, 2025
33 of 37 checks passed
@ammarwa ammarwa deleted the import/develop/ROCm_rocprofiler-sdk/huanran_finalization_processes_sync branch August 25, 2025 13:57
systems-assistant Bot pushed a commit to ROCm/rocprofiler-sdk that referenced this pull request Aug 25, 2025
 finalization stage (#169)

* Using semaphore to sync with all peer processes in finalization stage

[rocprofv3] Implement synchronization using POSIX semaphore in finalization

* clang format code

* clang 11 format code

* Add process sync option for rocprofv3

* Default value of process sync is false

* Update source/lib/rocprofiler-sdk-tool/tool.cpp

Apply suggestion by Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* update according to comments

* add new line to helper.hpp
[rocm-systems] ROCm/rocm-systems#169 (commit b645010)
ammallya pushed a commit that referenced this pull request Jan 21, 2026
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
ammallya pushed a commit that referenced this pull request Jan 21, 2026
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rocshmem commit: 49f7f1b]
ammallya pushed a commit that referenced this pull request Jan 30, 2026
…ocs/sphinx (#169)

Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.21.1 to 1.22.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v1.21.1...v1.22.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.22.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
ammallya pushed a commit that referenced this pull request Jan 30, 2026
…ocs/sphinx (#169)

Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.21.1 to 1.22.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v1.21.1...v1.22.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.22.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rocjpeg commit: a43e7f1]
ammallya pushed a commit that referenced this pull request Apr 9, 2026
Adds instructions on installing hipFile using the nightly packages and
building aiscp.
ammallya pushed a commit that referenced this pull request Apr 9, 2026
Adds instructions on installing hipFile using the nightly packages and
building aiscp.

[ROCm/hipFile commit: 77d9fd5]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants