Skip to content

[hipFFTW] hipFort bindings and tests#295

Draft
marikurz-amd wants to merge 4 commits into
developfrom
users/marikurz/hipfftw
Draft

[hipFFTW] hipFort bindings and tests#295
marikurz-amd wants to merge 4 commits into
developfrom
users/marikurz/hipfftw

Conversation

@marikurz-amd

Copy link
Copy Markdown

Status: This is a draft. The Fortran API is not complete yet, but can be extended in future work. This PR is opened mainly to generate visibility and avoid duplicate work.

Motivation

This PR starts the addition of FFTW-style Fortran interfaces that allow to call hipFFTW through hipFort. This PR adds a first batch of interfaces, in particular double-precision, legacy-style interfaces for the basic and advanced API. It also adds a CMake structure that builds tests depending on the hipfort::hipfftw target being available or not.

Technical Details

  • First commit: Adds hipfort_hipfftw.F90 with the first hipFFTW Fortran interfaces (double precision, “legacy” style types; basic and new-array transforms).
  • Second commit: Adds the advanced *_many interfaces and tidies formatting and comments in that file.
  • Third commit: Adds F2003 tests for 1D C2C, R2C/C2R round-trip, and batched many cases (including interleaved, padded embed, and many-vs-individual). Updates CMake and package config so the hipfort::hipfftw target is exported and tests can build; updates Makefiles under test/ as needed.

The build only runs a test if the matching C API symbols exist in libhipfftw, so older or partial ROCm installs do not fail the whole test suite.

Test Plan

  • Configure with ROCm so hip::hipfftw and libhipfftw are found.
  • Build the hipFFTW piece and the hipfftw_* tests when the required symbols are present.
  • Run the three tests: hipfftw_c2c, hipfftw_r2c_c2r, and hipfftw_many. Expect some tests to be skipped if a symbol is not in the library yet.
  • The tests cover the majority of implemented API calls and strive to reduce duplicate testing.

Test Result

ROCm hipFFTW CMake target Tests built CTest summary hipFFTW tests run¹
7.0.0 skipped 30 30 / 30 passed 0
7.0.1 skipped 30 30 / 30 passed 0
7.0.2 skipped 30 30 / 30 passed 0
7.1.0 yes 30 30 / 30 passed 0
7.1.1 yes 30 30 / 30 passed 0
7.2.0 yes 32 32 / 32 passed 2
7.2.1 yes 32 32 / 32 passed 2

¹ On 7.2.x, CTest registered four hipFFTW-related tests; two binaries were built and executed (hipfftw_c2c, hipfftw_r2c_c2r). All passed.

Submission Checklist

…y legacy-type interfaces in double-precision for the basic and new-array transforms.
Add F2003 GPU tests covering 1D C2C, R2C/C2R round-trip, and advanced
batched many interfaces (interleaved, padded embed, many-vs-individual).
Include corresponding CMake target export and package config integration.
@marikurz-amd marikurz-amd requested a review from cgmb as a code owner April 10, 2026 11:15
@marikurz-amd marikurz-amd marked this pull request as draft April 10, 2026 11:16
@marikurz-amd marikurz-amd changed the title Draft: hipFort bindings and tests for hipFFTW hipFFTW: hipFort bindings and tests Apr 10, 2026
@marikurz-amd marikurz-amd changed the title hipFFTW: hipFort bindings and tests [hipFFTW] hipFort bindings and tests Apr 10, 2026
…n built against exports the necessary symbols.

@regan-amd regan-amd left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the right direction to me. Most comments are cosmetic and/or simple suggestions...

I'm assuming upcoming work items include (not necessarily in this order)

  • expanding test coverage for
    • dfftw_plan_dft_2d, dfftw_plan_dft_3d, dfftw_plan_dft (and r2c/c2r equivalent);
    • dfftw_execute;
    • in-place transforms;
  • expanding interface to guru and guru64 plan creation subroutines as well along with testing thereof;
  • covering single-precision as well.

Is that correct? If not, please clarify what's not planned for (and/or what else is).

! ==============================================================================
! hipfort: FORTRAN Interfaces for GPU kernels
! ==============================================================================
! Copyright (c) 2020-2022 Advanced Micro Devices, Inc. All rights reserved.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
! Copyright (c) 2020-2022 Advanced Micro Devices, Inc. All rights reserved.
! Copyright (c) 2026 Advanced Micro Devices, Inc. All rights reserved.

!> @details Allocates and initializes an N-dimensional complex-to-complex FFT plan.
!>
!> @param[in] rank Number of dimensions.
!> @param[in] n Array of dimensions.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!> @param[in] n Array of dimensions.
!> @param[in] n Array of lengths.

+ same comment elsewhere.

Comment on lines +94 to +105
!> @brief C interface: Create a 2D complex-to-complex FFT plan.
!>
!> @details Allocates and initializes a 2D complex-to-complex FFT plan.
!>
!> @param[in] n0 Length of the transform in the first dimension.
!> @param[in] n1 Length of the transform in the second dimension.
!> @param[in] in Input array.
!> @param[in] out Output array.
!> @param[in] sign Sign of the exponent in the formula that defines the Fourier transform.
!> @param[in] flags FFTW planner flags.
!> @return Handle to the FFT plan.
interface

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should leave the (private) binding functions undocumented and keep documentation only for the public subroutines of this module? I'm afraid this may trigger some confusion regarding the ordering of dimensions for users, e.g., w.r.t. the meaning of "{first, second, third} dimension" across the various documentation bits in this file... Unless I'm mistaken, these functions are not directly accessible to users of this module so there is no need for them to know much about...

Alternatively, if we feel strongly about keeping the private binders documented, my suggestion would be to change these by a simple comment referring them to the actual C header file hipfft/hipfftw.h if they're ever seeking more information about this. That would

  • alleviate duplication by design and naturally avoid out-of-sync documentation updates, if any;
  • implicitly indicate to users/interested readers that they're then leaving the realm of Fortran for C conventions if following the link (thereby abandoning their Fortran-specific default dimension-ordering convention).

Comment thread lib/hipfort/hipfort_hipfftw.F90
implicit none
integer(c_int64_t) :: plan
integer(c_int) :: rank
integer(c_int), dimension(rank), target :: n

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
integer(c_int), dimension(rank), target :: n
integer(c_int), dimension(rank), target, intent(in) :: n

+ same elsewhere for all subroutines' arguments of non-primary type.

end if

! Backward C2R: c2r(r2c(x)) should equal N * x
call hipCheck(hipMemcpy(dy, c_loc(hresult_c(1)), Nbytes_c, hipMemcpyHostToDevice))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copy is unnecessary in the current form given the verifications done above.

Comment thread test/CMakeLists.txt
endif()

if(HIPFFTW_HAS_EXECUTE_DFT)
hipfort_add_test(hipfftw hipfftw_c2c f2003)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I believe you could easily enable this hipfftw_c2c test for the dfftw_execute subroutine as well (e.g., with pre-processor compilation-time defines): the execution arrays dx_f and dx_y were used at plan creation so dfftw_execute (using the plan only) should operate on those.

That would enable some minimal coverage even for the "1.0.22" version of hipFFT.

integer, intent(inout) :: nfail
integer(c_int), parameter :: NX = 4, NY = 6, LDX = 8, howmany = 2
integer(c_size_t), parameter :: total_bytes = LDX * NY * howmany * 16
complex(c_double_complex), allocatable, target :: hx(:), hresult(:)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use Fortran multi-dimensional allocatable arrays, e.g., with 3 array dimensions in this case (2 for length and 1 for batch)? Wouldn't that make things a little more readable and better align with standard Fortran coding practices? If taking the suggestion, please apply consistently throughout.

Comment on lines +32 to +33
call hipCheck(hipMalloc(dx, Nbytes))
call hipCheck(hipMalloc(dy, Nbytes))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hipFFTW is supposed to be able to handle host-residing I/O data, and my understanding was that it would be the main use case scenario in fact. If the latter is true for Fortran users as well, I would recommend testing with host-residing I/O data, too (and/or only that if test scopes must be kept small)...
That would significantly simplify the tests and align better with the most relevant use case scenario (in my understanding).

@@ -0,0 +1,3 @@
CFLAGS ?= -lhipfft -lhipfftw -lrocfft

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CFLAGS ?= -lhipfft -lhipfftw -lrocfft
CFLAGS ?= -lhipfftw -lrocfft

If actually needed, could you clarify what it is needed for? This is surprising to me.

Also, why isn't it FFLAGS?

@ronlieb ronlieb self-requested a review May 25, 2026 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants