Skip to content

initial OpenCL SDK review #1

@bashbaug

Description

@bashbaug

Hello!

Here are a few brief comments regarding the changes in the reducecpp branch:

  • It looks like C++ 14 is now required whereas only C++ 11 was required previously. It's almost 2022, so that's probably ok, but worth confirming.
  • I didn't have TCLAP so I was unable to build the SDK after cloning. Can we mirror a copy of TCLAP in this repo to simplify building or add TCLAP as a submodule? If not, we need to update the README with the exact steps needed to build, to minimize barriers to entry.
  • Just FYI, I cannot update to the headers, ICD loader, and C++ bindings submodules in this branch either.
  • Several files use the abbreviation "plat" for "platform" and "dev" for "device". I would recommend spelling these out completely, to aid understanding in the SDK.
  • The prevous "add_sample" CMake function enabled testing conditionally for each sample, whereas I believe the latest changes control testing globally. Is this intended? I've found that some samples do not lend themselves well to automated testing, though having testing for all samples is a good goal.
  • If you're looking for another reduction variant, the very simplest textbook reduction usually uses atomics (although it's also usually very slow).
  • The check for subgroup support in "may_use_sub_group_reduce" doesn't quite look right, since some implementations may support subgroups but do not support cl_khr_subgroups due to lack of subgroup independent forward progress. The recommendation in the OpenCL API spec is to check for a non-zero return value from CL_DEVICE_MAX_NUM_SUB_GROUPS to check for subgroup support.
  • Given the purpose of the sample ("demonstrate how to query various extensions applicable in the context of a reduction algorithm") I think it would be worth adding more comments describing what the different queries are doing.
  • Should there be a way to force an alternate implementation (if supported) vs. choosing the default, to compare performance or correctness? For example, could I try the vanilla or subgroup variant even if my device supports work-group collective functions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions