Skip to content

Enzyme integration likely causes divergent kernels #584

@vchuravy

Description

@vchuravy

While looking at #583 I noticed that the aug_fwd kernel looks like:

function aug_fwd(
        ctx,
        f::FT,
        ::Val{ModifiedBetween},
        subtape,
        ::Val{TapeType},
        args...,
    ) where {ModifiedBetween, FT, TapeType}
    # A2 = Const{Nothing} -- since f->Nothing
    forward, _ = EnzymeCore.autodiff_deferred_thunk(
        ReverseSplitModified(ReverseSplitWithPrimal, Val(ModifiedBetween)),
        TapeType,
        Const{Core.Typeof(f)},
        Const{Nothing},
        Const{Core.Typeof(ctx)},
        map(Core.Typeof, args)...,
    )

    # On the GPU: F is a per thread function
    # On the GPU: subtape::Vector
    if __validindex(ctx)
        I = __index_Global_Linear(ctx)
        subtape[I] = forward(Const(f), Const(ctx), args...)[1]
    end
    return nothing
end

This will create divergent execution of barrier operations #558 (comment)
Likely this is also broken with @kernel unsafe_indicies=true

cc: @michel2323

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions