add depthwise_conv* overloads for CUDA by DhairyaLGandhi · Pull Request #22 · FluxML/NNlibCUDA.jl

DhairyaLGandhi · 2021-07-17T04:48:01Z

No description provided.

CarloLucibello · 2021-07-17T05:56:20Z

+function depthwise_conv!(y::DenseCuArray{T}, x::DenseCuArray{T}, w::DenseCuArray{T}, cdims::DepthwiseConvDims;
+               alpha = 1, beta = 0, algo = -1) where T <: CUDNNFloat
+    conv!(y, x, w, cims; alpha, beta, algo)
+end
+
+function ∇depthwise_conv_filter!(dw::DenseCuArray{T}, x::DenseCuArray{T}, dy::DenseCuArray{T},
+                       cdims::ConvDims; alpha = 1, beta = 0, algo = -1) where T <: CUDNNFloat
+  ∇conv_filter!(dw, x, dy, cdims; alpha, beta, algo)
+end
+
+function ∇depthwise_conv_data!(dx::DenseCuArray{T}, dy::DenseCuArray{T}, w::DenseCuArray{T},
+                     cdims::ConvDims; alpha = 1, beta = 0, algo = -1) where T <: CUDNNFloat
+    ∇conv_data!(dx, dy, w, cdims; alpha, beta, algo)
+end


these don't have to be cuda specific, we can add them to NNlib and remove the specific implementations (after a performance comparison)

Add what to nnlib, sorry? This package is specific to GPU functionality.

exactly these methods, with AbstractArray arguments, i.e. fallback on conv

Umm, we probably want to retain the cpu kernels anyway. Without explicitly having and launching Julia with many threads, grouped convolutions would scale with the number of groups.

this would be true for any implementation, specialized or not

julia> x′ = rand(Float32, 28, 28, 4, 2); julia> w′ = rand(Float32, 3, 3, 4, 30); julia> cdims = DenseConvDims(x′, w′, groups = 4) julia> @btime conv($x′, $w′, $cdims); 362.792 μs (86 allocations: 736.36 KiB) # -t1 236.368 μs (94 allocations: 831.89 KiB) # -t2 232.137 μs (94 allocations: 831.89 KiB) # -t4 julia> @btime depthwiseconv($x′, $(permutedims(w′, (1,2,4,3)))); 348.914 μs (42 allocations: 731.03 KiB) # -t1 156.558 μs (47 allocations: 826.53 KiB) # -t2 161.059 μs (47 allocations: 826.53 KiB) # -t4

This is with https://github.com/DhairyaLGandhi/NNlib.jl#dg/g2 which has a couple of fixes pending a PR.

ToucheSir

Looks reasonable to me, just needs a couple tests in https://github.com/FluxML/NNlibCUDA.jl/blob/master/test/conv.jl (I know the implementation is technically covered indirectly now, but there's no guarantee these methods will forward to the conv ones forever).

add depthwise_conv* overloads for CUDA

5343eca

CarloLucibello reviewed Jul 17, 2021

View reviewed changes

Merge branch 'master' into dg/depth

c1c6154

DhairyaLGandhi mentioned this pull request Jul 22, 2021

deprecate DepthwiseConv once we have groups in standard conv FluxML/Flux.jl#1667

Closed

ToucheSir reviewed Nov 12, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add depthwise_conv* overloads for CUDA#22

add depthwise_conv* overloads for CUDA#22
DhairyaLGandhi wants to merge 2 commits into
masterfrom
dg/depth

DhairyaLGandhi commented Jul 17, 2021

Uh oh!

CarloLucibello Jul 17, 2021

Uh oh!

DhairyaLGandhi Jul 17, 2021 •

edited

Loading

Uh oh!

CarloLucibello Jul 17, 2021

Uh oh!

DhairyaLGandhi Jul 17, 2021

Uh oh!

CarloLucibello Jul 17, 2021

Uh oh!

DhairyaLGandhi Jul 21, 2021

Uh oh!

ToucheSir left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

DhairyaLGandhi commented Jul 17, 2021

Uh oh!

CarloLucibello Jul 17, 2021

Choose a reason for hiding this comment

Uh oh!

DhairyaLGandhi Jul 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CarloLucibello Jul 17, 2021

Choose a reason for hiding this comment

Uh oh!

DhairyaLGandhi Jul 17, 2021

Choose a reason for hiding this comment

Uh oh!

CarloLucibello Jul 17, 2021

Choose a reason for hiding this comment

Uh oh!

DhairyaLGandhi Jul 21, 2021

Choose a reason for hiding this comment

Uh oh!

ToucheSir left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DhairyaLGandhi Jul 17, 2021 •

edited

Loading