Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: getindex(a, i::Array{Int}) #303

Closed
denizyuret opened this issue Jul 19, 2020 · 1 comment
Closed

Performance: getindex(a, i::Array{Int}) #303

denizyuret opened this issue Jul 19, 2020 · 1 comment
Labels
cuda array Stuff about CuArray. performance How fast can we go?

Comments

@denizyuret
Copy link
Contributor

This is important when computing loss:

julia> k = KnetArray{Float32}(rand(10,100))

julia> c = CuArray{Float32}(rand(10,100))

julia> i = sort(rand(1:1000,100))

julia> @benchmark k[i]
@benchmark k[i]
BenchmarkTools.Trial: 
  memory estimate:  1.48 KiB
  allocs estimate:  23
  --------------
  minimum time:     26.711 μs (0.00% GC)
  median time:      29.097 μs (0.00% GC)
  mean time:        30.774 μs (0.00% GC)
  maximum time:     787.018 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark c[i]
@benchmark c[i]
BenchmarkTools.Trial: 
  memory estimate:  2.48 KiB
  allocs estimate:  84
  --------------
  minimum time:     44.229 μs (0.00% GC)
  median time:      48.130 μs (0.00% GC)
  mean time:        48.999 μs (0.00% GC)
  maximum time:     833.070 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
@denizyuret denizyuret added the performance How fast can we go? label Jul 19, 2020
@maleadt maleadt added the cuda array Stuff about CuArray. label Jul 22, 2020
@maleadt
Copy link
Member

maleadt commented Jul 27, 2020

Looks fixed with the recent round of improvements:

Knet.jl
BenchmarkTools.Trial: 
  memory estimate:  1.50 KiB
  allocs estimate:  23
  --------------
  minimum time:     11.236 μs (0.00% GC)
  median time:      12.286 μs (0.00% GC)
  mean time:        15.525 μs (3.11% GC)
  maximum time:     14.404 ms (33.53% GC)
  --------------
  samples:          10000
  evals/sample:     1

CUDA.jl
BenchmarkTools.Trial: 
  memory estimate:  1.06 KiB
  allocs estimate:  46
  --------------
  minimum time:     10.209 μs (0.00% GC)
  median time:      11.297 μs (0.00% GC)
  mean time:        12.564 μs (0.00% GC)
  maximum time:     1.114 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

@maleadt maleadt closed this as completed Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda array Stuff about CuArray. performance How fast can we go?
Projects
None yet
Development

No branches or pull requests

2 participants