Skip to content

Tags: marius311/CUDA.jl

Tags

v1.2.1

Toggle v1.2.1's commit message
## CUDA v1.2.1

[Diff since v1.2.0](JuliaGPU/CUDA.jl@v1.2.0...v1.2.1)


**Closed issues:**
- CuArrays.zeros(T, 0) fails (JuliaGPU#81)
- CUDAnative.cos calls the base cos function in nested broadcast (JuliaGPU#102)
- CuSparseMatrixHYB * CuMatrix = nothing (JuliaGPU#256)
- Strange reordering of struct fields with dynamic parallelism (JuliaGPU#263)
- Performance: bias add (JuliaGPU#298)
- CUDA 11 libraries incorrectly looked up in artifact (JuliaGPU#300)
- CUTENSOR for windows (JuliaGPU#301)
- Performance: sum (JuliaGPU#302)
- Performance: getindex(a, i::Array{Int}) (JuliaGPU#303)
- Display for CuArray within Tuples does not respect :limit=>true (JuliaGPU#305)
- Performance: elementwise operations (JuliaGPU#307)
- Performance: perceptron (JuliaGPU#312)
- windows install error: isfile(__libcupti[]) (JuliaGPU#324)
- std with dims is not type stable (JuliaGPU#336)

**Merged pull requests:**
- Re-enable threading tests. (JuliaGPU#25) (@maleadt)
- Reorganize and simplify some includes (JuliaGPU#296) (@maleadt)
- Only run benchmarks on the master branch. (JuliaGPU#297) (@maleadt)
- Optimizations for broadcast (JuliaGPU#299) (@maleadt)
- Update manifest (JuliaGPU#304) (@github-actions[bot])
- Test runner improvements for multigpu mode (JuliaGPU#309) (@maleadt)
- Artifact improvements for CUDA 11 on Windows (JuliaGPU#310) (@maleadt)
- Optimize element-wise operations (JuliaGPU#313) (@maleadt)
- Check if reported GPU memory use is available. (JuliaGPU#314) (@maleadt)
- Update artifacts: include cusolverMg, and use Yggdrasil binaries. (JuliaGPU#315) (@maleadt)
- Specialization fixes for mapreducedim. (JuliaGPU#316) (@maleadt)
- Fix invalid conversion of pointer to signed integer. (JuliaGPU#317) (@maleadt)
- Work around (presumed) Windows driver bug in exception test. (JuliaGPU#319) (@maleadt)
- Update manifest (JuliaGPU#323) (@github-actions[bot])
- Bump CUDNN and CUTENSOR (JuliaGPU#325) (@maleadt)
- Simplify NVML discovery. (JuliaGPU#326) (@maleadt)
- Separate CURAND wrappers from Random impl. (JuliaGPU#327) (@maleadt)
- Simplify discovering binaries by using Sys.which. (JuliaGPU#328) (@maleadt)
- Add wrapper for NVML utilization rates. (JuliaGPU#329) (@maleadt)
- Attach CUSPARSE docstrings to bare methods, not empty functions. (JuliaGPU#331) (@maleadt)
- Eagerly reduce the amount of worker threads. (JuliaGPU#332) (@maleadt)
- Bump dependencies. (JuliaGPU#333) (@maleadt)
- Clean-up library wrappers [NFC] (JuliaGPU#334) (@maleadt)
- Fix CUDNN v8 discovery and loading on Windows (JuliaGPU#335) (@maleadt)
- Fix type stability of Statistics.var with dims. (JuliaGPU#337) (@maleadt)
- Fix parameter alignment for dynamic parallelism. (JuliaGPU#338) (@maleadt)
- Micro-optimize Base.fill. (JuliaGPU#339) (@maleadt)

v1.2.0

Toggle v1.2.0's commit message
## CUDA v1.2.0

[Diff since v1.1.0](JuliaGPU/CUDA.jl@v1.1.0...v1.2.0)


**Closed issues:**
- Segmentation fault when creating CuArray of CuArray (JuliaGPU#133)
- CUDNN tests fail with CUDNN 6.0.20 (JuliaGPU#134)
- CURAND fail to initialize, code 203 (JuliaGPU#255)
- Deprecation warnings (JuliaGPU#277)
- Can we pleeeeeeeease make cu(x) eltype preserving? (JuliaGPU#278)
- On the use of @sync during benchmarking in the documentation (JuliaGPU#279)
- Example in Multiple GPUs doc fails (JuliaGPU#282)
- LLVM error: Cannot cast between two non-generic address spaces (JuliaGPU#286)

**Merged pull requests:**
- Host-side CUTENSOR (JuliaGPU#243) (@kshyatt)
- Add and document a non-blocking version of at-sync. (JuliaGPU#280) (@maleadt)
- Use a custom adaptor for cu so that adapt(CuArray) preserves element types. (JuliaGPU#281) (@maleadt)
- Check and warn for library versions. (JuliaGPU#284) (@maleadt)
- Add note about nvml dll missing (JuliaGPU#288) (@kshyatt)
- Update your PR to have tests pass (JuliaGPU#289) (@kshyatt)
- Update manifest (JuliaGPU#290) (@github-actions[bot])
- Support CUDA 11 (JuliaGPU#291) (@maleadt)
- do not open the file twice when reading the libdevice bitcode (JuliaGPU#294) (@jakebolewski)

v1.1.0

Toggle v1.1.0's commit message
## CUDA v1.1.0

[Diff since v1.0.2](JuliaGPU/CUDA.jl@v1.0.2...v1.1.0)


**Closed issues:**
- Fix NSight detection (JuliaGPU#29)
- versioninfo() (JuliaGPU#34)
- throw_... messages: invalid call to `jl_alloc_string` (JuliaGPU#54)
- INTERNAL_ERROR during CUDNN handle creation (JuliaGPU#183)
- Improve benchmarking suite (JuliaGPU#222)
- How to load CUDA.jl conditional on the computer having a CUDA-compatible GPU? (JuliaGPU#237)
- CUSOLVER.heevd! returning Float and not Complex (JuliaGPU#238)
- Broadcasting fails with Float64 -> Int conversion (JuliaGPU#240)
- Running `] test CUDA` with `OhMyREPL` in `startup.jl` causes some tests to fail (JuliaGPU#246)
- ERROR: Your LLVM does not support the NVPTX back-end. in local project environment (JuliaGPU#249)
- CUDAnative: UndefVarError: AddrSpacePtr not defined on julia master (JuliaGPU#250)
- Error while freeing CUDA.CuPtr (JuliaGPU#254)
- Non-artifact initialization of CUDA.jl using CUDA 11 fails on Windows (JuliaGPU#262)
- Library handle creation close to OOM fails with ERROR_NOT_INITIALIZED (JuliaGPU#264)
- has(::TargetIterator, name::String) deprecation warning (JuliaGPU#271)

**Merged pull requests:**
- Add texture support from CuTextures.jl (JuliaGPU#209) (@maleadt)
- Memory pinning with interval trees (JuliaGPU#233) (@maleadt)
- Better nsys detection. (JuliaGPU#234) (@maleadt)
- CompatHelper: add new compat entry for "IntervalTrees" at version "1.0" (JuliaGPU#235) (@github-actions[bot])
- Update manifest (JuliaGPU#239) (@github-actions[bot])
- Replace slash by path separator to properly skip tests on Windows. (JuliaGPU#241) (@maleadt)
- Retry cudnnCreate on CUDNN_STATUS_INTERNAL_ERROR and CUDNN_STATUS_NOT_INITIALIZED (JuliaGPU#244) (@maleadt)
- Add issue templates (JuliaGPU#245) (@maleadt)
- Import wrapper tooling, wrap NVML (JuliaGPU#248) (@maleadt)
- Ignore some potentially unsupported NVML features. (JuliaGPU#251) (@maleadt)
- Assert NVPTX availability by just calling the initializer. (JuliaGPU#252) (@maleadt)
- Update manifest (JuliaGPU#257) (@github-actions[bot])
- Adapt to AddrSpacePtr rename. (JuliaGPU#258) (@maleadt)
- Typo in installation overview docs (JuliaGPU#260) (@clintonTE)
- Update GPUCompiler.jl (JuliaGPU#266) (@maleadt)
- Retry library initialization failure due to (badly reported) OOM. (JuliaGPU#268) (@maleadt)
- Upgrade CUTENSOR to v1.1.0. (JuliaGPU#269) (@maleadt)
- Use CUDNN from Yggdrasil. (JuliaGPU#272) (@maleadt)
- Update manifest (JuliaGPU#273) (@github-actions[bot])
- Improve local CUDA discovery for CUDA 11 (JuliaGPU#274) (@maleadt)
- Compatibility with latest LLVM and GPUCompiler (JuliaGPU#275) (@maleadt)

v1.0.2

Toggle v1.0.2's commit message
## CUDA v1.0.2

[Diff since v1.0.1](JuliaGPU/CUDA.jl@v1.0.1...v1.0.2)


**Closed issues:**
- Dynamic generation of docs including benchmarking timings can make the numbers "weird" (JuliaGPU#11)

**Merged pull requests:**
- Documentation updates (JuliaGPU#227) (@maleadt)
- Don't extend Base.findfirst with an unrelated method. (JuliaGPU#230) (@maleadt)
- CompatHelper: bump compat for "NNlib" to "0.7" (JuliaGPU#232) (@github-actions[bot])

v1.0.1

Toggle v1.0.1's commit message
## CUDA v1.0.1

[Diff since v1.0.0](JuliaGPU/CUDA.jl@v1.0.0...v1.0.1)

v1.0.0

Toggle v1.0.0's commit message
## CUDA v1.0.0

[Diff since v0.1.0](JuliaGPU/CUDA.jl@v0.1.0...v1.0.0)


**Closed issues:**
- unsafe_copy3d!: srcPos and dstPos handling (JuliaGPU#27)
- Test failure on Windows (JuliaGPU#37)
- Texture memory? (JuliaGPU#46)
- Tests for the LLVM passes (JuliaGPU#52)
- Bugged Sparse Matrix-Dense matrix multiplication, where dense matrix is transposed (JuliaGPU#77)
- Stack overflow when broadcasting over empty view in CuArrays 2.x (JuliaGPU#82)
- Sparse CSC gemm wrappers actually call CSR routines (JuliaGPU#181)
- Testsuite calls startup.jl (JuliaGPU#182)
- LLVM error: Cannot cast between two non-generic address spaces (JuliaGPU#190)
- Error running CUDA in Jupyter (JuliaGPU#195)
- Floating-point Inf causes an error (JuliaGPU#205)
- mul! issue (JuliaGPU#213)

**Merged pull requests:**
- include potri and test (JuliaGPU#179) (@erathorn)
- Fix sparse-dense matmul, with transposed dense (JuliaGPU#180) (@irhum)
- Behave like Base wrt. test flags. (JuliaGPU#184) (@maleadt)
- fix sparse CSC gemm and test (closes JuliaGPU#181) (JuliaGPU#185) (@jebej)
-  Add speciatization functions f(ctx, x) = f(x) for GPUArrays randn! function (JuliaGPU#186) (@Ellipse0934)
- Add inplace test for rand (JuliaGPU#187) (@kshyatt)
- Fix cushow tests on Windows. (JuliaGPU#188) (@maleadt)
- More tests for CUSPARSE (JuliaGPU#189) (@kshyatt)
- fixed gels_batched! issue (JuliaGPU#191) (@clintonTE)
- Added wrappers for cusolverDn<t>potrfBatched (JuliaGPU#192) (@IvanYashchuk)
- Added wrappers for cusolverDn<t>potrsBatched (JuliaGPU#193) (@IvanYashchuk)
- Compatibility with Julia 1.5 (JuliaGPU#194) (@maleadt)
- Add gemmex wrapper and test (JuliaGPU#196) (@kshyatt)
- Fix handling of srcPos and dstPos in unsafe_copy3d! (JuliaGPU#197) (@samo-lin)
- Prefer a local CUDA installation when running on CI, reinstate Julia 1.3. (JuliaGPU#198) (@maleadt)
- Add support for mixed precision (JuliaGPU#200) (@kshyatt)
- Add texture support from CuTextures.jl (JuliaGPU#206) (@maleadt)
- Error throwing tests (JuliaGPU#207) (@kshyatt)
- Update manifest (JuliaGPU#208) (@github-actions[bot])
- A few more tests for cusparse (JuliaGPU#210) (@kshyatt)
- Specialize Base.mightalias for better broadcast performance. (JuliaGPU#211) (@maleadt)
- Fix some mul! ambiguities, and dispatch more to CUBLAS. (JuliaGPU#214) (@maleadt)
- CUDA 11 compatibility entries (JuliaGPU#221) (@maleadt)
- Benchmark suite: tune and cache params (JuliaGPU#223) (@maleadt)
- Add benchmarks (JuliaGPU#224) (@maleadt)
- Update manifest (JuliaGPU#225) (@github-actions[bot])

v0.1.0

Toggle v0.1.0's commit message
## CUDA v0.1.0

**Closed issues:**
- Documentation: installation instructions (JuliaGPU#1)
- Faced some errors while testing cuda in Julia (JuliaGPU#3)
- facing unknown errors while compiling exact similar code for parallelization on CPU (JuliaGPU#7)

**Merged pull requests:**
- Fix typo (JuliaGPU#2) (@innerlee)
- Doc: fix comment on how memory is moving (JuliaGPU#4) (@mbauman)
- Install TagBot as a GitHub Action (JuliaGPU#8) (@JuliaTagBot)
- small grammar/typo tweaks to the introduction tutorial (JuliaGPU#12) (@KristofferC)
- Add code of other CUDA packages (JuliaGPU#14) (@maleadt)
- Initialize the memory pool (JuliaGPU#19) (@maleadt)
- Improve initialization for threading (JuliaGPU#20) (@maleadt)
- Don't use BinaryBuilder for most CI tests. (JuliaGPU#21) (@maleadt)