Tags: marius311/CUDA.jl
Tags
## CUDA v1.2.1 [Diff since v1.2.0](JuliaGPU/CUDA.jl@v1.2.0...v1.2.1) **Closed issues:** - CuArrays.zeros(T, 0) fails (JuliaGPU#81) - CUDAnative.cos calls the base cos function in nested broadcast (JuliaGPU#102) - CuSparseMatrixHYB * CuMatrix = nothing (JuliaGPU#256) - Strange reordering of struct fields with dynamic parallelism (JuliaGPU#263) - Performance: bias add (JuliaGPU#298) - CUDA 11 libraries incorrectly looked up in artifact (JuliaGPU#300) - CUTENSOR for windows (JuliaGPU#301) - Performance: sum (JuliaGPU#302) - Performance: getindex(a, i::Array{Int}) (JuliaGPU#303) - Display for CuArray within Tuples does not respect :limit=>true (JuliaGPU#305) - Performance: elementwise operations (JuliaGPU#307) - Performance: perceptron (JuliaGPU#312) - windows install error: isfile(__libcupti[]) (JuliaGPU#324) - std with dims is not type stable (JuliaGPU#336) **Merged pull requests:** - Re-enable threading tests. (JuliaGPU#25) (@maleadt) - Reorganize and simplify some includes (JuliaGPU#296) (@maleadt) - Only run benchmarks on the master branch. (JuliaGPU#297) (@maleadt) - Optimizations for broadcast (JuliaGPU#299) (@maleadt) - Update manifest (JuliaGPU#304) (@github-actions[bot]) - Test runner improvements for multigpu mode (JuliaGPU#309) (@maleadt) - Artifact improvements for CUDA 11 on Windows (JuliaGPU#310) (@maleadt) - Optimize element-wise operations (JuliaGPU#313) (@maleadt) - Check if reported GPU memory use is available. (JuliaGPU#314) (@maleadt) - Update artifacts: include cusolverMg, and use Yggdrasil binaries. (JuliaGPU#315) (@maleadt) - Specialization fixes for mapreducedim. (JuliaGPU#316) (@maleadt) - Fix invalid conversion of pointer to signed integer. (JuliaGPU#317) (@maleadt) - Work around (presumed) Windows driver bug in exception test. (JuliaGPU#319) (@maleadt) - Update manifest (JuliaGPU#323) (@github-actions[bot]) - Bump CUDNN and CUTENSOR (JuliaGPU#325) (@maleadt) - Simplify NVML discovery. (JuliaGPU#326) (@maleadt) - Separate CURAND wrappers from Random impl. (JuliaGPU#327) (@maleadt) - Simplify discovering binaries by using Sys.which. (JuliaGPU#328) (@maleadt) - Add wrapper for NVML utilization rates. (JuliaGPU#329) (@maleadt) - Attach CUSPARSE docstrings to bare methods, not empty functions. (JuliaGPU#331) (@maleadt) - Eagerly reduce the amount of worker threads. (JuliaGPU#332) (@maleadt) - Bump dependencies. (JuliaGPU#333) (@maleadt) - Clean-up library wrappers [NFC] (JuliaGPU#334) (@maleadt) - Fix CUDNN v8 discovery and loading on Windows (JuliaGPU#335) (@maleadt) - Fix type stability of Statistics.var with dims. (JuliaGPU#337) (@maleadt) - Fix parameter alignment for dynamic parallelism. (JuliaGPU#338) (@maleadt) - Micro-optimize Base.fill. (JuliaGPU#339) (@maleadt)
## CUDA v1.2.0 [Diff since v1.1.0](JuliaGPU/CUDA.jl@v1.1.0...v1.2.0) **Closed issues:** - Segmentation fault when creating CuArray of CuArray (JuliaGPU#133) - CUDNN tests fail with CUDNN 6.0.20 (JuliaGPU#134) - CURAND fail to initialize, code 203 (JuliaGPU#255) - Deprecation warnings (JuliaGPU#277) - Can we pleeeeeeeease make cu(x) eltype preserving? (JuliaGPU#278) - On the use of @sync during benchmarking in the documentation (JuliaGPU#279) - Example in Multiple GPUs doc fails (JuliaGPU#282) - LLVM error: Cannot cast between two non-generic address spaces (JuliaGPU#286) **Merged pull requests:** - Host-side CUTENSOR (JuliaGPU#243) (@kshyatt) - Add and document a non-blocking version of at-sync. (JuliaGPU#280) (@maleadt) - Use a custom adaptor for cu so that adapt(CuArray) preserves element types. (JuliaGPU#281) (@maleadt) - Check and warn for library versions. (JuliaGPU#284) (@maleadt) - Add note about nvml dll missing (JuliaGPU#288) (@kshyatt) - Update your PR to have tests pass (JuliaGPU#289) (@kshyatt) - Update manifest (JuliaGPU#290) (@github-actions[bot]) - Support CUDA 11 (JuliaGPU#291) (@maleadt) - do not open the file twice when reading the libdevice bitcode (JuliaGPU#294) (@jakebolewski)
## CUDA v1.1.0 [Diff since v1.0.2](JuliaGPU/CUDA.jl@v1.0.2...v1.1.0) **Closed issues:** - Fix NSight detection (JuliaGPU#29) - versioninfo() (JuliaGPU#34) - throw_... messages: invalid call to `jl_alloc_string` (JuliaGPU#54) - INTERNAL_ERROR during CUDNN handle creation (JuliaGPU#183) - Improve benchmarking suite (JuliaGPU#222) - How to load CUDA.jl conditional on the computer having a CUDA-compatible GPU? (JuliaGPU#237) - CUSOLVER.heevd! returning Float and not Complex (JuliaGPU#238) - Broadcasting fails with Float64 -> Int conversion (JuliaGPU#240) - Running `] test CUDA` with `OhMyREPL` in `startup.jl` causes some tests to fail (JuliaGPU#246) - ERROR: Your LLVM does not support the NVPTX back-end. in local project environment (JuliaGPU#249) - CUDAnative: UndefVarError: AddrSpacePtr not defined on julia master (JuliaGPU#250) - Error while freeing CUDA.CuPtr (JuliaGPU#254) - Non-artifact initialization of CUDA.jl using CUDA 11 fails on Windows (JuliaGPU#262) - Library handle creation close to OOM fails with ERROR_NOT_INITIALIZED (JuliaGPU#264) - has(::TargetIterator, name::String) deprecation warning (JuliaGPU#271) **Merged pull requests:** - Add texture support from CuTextures.jl (JuliaGPU#209) (@maleadt) - Memory pinning with interval trees (JuliaGPU#233) (@maleadt) - Better nsys detection. (JuliaGPU#234) (@maleadt) - CompatHelper: add new compat entry for "IntervalTrees" at version "1.0" (JuliaGPU#235) (@github-actions[bot]) - Update manifest (JuliaGPU#239) (@github-actions[bot]) - Replace slash by path separator to properly skip tests on Windows. (JuliaGPU#241) (@maleadt) - Retry cudnnCreate on CUDNN_STATUS_INTERNAL_ERROR and CUDNN_STATUS_NOT_INITIALIZED (JuliaGPU#244) (@maleadt) - Add issue templates (JuliaGPU#245) (@maleadt) - Import wrapper tooling, wrap NVML (JuliaGPU#248) (@maleadt) - Ignore some potentially unsupported NVML features. (JuliaGPU#251) (@maleadt) - Assert NVPTX availability by just calling the initializer. (JuliaGPU#252) (@maleadt) - Update manifest (JuliaGPU#257) (@github-actions[bot]) - Adapt to AddrSpacePtr rename. (JuliaGPU#258) (@maleadt) - Typo in installation overview docs (JuliaGPU#260) (@clintonTE) - Update GPUCompiler.jl (JuliaGPU#266) (@maleadt) - Retry library initialization failure due to (badly reported) OOM. (JuliaGPU#268) (@maleadt) - Upgrade CUTENSOR to v1.1.0. (JuliaGPU#269) (@maleadt) - Use CUDNN from Yggdrasil. (JuliaGPU#272) (@maleadt) - Update manifest (JuliaGPU#273) (@github-actions[bot]) - Improve local CUDA discovery for CUDA 11 (JuliaGPU#274) (@maleadt) - Compatibility with latest LLVM and GPUCompiler (JuliaGPU#275) (@maleadt)
## CUDA v1.0.2 [Diff since v1.0.1](JuliaGPU/CUDA.jl@v1.0.1...v1.0.2) **Closed issues:** - Dynamic generation of docs including benchmarking timings can make the numbers "weird" (JuliaGPU#11) **Merged pull requests:** - Documentation updates (JuliaGPU#227) (@maleadt) - Don't extend Base.findfirst with an unrelated method. (JuliaGPU#230) (@maleadt) - CompatHelper: bump compat for "NNlib" to "0.7" (JuliaGPU#232) (@github-actions[bot])
## CUDA v1.0.1 [Diff since v1.0.0](JuliaGPU/CUDA.jl@v1.0.0...v1.0.1)
## CUDA v1.0.0 [Diff since v0.1.0](JuliaGPU/CUDA.jl@v0.1.0...v1.0.0) **Closed issues:** - unsafe_copy3d!: srcPos and dstPos handling (JuliaGPU#27) - Test failure on Windows (JuliaGPU#37) - Texture memory? (JuliaGPU#46) - Tests for the LLVM passes (JuliaGPU#52) - Bugged Sparse Matrix-Dense matrix multiplication, where dense matrix is transposed (JuliaGPU#77) - Stack overflow when broadcasting over empty view in CuArrays 2.x (JuliaGPU#82) - Sparse CSC gemm wrappers actually call CSR routines (JuliaGPU#181) - Testsuite calls startup.jl (JuliaGPU#182) - LLVM error: Cannot cast between two non-generic address spaces (JuliaGPU#190) - Error running CUDA in Jupyter (JuliaGPU#195) - Floating-point Inf causes an error (JuliaGPU#205) - mul! issue (JuliaGPU#213) **Merged pull requests:** - include potri and test (JuliaGPU#179) (@erathorn) - Fix sparse-dense matmul, with transposed dense (JuliaGPU#180) (@irhum) - Behave like Base wrt. test flags. (JuliaGPU#184) (@maleadt) - fix sparse CSC gemm and test (closes JuliaGPU#181) (JuliaGPU#185) (@jebej) - Add speciatization functions f(ctx, x) = f(x) for GPUArrays randn! function (JuliaGPU#186) (@Ellipse0934) - Add inplace test for rand (JuliaGPU#187) (@kshyatt) - Fix cushow tests on Windows. (JuliaGPU#188) (@maleadt) - More tests for CUSPARSE (JuliaGPU#189) (@kshyatt) - fixed gels_batched! issue (JuliaGPU#191) (@clintonTE) - Added wrappers for cusolverDn<t>potrfBatched (JuliaGPU#192) (@IvanYashchuk) - Added wrappers for cusolverDn<t>potrsBatched (JuliaGPU#193) (@IvanYashchuk) - Compatibility with Julia 1.5 (JuliaGPU#194) (@maleadt) - Add gemmex wrapper and test (JuliaGPU#196) (@kshyatt) - Fix handling of srcPos and dstPos in unsafe_copy3d! (JuliaGPU#197) (@samo-lin) - Prefer a local CUDA installation when running on CI, reinstate Julia 1.3. (JuliaGPU#198) (@maleadt) - Add support for mixed precision (JuliaGPU#200) (@kshyatt) - Add texture support from CuTextures.jl (JuliaGPU#206) (@maleadt) - Error throwing tests (JuliaGPU#207) (@kshyatt) - Update manifest (JuliaGPU#208) (@github-actions[bot]) - A few more tests for cusparse (JuliaGPU#210) (@kshyatt) - Specialize Base.mightalias for better broadcast performance. (JuliaGPU#211) (@maleadt) - Fix some mul! ambiguities, and dispatch more to CUBLAS. (JuliaGPU#214) (@maleadt) - CUDA 11 compatibility entries (JuliaGPU#221) (@maleadt) - Benchmark suite: tune and cache params (JuliaGPU#223) (@maleadt) - Add benchmarks (JuliaGPU#224) (@maleadt) - Update manifest (JuliaGPU#225) (@github-actions[bot])
## CUDA v0.1.0 **Closed issues:** - Documentation: installation instructions (JuliaGPU#1) - Faced some errors while testing cuda in Julia (JuliaGPU#3) - facing unknown errors while compiling exact similar code for parallelization on CPU (JuliaGPU#7) **Merged pull requests:** - Fix typo (JuliaGPU#2) (@innerlee) - Doc: fix comment on how memory is moving (JuliaGPU#4) (@mbauman) - Install TagBot as a GitHub Action (JuliaGPU#8) (@JuliaTagBot) - small grammar/typo tweaks to the introduction tutorial (JuliaGPU#12) (@KristofferC) - Add code of other CUDA packages (JuliaGPU#14) (@maleadt) - Initialize the memory pool (JuliaGPU#19) (@maleadt) - Improve initialization for threading (JuliaGPU#20) (@maleadt) - Don't use BinaryBuilder for most CI tests. (JuliaGPU#21) (@maleadt)