Skip to content

Tags: vtjnash/CUDA.jl

Tags

v4.1.4

Toggle v4.1.4's commit message
## CUDA v4.1.4

[Diff since v4.1.3](JuliaGPU/CUDA.jl@v4.1.3...v4.1.4)


**Closed issues:**
- Buggy precompilation of init-defined symbols can break CUDA_Driver_jll initialization (JuliaGPU#1798)
- Calling CUDA.set_runtime_version!() with float parameter makes CUDA.jl unusable. (JuliaGPU#1831)
- Unexpexted memory allocation when using `randn!` (JuliaGPU#1856)
- The memory copy speed seems to exceed the hardware limit (JuliaGPU#1860)
- PCG produces different output on GPU (via Krylov.jl) (JuliaGPU#1864)

**Merged pull requests:**
- Fix system_driver_version on platforms not supported by CUDA_Driver_jll. (JuliaGPU#1854) (@maleadt)
- Update manifest (JuliaGPU#1861) (@github-actions[bot])

v4.1.3

Toggle v4.1.3's commit message
## CUDA v4.1.3

[Diff since v4.1.2](JuliaGPU/CUDA.jl@v4.1.2...v4.1.3)


**Closed issues:**
- CUDA.versioninfo() triggers download of lazy artifacts (JuliaGPU#1844)

**Merged pull requests:**
- Choose parallel tests based on CPUs, not threads. (JuliaGPU#1842) (@maleadt)
- Adapt to LLVM.jl 5 and GPUCompiler.jl 0.19. (JuliaGPU#1847) (@maleadt)

v4.1.2

Toggle v4.1.2's commit message
## CUDA v4.1.2

[Diff since v4.1.1](JuliaGPU/CUDA.jl@v4.1.1...v4.1.2)


**Closed issues:**
- Flux's `gradient` differentiating`rfft` leads to non-bit error (JuliaGPU#1835)

**Merged pull requests:**
- switch to using defined globals (JuliaGPU#1832) (@simonbyrne)
- Update manifest (JuliaGPU#1837) (@github-actions[bot])

v4.1.1

Toggle v4.1.1's commit message
## CUDA v4.1.1

[Diff since v4.1.0](JuliaGPU/CUDA.jl@v4.1.0...v4.1.1)



**Merged pull requests:**
- Fix export of CUDABackend (JuliaGPU#1834) (@vchuravy)

v4.1.0

Toggle v4.1.0's commit message
## CUDA v4.1.0

[Diff since v4.0.1](JuliaGPU/CUDA.jl@v4.0.1...v4.1.0)


**Closed issues:**
- ERROR: LoadError: bin\cublas64_11.dll when installing CUDA (JuliaGPU#1750)
- System-wide CUDA in LD_LIBRARY_PATH breaks CUBLAS (JuliaGPU#1755)
- CuDeviceTexture getindex breaks when executed on the CPU (JuliaGPU#1757)
- cuDNN.version can cause Julia to crash, missing `cudnn_ops_infer64_8.dll` (JuliaGPU#1777)
- cuDNN compile error "ERROR: LoadError: ArgumentError: invalid version string: local" (JuliaGPU#1783)
- "Error: No CUDA Runtime library found" for ≥v4.0.0 (JuliaGPU#1808)
- sqrt broken in kernels 'Format of __nvvm__reflect function not recognized' (JuliaGPU#1817)

**Merged pull requests:**
- Add support for CUDA 12.0. (JuliaGPU#1742) (@maleadt)
- Add more fixes and tests for CUDA toolkit 12.0 (JuliaGPU#1756) (@amontoison)
- Update manifest (JuliaGPU#1758) (@github-actions[bot])
- Fix test/cusparse/interfaces.jl (JuliaGPU#1762) (@amontoison)
- Simplify the function sig. (JuliaGPU#1763) (@N5N3)
- Update manifest (JuliaGPU#1770) (@github-actions[bot])
- Make versioninfo() resilient against NVML EPERM. (JuliaGPU#1771) (@maleadt)
- Move CUDAKernels to CUDA.jl (JuliaGPU#1772) (@vchuravy)
- [CUSPARSE] Improve conversion and tests between sparse matrices (JuliaGPU#1774) (@amontoison)
- Use geam for + and - operations with CuMatrix{<:CublasFloat} (JuliaGPU#1775) (@amontoison)
- Update manifest (JuliaGPU#1776) (@github-actions[bot])
- Update manifest (JuliaGPU#1781) (@github-actions[bot])
- Update manifest (JuliaGPU#1784) (@github-actions[bot])
- [CUSPARSE] Update preconditioners.jl (JuliaGPU#1785) (@amontoison)
- [CUSOLVER] Avoid the conversion to CSR format for reordering routines (JuliaGPU#1786) (@amontoison)
- Bump GPUCompiler. (JuliaGPU#1787) (@maleadt)
- Remove unneeded variable. (JuliaGPU#1788) (@maleadt)
- [CUSPARSE] Update conversions.jl (JuliaGPU#1791) (@amontoison)
- Update to CUDNN 8.8.1 for CUDA 12 compatibility. (JuliaGPU#1792) (@maleadt)
- Add support for CUDA 12.1 (JuliaGPU#1793) (@maleadt)
- [CUSPARSE] Interface color reordering (JuliaGPU#1794) (@amontoison)
- [CUSPARSE] Interface gtsv2 (JuliaGPU#1795) (@amontoison)
- Update manifest (JuliaGPU#1796) (@github-actions[bot])
- Adapt to GPUCompiler 0.18 (JuliaGPU#1799) (@maleadt)
- Follow `Array`'s behavior when initializing (JuliaGPU#1800) (@lcw)
- [CUSOLVER] Support A \ b for rectangular matrices (JuliaGPU#1802) (@amontoison)
- Use symbols instead of values when emitting code, when possible. (JuliaGPU#1804) (@maleadt)
- Refactor CI pipeline a little. (JuliaGPU#1805) (@maleadt)
- [CUSOLVER] Improve the dispatch for LAPACK routines (JuliaGPU#1806) (@amontoison)
- Diagonal for lower triangular of LU decomposition set incorrectly (JuliaGPU#1813) (@tgymnich)
- CompatHelper: add new compat entry for "KernelAbstractions" at version "0.9" (JuliaGPU#1824) (@github-actions[bot])
- Rebuild CUPTI API with support for STRUCT_SIZE (JuliaGPU#1827) (@vchuravy)
- Release CUDA 4.1 (JuliaGPU#1828) (@vchuravy)

v4.0.1

Toggle v4.0.1's commit message
Bump package versions.

v4.0.0

Toggle v4.0.0's commit message
## CUDA v4.0.0

[Diff since v3.13.1](JuliaGPU/CUDA.jl@v3.13.1...v4.0.0)


**Closed issues:**
- Missing implementation of right multiply for QR decomposition (JuliaGPU#1738)
- [CUSPARSE] Type error with mm! (JuliaGPU#1743)

**Merged pull requests:**
- Implement rmul for qr. (JuliaGPU#1739) (@maleadt)
- Update manifest (JuliaGPU#1741) (@github-actions[bot])
- Update CUSPARSE for CUDA v12.0 (JuliaGPU#1744) (@amontoison)
- Fix nvprof command (JuliaGPU#1745) (@lucifer1004)
- Update manifest (JuliaGPU#1747) (@github-actions[bot])
- Fix grammar (JuliaGPU#1748) (@lucifer1004)

v3.13.1

Toggle v3.13.1's commit message
## CUDA v3.13.1

[Diff since v3.13.0](JuliaGPU/CUDA.jl@v3.13.0...v3.13.1)


**Closed issues:**
- CUDA.jl cuFFT underperforming against CuPy cuFFT (JuliaGPU#1682)
- Is block-spmm supported? (JuliaGPU#1736)

**Merged pull requests:**
- Introduce cuFFT plan cache; switch to auto-managed memory. (JuliaGPU#1734) (@maleadt)
- Stop pirating GPUArrays' RNG methods. (JuliaGPU#1735) (@maleadt)

v3.12.2

Toggle v3.12.2's commit message
## CUDA v3.12.2

[Diff since v3.12.1](JuliaGPU/CUDA.jl@v3.12.1...v3.12.2)


**Closed issues:**
- CUDA.jl cuFFT underperforming against CuPy cuFFT (JuliaGPU#1682)
- Error during CUDA test (JuliaGPU#1718)
- Kernel error from bad broadcast (should be regular error?)  (JuliaGPU#1720)
- Freeze into StackOverflow when `JULIA_DEBUG=CUDA` set (JuliaGPU#1721)
- Use of linear operators in CUDA.jl (JuliaGPU#1727)
- Is block-spmm supported? (JuliaGPU#1736)

**Merged pull requests:**
- Allow `copy(::RNG)` (JuliaGPU#1719) (@mcabbott)
- Update manifest (JuliaGPU#1722) (@github-actions[bot])
- Simplify CuError rendering before library initialization. (JuliaGPU#1723) (@maleadt)
- Simplify CuError rendering before library initialization (master branch version) (JuliaGPU#1724) (@maleadt)
- Make device RNG test more robust. (JuliaGPU#1725) (@maleadt)
- Rely on LLVM.jl's typed_ccall for more intrinsics. (JuliaGPU#1728) (@maleadt)
- Backports for 3.13 (JuliaGPU#1729) (@maleadt)
- Simplify CUBLAS and CUSPARSE wrappers, reducing code generated. (JuliaGPU#1730) (@maleadt)
- Add Julia 1.9 CI. (JuliaGPU#1731) (@maleadt)
- Use released dependencies. (JuliaGPU#1732) (@maleadt)
- Remove NVTX. (JuliaGPU#1733) (@maleadt)
- Introduce cuFFT plan cache; switch to auto-managed memory. (JuliaGPU#1734) (@maleadt)
- Stop pirating GPUArrays' RNG methods. (JuliaGPU#1735) (@maleadt)

v3.13.0

Toggle v3.13.0's commit message
## CUDA v3.13.0

[Diff since v3.12.1](JuliaGPU/CUDA.jl@v3.12.1...v3.13.0)


**Closed issues:**
- Error during CUDA test (JuliaGPU#1718)
- Kernel error from bad broadcast (should be regular error?)  (JuliaGPU#1720)
- Freeze into StackOverflow when `JULIA_DEBUG=CUDA` set (JuliaGPU#1721)
- Use of linear operators in CUDA.jl (JuliaGPU#1727)

**Merged pull requests:**
- Allow `copy(::RNG)` (JuliaGPU#1719) (@mcabbott)
- Update manifest (JuliaGPU#1722) (@github-actions[bot])
- Simplify CuError rendering before library initialization. (JuliaGPU#1723) (@maleadt)
- Simplify CuError rendering before library initialization (master branch version) (JuliaGPU#1724) (@maleadt)
- Make device RNG test more robust. (JuliaGPU#1725) (@maleadt)
- Rely on LLVM.jl's typed_ccall for more intrinsics. (JuliaGPU#1728) (@maleadt)
- Backports for 3.13 (JuliaGPU#1729) (@maleadt)
- Simplify CUBLAS and CUSPARSE wrappers, reducing code generated. (JuliaGPU#1730) (@maleadt)
- Add Julia 1.9 CI. (JuliaGPU#1731) (@maleadt)
- Use released dependencies. (JuliaGPU#1732) (@maleadt)
- Remove NVTX. (JuliaGPU#1733) (@maleadt)