This repository has been archived by the owner on Mar 12, 2021. It is now read-only.
Releases: JuliaGPU/CuArrays.jl
Releases · JuliaGPU/CuArrays.jl
v2.2.2
CuArrays v2.2.2
v2.2.1
CuArrays v2.2.1
Closed issues:
Merged pull requests:
- findfirst should return nothing if no matching element is found (#712) (@bsmiles32)
- Update manifest (#715) (@github-actions[bot])
- fix ambiguity for Random.seed!(rng, nothing) (#717) (@marius311)
- Actually put the workspace buffer back in the memory pool. (#718) (@maleadt)
- CompatHelper: bump compat for "CEnum" to "0.4" (#720) (@github-actions[bot])
- Update manifest (#723) (@github-actions[bot])
v2.2.0
CuArrays v2.2.0
Closed issues:
- Kernel optimizations (#445)
- mul! failing for adjoint sparse matrices (#629)
- Deadlock during memory free (#685)
- Training Halts when Using CuArrarys (#691)
- CUBLAS initialization (#697)
- Performance issue with v2.1.0 compared with v1.7.3 (#701)
Merged pull requests:
- Implement a few sparse array basics (#572) (@janEbert)
- fix 1d convolution (#690) (@AStupidBear)
- Fix sparse mul! (#692) (@amontoison)
- Disable the GC after taking pool-related spinlocks. (#693) (@maleadt)
- Update manifest (#694) (@github-actions[bot])
- adds wrappers for syevjBatched/heevjBatched family of CUSOLVER functions (#695) (@electronsandstuff)
- Update manifest (#702) (@github-actions[bot])
- Repopulate the pool from freed blocks before allocating. (#704) (@maleadt)
- CEnum 0.3 compatibility. (#705) (@maleadt)
- Add direct BLAS calls for trmm! and trsm! (#707) (@aterenin)
- Add mul! for trmm! and tests. (#708) (@aterenin)
- Support and use broadcast with mapreduce. (#709) (@maleadt)
- Protect against recursive failures during init. (#710) (@maleadt)
- Perform threaded test in lock. (#711) (@maleadt)
v2.1.0
CuArrays v2.1.0
Closed issues:
- Performance regression with mapreduce (#611)
- cufunc wrappers of NNlib activation functions (#614)
- rand(UInt32) gives low-quality results (#655)
- With
CuArrays
2.0, multidimensionalcircshift
fails for integer multiples of array size shift in one or more dimensions (#657) - similar(PermutedDimsArray(::CuArray)) isa Array (#658)
- In CuArrays v2.0, GPU operation takes hours to run for the first time (#660)
- sum!(y::CuVector, x::CuMatrix) throws InvalidIRError error (#661)
- Where can I find (#667)
- Where can I find All the using instructions of CuArrays? (#668)
- add implicit float conversion to math functions (#671)
- Multiplication between mixed types doesn't drop leading dimensions (#673)
- Very slow 4D broadcast in 2.0.1 (#677)
- Sum function is slow (#679)
- Indexing CuArrays with Empty Ranges Errors (#687)
- Sum, any, etc. with function is no longer implemented (#688)
Merged pull requests:
- Actually test error throwing for lsvqr (#604) (@kshyatt)
- Wrapper functions for NNlib (#615) (@matsueushi)
- Improve mapreduce performance (#646) (@wongalvis14)
- Improved error output during finalizer exceptions. (#651) (@maleadt)
- Fix usage discrepancies (#654) (@maleadt)
- Add support for uniform random UInt32 (#656) (@chrstphrbrns)
- Update manifest (#659) (@github-actions[bot])
- Implement GPUArrays vararg mapreduce. (#663) (@maleadt)
- Test for implicit singleton dims with mapreducedim. (#665) (@maleadt)
- Update manifest (#674) (@github-actions[bot])
- Allow scalar indexing where necessary and add a few tests (#675) (@kshyatt)
- Update manifest (#681) (@github-actions[bot])
- Update manifest (#686) (@github-actions[bot])
v2.0.1
CuArrays v2.0.1
Merged pull requests:
v2.0.0
CuArrays v2.0.0
Closed issues:
Merged pull requests:
- implement CuIterator for batching arrays to the GPU (#467) (@jrevels)
- Specialized _var and _std functions (#612) (@merckxiaan)
- NNlib batched_mul! (#619) (@mcabbott)
- Clean-up init code. (#620) (@maleadt)
- Thread safety fixes. (#621) (@maleadt)
- Update wrappers. (#622) (@maleadt)
- Use a library handle for each thread/device. (#623) (@maleadt)
- Better error upon use of missing libraries. (#625) (@maleadt)
- Lock memory allocator operations (#626) (@maleadt)
- Thread safety of memory allocator (#627) (@maleadt)
- Use a return value-based retry scheme for all APIs failing to allocate. (#633) (@maleadt)
- Provide 5-arg mul! (#634) (@haampie)
- Update manifest (#635) (@github-actions[bot])
- Add BLAS.axpby! (#636) (@amontoison)
- Let cublasXT use host memory (#639) (@kshyatt)
- Test Julia 1.4. (#640) (@maleadt)
- 5-arg mul! with 3-arg mul! support (#641) (@haampie)
- Avoid multiple mapreduce kernel launches (#642) (@maleadt)
- GC.preserve some arrays when unsafely accessing them. (#644) (@maleadt)
- Support for Julia's multitasking. (#645) (@maleadt)
- Don't depwarn if new env vars are used. (#648) (@maleadt)
- Avoid deadlock in debug mode. (#649) (@maleadt)
v1.7.3
v1.7.3 (2020-03-06)
Closed issues:
- CI doesn't test old CUDA versions (#607)
- findmax slow (#606)
import CuArrays
always fails with CUDA 10.2.89 (but works fine with CUDA 10.0.130 and 10.1.105) (#601)- Kernel exception in findfirst (#595)
- Test failures on Tesla K20c/CUDA 9.0 (#594)
- Linear indexing into arrays with two or more dimensions throws an obscure error during gradient calculation with Zygote (#590)
- Make CURAND seeding more consistent with Base (#589)
mapreduce
(sum
,prod
, etc.) fail in some cases when given adims
argument. (#583)- Whitelist hypot (#442)
- polish
mapreduce
interface (#204) - Use of mapreduce by Flux doesn't rewrite intrinsics (#154)
- Slow mapreduce compared to KnetArray (#141)
- Support for
missing
values (#125)
v1.7.2
v1.7.2 (2020-02-11)
Closed issues:
v1.7.1
v1.7.1 (2020-02-07)
Closed issues:
- Only show debug timings during tests on CI (#573)
- Broadcasted setindex! triggers scalar setindex! (#571)
- ODE performance benchmark (#566)
- Invalid IR for reductions along trivial dimension (#542)
- cublas error with "--math-mode=fast" (#510)
- ArgumentError when displaying a view of CuArray (#506)
- Serial mapreduce kernel broken on -g2 (#418)
- Functions not available in CUDA 8.0 (#351)
- Rexport functions from AbstractFFTs (#341)
Merged pull requests:
v1.7.0
v1.7.0 (2020-01-21)
Closed issues:
- Update dependecies (#562)
- Non-linear performance at larger batch sizes. (#555)
- Wrong return type for reductions of complex
CuArray
s (#550) - ReshapedArray causes scalar indexing (#548)
- resize! broken (#547)
exp.\(x\)
is not type stable whenx isa CuArray{Complex{Float32}}
(#543)- zero-size alloc before initialization errors (#538)
- CUDNN compatibility (#536)
- Error message on invalid adapt (#365)
Merged pull requests:
- CUTENSOR coverage fixes. (#565) (maleadt)
- Basic tests for CuTensor type (#564) (kshyatt)
- Fix and test seed methods. (#561) (maleadt)
- Revert to simpler GC.gc API. (#560) (maleadt)
- Preserve the pooled property when deriving a CuArray. (#558) (maleadt)
- Implement array resizing. (#557) (maleadt)
- Fix SplittingPool reclaim. (#556) (maleadt)
- Improve error display. (#554) (maleadt)
- Warn about version mismatches. (#553) (maleadt)
- Throw an error for unsupported element types (#552) (maleadt)
- Use the CUDAnative context getter. (#551) (maleadt)
- Add a test for #543. (#546) (maleadt)
- Avoid exponential of positive numbers in softplus implementation (#518) (vargonis)