Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update for CUDA 9 / 10 #5

Merged
merged 25 commits into from
Jan 15, 2020
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2cdea63
Fp16 fixes for CUDA 9 (#783)
csarofeen Jun 26, 2017
68170ce
Warp intrinsic fixes (#785)
Jun 29, 2017
e905454
Updates for CUDA 9
csarofeen Jul 19, 2017
b629e33
cuda 9 hgemm fix
soumith Aug 25, 2017
a06460a
update with CMake 3.13, and add Turing support
gkanno Nov 30, 2018
51efac5
add nvcc option for half
gkanno Nov 30, 2018
c09c92f
patch for CUDA 10
gkanno Dec 3, 2018
e7fed5b
fix cuda 10.0 patch to be able to build with 9.x
gkanno Feb 25, 2019
4b168ba
add WARP_ANY
gkanno Feb 25, 2019
f944107
fix alignment warning
Sep 11, 2017
49dc78f
disable CudaHalfTensor for workaround on CUDA 10.
gkanno Jun 4, 2019
71c4469
Allowing larger grids for THCApply shows improved performance.
csarofeen Jul 21, 2017
352b446
Fix grid size for batch cat tensor now that getApplyGrid has been cha…
csarofeen Aug 28, 2017
aee45ce
fix __launch_bounds__ parameter for Turing(7.5)
gkanno Jun 10, 2019
7165723
same to ReduceNoncontig
gkanno Jun 10, 2019
27ba716
intoroduce mask parameter to WARP_ANY
gkanno Jun 11, 2019
4d6dc70
use cudaPointerAttributes.type for checking managed mamory.
gkanno Jun 11, 2019
3424b7c
fix cutorch_isManagedPtr
gkanno Jun 11, 2019
4992a6f
fix __launch_bounds__ parameter for Turing(7.5)
gkanno Jun 10, 2019
8fbc9cb
intoroduce mask parameter to WARP_ANY
gkanno Jun 11, 2019
a7d47fc
use cudaPointerAttributes.type for checking managed mamory.
gkanno Jun 11, 2019
cd8b9ef
fix bool <-> int conversion
gkanno Jun 11, 2019
6f33031
turn off CRT warnings of MSVC.
gkanno Jun 11, 2019
e04fdce
Merge branch '2017-06-01' of https://github.com/gkanno/cutorch into 2…
gkanno Jun 11, 2019
55100c0
add Compute Capability 7.2 to SELECT_COMPUTE_ARCH
gkanno Jun 12, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
intoroduce mask parameter to WARP_ANY
  • Loading branch information
gkanno committed Jun 11, 2019
commit 27ba716fd80427be84dd06ac5e657f4ebdf0cce0
4 changes: 2 additions & 2 deletions lib/THC/THCDeviceUtils.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,10 @@ __device__ __forceinline__ T WARP_SHFL_DOWN(T value, unsigned int delta, int wid
#endif
}

__device__ __forceinline__ bool WARP_ANY(bool cond)
__device__ __forceinline__ bool WARP_ANY(bool cond, unsigned int mask = 0xffffffff)
{
#if CUDA_VERSION >= 9000
return (bool)__any_sync(0xffffffff, (int)cond);
return (bool)__any_sync(mask, (int)cond);
#else
return __any(cond);
#endif
Expand Down