-
Notifications
You must be signed in to change notification settings - Fork 187
Support for std::complex<__half>
?
#95
Comments
Hi Leo, I've been pondering this myself. One problem is that it's a device only type and being a heterogeneous library it would probably need to live outside I'd like to see this implemented personally. There might be other work items that take precedence, but it might be doable in 2.0.0 if it doesn't diverge tremendously from |
Thanks for quick reply, Wesley! So I took a quick look at the namespace convention of libcu++, do you mean the support of half complex will likely live in Regardless of where it lands eventually, I think there's always some template and/or macro tricks for us to access the right namespace based on the real type, so I don't think it's a big deal as long as it's somewhere 🙂 Is there an expected timeline for 2.0.0? I see that complex numbers are listed in 1.4.0 so I suppose it refers to the single & double complexes. One last query (I hope!): Will libcu++ work with NVRTC? In CuPy most of our kernels are compiled by NVRTC because, like in several NVIDIA RAPIDS libraries, the number of kernels we support is exponentially large, and there's no way to precompile them all. But I am a bit worried that NVRTC is not listed in the supported compilers, although I know it's a strictly C++ compiler so hopefully it'd just work. |
Where it would end up is probably open for discussion, but I don't have any opinion at the moment.
As for a timeline, 2.0.0 is mid February, but there's a lot of contention for bandwidth. Things like
Correct, libcu++ has support for single and double
Mostly everything in libcu++ works fine with NVRTC, there are some exceptions, but they would most likely not pop up in general use. almost all of our heterogeneous testing today includes NVRTC passes. |
Keep it noted: @cliffburdick asked for bf16 support in #153. |
Closing in favor of NVIDIA/cccl#525 |
Hello, suggested by @allisonvacanti I am opening an issue for information gathering. I am wondering if libcu++ can support half complex natively or not.
We are evaluating the possibility of supporting
complex32
(consisting of two half-precision floating numbers, i.e.,__half
) in CuPy, see cupy/cupy#3370. This is important for us to be able to seamlessly work with CUDA libraries that provide half complex support (such as cuFFT).Currently CuPy uses a clone of Thrust headers for complex-number support (so we use
thrust::complex<T>
internally), but obviously Thrust does not support__half
natively, and the only reason CuPy's internal could makethrust::complex<__half>
compiled without any error is because we provide a conversion operator from__half
tofloat
so that the single-precision specializations are used, which is obviously not optimal nor desired in terms of performance.Thanks.
The text was updated successfully, but these errors were encountered: