Investigate improving the error message for PTX_COMPILE error #313

ksimpson-work · 2024-12-18T15:44:42Z

If you don't pass an architecture argument to program options, compile will use the architecture of the current device, however it is possible that the arch of the current device is not able to handle the ptx compilation. On Program compile failure we could potentially provide some tips for things to check depending on the error code. I see this as an enhancement personally.

leofang · 2024-12-30T06:09:51Z

If you don't pass an architecture argument to program options, compile will use the architecture of the current device,

For NVRTC, the default is compute_52 (it's documented here). Using the current device arch was my suggestion during a past offline chat 🙂 (and I still think we should do it)

however it is possible that the arch of the current device is not able to handle the ptx compilation.

This is a more complex topic. Considering two steps

Generating PTX:
- This needs NVRTC to support the target device arch (either user-specified or current). In the past this translated to comparing the CTK/NVRTC version with the earliest CTK version that supports the device arch (all libraries need to maintain this mapping somewhere, ex: cc90 -> CTK 11.8). However, recent NVRTC has the nvrtcGetSupportedArchs API to avoid this need. We could call this API in Program.
Generating SASS:
- All drivers can JIT compile PTX to SASS. However, this could fail when the driver version is older than the toolchain (NVRTC) version that is used to generate the PTX. We've implemented this check in the test suite, and we should move it to as part of Program.

ksimpson-work · 2024-12-31T01:04:14Z

I raised this issue before we added the _exception_manager ContextManager to the linker class, and the error message is now much better. I should have closed it then, but I agree with calling nvrtcGetSupportedArchs in Program and moving the check as well, and that would constitute a more robust fix.

separately, I'm running into the default compute think as I work on the ProgramOptions which use the current device arch, while Linker currently requires it as an argument. I am going to make it also fallback to the dcurrent device and put that up for review.

ksimpson-work added cuda.core Everything related to the cuda.core module enhancement Any code-related improvements triage Needs the team's attention labels Dec 18, 2024

leofang assigned ksimpson-work Jan 1, 2025

leofang added P1 Medium priority - Should do and removed triage Needs the team's attention labels Jan 1, 2025

leofang added this to the cuda.core beta 3 milestone Jan 1, 2025

ksimpson-work linked a pull request Jan 14, 2025 that will close this issue

Improve program checks #394

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate improving the error message for PTX_COMPILE error #313

Investigate improving the error message for PTX_COMPILE error #313

ksimpson-work commented Dec 18, 2024

leofang commented Dec 30, 2024

ksimpson-work commented Dec 31, 2024

Investigate improving the error message for PTX_COMPILE error #313

Investigate improving the error message for PTX_COMPILE error #313

Comments

ksimpson-work commented Dec 18, 2024

leofang commented Dec 30, 2024

ksimpson-work commented Dec 31, 2024