Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocFFT and hipFFT examples (part II) #160

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

Beanavil
Copy link
Collaborator

@Beanavil Beanavil commented Sep 6, 2024

This pull request contains the second batch of the new rocFFT and hipFFT examples. Added samples:

  • rocFFT
    • complex_complex
    • complex_real
    • real_complex
  • hipFFT
    • multi_gpu

@Beanavil Beanavil self-assigned this Sep 6, 2024
@Beanavil Beanavil marked this pull request as ready for review September 6, 2024 12:26
@Beanavil Beanavil requested review from a team and dgaliffiAMD as code owners September 6, 2024 12:26
void print_nd_data(const std::vector<Tdata> data,
std::vector<Tsize> np,
const int column_width = 4,
const bool reverse_indexing = false)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be easier to understand and use if it was either renamed to row_major (since the deffault looks like column-major), or if this was replaced by an enum that spelled out exactly what the caller wanted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yes, this was added to support printing data with either layout. IINM the default is row-major (np[0] is assumed to be the size of the outermost dimension), and when reversing the ordering of the dimension sizes we assume a column-major layout. Changed this parameter to column_major.

@malcolmroberts
Copy link

I'm having some trouble getting the example to work with the cuda backend; cmake issues show up.

Copy link

@malcolmroberts malcolmroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda compilation doesn't seem to work. CI coverage is probably needed here.

@Beanavil
Copy link
Collaborator Author

Beanavil commented Sep 9, 2024

cuda compilation doesn't seem to work. CI coverage is probably needed here.

@malcolmroberts what errors are you getting? On our end (internal CI, and I also built it locally to double-check) it builds without problems

@Beanavil Beanavil force-pushed the fft-multigpu-plan branch 3 times, most recently from c08fc9f to b88aa6e Compare September 17, 2024 09:20
/// layout, the \p column_major parameter must be set to \p true for a correct interpretation
/// of the dimensions' sizes.
template<class Tdata, class Tsize>
void print_nd_data(const std::vector<Tdata> data,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data should be a const reference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, yes. Updated this!

@Beanavil Beanavil force-pushed the fft-multigpu-plan branch 2 times, most recently from f0b29b0 to a7478f1 Compare September 18, 2024 14:25
@Beanavil
Copy link
Collaborator Author

Beanavil commented Oct 7, 2024

@malcolmroberts @evetsso what's the status of the review for this? I think the only issues left to tackle were the ones mentioned in this comment about the CUDA build, but this is not showing up on our end so without further details I cannot move forward on that

@malcolmroberts
Copy link

Sorry for the delay - jumping back into this again.

When I configure with
cmake -DCMAKE_CXX_COMPILER=hipcc -DGPU_RUNTIME=CUDA ..
I get errors like

[ 16%] Building CUDA object multi_gpu/CMakeFiles/hipfft_multi_gpu.dir/main.cpp.o
In file included from /home/AMD/marobert/repo/rocm-examples/Libraries/hipFFT/multi_gpu/main.cpp:24:
/home/AMD/marobert/repo/rocm-examples/Libraries/hipFFT/multi_gpu/../../../Common/example_utils.hpp:50:10: fatal error: hip/hip_runtime.h: No such file or directory
   50 | #include <hip/hip_runtime.h>
      |          ^~~~~~~~~~~~~~~~~~~
```

@Beanavil
Copy link
Collaborator Author

@malcolmroberts Oh I see, but IINM that error should be happening for other samples because it seems the HIP headers are not found. I can reproduce this on my side if for instance I rename the /opt/rocm/include/hip folder to something else. Could you verify if you have the hip headers present on your machine?

@malcolmroberts
Copy link

Hey, @Beanavil ; I do have /opt/rocm/include/hip populated on my test machine.

However, if there isn't a find_package( hip REQUIRED ) then I wouldn't expect it to include or link the hip runtime. The files are .cpp, so the cmake setup isn't detecting the hip programming language, and I don't think that the runtime is going to be automatically included. I'm configuring/compiling from the Libraries/hipFFT subdirectory.

@evetsso
Copy link

evetsso commented Oct 16, 2024

@Beanavil Just to be totally clear here, are you also building these examples under CUDA?

@Beanavil
Copy link
Collaborator Author

@evetsso @malcolmroberts sorry for the delayed response, I got totally caught up on other tasks. Yes, we build the examples for CUDA and HIP backend. In fact, we use the dockerfiles provided by this repo for this, so I think if you use a container built from the cuda image the build should be successful.

I'm trying to figure out what could be triggering the failures in your end, I don't think a find_package( hip REQUIRED ) is what's missing here because IINM when we link to hipFFT we are already including the dependency on HIP. On my end, for instance, I can see that when I invoke cmake it does add the compiler option for including the hip headers (-isystem=/opt/rocm/include):

developer@2c1b9a595f95:~/rocm-examples/Libraries/hipFFT/build$ cmake --build . -v
...
[ 16%] Building CUDA object multi_gpu/CMakeFiles/hipfft_multi_gpu.dir/main.cpp.o
cd /home/developer/rocm-examples/Libraries/hipFFT/build/multi_gpu && /usr/local/cuda/bin/nvcc 
-forward-unknown-to-host-compiler  -I/home/developer/rocm-examples/Libraries/hipFFT/multi_gpu/../../../Common 
+ -isystem=/opt/rocm/include --generate-code=arch=compute_52,code=[compute_52,sm_52] -std=c++17 -MD -MT 
multi_gpu/CMakeFiles/hipfft_multi_gpu.dir/main.cpp.o -MF CMakeFiles/hipfft_multi_gpu.dir/main.cpp.o.d -x cu 
-c /home/developer/rocm-examples/Libraries/hipFFT/multi_gpu/main.cpp -o CMakeFiles/hipfft_multi_gpu.dir/main.cpp.o
...

Could you also show how is the compiler being run on your end? This is done with the -v option that I show in the previous code.

@Naraenda
Copy link
Member

@MKKnorr hi could you ping the right people to get this merged? it's been open for awhile

@malcolmroberts
Copy link

Sorry for the delay; I'm gong to try and be more on top of this PR.

So, the issue that I'm having is that, on a pure NVIDIA machine, I can install hipfft, and set up cmake paths, but when I try and compile, for example, plan_d2z, I get a compilation error that it cna't find the hip runtime. Steps are:

cd Libraries/hipFFT/plan_d2z && mkdir build && cd build
export HIP_PLATFORM=nvidia
cmake -DGPU_RUNTIME=CUDA ..
make

then I get

[ 50%] Building CUDA object CMakeFiles/hipfft_plan_d2z.dir/main.cpp.o
In file included from /home/marobert/repo/rocm-examples/Libraries/hipFFT/plan_d2z/main.cpp:24:
/home/marobert/repo/rocm-examples/Libraries/hipFFT/plan_d2z/../../../Common/example_utils.hpp:50:10: fatal error: hip/hip_runtime.h: No such file or directory
   50 | #include <hip/hip_runtime.h>
      |          ^~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/hipfft_plan_d2z.dir/build.make:77: CMakeFiles/hipfft_plan_d2z.dir/main.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/hipfft_plan_d2z.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

This isn't a problem when compiling on AMD systems.

@Beanavil
Copy link
Collaborator Author

Beanavil commented Dec 3, 2024

@malcolmroberts could you perhaps give more details about the previous setup that was done on that machine (ROCm installation e.g.)? I'm trying to reproduce on a NVIDIA machine inside a container (using this image) and following the setup steps of our dockerfile for cuda (this one) and I cannot reproduce any build errors :(.

@malcolmroberts
Copy link

I can get this working in the docker image; trying to figure out what the difference might be in the installs.

@Naraenda
Copy link
Member

Naraenda commented Jan 6, 2025

Hi @malcolmroberts did you had the opportunity to at the build discrepancy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants