-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VectorToXeGPU: Allows lowering vector.transfer_read and vector.transfer_write to XeGPU #773
base: main
Are you sure you want to change the base?
Conversation
…GPU dialect that enables vector access on Intel GPU
@Scarlet1ssimo I have a general question about the placement of this code. |
mlir::Value desc; | ||
if (auto MemRefTypedSource = | ||
mlir::cast<mlir::TypedValue<mlir::MemRefType>>(source)) { | ||
desc = rewriter.create<mlir::xegpu::CreateNdDescOp>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be a check for memref rank here? XeGPU supports limited ranks.
mlir::Value desc; | ||
if (auto MemRefTypedSource = | ||
mlir::cast<mlir::TypedValue<mlir::MemRefType>>(source)) { | ||
desc = rewriter.create<mlir::xegpu::CreateNdDescOp>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as my comment above: Check rank and return failure if unsupported shape.
@@ -0,0 +1,114 @@ | |||
// RUN: %python_executable %imex_runner --requires=l0-runtime -i %s --pass-pipeline-file=%p/vector-to-llvm.pp \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test in an Integration test running on GPU.
I would suggest placing in somewhere like under
test/Integration/Dialect/Vector/
@@ -0,0 +1,18 @@ | |||
builtin.module( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should move to a folder for Integration tests along with the test case above.
Please review these guidelines to help with the review process:
This patch allows lowering
vector.transfer_read
andvector.transfer_write
, which is quite common when handling the vectorization stuff, to corresponding XeGPU dialect.Namely, it first create a descriptor then apply either a LoadNdOp or a StoreNdOp.
Directly accessing 1d vector runs into some unknown issues, specifically no error reported during compilation and run but got wrong memory access pattern. So temporarily, for access a 1d vector, it first translates to accessing a
1x?
vector, which then get reshaped back to 1d. At least it works as expected. Hope this could be fixed on some other sides.Tested on PVC device.