-
According to https://github.com/NVIDIA/cutlass/blob/main/media/docs/cute/0x_gemm_tutorial.md, we have a map between transpose and matrices' layout. For example, mma_attom gives following layouts for TN, based on the assumption that matrix A is M-Major and matrix B is N-Major (according to my understanding), which contradicts the table above. // (T8,V4) -> (m,k)
using ALayout = Layout<Shape <_8,_4>,
Stride<_1,_8>>;
// (T8,V4) -> (n,k)
using BLayout = Layout<Shape <_8,_4>,
Stride<_1,_8>>; Have I misunderstood anything? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
The The See a similar question here with example code: |
Beta Was this translation helpful? Give feedback.
-
cutlass follows the convention that matrix A is logical MxK, B is logical NxK ,C is logical MxN and they are both column-major. cutlass constructs the layout based on this convention. "NT" indicates an A.col B.row instruction while "TN" indicates an A.row B.col instruction. We can't get a real index in an There's no problem to follow the the row-major and describe an A.row B.col instruction as "NT" as long as getting the right layout. Is this a correct understanding? |
Beta Was this translation helpful? Give feedback.
The
NT
,TN
,NN
, andTT
of the instructions are often mistakenly conflated with the layout of the data as well. The instructions and their traits don't say anything about the layout of data they operate on. The different instructions can, in principle, work on any data in any layout.The
MMA_Traits
that you show do not describe the layout of the data, they describe the partitioning pattern of the instruction. Those partitioning patterns can be applied to any tensor of data with any layout.See a similar question here with example code:
#1226