Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Schedule Hopper mma instruction (#3278)
Stacked on #3320 This PR: * Schedules the MMA instruction result for the HopperMultiMatmulScheduler. * Removes some unused methods that are no longer necessary. * Checks that there is "no prologue". Specifically, that we have `gmem -LoadStoreOp-> smem -MmaOp->`. This can currently not be done unless we create the MmaOp at definition using `fusedMultiplySum` (see #1628). * Checks that MmaOp output has logical order MNK. If not then a root->logical reorder should have been created at definition. (maybe this should be made easier as an option in `fusedMultiplySum`). This PR does not schedule split-K or TMA stores of the output. --------- Co-authored-by: Ryan Spring <[email protected]>
- Loading branch information