Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
1.这是基于“边角料法”的,请把他改成基于“网格跨步循环”的:
应用于线程和板块一起上的情况,可以匹配任意板块和线程数量。
2.这里有什么问题?请改正:
+=操作会导致线程冲突,因此改用原子加。
3.改成“网格跨步循环”以后,这里三重尖括号里的参数如何调整?:
将板块数改为1,每个线程处理n/1024个元素。
4.这里的“边角料法”对于不是 1024 整数倍的 n 会出错,为什么?:
除法向下取整,因此最后不满blockdim的数据无法被处理。板块数量改为(n+ blockdim -1)/ blockdim即可
5.这里 CPU 访问数据前漏了一步什么操作?:
当前时刻线程未全部执行完成,因此需要做一次同步。