-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mesh shader emulation over draw-indirect #38
Comments
Update to strategy: Cross-thread semantics is a big issue. Initial toughs: for any write-out store for all possible values of if all varyings are written from same thread, than that |
Initial work on Can roughly estimate thread-mapping for vertex/varyings for simple(OpenGothic) cases. |
Mesh-emulation still slower than draw-call spam in opengothic case. Current ideas:
|
More numbers:
|
Since neither NVidia, neither Intel support compute-to-graphics overlap in same command buffer new take on runtime is:
TODO: |
Hm, task shader appear to be way bigger problem that I expected. In straight indirect-based workflow one Some ideas:
void main()
{
if(gl_LocalInvocationID < max_task_threads)
task_main();
barrier(); // make sure that task stage is done
for(int i=0; i<mesh_groups; ++i)
{
if(gl_LocalInvocationID < max_mesh_threads)
mesh_main();
}
} Cons: wont work reliable with inner barriers, wont work fast with large expansion factor
|
Recent test on Intel, with time-stamp based profiler.
|
Based on #33
Initial implementation is practically working, this ticket is to track technical depth and for profiling work.
TODO:
flat
and other interpolatorsin uvec3 gl_WorkGroupID
polutionin uvec3 gl_NumWorkGroups
- polluted due to dispatch indirectin uvec3 gl_GlobalInvocationID
// polluted, since it is byproduct of gl_WorkGroupIDERR(wont't fix):
The text was updated successfully, but these errors were encountered: