Compute versus Hardware
Let's use our old test Mesh Shader versus MDI with 498990 64×128 Meshlets without any culling except back-face. The era of GPU fixed-function units and an enormous number of shader types is almost over. Best Mesh Shaders / MDI are slower than Compute-based rasterization. Single shader type is better than 14 dedicated shader types. What we need are compute shaders and the ability to spawn threads from shaders effectively. Everything else can be easily implemented on Compute Shader level. A compute shader extension allowing atomically to write payloads into the image will change everything.