Argdown Document

<Quadgroup Extension>: If quad operations are made into their own extension, both its potential market becomes larger and subgroup operations' market grows.

[Explicit Operations]: Operations that take in active mask or lane index should exist.

<Undefined Behavior>: Explicit operations such as broadcasting with index depends on successful query of active lanes which can't be guaranteed with targeted underlying APIs.

<Avoiding Helps>: Implicitly active operations are useful enough while making concerns of divergence or reconvergence irrelevant.

[Non-Uniform]: We should support the non-uniform subgroup model.

<Ubiquitous>: All WebGPU target APIs use non-uniform model.

<Ambiguous Divergence>: when invocations in a subgroup are executing "together", how long is that guaranteed?

<Ambiguous Reconvergence>: once invocations diverge, what are the guarantees about where you reconverge?

Vulkan has weak guarantees, D3D12 and Metal don't have anything.

we can just say invocations never reconverge, which matches AMD ISA and CUDA models.

<Ambiguous Forward Progress>: how blocks affect progress of other blocks, and invocations within blocks affect progress on other invocations?

D3D, Metal, and Vulkan are silent on both of these.

<Ambiguous Helpers>: do helper invocations participate in subgroup operations?