Looking for efficient branching instructions in materials

The base materials in our project have a few expensive paths that are not frequently visible on our pixels. We would like to be able to efficiently skip these paths when we know that they won’t produce anything visible.

We can figure out with only a few instructions if our expensive path would produce something visible or not. We tried to use “if” nodes to discard these expensive paths as soon as possible but but it doesn’t really help us. By looking at the produced HLSL code, all it can produce is ternary ?: operations that are applied after the expensive path has been calculated.

In term of HLSL functionality, this is very close to a flatten if. What we are looking for is a branch if instead. Is there any way to do the equivalent of an HLSL branching if statement? If not, are you planning to expose this in the future?

So far, the only workaround we found was to rewrite our expensive paths in custom expressions which contain if-return statements early in their code to skip the remaining instructions. In term of performance, it works well. It allows us to save many ms per frame of GPU time. Unfortunately, it’s a lot of trouble for our artists who are not technical enough to write HLSL shaders.

Cheers,

I managed to implement my own branch node and I’ve got some interesting results. I was able to remove my custom expressions with if-return early exit statements and get pretty much the same results by using my branch node instead.

Here is an example of what I can do with my branch node:

Here I have an expensive material function that displaces vertices. I use the visibility function in the commented area to attenuate the vertex displacement with the distance (value goes from 1 at camera location to 0 at maximum distance). I use the visibility factor to feed A which is used to compare A with B. On the left pane, We can see that comparison operation is configurable (in this case, A > B is used) and B defaults to Const B because nothing is connected to B. When the comparison is true (within visible range), the True input expression is sent in output. When the comparison is false, the False expression is sent in output instead (Const False is used if no expression is provided).

The [code][2] generated is really inside an if-else statement

Note that no else statement is produced when the False pin is not connected. An optional branch or flatten argument can also be specified (this will actually use the BRANCH or FLATTEN macros).

Finally, it’s also possible to connect branch nodes into each other. This will resolve if statements correctly and indent code in an appropriate way. Here is an example:

This is the corresponding HLSL [code][4] generated.

Note that this solution is not perfect. It has a few drawbacks.

First, if some of the nodes inside the True or False pins are shared with among each other or with other parts of the graph outside the scope of the branch, the common material nodes will be emitted in each of the location within the if-else statement and outside the scope of the if-else statement. Removing redundant expression is not trivial to do because HLSL code is emitted in a single pass. However, I have done some tests on PC and usually, the redundant expression were simplified correctly by the HLSL compiler or by the driver. I didn’t check if this always the case and if it works well on other platforms so my recommendation would be to avoid plugging in redundant expression or make them the less expensive possible.

Also, it’s worth mentioning that the branch result is constrained to a single expression at a time so it can store up to a float4 value. I had a case where I had to optimized a float3 and a scalar with the same branch condition and a lot of shared logic. I ended up combining the two in a float4 expression connected to a single branch node. More complex cases might require multiple brach nodes evaluated in parallel with the same if condition. I have not evaluated this case. I suppose that good HLSL compilers would be good enough to simplify this but this optimization might easily be missed by some compilers.

Overall. this branch node open the door to many shader optimizations on our side while keeping a level of artistic control in our shaders even for artists who are not familiar with HLSL and custom expressions. We already see some use for it in our other projects.