Vulkan allows us to bind shader resources like textures, images, storage buffers, uniform buffers and texel buffers in an incremental manner. For example, we can bind all view matrices in a single descriptor set (actually, just a single uniform buffer) and have it persist between several pipeline switches. However, it’s not super clear how descriptor sets are deemed compatible between pipelines.
NOTE: When mentioning shader later, I mean AnyFX style shaders, meaning a single shader can contain several vertex/pixel/hull/domain/geometry/compute shader modules.
I could never get the descriptor sets to work perfectly, which is to bind the frame-persistent descriptors first each frame, and then not bind them again for the entire frame (or view). Currently, I bind my ‘shared’ descriptor sets after I start a render pass or bind a compute shader.
When binding a descriptor set, all descriptor sets currently bound with a set number lower than the one you are binding now has to be compatible. So if we have set 0 bound, and bind set 3, then for set 0 to stay bound, it has to be compatible with the pipeline. If we switch pipelines, then the descriptor sets compatible between pipelines will be retained, if they follow the previous rule. That is, if Pipeline A has sets 0, 1, 2, 3 and Pipeline B is bound, and sets 0 and 1 are compatible, then 2 and 3 will be unbound and will need to be bound again.
Where do we find the biggest change of shader variables? Well, clearly in each individual shader. For example, let’s pick shader billboard.fx, which has a vec4 Color, and a sampler2D AlbedoMap. In AnyFX, the Color variable would be a uniform and tucked away in a uniform buffer, and the AlbedoMap would be its own resource. In the Vulkan implementation, they would also be assigned a set number, and to avoid screwing with lower sets, thereby trying to avoid invalidating descriptor sets, this ‘default set’ would have to be high enough for other sets to not go above it. However, since we can’t really know the shader developers intention of how sets are used, the compiler be supplied a flag, /DEFAULTSET
I also got texture arrays and indexing to work properly, so now all textures are submitted as a huge array of descriptors, and whenever an object is rendered all that is updated is the index into the array which is supplied in a uniform buffer. This way, we can greatly keep the amount of descriptor sets down to a minimum of 1 per set number per shader resource. Allocating a new resource using a certain shader will expand the uniform buffer to accommodate for object-specific data.
First off is the naïve way:
Memory | Memory | Memory | Memory | Memory | Memory | Memory | Memory |
---|---|---|---|---|---|---|---|
Buffer | Buffer | Buffer | Buffer | Buffer | Buffer | Buffer | Buffer |
Object 1 | Object 2 | Object 3 | Object 4 | Object 5 | Object 6 | Object 7 | Object 8 |
Which was where I was a couple of days ago, and this forced me to use one descriptor per shader state, since each shader state has their own buffer. The slightly less bad way of doing this is:
Memory | |||||||
---|---|---|---|---|---|---|---|
Buffer | Buffer | Buffer | Buffer | Buffer | Buffer | Buffer | Buffer |
Object 1 | Object 2 | Object 3 | Object 4 | Object 5 | Object 6 | Object 7 | Object 8 |
Which reduces memory allocations but also doesn’t help with keeping the descriptor set count low.
Memory | |||||||
---|---|---|---|---|---|---|---|
Buffer | |||||||
Object 1 | Free | Free | Free | Free | Free | Free | Free |
Allocating a new object just returns a free slot.
Memory | |||||||
---|---|---|---|---|---|---|---|
Buffer | |||||||
Object 1 | Object 2 | Free | Free | Free | Free | Free | Free |
If the memory backing is full, we expand the buffer size and allocate new memory.
Memory | |||||||
---|---|---|---|---|---|---|---|
Buffer | |||||||
Object 1 | Object 2 | Object 3 | Object 4 | Object 5 | Object 6 | Object 7 | Object 8 |
Memory | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Buffer | |||||||||||||||
Object 1 | Object 2 | Object 3 | Object 4 | Object 5 | Object 6 | Object 7 | Object 8 | Object 9 | Free | Free | Free | Free | Free | Free | Free |
As you can see, the buffer stays the same, meaning we can keep it bound in the descriptor set, and just change its memory backing. The only thing the shader state needs to do now is to submit the exact same descriptor state as all sibling states, but provide its own offset into the buffer.
However, since we need to create a new buffer in Vulkan to bind new memory, we actually have to update the descriptor set when we expand, but this will only be done when creating a shader state, which is done outside of the rendering loop anyways.
Textures are bound by the shader server each time a texture is created, it registers with the shader server, and the shader server performs a descriptor set write. The texture descriptor set must be set index 0, so that it can be shared by all shaders.
Consider this shader:
group(1) varblock MaterialVariables { ... }; group(1) sampler2D MaterialSampler; group(2) r32f image2D ReadImage; group(2) image2D WriteImage; group(3) varblock KernelVariables { ... };
Resulting in this layout on the engine side.
Shader | ||||
---|---|---|---|---|
Descriptor set 1 | Descriptor set 2 | Descriptor set 3 | ||
Uniform buffer | Sampler | Image | Image | Uniform buffer |
Creating a ‘state’ of this shader would only perform an expansion of the uniform buffers in sets 1 and 3, but the sampler and two images will be directly bound to the descriptor set of the shader, meaning that any per-object texture switches would cause all objects to switch textures. We don’t want that, obviously, but we’re almost there. We can still create a state of this shader and not bind our own uniform buffers, by simply expanding the uniform buffers in sets 1 and 3 to accommodate for the per-object variables. To do this for textures, we need to apply the texture array method mentioned before.
group(0) sampler2D AllMyTextures[2048]; group(1) varblock MaterialVariables { uint MaterialTextureId; ... }; group(2) r32f image2D ReadImage; group(2) image2D WriteImage; group(3) varblock KernelVariables { ... };
Which results in the following layout:
Shader | ||||
---|---|---|---|---|
Descriptor set 0 | Descriptor set 1 | Descriptor set 2 | Descriptor set 3 | |
Sampler array | Uniform buffer | Image | Image | Uniform buffer |
Now, texture selection is just a manner of uniform values, supplying a per-object value for the uniform buffer value MaterialTextureId. While this is trivial for samplers, it also leaves us asking for more. For example, how do we perform different sampling of textures when all samplers are bound in an array? Vulkan allows for a texture to be bound with an immutable sampler in the descriptor set, so that’s one option, although we supply all our sampler information in AnyFX in the shader code by doing something like:
samplerstate MaterialSamplerState { Samplers = { MaterialSampler }; Filter = Anisotropic; };
But we can’t anymore, because we don’t have MaterialSampler, and applying this sampler state to all textures in the entire engine might not be correct either. Luckily for us, the KHR_vulkan_glsl extension supplies us with the ability to decouple textures from samplers, and create the sampler in shader code. So I enabled AnyFX to create such a separate sampler object, although to do so one must omit the list of samplers. So the above code would be:
group(1) samplerstate MaterialSamplerState { Filter = Anisotropic; };
Which results in a separate sampler, and the descriptor sets would be:
group(0) texture2D AllMyTextures[2048]; ...
And finally, sampling the texture is
vec4 Color = texture(sampler2D(AllMyTextures[MaterialTextureId], MaterialSamplerState));
Instead of
vec4 Color = texture(AllMyTextures[MaterialTextureId]);
Which will allow us to, in the shader code, explicitly select which sampler state to use, even if we have all our textures submitted once per frame. I could also implement a list of image-samplers combined really easily, and allow for example a graphics artist to supply the texture with sampler information, and just have that updated directly into the descriptor set, but still be able to fetch the proper sampler from the array.
For the sake of completeness, here’s the final shader layout:
Shader | |||||
---|---|---|---|---|---|
Descriptor set 0 | Descriptor set 1 | Descriptor set 2 | Descriptor set 3 | ||
Texture array | Sampler state | Uniform buffer | Image | Image | Uniform buffer |
So this proves we can utilize uniform buffers to select textures too, covering all our grounds in one tied up bow. Neat. Except for images, and here’s why.
Images are not switched around and messed around with like textures are, and for good reason. An image is used when a shader needs to perform a read-write to texels in the same resource, meaning that images are mostly used for random access and random writes, for post effects and the like, and are thus not as prone to changes as for example individual objects. Instead, images are mostly consistent, and can be bound during rendering engine setup. We could implement image arrays like we do texture arrays, however we must consider the HUGE amount of format combinations required to fit all cases.
Images can, like textures, be 2D, 2D multisample, 3D, Cube, just to mention the common types. We obviously have special cases like 2DArray, CubeArray and so forth, but array textures are not even used or supported in Nebula; never saw the need for them. However, images also needs a format qualifier if the image is to be supported with imageLoad, meaning we basically need a uniform array of all 4 ordinary types, with all permutations of formats. While possible, I deemed it a big no-no, and instead determined that since images are special use resources for single-fire read-writes, then a shader has to update the descriptor set each time it wants to change it, meaning it’s more efficient to, in the same shader, reuse the same variable and just not perform a new binding. All in all, this shouldn’t become a problem.
What’s left to do is to enforce certain descriptor set layouts by the shader loader, so that no shader creator accidentally use a reserved descriptor set (like 0 for textures, 1 for camera, 2 for lighting, 3 for instancing). If the shader does, it will manipulate a reserved descriptor set which will cause it to become incompatible, and we can’t have that since it will simply cause manually applied descriptors to stop being bound, resulting in unpredictable behavior. Another way of solving this issue is by changing the group-syntax in AnyFX to something more stable and easier to validate, like making it into a structure like syntax, for example:
group 0 { sampler2D Texture; varblock Block { vec4 Vector; uint Index; } }
And then assert that no group is later declared with the same index. To handle stray variables declared outside of a group, the compiler simply generates the default group, and puts all strays in there.
The only issue I have with the above syntax is the annoying level of indirection before you actually get to the meat of the shader code. I think implementing an engine side check is the way to go now, but implementing groups as a structure like above could be a valid idea, since we might want to have the same behavior in all rendering APIs. Consider this for OpenGL too, in which we can guarantee that applying a group of uniforms and textures will remain persistent if all shaders share the same declaration. Although, in OpenGL, since we don’t have descriptor sets, we must simply ensure that the location values for individual groups remain consistent.