While I’ve been working on AnyFX, I’ve also looked into tone mapping. Now, Nebula encodes and decodes pseudo-hdr by just down-scaling and up-scaling, something which works rather well in most cases. However, HDR was applied screen-wide and had no adaptation of brightness or color, and this is exactly what tone mapping solves.
I was a bit stunned of how easy it was to perform tone mapping. We need to downscale the color buffer down to a 2×2 image, and then calculate the average luminance value into a 1×1 texture. We then simply copy the 1×1 texture to be used the next frame.
One way to perform the downscaling could be a sequence of post effects which performs a simple 2×2 box average. However, since what we are doing is essentially mipmapping, we could just aswell generate mips instead. The only downside to this is that we generate one more level than we want, but it’s still much more efficient than having to run a series of consecutive downscale passes.
When the average luminance has been calculated, we just use that value with a tone mapping operator to perform the effect. We perform the eye adaptation part when we perform the 2×2 -> 1×1 downscale. The HLSL code for the operator is:
static const float g_fMiddleGrey = 0.6f; static const float g_fMaxLuminance = 16.0f;</pre> //------------------------------------------------------------------------------ /** Calculates HDR tone mapping */ float4 ToneMap(float4 vColor, float lumAvg, float4 luminance) { // Calculate the luminance of the current pixel float fLumPixel = dot(vColor.rgb, luminance); // Apply the modified operator (Eq. 4) float fLumScaled = (fLumPixel * g_fMiddleGrey) / lumAvg; float fLumCompressed = (fLumScaled * (1 + (fLumScaled / (g_fMaxLuminance * g_fMaxLuminance)))) / (1 + fLumScaled); return float4(fLumCompressed * vColor.rgb, vColor.a); }
We use a constant for the middle gray area and the maximum amount of luminance. This could be parametrized, but it’s not really necessary. We calculate the average luminance using the following kernel:
//------------------------------------------------------------------------------ /** Performs a 2x2 kernel downscale */ void psMain(float4 Position : SV_POSITION0, float2 UV : TEXCOORD0, out float result : SV_TARGET0) { float2 pixelSize = GetPixelSize(ColorSource); float fAvg = 0.0f; // source should be a 512x512 texture, so we sample the 8'th mip of the texture float sample1 = dot(ColorSource.SampleLevel(DefaultSampler, UV + float2(0.5f, 0.5f) * pixelSize, 8), Luminance); float sample2 = dot(ColorSource.SampleLevel(DefaultSampler, UV + float2(0.5f, -0.5f) * pixelSize, 8), Luminance); float sample3 = dot(ColorSource.SampleLevel(DefaultSampler, UV + float2(-0.5f, 0.5f) * pixelSize, 8), Luminance); float sample4 = dot(ColorSource.SampleLevel(DefaultSampler, UV + float2(-0.5f, -0.5f) * pixelSize, 8), Luminance); fAvg = (sample1+sample2+sample3+sample4) * 0.25f; float fAdaptedLum = PreviousLum.Sample(DefaultSampler, float2(0.5f, 0.5f)); result = clamp(fAvg + (fAdaptedLum - fAvg) * ( 1 - pow( 0.98f, 30 * TimeDiff) ), 0.3, 1.0f); }
The 0.98f can be adjusted to modify the speed of which the eye adaptation occurs. We can also adjust the factor with which we multiply the TimeDiff variable. Here we use 30, but we could use any value. Modifying either value will affect the speed of the adaptation.
Also, in order to make the pipeline more streamlined, we first downsample the color buffer (which has a screen-relative size) down to 512×512 before we perform the mipmap generation. This ensures us there will be a level with 2×2, and using simple math, we can calculate the mip level must be the 8th.
512×512 | Level 0 |
256×256 | Level 1 |
128×128 | Level 2 |
64×64 | Level 3 |
32×32 | Level 4 |
16×16 | Level 5 |
8×8 | Level 6 |
4×4 | Level 7 |
2×2 | Level 8 |
So to conclude. First we perform a downscale from variable size down to 512×512. Then we perform a mipmap generation on the render target. Calculate the average luminance and use the time difference to blend between different levels of luminance. We then use the luminance value and perform the operator described above. In order for this to be properly handled, we blend both the bloom and the final result. If we perform bloom without the tonemapping, we will get an overabundance of bloom. The difference can be seen below:
// Gustav