Epidermis

I got a tip that skins are a very important thing to render properly. THE method to solve rendering through semi-solid objects such as skin, leaves etc. is called subsurface scattering, which can be seen here http://en.wikipedia.org/wiki/Subsurface_scattering.

The only problem with the general and near-perfect method of rendering skins using subsurface scattering is that one needs to perform light-operations per each light source. This means deferred rendering is out the window, which is bad! So, there is method, called SSSS or Screen-Space Subsurface Scattering, which has seen use in engines such as in the Unreal 3 engine.

Our graphics artist, Samuel, thought it would be a very nice addition to Nebula if I were to add this algorithm in order to render skins properly. Here is the result:

facenosss

No Screen-Space Subsurface Scattering

facesss

With Screen-Space Subsurface Scattering

I accomplished this by adding a new MRT, which renders Absorption color, Scatter color and binary mask which masks out the area where the SSS should take place. We need to mask out the rest because we are working with this in screen-space, and must therefore be very careful exactly what we apply our algorithm too, so that ordinary static objects don’t get this effect.

The algorithm itself is simply a horizontal and vertical bloom, where the light is blurred instead of the color depending on the depth of nearby pixels. This, combined with a Gaussian distribution which ‘favors’ reddish hues gives the effect that skin appear more life-like since it simulates light being spread under the surface of the skin.

The exact implementation in Nebula uses two passes. One pass is the standard boring old skinning pass, which calculates lighting, albedo, emissive and specularity. Then, we render all geometry which uses the SSS process, and while doing so we render out the absorption map and scatter map. Also, we render a small data buffer, which holds our variables, such as SSS width, SSS strength, and the SSS correction factor, as well as a bit which tells us if a certain pixel should be SSS:ed or not. Then we apply our screen-space post effect which runs the actual algorithm, performing the above seen image. This means that we can perform per-object settings for the SSS, while still rendering the result in screen-space. I also fixed a minor glitch with the original implementation, which gave artifacts when the edges of the screen cut a subsurfaced area by using the mirror address mode for the sampler, which means that pixels along the border of the screen wont suffer from wrapping samples, which in turn removes the artifact.

The red rectangle shows an artifact which occurs when using a wrap address mode

The red rectangle shows an artifact which occurs when using a wrap address mode

I also applied the same technique to the HBAO shader, seeing as it previously suffered from the same problem with artifacts along the screen borders.

// Gustav

Particles, post effects and general ignorance

So I thought I could be clever and remove redundant context device switches by not setting the index buffer in the device context if the very same index buffer was set before. Little did I know that I, just recently, made it so that the device context gets reset for each pass, so as to prevent any unnecessary render targets, shader variables and shaders to be attached to the rendering pipeline. Well, the obvious problem was that even though two objects with identical index buffers were rendered, let’s say first for the shadow maps, and then for the actual color, it would be so as the same index buffer would be set twice. Well, seeing as the entire device context gets reset between passes, this isn’t the case. I chose to tell you this, not because of stupidity (although it’s somewhat embarassing) but out of curiosity at what kinds of glitches may appear. The glitch was that if a light had shadows enabled, and an object which was previously shadowed moved out of the view able area of the light source (thus not rendering the shadows anymore), some other object would disappear.  So whenever you doubt your visibility system, or if you have some really strange and unpredictable glitch, it might be something as simple as pre optimization, which, as we know, is the root of all evil!

Anyways! Let’s get to the good news! So, we have particles, again. Directly in the content browser. Epic winage. More like good use of christmas… Not only are particles in place, but so are cube maps. So in effect I’ve also implemented a shader which uses environment mapping to get stuff shiny and pretty! Another use of cube maps is of course skyboxes, and as such, our at-will artist, Samuel, has made a very pretty skybox prototype which is now in use as the default skybox! This of course adds a much more ‘professional’ look to the content browser instead of the boring single-colored background.

gun

Environment mapped asset from our casual graphics artist

The shading system is no longer dependent on Nody, so we can once again implement new shaders. Using DirectX we’re using the ye olde’ Effects system, which works for all our purposes. And since Microsoft seem to have decided NOT to discontinue the Effects system, we will use it for our DirectX implementations. The awesome part of this system is of course the flexibility of having all shading states and such directly in the shader file. Also, all shader functions and application entry points are directly written into the file, so it’s really easy to implement new stuff. For the OpenGL implementation we must design some sort of ‘language’ which allows for the same set of functionality. This could be fixed using the same Nody-ish structure where you tag code as a sections, so that the code itself can be loaded as different objects, which should then also be linked together using the very same file. But that’s a project for the future. Only when this fundamental implementation is in place can we really design a node-based shader editor which solves the problem of graphics artists being able to create new shaders.

So tessellation is also back in business. It works. It looks good. Although it’s a bit shaky, since there is a very high risk of holes appearing in the mesh over mesh seams. We’ve tested a very quick low-poly version of a high-poly mesh with a displacement map, and it looks very good! One must still be very careful with the UV’s however, and of course by extension also the displacement map. The only thing which mitigates this pain is the fact that the Unreal Engine seems to have the same problem. Until some clever fellow(s) solves this problem, we’re going to leave it to the artist to solve the problem. Seems safe enough.

As if this wasn’t enough, I’ve also fixed post effects. By fixed I mean fiddled. By fiddled I mean adapted. Adapted to the DX11 render path. This means that post effect entities, an ingenious construction of Nebula (not done by me) which allows for post effects to be animated whenever one of these entities is encountered. A post effect entity gets triggered if the point of interest is inside a post effect entity. So in the level editor, we will be able to place a post effect entity, which will be triggered when entered. This can make for really cool effects, such as saturation, color balance, contrast, fog etc. The first post effect entity set will be the default entity, so every level could have one to set the ‘mood’ of the level.

And I’ve also reinstated the depth of field, which can be controlled using a post effect entity as every other post effect. I personally think it’s a bit gimmicky, but it’s serves the purpose of softly forcing the player to focus on a specific point. The only thing which bothers me with the current DoF implementation is that it uses variables to determine where the depth should be, not where it should be. This should be easily handled by sending a 2D position where in screen-space the focal point should be. The DoF range and intensity should still be a variable in my opinion, but the method with which the focal point gets determined is a bit shaky in my opinion.

Showing off the old and reimplemented DoF

Showing off the old and reimplemented DoF

hdr

Showing some modified HDR parameters, such as bloom color and bloom range

// Gustav

New tech!

While I’ve been hard at work with the Content Browser, I’ve also done tons of work on the rendering side. The first and probably coolest feature I’ve added is instancing, which basically lets us add model entities to the brim without giving an all to big loss of FPS. The bottle neck right now seems to be the visibility system. I have tested rendering 4000 1134 polygon models with varying results, from 30 FPS on my home computer to a solid 60 on a newer one. It seems that the rendering itself isn’t causing the bottle neck, but instead the CPU is bottle necked by the visibility system.

I’ve chosen not to use a texture nor a vertex stream to send the instancing transforms to the shader. Instead I render batches of 256 instances at a time, which basically divides the number of draw calls by 256. The reason why the upper limit is 256 is because the biggest  size an array can have in HLSL. No matter, it’s not really a problem seeing as cutting the number of draw calls by 256 for large amounts of objects takes the FPS from unplayable to fluent. I have plans to investigate the other methods in the future, but I’m holding off for the moment. The thing is that texture fetching would require 4 * 3 textures fetches per vertex in the shader, because the sampler only takes 4 pixel components per fetch, and a matrix consists of 4 rows, and each vertex needs ModelViewProjection, ModelView and Model. Vertex streaming seems to be the other viable option, and I will look into that if it’s necessary.

Oh, and if you’ve ever wondered how 4000 trucks would look like in real time in a boring grid, here is a picture:

nebulainstanced

 

But that’s not all, nono, far from it! Every shader previously written in Nody has been converted into the old .fx format. This might seem counterproductive, but it turns out that working with shaders using Nody was far from optimal, and we made the decision to just code the shaders instead of designing them. In the future I will look into how we can use Nody to accomplish this, but with another approach.

As a result, I’ve re-implemented the old tessellation shader, and we’ve also tried it using a ‘real’ model with a ‘real’ displacement map generated from a high-poly sculpt. I don’t have any images to show it, but I can assure you that it works. There is a problem though, we must use soft edges, because otherwise we get cracks in our tessellated mesh.

nebulanotessellation nebulatessellation

In the image above we have a large cuboid which is very thin. The mesh was solid and intact before tessellation, but because the normals at the edges point in different directions because of the hard edge, the tessellated result gets cracked because the normals at both sides of the edge point in different directions. As such, one cannot use tessellation with hard edges unless the level of tessellation is zero over the seam. The tessellation shader also tessellates based on eye distance, so the further away you are the lower the tessellation factor will be. Oh and before I forget, the tessellation shader also works for skinned meshes.

Not only can we do directly in Nebula, and get a feel of how the result will look, but we can also render everything in a wireframe mode, giving artists a hum of how fine the tessellation is. This, we hope, will prove useful when working with tessellated meshes. Here’s a picture of the very old eagle mesh, tessellated and rendered in wireframe for your debugging pleasure.

nebulaeagletess nebulatexprev

 

As you can see, I’ve redesigned how variables are handled and displayed. On the right you can see what variables are available for the current material. Textures can also be previewed by hovering over the icon. The thumbnail picture can be clicked on, and it opens up a file browser which lets you select textures from the file system.

I’ve also taken the liberty to add a shader which lets you animate UVs. It doesn’t work with keyframes, but instead using timing and angles. The shader has a set of parameters, linear direction which animates the UVs in a specified direction, angle which rotates the UVs around the UV shell center point, linear speed which determines the speed of the linear animation, and angular speed which determines the speed of the angular animation. With all these parameters, we can achieve rather good looking animations. It can also tile the texture in X and Y depending on the variables.

nebulauvanimated1 nebulauvanimated2

Above you can see the same object with different settings for the UV animations. You can see the tile count which is dependent on the NumXTiles and NumYTiles variables.

As you probably can see, the browser also allows you to change the light color, and also allows you to lock the global light to always point in the direction of the camera. Well, I guess that’s all for now!

 

 

Long time no see

So I’ve been away for a long time. Have no fear, this doesn’t mean I haven’t been busy! Quite the opposite, I’m currently working on a new iteration of the Nebula 3 toolkit. As such, I’ve begun working on a content manager. The idea is that the previous way of handling content is that you have five different applications, a level editor, a material editor (textures and shader variable editor), Nody, the batch exporter and the importer. This time around, we want to gather every application into one big tool, which lets us not only edit graphical objects live, but also avoids the fact that we need several applications to accomplish the one thing we want.

Enter the content browser, a tool which lets you review all your assets, edit them (live of course) as well as import new ones. This tool is the one I’m currently working on, and the idea of it is to be closely attached to the level editor and Nody. With the content browser, an artist or programmer NEVER has to switch between applications to edit, preview and update assets for their games, giving a huge increase in productivity.

As well as this is going on, I’m also working on a new iteration of Nody. This might sound irrelevant in comparison to the previous, but in my opinion (remember, this is my opinion, so don’t take it too seriously) it’s not. The idea for Nody 2 is to separate what is actually Nody from what is specific about shaders. By that I mean there will be a framework called Nody, or Nody 2 (haven’t quite decided yet) which can be used to implement a Nody-application, where a node-based shader designer might be one of them. Have in mind that the basic idea of Nody is to have a node graph which basically just connects inputs to outputs and produces a result. A colleague of mine had the idea to use the Nody library to implement a production line for wood sawing, while another suggested it can be used to setup behavior trees for AIs. The uses for Nody is more or less endless, but the important factor is that Nody, as it looks today, is only implemented as a shader designer tool. As such, I need to either separate the ‘relevant’ stuff from the current version of Nody, or quite simply make everything from scratch. The reason why I chose the latter is because we’ve found a way to incorporate Nebula stuff like smart pointers and the overall symmetric design in Qt applications, and as such I can write the new iteration of Nody to have automatic memory deallocation, something which Qt more or less lacks (despite the fact they have six different types of smart pointers).

Oh, and did I mention that I’ve started studying again? I’m not asking you to care but have in mind that progress might seem sluggish when I’m simultaneously trying to write a compiler…

Pics of the progress might appear soon!

Being wrong, again

So, I was wrong again. You can’t blend normals. This should have been obvious to me, seeing as blending normals might result in two normals nulling eachother out if you blend them. Take a normal, lets say it’s 0.5,0.5,0.5 pre-normalized, and you blend it with a vector -0.5, -0.5, -0.5 with the blend factor 0.5. What do you get? That’s right, zero, you get zero, resulting in complete and utter darkness. This is not good, for tons of reasons, many of which I shan’t explain. Just try constructing a TBN matrix where one vector is 0,0,0. Not so easy! So I removed it, and sure, objects which use alpha (mainly transparent surfaces) won’t get lit correctly seeing as the lighting should take place per alpha-object instead of screen-space. On the other hand, it’s hardly visible, seeing as alpha-objects often consist of glass, where as many layers of glass results in an additive effect because of reflections in the light etc, making the rendering error hard to see.

 

Good news are that I finally got CSM to work! It wasn’t easy, and I’ll explain why. As you may or may not know, Nebula saves the depth of each pixel as the length of the view-space position of that pixel. What this allows us to do, is to take any arbitrary geometry in the scene, and calculate a surface position which lies in the same direction that we are currently rendering. Why is this important then? Well, as you see, saving depth in this manner allows us to calculate the distance between lets say a point light, and the shaded surface positions it should light, determine the distance between these two points, and thus light the surface. Great! We can literally re-construct our world-space positions by simply taking the depth-buffer and multiplying it with a normalized vector pointing from our camera to our pixel. Now comes the hard part. What do you do, when you have no 3D-geometry to render, but instead have a full-screen quad, as is the case with our global light. After finding this guide: http://mynameismjp.wordpress.com/2010/09/05/position-from-depth-3/, I tried it out, and got some good and some not so good results.

 

 

 

 

 

 

 

 

What you see is the same scene, using the above mentioned algorithm to reconstruct the world-space position. If you’re lazy, this is the general concept. In the vertex shader, get a vector pointing from the camera (that’s 0,0,0 for a full-screen quad) to each frustum corner. Then, in the pixel shader, normalize this ray, take the camera position + normalized ray * sampled depth. Sampled depth is the length of the view-space position vector. I thought this could be the result of a precision error or whatnot, but the solution was a proof that it wasn’t. Instead of using a ray and such to recreate this, I simply let my geometry shader (not the GS, but the shader which is used to render deferred geometry) output the world-space position to a buffer which would then hold these values. Using that texture to sample the data, the big black blob disappeared. Somehow, the method used for calculating the world-space position has to lack in accuracy, so I pondered. Normalizing the vector could not be correct. Why? Well, consider the fact that I want a vector going from the frustum begin to each pixel on the far plane. Each vector in that respect is not normalized, seeing as the distance straight forward is shorter than it is to a corner, which can be derived using Pythagoras theorem. Currently, the shadows work, and I’ve baked them into an extended buffer, using both depth as the alpha-component, and the RGB as the world space position components, to save using another render target. I’m not satisfied yet though, seeing as the explanation on mynameisjp.com must have some validity to it, but for now, I’ll keep the data in the render target.

Deferred shading revisited

I haven’t really had much time to write here, been busy with lots of stuff.

When testing the rendering pipeline we’ve hit a couple of annoying glitches with the rendering. The first was that UVs and normals for FBX-meshes was corrupt. The fix for the UVs was pretty straight-forward, just flip them in Y (Why?! I have no idea…), but the fix for the normals wasn’t quite as intuitive. First of all, I would like to start off with saying our target platform is PC for our fork of the engine, just so you know. Many of the rendering methods previously in the engine had lots of focus on compressing and decompressing data, in order to save graphics memory and such, but seeing as problems might occur oh so easy, and debugging compressed data is oh so hard, I’ve decided to remove most of the compression methods in order to be able to get a good visualization of the rendering.

The first thing, which is ingenious, is to compress normals (bumped normals of course) into an A8R8G8B8 texture, compressing the X-value in the first two components (A and R) and the Y-value in the other two (G and B). Z can always be recreated using the algorithm z = 1 – x * x – y * y, seeing as a normal has to be normalized. Anyways, the debug texture for such normals would be a brilliant Red and Blue-Green texture, which is impossible to decode by sight, so what I’ve done is to break the compression and use the normals raw. Well, then another problem rose, raw normals would need a texture with the format R32G32B32, one float per each normal, right? Well yes sir, you are correct, but too bad you can’t render to such a texture! Using a simple A8R8G8B8 and just skipping the A-value would give such poor precision, artifacts would be everywhere. Instead, I had to pick a A32R32G32B32 texture as my render target. Wasteful? Yes! Easy to debug? Yes! Beautiful and precise normals? Hell yes! I’d say with a score of +1, it’s a go!

Right, we have two enormous textures to render normals to (one for opaque, one for alpha), what else can we do?

Well, Nebula was aiming for a very broad variety of consoles, ranging from DS to PC to PS3. I’m just shooting from the hip here, but that might be the reason to why they implemented the light-prepass method of performing deferred shading. The pre-pass method requires two geometry passes, one for rendering normals and depth, and the second for gathering the then lighted buffer, together with the diffuse, specular and emissive colors. That’s nice, but it requires two geometry passes (which can get really heavy with lots of skinned characters). The other, more stream-lined method is to render normals, depth, specular and diffuse/albedo to four textures using MRT (multiple render targets), generate light using the normals and depth, and then simply compose everything using some sort of full screen quad. Yes! That sounds a lot better! The only problem is that we need four render targets, something which can only be done on relatively modern hardware, but not for some consoles.

Anyway, the deferred shading method does not incorporate a method to deal with alpha. That does NOT mean you can’t light alpha deferred!

The solution is to render all alpha objects to their own normal, specular, albedo and depth buffer, use them for lighting separately (requires another light pass using the alpha buffers as input), and then in a post-effect, gather both opaque color and alpha color, then interpolate between them! Easy peasy! The way I do it is:

 

/// retrieve and light alpha buffers
float4 alphaLight = DecodeHDR(AlphaLightTexture.Sample(DefaultSampler, UV));
float4 alphaAlbedoColor = AlphaAlbedoTexture.Sample(DefaultSampler, UV);
float3 alphaSpecularColor = AlphaSpecularTexture.Sample(DefaultSampler, UV);
float4 alphaColor = alphaAlbedoColor;
float3 alphaNormedColor = normalize(alphaLight.xyz);
float alphaMaxColor = max(max(alphaNormedColor.x, alphaNormedColor.y), alphaNormedColor.z);
alphaNormedColor /= alphaMaxColor;
alphaColor.xyz *= alphaLight.xyz;
float alphaSpec = alphaLight.w;
alphaColor.xyz += alphaSpecularColor * alphaSpec * alphaNormedColor;

/// retrieve and light solid buffers
float4 light = DecodeHDR(LightTexture.Sample(DefaultSampler, UV));
float4 albedoColor = AlbedoTexture.Sample(DefaultSampler, UV);
float3 specularColor = SpecularTexture.Sample(DefaultSampler, UV);
float4 color = albedoColor;
float3 normedColor = normalize(light.xyz);
float maxColor = max(max(normedColor.x, normedColor.y), normedColor.z);
normedColor /= maxColor;
color.xyz *= light.xyz;
float spec = light.w;
color.xyz += specularColor * spec * normedColor;

alphaColor = saturate(alphaColor);
color = saturate(color);

float4 mergedColor = lerp(color, alphaColor, alphaColor.a);

 

A simple lerp serves to blend between these two buffers, and the result, mergedColor, is written to the buffer.

Sound good eh? Well, there are some problems with this as well! First of all, what about the background color? Seeing as we light everything deferred, thereby also light the background, wherein the background lighted will serve as our final result in the gather method stated above, we will get an unexpected result. Well, what we will get is some sort of incorrectly lighted background which changes color when the camera moves (because the normals will be static but the angle to the global light will change). So, how do we solve this? Well, by stencil buffering of course! Every piece of geometry draws to the stencil buffer, and thus, we can quite simply just ignore to light and gather any pixels outside our rendered geometry, but without having to render our geometry twice! And so, by simply clearing the buffer which the gather-shader writes to, to our preferred background color, we can have any color we like!

So that’s solved then, alpha and opaque objects with traditional deferred shading with custom background coloring, sweet!

Oh, and I also added bloom, easy enough, render bright spots to a downsized buffer, blur it to an even more downsized buffer, blur it again, and again, and then sample it, et voila, bloom!

So, conclusion, what did we win from this, and what did we lose? We got better normals and lighting to the cost of some graphics memory. We removed half our draw-calls by removing a complete set of geometry passes. We managed to optimize our lighting per-pixel by stencil-buffering, which in turn yielded the ability to use a background color. We managed to incorporate alpha into all of this, without any hustle or expensive rendering. All in all, we won!

Also, here are some pictures to celebrate this victory:

Deferred rendering with alpha. This picture shows a semi-transparent character (scary) with fully transparent regions (see knee-line) and correct specularity

Bloom!

Lighting using the A8R8G8B8 texture format (low quality)

Lighting using the A32R32G32B32 texture format (high quality)

PSSM and CSM

I’ve been hard at work during the last 4 days with global light shadowing. One might think this should be extremely simple, but there is a hitch with global ligths, they have NO position! So how does one render shadows from a source when there is no source?

The solution is found in two different algorithms, one called PSSM (Parallel-split Shadow Maps) and CSM (Cascading Shadow Maps). Firstly though, I should explain what a shadow map actually is. A shadow map is a light-perspective rendered buffer of the scene, which basically means that every light source which needs to cast shadows has to render every object visible to the light as a depth-based buffer. The shadow map basically just stores depth saved from a lights point of view. This of course means that a non-shadow casting light source is significantly faster than the shadow-casting ones.

Now when we’ve gotten that out of the way, we can go back to global lights. As you may or may not have realized, in order to render a shadow map you need a position of a light source, and a direction in which it emits light (except for the point light, which only has a position). The problem is that a only has the direction, so there is no point of view from where one can render the scene. To resolve this, there is a very non-intuitive way of doing it, and that is to split the camera view-frustum into different sections (see Cascades and Parallel-splits). The light source is then rendered from outside the scene bounding box, just to ensure every shadow-casting object gets into the buffer. In order to handle rendering the entire scene into the shadow buffer without needing an enormous buffer, one renders the different splits into different textures, but using different projection transforms to do so. So the first buffer surround the closest area of the camera, the next split or cascade overlaps a larger area and so forth, until it reaches the maximum limit which is based on a given distance. This way, shadows very very far away will be rendered with a very low resolution, seeing as the viewport for that area is very big, so each item doesn’t get that much space.

Nebula used to use this algorithm in version 2.0, without deferred rendering, so re-implementing this algorithm in Nebula 3 with DirectX 11 is a bit of a challenge. I was so in the dark I actually thought I’d just been lucky with everything I’ve ever done right, because how hard I tried I couldn’t get the shader to switch between the shadow maps based on the depth to the pixel. I used a float array to send split distances from the CPU to the shader, but the comparison ALWAYS failed. After about 3 days of getting that part to work, I realized I just had to try comparing the values without a variable, but instead hard-code them. To my amazement, that worked fine, perfectly fine. By this point, I honestly started to consider if Nebula didn’t properly set float arrays to the shader. I debugged the pixel shader in PIX, and the values where completely fine. The funny thing was that the compiled shader description clearly showed that the float array, consisting of 5 floats, was 80 bytes in size. 80! Last time I looked, a float was 4 bytes, 5 * 4 = 20, not 80. That was when I realized what was wrong, for some reason, the compiler (or driver) seemed to think that an array of floats was equivalent to an array of float4, but could nonetheless fetch the values without using swizzling. When I changed the array into an ordinary float4 (obviously loosing a value), the depth test worked perfectly! I’m going to be a bit careful, because this could be simple ignorance, similar to the hull-domain shader problem, but there might be some sort of compilation problem going on here. Seeing as an array of float apparently had the same buffer size as a float4 array of equal size, and because fetching values always returned the incorrect (I believe I got infinite, but it’s somewhat hard to tell on a shader-level) values.

Right, back to PSSM. It currently looks like utter crap. It looks so bad infact, that I’m not even going to show you, because of shame and all that. There are two problems at the moment. The first one is that all geometry gets rendered with the camera inverted, so everything gets rendered backwards, resulting in very strange shadows, the most funny effect being that the object which casts shadows is shadowed by itself. The other problem (may or may not be related to the first one) is that in the seams between the two different split maps is clearly visible because of the radial pattern when comparing depths in screen space. If you’ve had the patience to read this far with the knowledge that you won’t be seeing any pictures, I salute you! Oh and, we’ve started this years game project, which is being made in Nebula, both of which can be followed here: http://focus.gscept.com/gp2012-1/ and here: http://focus.gscept.com/gp2012-2/.

HBAO and ESM

Lately, I’ve been hard at work with the stuff I like the most, shading! First, I reimplemented the old Nebula exponential shadow maps (which looks great by the way) in DX11 for pointlights and spotlights. The direction light is a little trickier, but I will be all over that shortly. I also took the chance to remove the DSF depth buffer into a more high-precision depth-buffer. The old DSF buffer stored 8 bits normal id, 8 bits object id and 16 bits depth, whilst the new buffer stores 32 bits pure depth, removing all the halo-problems.

Bringing shadows to life sure wasn’t easy, but it was well worth the while. Nebula had a limit to only use 4 shadow casting lights per frame, I thought I could boost that to make it use 16 (newer hardware can handle it). I’ve also begun working on the global light shadow algorithm, and I thought I’d start with making the PSSM method work. The reason why I chose PSSM is because major parts of it have already been implemented, but also because it seems like a very valid concept.

To handle AO, and setting variables for the AO, I decided to make a new server for it, which was fitted right next to the light and shadow servers (seeing as it has to do with lighting). The method for computing screen-space ambient occlusion is called HBAO, which basically samples the depth buffer using an offset texture (random texture). It uses the current sample point along with the sampled point to calculate not only depth difference, but also the total difference in angle, giving off a strong occlusion effect if the angle between two surface tangents differ a lot. More of the algorithm can be found here: http://www.nvidia.com/object/siggraph-2008-HBAO.html.

The introduction of the AOServer also allows for setting the AO variables live (whenever there is an interface available). Here are two pics showing the awesome new graphics.

 

The picture on the right shows a real-time AO pass, and the right shows a scene with 3 shadow-casting lights. Pretty nice right!

 

Also found a couple of things I want to do with Nody. First, I changed so one can create different types of render targets. If one wants to write to a 1, 2, 3 or 4-channel texture, one should have no problem doing so. I’m also considering being able to add and manipulate render targets between frame shaders. For example, the AO-pass uses the DepthBuffer from the main frame shader, and the main frame shader  uses the AO-buffer from the HBAO frame shader. I also want to be able to add an MRT or an RT directly to the output node, and then decide how many channels one wants to use. When this is done, one should not be able to add new render targets or MRTs. This is to avoid silent errors which might occur if the render targets have their positions switched. Also, attaching render targets to the output nodes should also allow you to pick from ANY frame shader, instead of just the current one.

It would also be awesome to have the ability to use compute-shaders in Nody, as well as nodes which lets you code everything freely.

Plug-in, Nebula-out. Get it?

I’ve been hard at work getting a Nebula 3 plugin for Maya to have all the features we’d want. Why you may ask, isn’t the pipeline awesome as it is? Well, the answer is no, no it isn’t. It works, but it is far from smooth, and the Nebula plugin aims to address that. The plugin is currently only for Maya, but there will be a Motion Builder version as well. The plugin basically just wrap FBX exporting, and makes sure it’s suitable for Nebula by using a preset included with the Nebula distribution. It also runs the FBX batcher which converts the fbx-file to the model files and mesh files. It can also preview the mesh if it’s exported, and will export it if it doesn’t exist already. This allows for immediate feedback how the model will look in Nebula. It also tries preserve the shader variables, but it’s impossible to make it keep the material. That’s because DirectX doesn’t support setting a vertex layout with a vertex shader with a smaller input than the layout. This is a problem because converting a skin from static to skinned will cause Nebula to crash, seeing as the material is preserved between exports. So the plugin offers a way to get meshes, including characters, directly to Nebula, which is very nice indeed.

I’ve also been working with getting a complete Motion Builder scene into Nebula, and I actually got Zombie to work with all features. This means the skin, along with more than one animation clip (yay) can be loaded into Nebula seamlessly by simply saving the Motion Builder file and running the exporter. I will probably make a Nebula 3 plugin for Motion Builder as well, so we can have the exact same export and preview capabilities as in Maya.

I know I have been promising a video showing some of the stuff we’ve done, but I just haven’t had the time! Right now, I will start working on the documentation for our applications so there are clear directions for anyone who wish to use them (mainly our graphics artists here at Campus) . The plugin already redirects from Maya to three different HTML docs, each of which will describe the different tools. That’s nice and all, except for the fact that the HTML docs are completely empty.

Characters continued

So I ran into a problem with characters when I tried to load the FBX example Zombie.fbx. It turns out the skeleton in that particular scene isn’t keyed directly, but indirectly using a HIK rig. So when I tried to read the animations from the scene, I got nothing, every curve was a null pointer. I’ve tried back and forth just getting the character in the scene, but the HIK rig doesn’t really NEED to follow the actual skeleton, so there is no exact way of knowing it will fit. Instead, I go into Maya and bake the simulation. This won’t bake the animation to be per-vertex, but instead it will make sure every joint is keyed identically to all it’s effectors. The reason why I don’t really want to read characters in total is not because of simple rig-to-joint connections, but also because of effectors. So a skeleton might be linked to another HIK rig, but the animations are not identical, just slightly identical. Baking the simulation will make sure that every effector along with every possible related skeleton gets keyed in the skinned skeleton.

I really hope MotionBuilder can do the same, because otherwise we are going to be stuck using a single animation layer until the Maya FBX exporter gets support for exporting multiple takes.

I’ve also been working on getting the model files to update when one exports a previously existing Maya scene. The thing is that if one has spent lots of time modifying textures and shader variables, and then decides to tamper with the base mesh or possibly the entire scene, the model file will still retain the information previously supplied. This is a way to compensate for the fact that we can’t set shaders, textures and shader variables in Maya, but have to do it in an external program. When this is done, only meshes with identical names to those already existing in the model file will retain their attributes, all others have to be changed in the material editor.