Characters continued

So I ran into a problem with characters when I tried to load the FBX example Zombie.fbx. It turns out the skeleton in that particular scene isn’t keyed directly, but indirectly using a HIK rig. So when I tried to read the animations from the scene, I got nothing, every curve was a null pointer. I’ve tried back and forth just getting the character in the scene, but the HIK rig doesn’t really NEED to follow the actual skeleton, so there is no exact way of knowing it will fit. Instead, I go into Maya and bake the simulation. This won’t bake the animation to be per-vertex, but instead it will make sure every joint is keyed identically to all it’s effectors. The reason why I don’t really want to read characters in total is not because of simple rig-to-joint connections, but also because of effectors. So a skeleton might be linked to another HIK rig, but the animations are not identical, just slightly identical. Baking the simulation will make sure that every effector along with every possible related skeleton gets keyed in the skinned skeleton.

I really hope MotionBuilder can do the same, because otherwise we are going to be stuck using a single animation layer until the Maya FBX exporter gets support for exporting multiple takes.

I’ve also been working on getting the model files to update when one exports a previously existing Maya scene. The thing is that if one has spent lots of time modifying textures and shader variables, and then decides to tamper with the base mesh or possibly the entire scene, the model file will still retain the information previously supplied. This is a way to compensate for the fact that we can’t set shaders, textures and shader variables in Maya, but have to do it in an external program. When this is done, only meshes with identical names to those already existing in the model file will retain their attributes, all others have to be changed in the material editor.


I’ve been hard at work getting characters to work, and now they do, well sort of…

There seem to be two problems with characters at this present moment, the first is that Maya FBX exporter have no way of setting the skin and skeleton to bind pose before exporting (which is super important because the skeleton can be retrieved in bind pose, but the mesh can’t). If one is to not export while in bind pose, the mesh will not be relative to the skeleton, resulting an incorrect mesh deformation. The second problem is that I can’t load the animations from the FBX examples such as Zombie.fbx. The mesh works, the skeleton works, but the animation clips doesn’t get loaded for some reason. I have still to investigate this. The good news is that I actually can have an animated character! This means that the skin fragmentation, skeleton construction, animation clips (albeit only from Maya at the moment) and scene construction with a character actually works!

Now, if you found this thread because you were tearing your head off trying to understand how Maya stores joints in FBX, here is the deal. First of all, your KFbxPose is useless, seeing as you want your joints while they are in bind pose to begin with. The only thing you need is your KFbxNode for each joint, which is easily retrievable using a recursive algorithm to traverse your joints. When you got this, all you want is to get the PreRotation (using KFbxNode::GetPreRotation(KFbxNode::eSOURCE_SET) ) and current orientation using the KFbxNode::LclRotation.Get(). Your PreRotation corresponds to the Joint Orientation in Maya, and this rotation will be the basis for your joint. Now, we have two vectors, where X, Y and Z correspond to the degree of rotation around each axis. Note the use of degrees, if you want radians, this is where you want to convert it. The PreRotation (or Joint Orientation) consists of three angular values, rotation around X, Y and Z, but they are not made up of a free form transformation. To get a rotation matrix for these angles, you need to construct a rotation matrix for X, Y and Z (using the axis 1,0,0 for X, 0,1,0 for Y and 0,0,1 for Z) using the axis-angle principle, where your angle is your rotations value. Multiply these three matrices together and we get our final rotation matrix for the joint.

Example code show what I just explained (Note: n_deg2rad converts degrees to radians):


KFbxVector4 preRotation = joint->fbxNode->GetPreRotation(KFbxNode::eSOURCE_SET);

// first calculate joint orientation
matrix44 xMat = matrix44::rotationx(n_deg2rad((float)preRotation[0]));
matrix44 yMat = matrix44::rotationy(n_deg2rad((float)preRotation[1]));
matrix44 zMat = matrix44::rotationz(n_deg2rad((float)preRotation[2]));
matrix44 totalMat = matrix44::multiply(matrix44::multiply(xMat, yMat), zMat);


That is not enough however. We also want your bone rotation at the moment of binding, which we get by getting the LclRotation as previously mentioned. Then apply the same principle to those values…


KFbxVector4 bindRotation = joint->fbxNode->LclRotation.Get();

// then calculate the bind value for the bone
matrix44 bindX = matrix44::rotationx(n_deg2rad((float)bindRotation[0]));
matrix44 bindY = matrix44::rotationy(n_deg2rad((float)bindRotation[1]));
matrix44 bindZ = matrix44::rotationz(n_deg2rad((float)bindRotation[2]));
matrix44 bindMatrix = matrix44::multiply(matrix44::multiply(bindX, bindY), bindZ);


Now we have both matrices, the bone matrix and the joint matrix. In games, we don’t really care about bones, seeing as we want a united joint matrix which we can use in our skinning shader. Anyhow, when we got our matrices, we also want to combine these to get the actual joint bind pose…


// multiply them
bindMatrix = matrix44::multiply(bindMatrix, totalMat);


If you want quaternions as rotations, which is the case with Nebula, just convert the matrix to quaternion, otherwise keep it like this. This solution gives you the joints where they are NOT multiplied with their parents matrices, and this is because Nebula wants them unrelated. Nebula calculates the inverted bind pose for each joint when loading them, and simply multiplies the current joint with the parents inverted bind pose when evaluating the skeleton. Another way of solving this would otherwise be to get the KFbxPose, but the pose only gives you matrices which are premultiplied by all parents, which means the skeleton will be in world space, detached from their parents, which in turn means Nebula wont be able to multiply them with their parents matrices. So, use this method if you want to skin in realtime using the algorithm JointMatrix = JointPoseInverted * (JointPose * JointParentPose).

So the conclusion is that Nebula wants the joints in local space (not multiplied by their parents), so that Nebula can multiply the parents afterwards. The reason behind this is because the skeleton can have simultaneous animation clips running at the same time, which means the parent joints might be affected by two animations simultaneously, and thus the parent matrix cannot be pre-multiplied. This is probably the way most game engines would handle skeletons, seeing as there is minimal re-computation for maximal flexibility.

I will post a video proving that it actually works when I’ve made it work for multiple animation clips (the Zombie.fbx problem).


EDIT: I found on the FBX discussion forums that all of this can be done with a single function call, called KFbxNode::EvaulateLocalTransform… Thankfully, I’ve learned a lot about how it’s really done underneath the hood so I’m not bitter about it… Well OK, maybe a little bit…


The time I’ve had this past week as been focused on animations and characters in Nebula. There is a big difference between exporting characters in the new installment of Nebula compared to the old. The biggest feature is the fact that one can have several characters in one Maya scene, and have them exported as several individual characters! There is a pretty big difference between characters and ordinary static objects in Nebula, mainly in their model files (.n3). You see, a model file describes a scene, which is usually initiated with a transform node describing the global bounding box for the entire scene. This node then holds all meshes in the scene, with all their corresponding values for material, texture and variables. However, characters are much different! They have another parent node, called CharacterNode, which describes the character skeleton. All meshes described within the CharacterNode are counted as skins to the skeleton, which in turn means they have to be skinnable! This means that having both characters and static objects in the same scene is impossible with the current design. One might as why I don’t just add a root node which contains both a CharacterNode with all its skins, and then have all the other nodes parallel to that node. Well, you see, Nebula has to decide whether or not a MODEL is a character or a static mesh. So combining both static meshes and characters would cause big problems. This also means every single skeleton needs its very own model. Currently, the batcher decides whether or not a Maya scene should be a character in Nebula, or a static mesh. There wouldn’t be a problem if one would just take all static objects into one model, and have every character in their separate ones, except if it wasn’t for giving them a proper name! So one has to chose if they want to make an ordinary static object scene, or a character scene, so that’s that!

And of course, the biggest problem is getting the skeletons, animation curves and skinning to work properly, seeing how many variables there are that can go wrong. Currently I think I’ve managed to get the skeleton working properly, seeing as I can have a box using three joints, unanimated, and it looks correct. However, as soon as I apply an animation, it breaks. The image to the left shows how it looks after animation, and the right one before animation.









I also realized that Nebula only accepts skins which use 72 or less joints, which means that more complex models needs to be split into smaller fragments, where each fragment can use 72 or less joints. I should have this done by the end of the week unless something very time consuming turns up.

I’ve also been collaborating with my colleagues and we’ve started wrapping our programs together, mainly by designing a central class for handling settings. For example, if I set the project directory in Nody, it should be remembered by all toolkit applications so that one doesn’t need to reset it everywhere if one is to change the working directory.


The past two weeks have all been centered around content, and how to get content into Nebula. We’ve been working on a format called Alembic, which provides easy-to-export plugins for Maya. We just recently realized though, that the Maya plugin doesn’t connect skins and skeletons, so there is no way of know what skin goes to what skeleton, bah! Instead, Maya animates every single vertex individually, so a 5 megabyte .mb-file becomes a 50 megabyte .abc file! Not only is it space inefficient, it’s also an enormous setback in performance when animating. Anyhow, we decided to revert back to the FBX, because we realized it would be much easier to dig into FBX (mainly because there has been some work done by a couple of my old class mates). They made an exporter which would allowed you to make a character in Maya, export it to FBX, and then use it in Nebula. And while that is rather nice, the application was quite difficult to expand on.

So we made a new one! It basically uses all the same things at the moment, excepts in a way more modular way. For example, if one has several meshes in one Maya scene, the FBX batcher will export every single mesh node to its own file, and if there is only one mesh, it will create a file with the same name as the counterpart. Also, the batcher will create a model (basically scene graph fragments) which holds every single mesh in your Maya scene as a separate mesh, thus allowing you to modify them by attaching different materials, variables etc. The only thing that is left to do with it is to allow parenting of objects. Seeing as Nebula models already handles parenting by having a node hierarchy, one could just as easily make sure the meshes come in the exact same hierarchy in the model as they do in Maya. The only problem with this is that there might be some unexpected behavior when animating. This remains to be seen, however I fear that parenting with characters will be a feature to add in the very close future.

Right, content, where was I. Yes, meshes, ok, we need meshes, and we have meshes. Although it wasn’t trouble free. Nebula has a set of tools which greatly help with mesh importing, the MeshBuilder, MeshBuilderVertex and MeshBuilderTriangle. These three classes is all you need to represent a mesh. Getting data from FBX was also trivial, but getting data into the MeshBuilderwasn’t. Since a vertex can have several UV-coordinates, and the FBX-models are compressed in such a manner so that they have an index list and a data list for very piece of vertex information, it would require extensive parsing just to know how many vertices one would need! Instead, the MeshBuilder has a function called inflate, which makes sure every triangle has its own unique set of vertices. Thus, one can traverse the index lists and set the data with ease, and then just remove the redundant vertices right? WRONG! Retrieving bitangents and tangents from FBX resulted in every single vertex being unique, which in turn resulted in the MeshBuilder removing 4 vertices when cleaning up. The removal resulted in a destruction of the mesh, and this was of course not acceptable. So instead we decided to get the UVs and Normals, which are only unique for some vertices, and then deflate (remove redundancies) and calculate the bitangents and tangents ourselves. No more exploding meshes = mission accomplished.

This wasn’t all however. It turns out Nebula saves meshes in a compressed format, where positions are stored raw, normals bitangents and tangents are saved as single ints, and the texture coordinates are saved as one int. They use a packing technique where the RG and BA components of the vector describes where in tangent space the normal is pointing, which in turn gives you a decent precision. Texture coordinates are saved as two 16 bit unsigned shorts, resulting in every piece of texture coordinate is only the size of an int. Texture coordinates needs to have more precision than normals, but certainly doesn’t need to be 32 bit per component.  Right, I thought you might want to see some proof, so I grabbed a picture for you.

This shows two models made in Maya and exported using the FBX batcher. The artifacts you might see on the inside of the “sphere” is a glitch caused by the SSAO shader. The character is dressed with the eagle texture, so that explains her being transparent in some areas.

It’s alive!

So I’ve been working on Nody, trying to get it to be as flexible and as intuitive as possible. Thus far, I can modify shaders (which in turn actually rewrites the shader code) and have them presented to me in real-time, which is pretty neat. What I wanted to do when that worked was to be able to set variables such as textures, directly from Nody, so that one can preview how a shader would look on a specific model with a specific set of variables. That’s what got me into the texture compression part (see last post).

Nody will not serve as a texturing tool, seeing as it’s purpose is to create shaders, and that’s it. The reason for this is because Nody is supposed to work on a per-shader level. Nebula uses 3 different levels, resource, template and instance. The first, resources, is the most general of these three, it can be a mesh, a shader, a texture, a sound file, or any other type of resource one might need. The second level describes for example a model, which is a collection of resources, and resource states, such as shader variables, texture attachments, sound attachments, animations, skeletons etc. The third level is instance, which is what is you actually use in your game. Nody is a level 1 tool, meaning its purpose is to handle a resource, in this case a shader. A colleague of mine is currently working on a level 2 tool which will be a part of the level editor. This tool is called the material editor, and it allows a user to switch materials (not the shaders within the material), textures, variables etc. It’s basically meant to change the model-file, which is used as a template for all instances. On the instance level, very little is changed in resource-manners. One might want to change a certain variable, but that is pretty much as far as you go. One might want to be able to have variations without really changing the model file, and that is fine, as long as one keeps track of the variable name and sets it correctly.

When picking textures in Nody, Nody will present your working directory, where you have your image files in raw format. Whenever a texture is picked, Nody will look for the presence of an exported version, and if it doesn’t exist, calls the texture batcher to export it.

Although, one might want to be able test out their shader to every extent before deciding it’s exactly what they want, and that means variables and textures has to be testable online. With online I mean without having to restart either application. I haven’t really had the time to make a video yet, but I’m working on getting one out, so you can see how powerful this tool is. It’s really cool too

Texture compression

My focus this past week has been to improve the usability of Nody. The most obvious thing I came up with was the ability to preview textures in the actual node window, so as to allow the user to see how different texture will look with the given shader. There was two major problems with this. First, is it wiser to use the work folder (containing a very broad mixture of texture formats) or use the export folder? Second, how do I deal with the formats I’m faced with?

Turns out Qt has this covered, for every single file format than TGA and PSD, which are the currently used work-formats in Nebula. I started off with basically copy-pasting a TGA loader done by the Qt crew. This version is not available in the standard Qt package, but requires Qt3D, so I thought I’d rather not use more modules but instead just copy the code. Then I tried figuring out how PSD works, and while it’s pretty straightforward, it’s still very hard to figure out exactly how the bytes are laid out.

In the PSD and PSB specification, it says the PSD file consists of 4 sections, header, color modes, layer and mask information, and image data. I want the header (for sizes, bit depths and such) and the raw image data. Also, PSD files are RLE-compressed, a loss-less compression method which results in a very small size if there is little to no variation in the source. In case you are to lazy to google, it works by having each scan-line (row in unfancy terms) be split into segments, where first you have one byte describing how many of the same pixel will appear in row, followed by the data for those pixels, depending on pixel depth. So for examle, I could have 15 pixels in row which are pure red, and with a R8 G8 B8 image that would give me something like: 15 255 0 0.

PSD supports up to 56 channels, so each pixel could potentially be 56 bytes for the lowest byte per pixel, which is a lot of information. The specification also stated that each row begins with the total byte count for that row, followed by the RLE-compressed data. What’s strange about that is the need to state how many bytes you have, because you already know from each RLE-package how much you are going to receive. If I know I have 3 channels and 8-bit coloring, I can also be sure how many bytes will follow each pixel repeat counter.

Halfway through this though, I got some advice that I maybe shouldn’t focus on handling every single work format there was, but instead reading the exported format, DDS.

DDS means DirectDraw Surface, and is nothing more than a container for an actual texture. The DDS header, specified here: has all the information necessary to reach the actual image data. This is the part where stuff gets trixy. DDS can contain either raw textures, which are easily read by simply reading them byte for byte. DDS can also contain compressed formats, such as DXT1, DXT3 or DXT5. If one likes to read specifications on how these compression algorithms work, one might want to visit: The compression algorithms described are called block compressions. A block consists of 16 texels, which are somehow averaged into a data structure, which takes less room than the original raw data, but is still without minimum loss. If one is too lazy to read, I can give you a fast introduction how to decode these files.

First we have DXT1. DXT1 saves two colors, both of which are the two extreme colors of a block. Then, an integer is saved, which contains all the codes for all the 16 texels. Each texel only needs two bits (remember BITS) to describe what color they should use. The colors, let’s call them colorOne and colorTwo, which are read from the image, can create two averages, colorThree and colorFour. Together, they can be indexed by using the decimal values 0, 1, 2, 3, or binary values 00, 01, 10, 11. Remember how every texel has two bits? Well, these two bits are just what you need to count to 3. So each texel has its own two indices masked out, and then matched to a list of colors, so the appropriate texel can retrieve the appropriate color. I solved this by using an array, which saves the colors in order, so addressing the array with index 00 would give me colorOne, and 01 colorTwo etc. Getting the indices from the actual int simply required a bit shift to the appropriate bit, and then an & comparison with 11 to get their value. I almost forgot that if colorOne is bigger than colorTwo, one has to set colorFour to 0, and make colorThree a linear interpolation of colorOne and colorTwo, instead of a bilinear. This is to accommodate for alpha, and as you can see, DXT1 only supports binary alpha.

Then we have DXT3. DXT3 has the exact same color layout as DXT1. What differs is that DXT3 can handle non-binary alpha. The structure starts with a set of 8 bytes containing the alpha values of all indices. Once that is read, the rest is cake. When we have the color index for our texel, all we need to do is to fetch our two alpha-bits at that texel, and multiply the second bit with 256 to create a short. Slam dunk done.

DXT5 has an even more complex way of handling alpha. Instead of just simply having an alpha value per index, DXT5 stores alpha as a palette, much like the colors, and then simply use interpolation to find the wanted alpha value. So the structure starts with two alpha bytes, our extreme values. Then there are 6 alpha bytes. Why 6 you may ask? Well, remember how 4 bits could index our colors in a very nice way? Alpha needs 3 bits (as I said before, BITS again) per texel to be used as an index. So expand 4 bytes with half  and you get 6 bytes. The fancy thing about DXT5, is that alpha has to be calculated just like we calculated colors, but now we have more data for alpha values than we have colors. So if we have a finer grain of alpha values, that means our indices must be bigger, so instead of just having 00, 01, 10 and 11, we have 000, 001, 010, 011, 100, 101, 110, 111. These values are then used to sample the alpha. The tricky part with this though, is that there is no structure which holds 6 bytes, so we have to divide the data into one int and one short. The real problem comes when we need to get our indices, even though the indices actually border from the int to the short. The easiest way to do this is to simply count, we start at the short, that one ends after 16 bits, which would give us that when we are at the 15th bit, we need to take two bits from the int, this is our transition zone. This sample code explains it in detail.


int alphaCodeIndex = 3*(4*i+j);

int alphaCode;

if (alphaCodeIndex <= 12)
alphaCode = (alphaCode2 >> alphaCodeIndex) & 0x07;
else if (alphaCodeIndex == 15)
alphaCode = (alphaCode2 >> 15) | ((alphaCode1 << 1) & 0x06);
else // alphaCodeIndex >= 18 && alphaCodeIndex <= 45
alphaCode = (alphaCode1 >> (alphaCodeIndex – 16)) & 0x07;


alphaCodeIndex is a running iterator, which for every row i increases by j so that we always have a three bit increment. When we go past the size of the short, we need to remove the size of the short from the index, seeing as we need to ‘start over’ when we are going to bit shift the int. I can’t take any credit for this clever solution, seeing as I found it at:

When alphaCode is retrieved, it is used to get an indexed alpha value, and the compression is done. This is the result in Nody:


The image shows us one bump map (to the left) compressed with DXT5 and one diffuse map (to the right) compressed with DXT1 being rendered in real-time in Nody. The next step is to actually see the texture being applied whenever this happens. I couldn’t be more excited!