Release soon.

The time has finally come to working on the public release of our Nebula3 version. In the true spirit of when Andre still worked at Radonlabs we will release our work on N3 with the same 2 clause BSD license. We were thinking of integrating some of the things we are working on first, but if you continue on that road you will never release anything, so we decided to go ahead and just release and keep working after that. There are still some things in the pipeline, as a fully working OpenGL4 port (using AnyFX that Gustav is working on), a fully working Havoc integration (mostly done) and a rewrite of the network layer.

Currently most of the work left is cleaning up random code, adding proper copyright/license stuff, revamping the build system a bit so that it is a bit more newcomer friendly and above all create some nice demos with content we have created here. Should be done by next week hopefully, so be ready!

And in other news, glad midsommar! ;D

AnyFX progress

Designing a programming language, even a simple one as AnyFX is hard work. It’s hard work because there are so many little things you miss during initial development and planning. Anyways, here is the progress with AnyFX so far. This video shows a shader implemented in AnyFX for OpenGL, and uses vertex, hull, domain, geometry and pixel shading. It uses hull and domain shaders to tessellated and displace the cube into a sphere. The geometry shader is used as a single-pass wireframe renderer, as described and implemented here: http://prideout.net/blog/?p=48. In the video you can see how I dynamically change tessellation inner and outer factors using an AnyFX variable.

anyfx

The next step on the list is compute shaders. When they work properly and I’m satisfied with how they are handled, I’m going to start integrating this into Nebula.

AnyFX, what the fuzz?

As a part of my studies, I’ve been developing a very simple programming language, very similar to that of Microsoft FX for effects. The difference between AnyFX and Microsoft FX is that AnyFX is generic, meaning it will work for any back-end implementation. The language works by supplying all the other stuff BESIDES the code which we need to render. This means that we actually put back-end specific implementations in the shader bodies. Why you may ask? Well, it may be extremely dangerous and poorly optimized if we are to define our own language for intrinsics, function calling conventions etc, and directly translate this to graphics assembler. Instead, we rely on the vendor-specific back-end compilers to do the heavy work for us. As such, we can super easily port our old HLSL/FX shaders or GLSL shaders by simply copying all of the functionality in the function bodies straight into an AnyFX file. However, this requires us to provide potentially several files in order to have support for different shader libraries, and yes, in this sense you are correct. We could implement a language with several function bodies, one for each implementation, but it wouldn’t look like C anymore, and the code could get messy in a hurry. Sounds confusing? Well, here’s an example:

//——————————————————————————
// demo.fx
// (C) 2013 Gustav Sterbrant
//——————————————————————————

// This is an example file to be used with the AnyFX parser and API.
profile = glsl4;

// A couple of example variable declarations
sampler2D DiffuseTexture;
sampler2D NormalTexture;

state OpaqueState;
state AlphaState
{
DepthEnabled = true;
BlendEnabled[0] = true;
SrcBlend[0] = One;
DstBlend[0] = One;
};

// a variable block containing a set of variables, this will instantiated only once in the effects system
// this block of variables will be shared by all other .fx files compiled during runtime with the same name and the [shared] qualifier
varblock Transforms
{
mat4 View;
mat4 Projection;
};

mat4 Model;

varblock Material
{
float SpecularIntensity = float(1.0f);
vec4 MaterialColor = vec4(1.0f, 0.0f, 0.0f, 1.0f);
};

//——————————————————————————
/**
Simple vertex shader which transforms basic geometry.

The function header here complies (and has to comply) with the AnyFX standard, although the function code is written in a specific target language.

This language is compliant with GLSL
*/
void
vsStatic(in vec3 position, in vec2 uv, out vec2 UV)
{
gl_Position = Projection * View * Model * vec4(position, 1.0f);
UV = uv;
}

//——————————————————————————
/**
Simple pixel shader which writes normals and diffuse colors.

Here, we use multiple render targeting using input/output attributes.

We also apply a function attribute which tells OpenGL to perform early depth testing
*/
[earlydepth]
void
psStatic([color0] out vec4 Color)
{
Color = texture(DiffuseTexture, uv);
}

//——————————————————————————
/**
Two programs, they share shaders but not render states, and also provide an API-available data field.
*/
program Solid [ string Mask = “Static”; ]
{
vs = vsStatic();
ps = psStatic();
state = OpaqueState;
};

program Alpha [ string Mask = “Alpha”; ]
{
vs = vsStatic();
ps = psStatic();
state = AlphaState;
};

 

So, what’s fancy here? Well first of all, we can define variables for several shader programs (yay!). The programs combines vertex shaders, pixel shaders, eventual hull-domain and geometry shaders, together with a render state. A render state defines everything required to prepare the graphics card for rendering, it includes depth-testing, blending, multisampling, alpha-to-coverage, stencil testing etc. Basically, for you DX folks out there, this is a combined Rasterizer, DepthStencil and BlendState into one simple object. You may notice that we write all the variable types with the GLSL type names. However, we could just as well do this using float1-4, matrix1-4×1-4, i.e. the HLSL style. The compiler will treat them equally. You may also notice the ‘profile = glsl4’ which just tells the compiler to generate GLSL code as the target. By generate code in this case, I mean the vertex input methodology (which is different between most implementations). It’s also used to transform the [earlydepth] qualifier to the appropriate GLSL counterpart. We can also define variable blocks, called ‘varblock’, which handles groups of variables as buffers. In OpenGL this is known as a Uniform Buffer Object, and in DirectX it’s a Constant Buffer. We also have fancy annotations, which allows us to insert meta-data straight into our objects of interest. We can for example insert strings telling what type of UI-handle we want for a specific variable, or a feature mask for our programs, etc. Since textures are very very special, in both GLSL and HLSL, we define a combined object, called sampler2D. We can also define samplers, which is handled by DirectX as shader code defined objects, and in OpenGL as CPU-side settings. In GLSL we don’t need to define sampling from a texture using both a texture and a sampler, but in HLSL4+ we do, so in that case, the generated code will quite simply put the sampler object in the code. We can also define qualifiers for variables, such as [color0] as you see in the pixel shader, which means that the output will be to the 0’th render target. AnyFX currently supports a plethora of qualifiers, but only one qualifier per input/output.

Anyways, to use this, we simply do this:

 

this->effect = AnyFX::EffectFactory::Instance()->CreateEffectFromFile(“compiled”);
this->opaqueProgram = this->effect->GetProgramByName(“Solid”);
this->alphaProgram = this->effect->GetProgramByName(“Alpha”);
this->viewVar = this->effect->GetVariableByName(“View”);
this->projVar = this->effect->GetVariableByName(“Projection”);
this->modelVar = this->effect->GetVariableByName(“Model”);
this->matVar = this->effect->GetVariableByName(“MaterialColor”);
this->specVar = this->effect->GetVariableByName(“SpecularIntensity”);
this->texVar = this->effect->GetVariableByName(“DiffuseTexture”);

 

Then:

 

// this marks the use of AnyFX, first we apply the program, which enables shaders and render states
this->opaqueProgram->Apply();

// then we update our variables, seeing as our variables are global in the API but local internally, we have to perform Apply first
this->viewVar->SetMatrix(&this->view[0][0]);
this->projVar->SetMatrix(&this->projection[0][0]);
this->modelVar->SetMatrix(&this->model[0][0]);
this->matVar->SetFloat4(color);
this->specVar->SetFloat(1.0f);
this->texVar->SetTexture(this->texture);

// finally, we tell AnyFX to commit all changes done to the variables
this->opaqueProgram->Commit();

 

Aaaand render. We have some restrictions however. First, we must run apply on our program before we are allowed to set the variables. This fits nicely into many game engines, since we first apply all of our shader settings, then apply our per-object variables, and lastly render. We also run the Commit command, which updates all variable buffers in a batched manner. This way, we don’t need to update the variable block for each variable, seeing as this might seriously stress the memory bandwidth. When all of this is said and done, we can perform the rendering. We need to perform Apply first, because each variable will have different binding points in the shaders. In OpenGL, each uniform have a location in a program, and since different programs may use any subset of all variables declared, the locations are likely to be different. In HLSL4+, we use constant buffers for everything. For HLSL4+, commit is vital since if we only use constant buffers, we need to, at some point, update them.

All in all, the language allows us to extend functionality to compile-time stuff. For OpenGL, we can perform compile-time linking by simply testing if our shaders will link together. We can also obfuscate the GLSL code, so that nobody can simply read the raw shader code and manipulate it to cheat. However, during startup, we still need to compile the actual shaders before we can perform any rendering. In the newer versions of OpenGL, we can pre-compile program binaries, and then later load them in the runtime. This could easily be implemented straight into AnyFX if needed, but I’d rather have the shaders compiled by my graphics card so that the vendor driver can perform its specific optimizations. Microsoft seems to be discontinuing FX (for some reason unknown), but the system is still really clever and useful.

And also, as you may or may not have figured out, this is the first step I will take to finish the OpenGL4 render module.

When I’m done with everything, and it’s integrated and proven to work using Nebula, I will write down a full spec of the language grammar, qualifiers and release it open source.

Summer time!

So the game projects are over, phew! The critique we got this year was far less than it was before, which is good since we spent a good lot of time working on the tools. Of course we had some bugs during the development phase, and had some problems which we want to address. Apart from that, it’s time to look forward again to see what we want to change and improve. My area of interest, as you may know by now, is rendering and shading, so this is the list of things I want to fix at the moment:

  • Materials currently need to be inserted in the materials list, and also EVERYWHERE when we want the material to be rendered. It would be much neater if the material could have some attribute for which we sort and render, so that we may say “Render all materials which requires lighting here”, “Render all materials which requires subsurface scattering here” etc etc. This makes it super simple to add a new material into Nebula.
  • Rendering is threaded, but seems to be somewhat glitched since we get random deadlocks. It’s also very cumbersome to get animation stuff, attachments, shader variables, skeleton stuff etc since we have to perform a thread sync every time we request something. So, I would like to rethink the threading part, so that we may use the power of multithreaded rendering, while at the same time maintain simple access to graphics data. My current idea is to implement a separate context which allocates resources (since texture, mesh and shader allocation is going to take time), while still maintaining the immediate render context on the main thread. Animation and visibility queries should still be in jobs, since it works extremely smoothly and fast. This also simplifies Qt applications since the WinProc will be in the main thread! Also, if possible, we could have another thread which only performs draw calls, so that the actual drawing doesn’t directly affect game performance. This last part is similar to: http://flohofwoe.blogspot.se/2012/12/coregraphics2.html
  • Changing materials on models is quite a hustle, and it shouldn’t since the material is only used to batch render objects, so we should be able to switch materials, which is then applied on the next render.
  • All models should be flat in hierarchy. This is because we want a model to be split by it’s materials, meaning we can easily find and set a variable for a ‘node’ in a model. Previously, we had to traverse the node hierarchy for the model to find a node, but this shouldn’t have to be the case since we can flatten transforms and pre-multiply all transforms directly to the mesh. As such, we can remove a lot of complexity with model updates by simply having a combined transform for all nodes, let all nodes be primitive groups in a mesh and let each node have its own material. This still allows us to have the super clever models used in Nebula, where we can have several materials for the same model.
  • Dynamic meshes. If we are going to have a deferred rendering context which allocates resources, then we should be able to load static meshes in the thread, and create dynamic meshes in our main thread, then we can also modify the dynamic meshes however we want. As such, we could super simply implement cloth, blend shapes, destruction etc.
  • We should be able to import statically animated meshes. A very clever way to do this is to use all the animated transform nodes and convert them to a joint hierarchy, and then to simply rigid bind the meshes to their joints. This is very similar to how rigid binding is done at the moment, but in this case the artists need not create a skeleton before hand (creds for this goes to Samuel Lundsten, our casual graphics artist).
  • Model entities are a bit too generic. This is very flexible with the current multithreaded solution, since we can simply send a GraphicsEntityMessage to the server side ModelEntity and everything works out fine. However, whenever we need a specialized model, for example a particle effect, a character (, and in the future cloth/destructable model) we are likely to have specialized functionality, for example particle effect start/stop, character animaton play and so on, you get the idea. It’s a bit more intuitive if we have different types of entities for this instead of using generic model entities. Also, in my opinion, the way a character is ALWAYS a part of a model entity, be it initialized or not, which doesn’t look that nice. Another problem is that since particle models, just like any models, have a simple model hierarchy, we need to traverse all particle nodes in order to start/stop the particle effect. This can then be way more generalized if we have a ParticleModelEntity which handles this for us.
  • OpenGL renderer. I don’t really like the DirectX API, in any shape or form. It’s only valid to use for Windows and Xbox, and for Windows we can still use OpenGL anyways, so there is really no reason unless you want to develop games exclusively for Xbox. Besides, the current gen graphics is only available on Xbox One, so the only reason we would want to keep using DirectX would be to develop games to a set-top box. We will of course till have the DirectX 11 renderer available if we want. Right, OpenGL! I’m almost done with AnyFX right now, and currently the only back-end implementation available is for GL4 (coincidence? I think not!). The shading was the major cog in our wheels, since we had no FX counterpart for OpenGL, which I hope AnyFX will supply.
  • Integrate AnyFX. Whenever I’m done with AnyFX, meaning it should be thoroughly tested with geometry shading, hull and domain shading, samplers and perhaps even compute shaders, I’m going to integrate it with Nebula in order to accomplish the goal defined above.
  • Compute shading. Whoa, this would be soooo awesome to integrate. With this, we could have GPGPU particles, or perhaps a Forward+ renderer (drool).
  • Frame shader improvement. It could prove useful to be able to declare depth-stencil buffers separately. This isn’t really THAT important, but it could prove useful to render some stuff to a separate depth buffer without having requiring it to be paired with the actual render targets.
  • Implement some system which lets us use two sources for shaders. This is useful when having an SDK, with all the useful shaders from the engine, and then be able to import new shaders from an auxiliary source, your project for example. The same goes for materials, and materials should perhaps be defined as a resource where each material defines their shaders, variations for each shader and parameters for each material. As such, we can simply just add a new material. Currently, the materials are listed in one big XML file, and this is not so neat, seeing as we might have tons of different materials, and finding a certain material takes a considerable amount of time since just the current materials take up tons of space. With this implemented, we can also split materials into SDK materials and project materials. As for frame shaders, we specify a very specific render path, and in my opinion, this should be exclusive for each project.
  • Enabling and disabling rendering effects is neither pretty, nor working properly. The frame shaders define a complete render path, and each frame shader has mutually exclusive set of resources. So if we want to implement some AO algorithm, we have to put it straight in the main frame shader, and this is both ugly and perhaps unwanted (when considering different specs). So, if we want to have for example low-medium-high AO quality, we have to define three different frame shaders, each with their own variations. This is perhaps not so nice. While I love the frame shader render path system, I dislike the inflexibility to control it during runtime. A way to handle this would be to define where during the render path we want to perform some render algorithm by supplying the render path with an algorithm handler, for example:

    <Algorithm handler=”Lighting::AOServer” output=”AOBuffer”>
    <Texture name=”Depth” value=”DepthBuffer”/>
    </Algorithm>

    The render system would then call the Lighting::AOServer::Instance() with a set of predefined functions, which is used to prepare the rendering of the algorithm, which is then written to the output. As such, we can simply call the AOServer and adjust our AO settings accordingly. We could also have batches within the Algorithm tag, which enables us to render geometry with a specific adjustable server handling it. The Texture tag defines an input to the handler, which works by supplying a symbolic name (“Depth” in this case) to which we bind a texture resource (Here it’s the render target “DepthBuffer”). Since the frame shader only deals with textures, this is the only thing we have to define in order to run our algorithm.

  • Visibility is resolved a little goofy currently. This is because visibility has to be resolved after we perform OnRenderBefore() on our models, which in turn is called whenever we prepare for culling. This is a bit weird, since OnRenderBefore is REQUIRED in order for a model to use a transform, but OnRenderBefore only gets called if an object is visible. This means that when an object gets created, it’s automatically in origo until OnRenderBefore gets called. However, since this only gets called when the camera sees an object, it means that our shadow casters never gets this feedback, and as such, they stay untransformed until they are visible by the camera! Have in mind that the global bounding box gets updated constantly, we don’t have to turn the camera to origo just to get objects to be visible, however we have to see them with the actual camera! Albeit, this is very clever since it means objects outside the visible area never gets their transforms updated, and this is completely legit if we don’t use shadows. However, when we do have shadows, we need to update models for all objects visible by the camera and all shadow casting lights! The current solution is a bit hacky; it works but it can be done much nicer.
  • Point light shadows. As of right now, point lights cannot properly cast shadows. I should implement a cube map shadow rendered which utilizes geometry shading to render to a cube with one draw pass. This also means that point light shadows cannot be baked into the big local light shadow buffer, but need instead their own buffers.

Phew, quite a list! In the next update I’ll probably be posting something about more about AnyFX, since it will be the first thing I check off the list.

 

// Gustav