Leave a comment

Lighting revamp progress

I’m gradually making progress in my lighting revamp. This involves the following:

  • Making my textures have real-world albedo values
  • Doing the appropriate gamma correction throughout the pipeline
  • Coming up with a better ambient lighting model
  • Taking another look at my less-than-satisfactory HDR implementation

I’m trying to make all my rendering “gamma correct”. This means doing my lighting calculations in linear color space. Well, they’ve always assumed linear color space, but my input textures (and thus my albedo G-buffer) were in sRGB. So I am now correcting the source textures (sRGB to linear), operating in linear space in my light accumulation buffer, and then doing the final gamma correction (sRGB to linear) before presenting to the screen.

I really don’t like the gamma-correct look though. I’m still in the early stages, so it’s possible I have some bugs I need to work out. But everything comes out very washed-out.

The following set of images show before (top) and after (bottom).


A canyon at dusk.

A canyon at dusk.



Desert at midday



Snowy stream with birches.

Snowy stream with birches.


In general, snowy sunny scenes look a little better though:




For those implementing a properly gamma-correct pipeline, how do you prevent your lighting from getting so washed out?

Leave a comment

Global illumination – Low frequency ambient occlusion

Three scenes lit only with a ambient term, including occlusion.

Three scenes lit only with an ambient term, including occlusion.


A couple of posts ago I mentioned I would try my hand at implementing some more general purpose low frequency ambient occlusion. The article here suggests a blurry overhead shadowmap to darken areas under trees and such. With this as inspiration, I came up with a technique I’m fairly pleased with.

My general principle is the same: I render several overhead shadow maps from various sun angles for a region. I create lightmaps of these, as if they were applied to near-ground geometry. I blend them all together, blur them, and then store the result in a texture for lookup by my ambient lighting shader.

Step by step

Here’s a view of an area of my game world with a brick wall, a canyon cliff, and some trees:





I create an orthographic projection of a 32×32 area looking down, generate a shadowmap with the “sun” roughly overhead, and then cast that onto a model of the terrain that is just slightly above the actual terrain (so that the actual terrain doesn’t cause self-shadowing everywhere). I don’t need to use any kind of fancy shadow mapping here, just a basic single tap compare will do. The resulting light map looks something like this:




That’s very crisp, let’s combine more light maps with the sun at various positions in a hemisphere above the ground:




That’s around 6 samples, and looks a little better. Let’s try 25 samples, evenly distributed around the hemisphere (incidentally see here for information on how to choose uniform sampling points on a hemisphere, using cosine weighting). Yes, this is a lot of work, and isn’t suitable for realtime. So the resulting textures are meant to be generated offline and used in realtime.




That’s looking a lot better, we can even begin to see darkness along the canyon walls. The occlusion on the ground near the brick wall is much smoother. Let’s blur it.




Now, what I do with all this data? Remember, this is all for a 32 x 32 chunk of my 1024 x 1024 map (because I’m blurring things, I actually render large enough for 34 x 34 chunk, and just pluck out the center 32 x 32 piece). I only want 32 x 32 pieces of data, so I shrink the image down to that size, and here is the result (magnified).




The above is a very coarse representation of the ambient occlusion at ground level. I can sample from this during my lighting pass and include it in the ambient lighting term.

It’s not good enough as is though, since most objects are above the ground, not right at the ground. The above image is measuring shadow at the ground. The area with the brick wall, for instance, is very dark. That would mean I would render the top of my brick wall very dark, since it’s in the dark region – even though it basically isn’t occluded at all. The same goes for my trees and such.

So what I want to do is to measure the occlusion at points higher above the ground too. So we’re basically developing not just a 2d grid of occlusion, but multiple layers of grids. My game is basically 2.5d though – there isn’t much verticality. I don’t need a full 3d cube of occlusion. Just a few layers will do.

I end up rendering a total of 3 layers (I’ll describe why I chose 3). Here’s another section of my game world, and 3 “layers” of terrain (onto to which shadows are projected) visualized in magenta:




I use fixed layer heights roughly based on how tall I expect objects to be in my game.

Why 3 layers? Well it fits nicely in a single texture. Remember, I am using this in my ambient lighting shader. In that shader I have access to the world position of the pixel being rendered. So I can sample from the appropriate location in my 1024 x 1024 occlusion texture. However, I also need the reference ground height at this location. I could sample from the heightmap that I have available, but that would be an extra texture sampling operation. So instead I stash the reference ground height in the alpha component, and the occlusion terms for the 3 layers in the R, G and B channels. Once I figure out how high I am above the ground, I lerp between the three occlusion channels to get the value I need.

The total cost in the shader is one texture sample and 15 instructions. The per-frame costs appears to be under 0.1ms on my GeForce GT 240.


float OcclusionStrength; // I'm using 0.8 right now.

// These define the thicknesses of the occlusion bands/layers.
#define BAND_HEIGHT 2.0

float GetOcclusionAtPoint(float3 worldPosition)
	// We don't need the DX9 half texel offset. The center of each texel represents the point in between two grid vertices,
	// not an actual grid vertex.
	float2 texCoord = worldPosition.xz * OneOverWorldSize;

	float4 sample = tex2D(OcclusionAndHeightSampler, texCoord);
	float terrainReferenceHeight = sample.a * YDisplacementScale;	// YDisplacementScale converts [0-1] into actual world values.
	float heightAboveTerrain = worldPosition.y - terrainReferenceHeight;

	// heightAboveTerrain should guide us as to the balance of the RGB channels. We'll end up lerping between two of them.
	float3 bands;
	bands.x = heightAboveTerrain < HEIGHT_BAND_LOWER;
	bands.y = (heightAboveTerrain >= HEIGHT_BAND_LOWER) && (heightAboveTerrain < HEIGHT_BAND_UPPER);
	bands.z = heightAboveTerrain >= HEIGHT_BAND_UPPER;

	float3 sampleA = float3(sample.gb, 1);
	float3 sampleB = sample.rgb;
	float3 lerpAmount = saturate(float3(HEIGHT_BAND_LOWER - heightAboveTerrain, HEIGHT_BAND_UPPER - heightAboveTerrain, HEIGHT_BAND_UPPER + BAND_HEIGHT - heightAboveTerrain) / BAND_HEIGHT);

	float occlusion = lerp(dot(bands, sampleA), dot(bands, sampleB), dot(bands, lerpAmount));

	// We could optimize this by storing the inverse in the texture to avoid some instructions.
	return 1 - (1 - occlusion) * OcclusionStrength;


What does it look like? Here’s the previous scene rendered with only the occlusion term:




You can see the obvious darkening at the base of the brick wall (but not at the top, because we have multiple occlusion layers!), on the lower parts of the trees, and on the steep cliff faces.

Now, with out any direct sunlight, the scene would be pretty muddled (no sunlight means there is just an overhead ambient term representing light from a cloudy sky):




In the above image, the tree and plant already have some high frequency AO baked into them (which is using in the ambient lighting equation), so they look a little better (or rather different) than the surroundings.  Now we include the occlusion term:




Much better.


The old way

To be fair, I did have a similar but more limited mechanism functioning previously. I ran an offline ray-caster on my terrain and baked some occlusion into the terrain itself, so I did have some darkening in corners and cliff faces. But the ray-casting only included other terrain, not all the static world objects like this method described in this article does.

To make up for that, I had an overly complex system that, for a limited set of world objects (trees and walls, but nothing else), would draw some appropriate occlusion terms into a texture (e.g. a round blurry disc below plants and trees). This would be picked up by the terrain shader and included in the AO already baked into the terrain. So it allows some world objects to affect the terrain only.

However, the new way I’m doing things lets all (static) objects occlude all others (and it’s simpler, really).

I might keep the old way for dynamic objects causing some occlusion against terrain only.

Some more images

In my next post, I hope to talk more about other lighting changes I’ve been doing in the hopes of improving my lighting overall. For now though, here are some more images related to this post.

Here are some cylinder “probes” to help see the effect of the occlusion at various heights above the ground, next to various objects and terrain features:


CylinderProbes copy


Here’s a moving object placed under a tree and out in the open. There are no directional light sources in this image at all, just the overhead ambient skylight. You can see the man becomes darker under the cover of trees.




The next two images below show the subtle effect this gives. In each case, the bottom portion is the one with the occlusion term.

Note the darkening on the cliff below the conifer trees, for instance.




Here’s the AO term only:




There is sunlight in this picture, but the shaded canyon is a little darker due to less ambient light:





Doing work like this really makes you appreciate well-factored code. While I’m in the world editor, I want to be able to push a button and have the occlusion calculated for the entire world map. This requires setting up another rendering path, camera, loading all parts of the world as needed, and so on… – without affecting the “current” version of the world that is loaded. Because I’m lazy, I still had several places in my code where I was using static/global variables. I had to clean up all those code paths and remove almost all global state to get this working cleanly. Which just reinforces the belief that you will always regret having global state. There will always end up being a reason to remove it.

A pleasant surprise though, was how easy it was to get the entity component system working in this version of the world. I currently have around 15 systems operating on the entities, and there is a fair amount of code to set them all up. I was about to refactor that code to make it accessible by the occlusion-calculating code, when I realized all I needed were the RenderingSystem and ChildTransformSystem. So I just instantiate my entity component framework and add only those two systems. It was just a handful of lines of code, and a small modification to the RenderingSystem to only render objects marked as “static”.




Global illumination – comparison

In my last post I talked a bit about some cheapish global illumination I was trying out. I noted how it was a very subtle effect, and not really worth the effort.

Today I tried a much much simpler form of “global illumination”: in addition to my standard ambient term, I include another directional light that faces the opposite direction as the sun light. This is a much easier effect to control. I got the idea from this article.

It’s intended to simulate the sun bouncing off objects and lighting up the opposite side of the objects. It’s roughly 0.2 times as strong as the sunlight (assuming an average scene albedo of 0.2), and kind of yellowish greenish in color (assuming average scene colors).

Here’s a few comparison screenshots. The top image in each is without any GI. The second image is the result produced by the SH-based GI I mentioned in the last post. The bottom shot is the simple “opposite the sunlight” term.


CompareGI_Day1 copy


CompareGI_Day2 copy


CompareGI_Night copy


Obviously it’s not really a fair comparison, since the effect is much stronger. I could easily reduce the strength, although I have other plans to implement some low frequency occlusion that will likely address some of that.

The sunlight bounce light doesn’t quite face opposite the sun – the vertical part of the direction vector is reduce, so it shines more horizontally. Further more, the brightness is not a simple dot product of the light vector with the surface normal (which would result in full strength facing the light, and 0 strength at 90 degrees from the light). I allow light to “wrap around” beyond the 90 degree mark a bit, simulating (I think) what would happen with other reflective surfaces of various orientations behind the object.


float3 SunlightBounceDirection;
float3 SunlightBounceColor;
float WrapAround;
float3 GetSunlightBounceTerm(float3 surfaceNormal)
	// Basically, we'll take [-Wraparound, 1] and map it to [0, 1]
	float dp = dot(surfaceNormal, SunlightBounceDirection);
	float amount = (dp + WrapAround) / (1 + WrapAround);
	return SunlightBounceColor * saturate(amount);



Global illumination – Improving outdoor lighting

I came across some nice articles on this site which have inspired me to try to improve the lighting in my game.

Recently I implemented a “primitive” form of global illumination for the game (I say “primitive”, because while there are surely more advanced techniques out there, it is still far from being simple).

I basically measure reflected sunlight (one bounce) off of objects and add the effects of this to my ambient lighting term. Most of the work happens offline:

  • For every point in a world grid, render the scene in all 6 directions with sunlight only, and place these in a cube map
  • Integrate the cube map with spherical harmonic basis functions
  • Store the spherical harmonic constants in a texture where each texel corresponds to a point in the world grid (currently 1024×1024). I’m assuming a 2.5D world here (so there is just one layer).

For debugging purposes, I can show the generated cube map, the cube map applied to a sphere, and the spherical harmonic constants applied to a sphere (which should approximate the low frequency color variations in the cube map). In the below image, you can see the sphere gets lit by green light reflected off the mossy cliff, and below a bit from the blue water:


Cube map and results of applying it to a sphere (rendered roughly from where the sphere is located)

Cube map and results of applying it to a sphere (rendered roughly from where the sphere is located)


At runtime, using the textures that contain the SH constants, I add this term to the ambient light based of the normal of the pixel being rendered. I also scale this by the inverse of the current cloudiness (since it is supposed to represent reflected sunlight).

My plan (not yet fully-implemented), was to render these offline cube maps at 8 different times during the day. Then we would use a blend of the two adjacent (in time of day) snapshots. I got this idea from the GDC 2014 presentation “Assassin’s Creed 4: Black Flag – Road to next-gen graphics“.

Of course, 9 RGB spherical harmonic terms requires 27 bytes to store. If I only use the first 4 SH terms (taking a reduction in quality), then that’s still 12 bytes. At 1024 x 1024, that’s 12MB. Multiplied by 8 times of day, that’s 96MB just for the textures to store the reflected sunlight approximation. As for performance, it adds 6 texture samples to my ambient lighting equation. Offline performance is slow too – I only calculate 20×20 regions on demand right now for testing, but based on the time that takes, it would take about 7 hours to render for the entire map.

I could minimize memory usage by only placing probes in important areas and constructing the textures at runtime based on the nearest probes. But that would be a lot more work.

I suppose this would be worth it if it produced clearly superior results. But the resulting effect is quite subtle:



The lower half of the image has the global illumination effect applied. The bright sunlight reflecting off the right-hand cliff gives some extra illumination to the shadowed cliff on the left.


Overall, the effect is quite subtle for my mostly-outdoor world. It is worth all the effort? It also does nothing to help the case when it’s cloudy and there is no direct sunlight (since this is supposed to represent reflected sunlight). As a result, things are very dull and muddled in that scenario:




Now back to some of the interesting and inspiring articles I found.

There is one about outdoor lighting that appears to produce nice results. And there is one about multi-resolution ambient occlusion.

The latter article has some really nice images, and describes techniques which are actually similar to stuff I already have (which helps validate my hacks).

Although I don’t currently use the medium frequency occlusion (standard SSAO) because it’s quite expensive, I do have things that are similar to the high and low frequency occlusion.

For many of my objects, I have ambient occlusion terms baked into the geometry (this is the high frequency occlusion). It’s most noticeable in my plants, and improves the look greatly:




You’ll also note the dark area below the plant and near the base of the brick wall. This would be similar to what the article describes as low frequency occlusion. Basically, large objects cause things to be darker below them. Currently, my implementation is specific to terrain – certain objects (tree and walls) add to a ambient occlusion term baked into the terrain. The article describes a more general approach I could use.

So I’ll be looking at some of these techniques to improve my lighting. The first thing I’ll do though is to adjust the albedos of my textures. I think a lot of them are out of whack with each other, making things look not very realistic.

Hopefully I’ll get to a point where I can produce clearly better results, and provide some before-and-after screenshots.



Water waves

Using scrolling normal maps for water gives us a nice look for tranquil lakes and such. In order to get something that looks like it’s actually interacting with the environment though, we need to actually modify the water geometry.

Sine waves are one option: basically we’d displace the height of the water vertices based upon sine wave. A better option are Gerstner waves . They improve upon simple sine waves in that a vertices xy position is also changed, which more accurately models how real water waves look (see the linked article above for a better explanation).


Gerstner waves in my engine.

Gerstner waves in my engine.


Gerstner waves are what I currently use in the water shader in my engine. I layer four of them on top of each other to create sufficient variety. At any one point, one of them is fading to zero amplitude in order to be replaced by a new random wave.

I don’t yet have them tied to the wind in my world, but the plan is to roughly align them with the current wind.


Wire-frame of Gerstner waves.

Wire-frame of Gerstner waves.


One advantage of using a strict calculated wave like this is that I can duplicate these calculations on the CPU in order to, for example, place an object in water and have it bob up and down.

Ideally I would have a water simulation that allows for real interaction with the environment: “bouncing” off cliffs and speeding up over shallow water. I haven’t yet looked very far into doing this, but I would imagine I would still use the Gerstner waves as “energy input” to the water simulation (i.e., the wind action).

Leave a comment

iOS devices, dependent texture reads

I’ve always assumed that the term “dependent texture read” referred to the case when the texture coordinates used to fetch from one texture were calculated using the results of a previous texture fetch. This is a fairly obvious performance problem, as texture reads can be slower than ALU operations (especially if the texture is not in the cache). Thus the shader may “stall” waiting for the results it needs to proceed.

One of the problem levels with lots of bricks.

One of the problem levels with lots of bricks.

In a few of my game’s levels, I noticed that the frame rate was below 60 on the iPad 2, and that moving objects’ motion looked chunky. I put some Stopwatch counters around portions of my Update and Draw methods and looked at the values in the debugger (Xamarin studio). The numbers didn’t really make sense (like 60ms for a single Update cycle), so I assumed this was some artifact of debugging on an iOS device. Then I ported some performance measuring code I had in another project (which displays various metrics in the actual game) and had a look.

The Update and Draw cycles took about 6ms each in the worst case, so the performance bottleneck was not on the CPU (incidentally, this is about 20x slower than on my PC). I also noticed that the Draw cycle was frequently being skipped (XNA/MonoGame does this when the GPU can’t keep up).

So my bottleneck was definitely on the GPU. The problems seemed to occur on the levels with lots of “bricks”. So I assumed it was one of three things:

  1. Simply too much overdraw (the bricks are drawn over a background)
  2. The shader used to draw the bricks is too expensive
  3. I was using too much bandwidth sending over the brick vertices every frame (they aren’t drawn from a vertex buffer)

Making the bricks really small make the perf problem go away. So that ruled out (3), and suggested either (1) or (2) was a problem.

The brick shader is special – it combines two textures: a bricky background and a moss foreground. I subvert the color channel to pass in extra information that allows me to muck around with the texture coordinates I pass to the moss texture fetch. Really, I’m just doing this because all the rendering in this game goes through XNA/MonoGame’s SpriteBatch, which uses a fixed Position/Color/TextureCoordinate vertex format. I am lazy and wanted to avoid creating a new rendering code path – thus this hack.

Bricks need two textures.

Bricks need two textures.

But basically my shader does:

  1. fetch from brick texture
  2. calculate moss coordinates
  3. fetch from moss texture
  4. multiply the two together

I tried removing step 2 from the shader, and suddenly the perf problem went away. It was just a handful of calculations, which should be no big deal (perhaps even hidden by the latency of the brick texture fetch), but it made a big difference performance-wise.

I was confused, so then I tried some of Xcode’s performance measuring tools. They were super easy to use (I was getting results within 30 seconds of opening the tool). One of them spits out potential performance problems.

Screen Shot 2014-01-17 at 7.31.45 PM

It listed “dependent texture sampling” for the call that draws the background image. The background image is drawn with a very similar shader as the bricks: dual texture with some calculations for one of the texture coordinates. Further research showed that dependent texture reads can also refer to any calculations done for texture coordinates in the pixel shader. On Apple’s website I found the following:

Dependent texture reads are supported at no performance cost on OpenGL ES 3.0–capable hardware; on other devices, dependent texture reads can delay loading of texel data, reducing performance. When a shader has no dependent texture reads, the graphics hardware may prefetch texel data before the shader executes, hiding some of the latency of accessing memory.

The iPad 2 has OpenGL ES 2.0 hardware, I believe. So I’m subject to this limitation. This isn’t really something I ever had to deal with when developing for PCs or the Xbox 360. I’m guessing this has something to do with a more primitive texture cache on less-advanced GPUs.

The fix was theoretically simple: just move the texture coordinate calculations to the vertex shader. Unfortunately this wasn’t possible in my scenario, so I ended up having to “do it properly” and basically re-implement a subset of XNA’s SpriteBatch and plumb through the extra pair of texture coordinates. With a proper set of texture coordinates passed into the shader I no longer need to do extraneous calculations in the pixel shader. Performance is back up above 60FPS.

1 Comment

Average hue determination – (Pokédex part 2)

How do I sort a series of images by color? See the previous post for the motivation here.

My first iteration was just to take the average of all the opaque pixels of the image. It was a start, but I knew this wasn’t going to be sufficient. That only gives me a single color. It also might give me a completely wrong in-between color if a Pokémon is half one color and half another.

Is Smoochum yellow or pink?

Is Smoochum yellow or pink?

So then I thought to calculate the hue of each pixel and divide them up into buckets to make a histogram of hues. I could then choose a maximum (or multiple maxima if I wanted to classify it under multiple distinct hues).

Immediately I saw a problem though: if the main color of a Pokémon lay roughly between two buckets, samples might be split between those two buckets, leading another bucket (representing a less important color) to be the “winner”.

There’s probably a correct and complex mathematical solution to this problem, but the immediately obvious thing to do was to make buckets overlap adjacent buckets. So one pixel sample could end up in multiple adjacent buckets and help count towards both their scores.

So I set about implementing that, but started with only choosing one bucket just to simply things. One of the first things I noticed was the miscategorization of many Pokémon. Umbreon for instance, was lumped with the blue Pokémon:


Mousing over the Umbreon sprite in photoshop, I could see it was dominated by a very unsaturated blue hue (its gray body). I kind of wanted it in the yellow category. So I added a saturation threshold below which I wouldn’t consider pixels.

This helped, but I still noticed some problems. Latias, who is pink/red, was categorized with the purple Pokes. In Photoshop, Latias’ slightly purple upper body was below my saturation threshold, so what was the problem?


Stepping through the debugger, I noticed that the saturation values being calculated for the barely purple pixels were close to 100%. It turns out that System.Drawing.Color.GetSaturation uses the HSL representation, not HSV (despite the documentation saying otherwise). I replaced its implementation with a proper HSV saturation, and the problem was fixed.

I had to lower my saturation threshold to around 10% to correctly categorize some Pokes. But that put Zoroark amongst the purples, due to his just slightly purple dark body:


I realized that it was probably a better idea to weight the samples based on saturation. That fixed Zoroark’s problem:


But oh no: Sylveon, who had previously been categorized correctly, was now in the blues. She’s generally pink, but her blues are just saturated enough to outweigh that.


So I put a max on the saturation weight at 60%, and that was enough to tip the scales in her favor and bring her back to pinks:


But now the pink Mewtwos were in the green zone:


Inspecting in Photoshop, I saw that the sprite is ringed by a greenish outline that is dark, but highly saturated.


Of course… dark colors can be highly saturated (look here). So my final tweak was to instead weight by the product of saturation and brightness (value). Mewtwo is in good company now:


Finally, it was time to allow for a Poke to appear in multiple color categories. I played around with this for a bit, and came up with the following: I choose a bucket that is at least 50% of the biggest bucket, and whose adjacent buckets are less than itself (i.e. it must be a local maximum). It seems suitable.

The final result looks something like this, with very quick scrolling through the list:


Unfortunately, it has turned out not to be all that useful. It still takes a while to find the Pokémon you want.

- Woolfe -

Developer's blog for IceFall Games

Ferrara Fabio

Game & Application Developer, 3D Animator, Composer.

Clone of Duty: Stonehenge

First Person Shooter coming soon to the XBOX 360

Low Tide Productions

Games and other artsy stuff...


Just another WordPress.com site

Sipty's Writing

Take a look inside the mind of a young game developer.

Jonas Kyratzes

Writer, game designer, filmmaker.

Indie Gamer Chick

Indie Game Reviews Without Mercy

The Witness

Developer's blog for IceFall Games

Developer's blog for IceFall Games

Developer's blog for IceFall Games

game producer blog

Developer's blog for IceFall Games

A Random Walk Through Geek-Space

Brain dumps and other ramblings

The ryg blog

When I grow up I'll be an inventor.

Developer's blog for IceFall Games


Developer's blog for IceFall Games

Ocean Quigley's Projects:

Developer's blog for IceFall Games

Wolfire Games Blog

Developer's blog for IceFall Games

The Witness

Developer's blog for IceFall Games


Get every new post delivered to your Inbox.

Join 282 other followers