Leave a comment

Animation system

In accordance with my last post, I’ll talk about the animation system I’ve currently implemented.

I was originally using something based on the XNA skinned model sample. I knew eventually this wouldn’t be good enough because it doesn’t support blending. I was also worried about the massive amount of keyframe objects it creates. Just a couple of seconds of animation – a walk and an idle animation – is 10000 objects. The more objects that exist, the longer .net garbage collections will take. Taking a look at the gc on the Xbox, animation keyframes dominated in the object count above all else.

Instead of modifying my code to support these features, I thought I’d check out other existing animation libraries for XNA. The most actively maintained, fully-featured one seems to be the KiloWatt library. It supports blending, redundant keyframe removal and keyframe storage as a more minimal SQT instead of a 4×4 matrix like the XNA sample.

I additionally have the requirement that my animations are stored in bvh files separate from the model itself (since I hope to re-use them). So after downloading KiloWatt animation, I still had some work to do.

The library worked nicely, but I was not able to get it working properly with the separate bvh files for some reason. Plus I was still left with the problem off too many keyframe objects. At this point I decided it would be less work to proceed with what I already had (which already had support for separate bvh’s).

Make keyframes smaller

The keyframe used in the XNA skinning sample contains:

  • int Bone (4 bytes)
  • TimeSpan Time (8 bytes)
  • Matrix Transform (64 bytes)

76 bytes each. 10000 of these keyframes is getting close to a megabyte of memory!

Time and Bone index can be inferred from the containing object, so we can get rid of those. The Transform matrix can be expressed more compactly as SQT (Scale/Quaternion/Translation). This is needed for proper interpolation for animation blending anyway. In addition, we can limit ourselves to a uniform scale (in fact, it is likely that almost all of my animations won’t scale anyway, so it would even be possible to eliminate altogether). With those changes, my keyframe is now:

  • float Scale (4 bytes)
  • Vector3 Translation (12 bytes)
  • Quaternion Orientation (16 bytes)

Now we’re down to 32 bytes! Further compression is possible, but at the expense of higher runtime cost. Each value could probably be stored at half precision, and the Quaternion is a unit vector, so one value can be left off. This would get us down to 14 bytes. But what I have now is the limit of “simple code”, so it’s fine for the time being.

To convert a keyframe into a transform matrix, I use:

public void ToMatrix(out Matrix m)
    m = Matrix.CreateFromQuaternion(ori);
    // Avoid needless mults if our scale is 1 (common case)
    if (scale != 1f)
        m.M11 *= scale;
        m.M12 *= scale;
        m.M13 *= scale;
        m.M21 *= scale;
        m.M22 *= scale;
        m.M23 *= scale;
        m.M31 *= scale;
        m.M32 *= scale;
        m.M33 *= scale;
    m.Translation = pos;

Reduce object count

The next challenge was to reduce object count, and thus heap complexity. The obvious way is to change the keyframe class into a struct (value type). There are two difficulties here: 1) We want to reduce or eliminate needless copying that occurs with value types. 2) we want to minimize data cache misses.

The first is fairly straightforward to achieve by factoring the code properly. One of the main things to keep in mind is that we need to keep the keyframes in native arrays and not Lists or any other dynamic type. You can pass a reference to an array element (avoiding the struct copy if you’re passing it to a function), but you cannot pass a reference to an element directly returned by the List indexer, since it returns a copy of the value in the list.

The second requires more thought, but is especially important on PowerPC architectures (i.e. the Xbox). CPU memory cache misses – resulting in needing to retrieve data from main RAM – are very expensive; they could be on the order of 1000 cycles. So ideally, we want the data that is accessed together to lie together in memory. Certainly an array of structs should be good for this, because that is one chunk of contiguous memory.

The problem is that the animation data is exposed in the content pipeline as “per bone”. That is, for a particular bone you have all the keyframes for each frame of the animation. This organization is carried through in both the XNA Skinning Sample, and the KiloWatt library. An animation consists of a set of keyframes for each bone:

Animation keyframes per bone

But when accessing the animation data at runtime, we are interested in examining the keyframes for all the bones for a particular time. So if we have an array structs, we’d much prefer them to be organized so that the array contains all the keyframes for all the bones at a particular time:

Animation keyframes per time

So I modified the organization of the data so that it is now like the above picture. One potential problem here is that in some animations, some bones may have far fewer keyframes. I don’t know if this is common practise, but I can see situations where it applies. Organizing the data like the above means we’ll need to insert duplicate keyframes.

If I wanted to support separate sampling rates for individual bones eventually, I could store the bone index with the keyframe again, and keep a separate array of keyframes for the “last known keyframe” for each bone.

Animation blending

The blending operation itself is pretty straightforward. I can create a keyframe from two intermediate keyframes like so:

public KeyFrame(ref KeyFrame k1, ref KeyFrame k2, float amount)
    scale = MathHelper.Lerp(k1.scale, k2.scale, amount);
    Vector3.Lerp(ref k1.position, ref k2.position, amount, out position);
    Quaternion.Slerp(ref k1.orientation, ref k2.orientation, amount, out orientation);

The animations I have are high resolution enough that I don’t need to blend between individual frames (unless I decide I want smooth slow motion). I’m using blending to transition between two different animations (idle and walking, for instance).

The more interesting part of animation blending is how it ties into the game engine, and whether things like additive blending (which could be used to add on things like a head turning animation as someone is running) are supported. I don’t have a very advanced system yet. I just support simple transitions from one clip to another. For now this is good enough.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Space Quest Historian

Let's Play's, Podcasts, and General Adventure Game Goodness

Harebrained Schemes

Developer's blog for IceFall Games

kosmonaut games

Development blog of "Bounty Road"


Turn up the rez!

bitsquid: development blog

Developer's blog for IceFall Games

Game Development by Sean

Developer's blog for IceFall Games

Lost Garden

Developer's blog for IceFall Games


Developer's blog for IceFall Games

Casey's Blog

Developer's blog for IceFall Games


Developer's blog for IceFall Games

Rendering Evolution

Developer's blog for IceFall Games

Simon schreibt.

Developer's blog for IceFall Games

Dev & Techno-phage

Do Computers Dream of Electric Developper?

- Woolfe -

Developer's blog for IceFall Games

Ferrara Fabio

Game & Application Developer, 3D Animator, Composer.

Clone of Duty: Stonehenge

First Person Shooter coming soon to the XBOX 360

Low Tide Productions

Games and other artsy stuff...


Just another WordPress.com site

Sipty's Writing

Take a look inside the mind of a game developer.

Jonas Kyratzes

Writer, game designer, filmmaker.

%d bloggers like this: