Identifying entities in the game world

In this post I’ll go over my evolving thoughts about how to identify game entities (and other resources) in the game engine.

Everywhere in the game engine are identifiers that let us access important objects. These objects might be game entities, scripts, 3d model names, etc… We can’t use pointers or live managed references to access these objects since we need to be able to persist these identifiers for save games or access from scripts. If the resource has a distinct name, then we can use strings. Strings, however, take up a lot of space, are slow to compare against, and – in the managed world – impact garbage collection cost.

So generally we use some sort of integer that can uniquely identify the item. The problem then becomes how to generate these integer identifiers and maintain them.

My first attempt at this used a giant monolithic string list. When any piece of code requested an integer identifier for a string, we would look in our list and hand back the index. If the string was not present, we would add it to the list and hand back a new id. We started from 0, and the ids simply increased as we added more strings.

Now, this requires maintenance of this monolithic string list. Once I have generated an id for a string, it can never change. That means I can never remove any string from the list (well, I could replace it with an empty string), re-order the list or anything. If any change is made to the game world that adds a new string, I need to ensure the monolithic list is correctly updated and saved out, always in sync.

While that’s doable, it still frightens me a little. If the list did get out of sync, it might produce very insidious bugs.

I had briefly thought of just using string hashes for the ids, but quickly dismissed it due to the risk of collisions (two strings producing the same hash code). However, a chapter I came across in Game Engine Architecture changed my mind. The author stated that they used a 32 bit hash for strings, and didn’t encounter a single collision in two years of development on a game. Even in the rare case a collision was encountered, it is easily detected and easily fixed by changing one of the strings.

And so I began looking into that. It would simply a lot if it would really work.

Unfortunately, the .net implementation of GetHashCode() for strings, while quick and sufficient for use in a hash table, isn’t good enough for these purposes. Ideally the hash would result in large differences in the number for small differences in input. I noticed that the values returned for similar strings were very similar.

Even more important though, is that the value returned by GetHashCode() is not guaranteed to be consistent across platforms or versions of .net. The values it produces really cannot be persisted.

So I found a CRC-32 implementation on the web and adapted it for my purposes. Since I control the implementation I can guarantee consistent results across platforms and .net versions, and the distribution of values is also much better.

In order to convince myself of the unlikelihood of collisions, I found a large (58,000) list of English words on the internet and ran them through the hash. No collisions at 32 bits. Or 31 or 30 bits. 1 at 29 bits, and 2 at 28 bits. Seemed good enough for me.

Just for kicks I ran it through the .net GetHashCode() implementation too. 1 collision at 32 bits (“nominee” and “mineshaft”), and 14 collisions at 28 bits.

So now I’ve converted most of my code that needs this to using these hash ids, eliminating a lot of the data files I needed to maintain and simplifying my code quite a bit. In many cases, I don’t even need to load the actual strings into the game – the ids are all resolved at compile time, and they strings themselves never need to be seen by the user (they can be loaded into the debug version of the game or the game editor though). I will at some point need to write some code that warns collisions, but that shouldn’t be very difficult.

I use these ids for many purposes: for variables in scripts, to identify words in NPC conversations, and to identify named game entities.

For game entities, I need to be careful. Not all game entities have names. They only need a name if they are a unique item in the world and they need to be referenced from a script. A random piece of fence, a chair, or a weapon dropped as loot in combat do not need names. They may still need a unique identifier, however. If that weapon is moved into a character’s inventory, I need some way to track that.

I don’t want two separate notions of id though, so I have decided to allocate specific zones in a 32 bit space to different types of ids. All ids with the high bit set are string hashes (yes, I’m only using 31 bits, but I’m still not worried about collisions). The bottom half is reserved for unnamed entity ids which are allocated by an ever-increasing value that I need to “maintain” during gameplay and as part of a save game (as opposed to during development). Thus I avoid collisions between the two types. (I actually have additional requirements that result in me segregating the 32 bit space into 4 different zones, but I won’t go into those details now).

Generating string hashes is certainly not cheap (you need to check each character of the string), but I don’t need to do it in tight loops anywhere. And as much as possible they are resolved at compile time.


3 comments on “Identifying entities in the game world

  1. AFAIK .net already does some magic behind the scenes which basically turn string comparisons into comparing two pointers or integers (or something). I’ve never tested this, but I would expect comparing strings to be really fast in .Net. So all this would make a whole lot of sense in C/C++, but it’s less useful in .net. (but I might be wrong)
    On the other hand .net might be able to optimize things better when you’re working with simple integers (a value type) compared to strings (a class), but anyway šŸ˜‰

    • Thanks for the reply Sander.
      I think maybe you’re talking about string interning, where .net keeps only a single instance of each unique string around (since strings are immutable). So yes, a string equality test maybe just be a pointer comparison (and of course you could implement the exact same thing in C/C++).

      So, this is just like an id system, but you don’t have control over the “ids” (which would be the .net handle to the object, I suppose). This would present a problem when serializing objects – you would still need to serialize the entire string in order to maintain consistency across instances of the program being run. And, you would require the strings be in memory (increasing the number of .net objects, slowing down garbage collection).

      The additional motivations for what I’ve described here are that: 1) we don’t need to keep the strings in memory (except when its useful for debugging), and 2) since have control over the generation of ids, this allows us to identify/serialize objects “offline” with the 32 bit number (save games, etc…).

  2. […] Clunky way to set property bag variables from C#.To identify variables I use a unique integer id, similar to that described in this post: https://mtnphil.wordpress.com/2012/01/06/identifying-entities-in-the-game-world/ […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Space Quest Historian

Let's Play's, Podcasts, and General Adventure Game Goodness

Harebrained Schemes

Developer's blog for IceFall Games

kosmonaut games

Development blog of "Bounty Road"


Turn up the rez!

bitsquid: development blog

Developer's blog for IceFall Games

Game Development by Sean

Developer's blog for IceFall Games

Lost Garden

Developer's blog for IceFall Games


Developer's blog for IceFall Games

Casey's Blog

Developer's blog for IceFall Games


Developer's blog for IceFall Games

Rendering Evolution

Developer's blog for IceFall Games

Simon schreibt.

Developer's blog for IceFall Games

Dev & Techno-phage

Do Computers Dream of Electric Developper?

- Woolfe -

Developer's blog for IceFall Games

Ferrara Fabio

Game & Application Developer, 3D Animator, Composer.

Clone of Duty: Stonehenge

First Person Shooter coming soon to the XBOX 360

Low Tide Productions

Games and other artsy stuff...


Just another WordPress.com site

Sipty's Writing

Take a look inside the mind of a game developer.

Jonas Kyratzes

Writer, game designer, filmmaker.

%d bloggers like this: