This post is going to be a bit of a grab bag about network testing and certification requirements for XBLIG games.
Currently I have temporarily suspended development on my main project, and am working on putting Tank Negotiator through the XBLIG peer review process.
I had done pretty substantial (I thought) testing on it before, and it was in pretty good shape. However, I had misunderstood (or missed) some of the evil checklist rules – particularly the ones about not forcing signed in controllers to take part in a match. I had some issues with this during network matches, an as a result I pulled my game from peer review.
I also was popping up the storage device selector earlier than I needed to – not a fail, but not good practice either.
So I’ve been substantially revamping the way profile sign in works (ensuring sign-in is separate from ” join”), and delaying the request for a storage device for as long as possible. In doing so, the extra testing I’ve been doing has uncovered a number of other bugs – so this has been a good thing.
A host uses NetworkSession.BeginCreate in order to start a hosting session. This method has a number of overloads, but you really have no choice but to use the one that takes a list of SignedInGamers. The other overloads will automatically add all signed in gamers to the session (up to the max local player count that you specify). And that is annoying, and a reason to fail certification.
I only join the main controller index (the one that started the game). Other players can join the network session after it has been created. I provide “join” UI in the lobby.
For clients that want to join a network session, the SignedInGamers list needs to be provided during the BeginFind (which searches for available sessions) instead of BeginJoin (which joins one of the available sessions). The reason for this is that the match making service needs to know beforehand how many local players are interested in participating, since it won’t return sessions that are already too full.
This essentially means that you need some sort of “join” system – for signed in gamers to express their interest in taking part – prior to searching for network sessions.
For joining sessions to which you have been invited, you use JoinInvited. Unfortunately there is no overload that takes a list of SignedInGamers. So in this case I really have no choice but to join all signed in players.
Another irritating feature of the XNA/Xbox network sessions is that it is not easy to remove a player from a network session after they have joined. You can certainly add players to a session. But to remove them, the only way is to sign them out completely.
Testing network play
I’ve done pretty extensive testing for network play between 2 devices – I have both automated and stress tests that exercise these code paths. But until earlier today I had, apparently, never tested 3 devices. This is a bit shocking that I was willing to declare this game ready for peer review without ever testing that scenario!
The game stopped cold when I tried this. Luckily it was just one small issue that was causing everything to go awry.
Instead of letting you send packets to a particular machine in the session, XNA instead lets you send packets to a particular gamer (or all gamers). If you do the easy thing and send to all gamers, then you’ll (1) need to filter out the packet received on the local machine, and (2) ensure you have some sort of de-duping code for when packets are received on the remote machines – if one machine has 2 or 3 players on it, you’ll get 2 or 3 sets of information.
I decided it was safer (code-wise) to instead pick one gamer from each machine, and send the data only to those specific gamers. Unfortunately, after you make the SendData call, the PacketWriter (into which you have been writing the packet data) is automatically emptied.
So the bug in my case was that I was trying to send an empty packet to the 2nd machine. The fix was to first copy all the data in the PacketWriter to a byte array, and then write it back in if I needed to send it to subsequent machines.
I really wish the XNA networking API had been a little better thought out. I’m not sure if the XDK is like this too, or whether its simply an XNA thing (PacketWriter is obviously an XNA thing, but I’m talking about the fact that “gamers” (instead of “machines”) are the only abstraction to which you can send data).
A more subtle bug lurked a little further in the shadows. I only hit it after over half an hour of stress test.
The powerups in my game lie in the playing area, and can be acquired by any of the players. Two players are not allowed to acquire the same powerup, however. The host is the arbiter of who gets what. This may give some slight advantage to the host, but upon receiving a RequestObject packet (which could also be from one of the players on the host machine), it will wait a short time (which depends on the calculated latency with other machines) before making its decision. This is in case any other competing requests come in. When its done, the host sends a RequestObjectDecision to all machines and they update their internal state that indicates who owns the object.
Once a player has an object, they can activate it. For instance, if they picked up a bomb, this will drop the bomb. I have code that assumes the bomb has an owner if it is dropped. It crashes if it doesn’t. Makes sense, right?
Except that with 3 machines, one machine can get the ActivateObject packet prior to the RequestObjectDecision packet. So:
- Machine B does a RequestObject
- Machine A, the host, sends a RequestObjectDecision
- Machine B receives the RequestObjectDecision, and the player immediately activates the object
- Machine B sends an ActivateObject
- Machine C receives the ActivateObject
- Machine C crashes, because it still hasn’t received the RequestObjectDecision that assigns an owner to the object
I’m not sure if sending my packets with SendDataOptions.ReliableInOrder instead of SendDataOptions.Reliable would address the issue (since different machines sent the RequestObjectDecision and ActivateObject packets that arrived out of order). In any case, that’s overkill. I can just include the object owner with the ActivateObject packet, and make the code resilient to setting an object’s owner multiple times.
I had always been testing with a SimulatedLatency of 200ms, but for some reason never with SimulatedPacketLoss. I enabled this with some trepidation, but luckily there was no adverse effect on the game. I use SendDataOptions.Reliable for all data except for player movement and weapons firing. It’s ok to lose these packets occasionally.
Test automation for system UI
I have been building a bunch of automated testing for navigating the menus, starting the game, interrupting the game, and just generally exercising all the game options. I simply fake a pre-defined series of button presses.
The problem with this is that it doesn’t work when system UI comes up: sign in screens, storage device selectors, etc…
So I’ve been working on ways to abstract those away too. I’m using EasyStorage, so abstracting storage away might not be too difficult. I’ve decided I won’t bother though – I’ll just pretend no storage is ever selected. The storage code has been pretty robust anyway.
My test sequences involve a lot of trickiness with gamer sign-in and sign-outs though. I’d like to be able to automate testing of what happens when players sign in and out at various points. So I created a “mock” version of the SignedInGamers collection on SignedInGamer, along with the SignedIn and SignedOut events. Normally they just hook right up to the real ones. But for testing I should be able to drive these with test data (while pretending the Guide is up waiting for someone to sign in).
I haven’t bothered with the other functionality such as ShowMarketplace yet. We’ll see if that’s worth it or not.