How do I sort a series of images by color? See the previous post for the motivation here.
My first iteration was just to take the average of all the opaque pixels of the image. It was a start, but I knew this wasn’t going to be sufficient. That only gives me a single color. It also might give me a completely wrong in-between color if a Pokémon is half one color and half another.
So then I thought to calculate the hue of each pixel and divide them up into buckets to make a histogram of hues. I could then choose a maximum (or multiple maxima if I wanted to classify it under multiple distinct hues).
Immediately I saw a problem though: if the main color of a Pokémon lay roughly between two buckets, samples might be split between those two buckets, leading another bucket (representing a less important color) to be the “winner”.
There’s probably a correct and complex mathematical solution to this problem, but the immediately obvious thing to do was to make buckets overlap adjacent buckets. So one pixel sample could end up in multiple adjacent buckets and help count towards both their scores.
So I set about implementing that, but started with only choosing one bucket just to simply things. One of the first things I noticed was the miscategorization of many Pokémon. Umbreon for instance, was lumped with the blue Pokémon:
Mousing over the Umbreon sprite in photoshop, I could see it was dominated by a very unsaturated blue hue (its gray body). I kind of wanted it in the yellow category. So I added a saturation threshold below which I wouldn’t consider pixels.
This helped, but I still noticed some problems. Latias, who is pink/red, was categorized with the purple Pokes. In Photoshop, Latias’ slightly purple upper body was below my saturation threshold, so what was the problem?
Stepping through the debugger, I noticed that the saturation values being calculated for the barely purple pixels were close to 100%. It turns out that System.Drawing.Color.GetSaturation uses the HSL representation, not HSV (despite the documentation saying otherwise). I replaced its implementation with a proper HSV saturation, and the problem was fixed.
I had to lower my saturation threshold to around 10% to correctly categorize some Pokes. But that put Zoroark amongst the purples, due to his just slightly purple dark body:
I realized that it was probably a better idea to weight the samples based on saturation. That fixed Zoroark’s problem:
But oh no: Sylveon, who had previously been categorized correctly, was now in the blues. She’s generally pink, but her blues are just saturated enough to outweigh that.
So I put a max on the saturation weight at 60%, and that was enough to tip the scales in her favor and bring her back to pinks:
But now the pink Mewtwos were in the green zone:
Inspecting in Photoshop, I saw that the sprite is ringed by a greenish outline that is dark, but highly saturated.
Of course… dark colors can be highly saturated (look here). So my final tweak was to instead weight by the product of saturation and brightness (value). Mewtwo is in good company now:
Finally, it was time to allow for a Poke to appear in multiple color categories. I played around with this for a bit, and came up with the following: I choose a bucket that is at least 50% of the biggest bucket, and whose adjacent buckets are less than itself (i.e. it must be a local maximum). It seems suitable.
The final result looks something like this, with very quick scrolling through the list:
Unfortunately, it has turned out not to be all that useful. It still takes a while to find the Pokémon you want.