Thanks to leileilol, I’ve found out this Quake 2 source code review by Fabien Sanglard, which includes code for implementing a dithering filter on regular BSP textures. Copying and pasting the code was easy, but soon I noticed there were several ways to make a better and faster equivalent.
The first thing to notice, as can be seen on Fabien’s article, is that that algorithm offsets the texture in one direction and clamps it in the other, to make sure that wrong texels are not drawn to the edges of the surfaces.
My first thought was to find ways to eliminate the offset (to make the code faster and to align the pixels properly), and soon I figured out that this wasn’t enough to properly align the dithered pixels around the center of their non-dithered equivalents: the dithering kernel also had to be rewritten. However, by rewritting the kernel I essentially allowed the dithering algorithm to also use negative indexes for the texels, thus requiring the texels’ indexes to be clamped twice, for both positive and negative values.
So, I started thinking about how to eliminate the clamping completely.
There were several drawing functions where I wanted to implement the dithering. Regular BSP surfaces, turbulent (water, slime, lava, portals) BSP surfaces, sky BSP surfaces, and SPR models. Particles doesn’t need it and MDL models still have a lot of changes to be made to their rendering and drawing functions anyway.
When implementing it on turbulent surfaces and on sky surfaces, I’ve noticed that clamping the texels on them was unnecessary, since the indexes of the texels of those surfaces are always bitmasked, and the masking wraps the indexes around on both ways. So, these were easy.
The indexes of the texels of regular BSP surfaces, however, are not bitmasked, because those texels comes from the surface cache, which is not tiled. Textures may be tiled into the surface cache, but the surface cache itself isn’t tiled.
The surface cache is generated by tiling a texture into it, while simultaneously applying a non-tiled lightmap on the final texels. So, my solution to avoid clamping was to expand the surface cache, padding it around its edges. The padding itself was made by wrapping around the texture (to ensure continuous tiling with other surfaces using the same texture), and by duplicating the lighting of the texels of the edges. This whole process of padding the surface cache was the hardest part, due to a number of small details I’ve had to learn.
Padding the surface cache requires using more RAM, so I’ve limited this to “mipmap level 0” (which, as fas as I know, means non-mipmapped) surfaces, which are the ones closer to the camera. This also means that mipmapped surfaces (levels 1 onwards) can’t use my dithering algorithm… but they don’t need it anyway! So, mipmapped surfaces are still drawn without dithering.
That resulted in a dithered 3D drawing algorithm that is properly aligned, and smoother on the edges of the surfaces. And while doing that, I’ve reduced this algorithm to just two sums per pixel (plus a few other operations on the beginning of each span), significantly reducing its impact on the framerate.