This memo could be an information dump in a Christopher Nolan movie. Zero shame.
« A pixel is not a little square,
a pixel is not a little square,
a pixel is not a little square!
(And a voxel is not a little cube) »
This text is a synthesis, with some heavy modifications to make it simpler for non-technical people, of a famous eponym technical memo written by Alvy Ray Smith in 1995 during his time at Microsoft. Alvy Ray Smith is one of the co-founders of Pixar and a genius. I tried to keep the original text where I could! Although it still lacks illustrations according to me.
In a time where voxel (or the idea that some have of voxel I should say) seems to be the next big thing for g@m3rz, I find interesting to dig out this vision again. I wrote this during my internship time at Ubisoft, where voxels were seen as the next big thing in gaming by HQ.
My purpose here is to, once and for all, rid the world of the misconception that a pixel is a little geometric square. This is not a religious issue. This is an issue that strikes right at the root of correct image computing. The little square model is simply incorrect. It harms. It gets in the way. If you find yourself thinking that a pixel is a little square, please read this paper.
I will have succeeded if you at least understand that you are using the model and why it is permissible in your case to do so (is it?).
Everything I say about little squares and pixels in the 2D case applies equally well to little cubes and voxels in 3D. The generalization is straightforward, so I won’t mention it from hereon.
The Little Square Model
The little square model pretends to represent a pixel (picture element) as a geometric square. Thus pixel (i, j) is assumed to correspond to the area of the plane bounded by the square {(x, y) | i-.5 E x E i+.5, j-.5 E y E j+.5}.
I have already, with this simple definition, entered the territory of controversy—a misguided (or at least irrelevant) controversy as I will attempt to show. There is typically an argument about whether the pixel “center” lies on the integers or the half-integers. The “half-integerists” would have pixel (i, j) correspond instead to the area of the plane {(x, y) | i £ x £ i+1., j £ y £ j+1.}.
See the little squares? Pixels would have edges and centers by this formulation.
So, What Is a Pixel?
A pixel is a point sample. It exists only at a point. For a color picture, a pixel might actually contain three samples, one for each primary color contributing to the picture at the sampling point. We can still think of this as a point sample of a color. But we cannot think of a pixel as a square—or anything other than a point.
There is a famous theorem called the Sampling Theorem, which tells us that we can reconstruct a continuous entity from a discrete entity using an appropriate reconstruction filter. That allows us to represent an image as a rectilinear array of point samples (pixels!).
There are many types of reconstruction filter out there, and the point of this memo is not to detail all of them, or to explain the maths behind them but to show you the basics.
Let’s take a picture:
Our picture has here 5x4 pixels, which contain color information. To fill the blanks, we will use a reconstruction filter, which can be modelled with its footprint:
What this shows is that this filter will consider 5 pixels around a pixel center to build color inside its footprint (this is a huge simplification but it’s the general idea). So, when you apply a filter to each pixel, you get something like this:
So, the final image will “look” like this:
If we change our reconstruction filter, for a “bigger” one, one that will consider more pixel information to build its color, such as this one:
Then the final image will “look” like this:
You normally feel that it’s a better reconstruction: there is a lot more overlap between each filter footprint, it will be more precise, just because knowledge sharing is better (image scientists could murder for that, but it’s the general idea).
Now let’s look at the worst case: a box filter. Which looks like our little square model right?
Then our image is reconstructed like this:
It is a valid reconstruction, but very poor in quality compared to the other filters, resulting in “jaggies” and general “aliasing”.
Why Is the Little square Model So Persistent?
I believe there are two principal reasons that the little square model hasn’t simply gone away:
- Geometry-based computer graphics uses it
- Video magnification of computer displays appears to show it
Geometry-based computer graphics (3D synthesis, CGI, etc.) has solved some very difficult problems over the last four decades by assuming that the world they model could be divided into little squares. Rendering is the process of converting abstract geometry into viewable pixels that can be displayed on a computer screen or written to film or video for display.
A modern computer graphics model can have millions of polygons contributing to a single image. How are all these millions of geometric things to be resolved into a regular array of pixels for display? Answer:
-
Simplify the problem by assuming the rectangular viewport on the model is divided regularly into little squares, one per final pixel.
-
Solve the often-intense hidden surface problem presented by this little square part of the viewport.
-
Average the results into a color sample.
This is, of course, exactly box filtering. And it works, even though it is low order filtering.
We probably wouldn’t be where we are today in computer graphics without this simplifying assumption. But, this is no reason to identify the model of geometric contributions to a pixel with the pixel. I often meet extremely intelligent and accomplished geometry-based computer graphicists who have leapt to the identification of the little square simplification with the pixel. This is not a plea for them to desist from use of the little square model. It is a plea for them to be aware of the simplification involved and to understand that the other half of computer picturing—the half that uses no geometry at all, the imaging half—tries to avoid this very simplification for quality reasons.
When one “magnifies” or “zooms in on” an image in most popular applications, each pixel appears to be a little square. The higher the magnification or the closer in the zoom, the bigger the little squares get.
Since I am apparently magnifying the pixel, it must be a little square, right? No, this is a false perception.
What is happening when you zoom in is this: Each point sample is being replicated MxM times, for magnification factor M. When you look at an image consisting of MxM pixels all of the same color, guess what you see: A square of that solid color!
It is not an accurate picture of the pixel below. It is a bunch of pixels approximating what you would see if a reconstruction with a box filter were performed.
To do a true zoom requires a resampling operation and is much slower than a video card can comfortably support in realtime today. So, the plea here is to please disregard the squareness of zoomed in “pixels”. You are really seeing an MxM array of point samples, not a single point sample rendered large.
But wait a minute: my monitor is made of pixels, right?
Well. Many people are aware that a color monitor often has little triads of dots that cause the perception of color at normal viewing distances. There is no fixed mapping between triads and pixels driving them. The easiest way to understand this is to consider your own graphics card.
Most modern cards support a variety of different color resolutions—eg, 640x480, 800x600, 1024x768, etc. The number of triads on your display screen do not change as you change the number of pixels driving them.
Summary
I have presented a brief but inclusive analysis of sampling and filtering. It has been shown that the little square model does not arise naturally in sampling theory, the main underpinning of everything we do. The geometry world uses it a great deal because they have had to simplify in order to accomplish. Their simplified model of contributions to a pixel should not be confused with or identified with the pixel. Magnified screen pixels that look like little squares have been shown to be a quick and dirty trick (pixel replication) by graphics boards designers, but not the truth.
In short, the little square model should be suspect whenever it is encountered. It should be used with great care, if at all, and certainly not offered to the world as “standard” in image computing.
Conclusion: A Voxel is not a little cube!
This part is me (JB Siraudin) writing.
You may have found the previous part a little obscure. After all, simplifying a pixel to a little square didn’t hurt that much the computer graphics industry or the video game industry. Many sampling or anti-aliasing strategies made up for that.
Now let’s take voxels.
Voxels are not cubes. Voxels are points on 3D grid. It is simple to represent them like cubes but they can be in any shape. If I don’t think the little square model is dangerous anymore, cubic voxels shouldn’t be the norm. Our mental shape of a voxel is determined by voxelart and Minecraft whereas we should turn to fluid simulation as an example of good use of voxels.
If we think of voxels as 3d points for sampling methods, they can be applied on top of practically any 3D representation technique, like polygon meshes. In fact, it’s already used in advanced lighting algorithms. Just as pixels, one could imagine reconstruction filters for remeshing voxellised shapes.
The vision of voxels at Ubisoft is to consider them as atoms to run large and “accurate” physics simulations to create more believable worlds and better sandboxes. I’m not saying voxels is a bad idea to accomplish this vision. I’m saying we can do it without seeing cubes everywhere. And also, not all physics simulations are made to run with an “atomic” model.
One must think of voxels as a shape for data storage and data manipulation, not as a 3D shape. And this is all what matters for future world building and simulation: data transformation. Voxels could be a strong basis for this, or not.
The real future of video game for Ubisoft is not voxels per se, it’s data-oriented design. Where data is the basis of the engine, where multiple simulations can run on the same set of data so you don’t need to manually create links between systems. I believe more in this as the key to create complex intricate systems, not to modelize everything in the world as little cubes.