¶Introducing "warp resize"
In an earlier blog entry on video shaders I introduced an algorithm called warpedge as an example. It's a hybrid warpsharp+resize algorithm that attempts to make edges as crisp as possible. The algorithm's output has proven interesting enough that I made a VirtualDub video filter out of it:
Warp resize, version 1.1
Warp resize, version 1.1 source code
Only works when enlarging a video, and requires a lot of CPU power. Read on for the algorithm description.
The basic "warp sharp" algorithm
Warp sharp is the name of an algorithm that I originally found coded as a filter for an image editing application called "The GIMP." It's based on the idea that if you can identify the edges in an image, you can warp the image to shrink the edges and thus make them appear sharper. I don't remember how the original code worked, but here's one way to implement such an algorithm:
- Convert the original channel to grayscale.
- Compute a gradient map from the grayscale image. For each pixel in the original image, the gradient map encodes a vector that indicates the direction and rate of fastest ascent in luminance. The usual way to do this is through a pair of edge detection filters that compute horizontal and vertical gradient. In a 3x3 window, subtracting (right-left) and (bottom-top) is one way to do this. Sobel filters are also popular:
-1 0 +1 -1 -2 -1 -2 0 +2 0 0 0 -1 0 +1 +1 +2 +1
(Both filters can be computed together in 10 adds; how this is done is left as an exercise to the reader.) - Convert the gradient map to a bump map by taking the length of each gradient vector.
- Optionally, apply a blur to the bump map.
- Compute a second gradient map from the bump map.
- Use the gradient map as a displacement map for warping the original image — that is, you use a scaled version of the gradient vector at each pixel to perturb the sampling location in the original image. In other words, given a gradient (du, dv), you retrieve (u+du*k, v+dv*k) from the source, where k is usually set such that the displacement is under a single pixel. I usually use a cardinal bicubic interpolator for the warp.
Problems with warp sharp
The first problem with warp sharp is that the length of the displacement map vectors is dependent upon the height of the bump map, which is in turn dependent upon the contrast of the original image. This means that the amount of narrowing of edges varies with local contrast, which produces uneven edges.
A second problem has to do with the warp. If the warp is done using straight interpolation, the algorithm will never output a pixel that is not in the infinitely interpolated version of the original image. In practice this means you can get "bumpiness" in edges the warped image, since the warp interpolator doesn't do edge-directed interpolation. You can often see this in near-horizontal or near-vertical lines, where the interpolator creates gray pixels when the line straddles a scan line boundary. Unfortunately, while there are edge-directed interpolation algorithms, they usually don't produce continuous output. For example, the Xin-Li New Edge Directed Interpolation (NEDI) algorithm only produces a rigid 2:1 enlargement. This doesn't help with a displacement-map based warp, which requires fine perturbations in the sampling location.
A third problem is that since warp sharp essentially narrows transition regions, it has a tendency to crystallize images into discrete regions such that it resembles a Voronoi diagram. Blurring the bump map decreases the filter's sensitivity to noise and helps this somewhat.
The warp resize algorithm
To solve the luminance sensitivity, warp resize normalizes the gradient vectors produced in the second pass. This has the downsides of amplifying noise and failing on (0,0) vectors, so the normalization scale factor is lerped toward 1.0 as the vector becomes very short. Also, the gradient vectors are computed on the full-size version of the image, so that the normalization occurs after resampling. Otherwise, the resampling operation would denormalize gradient vectors, which would introduce artifacts into the warp on original pixel boundaries.
The issue with the warp interpolation is corrected using the following ugly hack: We know that the problem becomes worst between pixel boundaries, where the interpolator is "most inaccurate" with regard to edges. So what we do is compute the difference between the warped pixel and the interpolated pixel, and re-add some of that difference to emphasize it. The difference is greater where the slope of the edge is higher, and thus this tends to flatten out edges and make borders crisper.
Throw in a liberal amount of high-powered, slightly-expensive interpolation to taste.
Thus, the algorithm used by the CPU version of the algorithm is as follows:
- Convert the image to grayscale.
- Compute a gradient map using Sobel filters.
- Resample the gradient map to the desired size using a B-spline bicubic filter. A B-spline bicubic filter has no negative lobes and thus introduces no ringing into the gradient map, at the cost of some blurriness compared to a cardinal spline filter. Mild blurring is desirable anyway to improve the smoothness of the gradients. (Note that since both the gradient determination and B-spline resample are linear operations, grad(resample(x)) is the same as resample(grad(x)).)
- Convert the gradient map to a bump map by taking the length of each vector. This bump map shows the areas of highest intensity change — in a way, the "edges of the edges."
- Compute a displacement map from the bump map, again using Sobel filters.
- Normalize the displacement map.
- Resample the original image to the desired size using a separable cardinal spline bicubic filter. Call the resultant image R.
- Warp the original image by sampling the original image at (u+du*k, v+dv*k) using a cardinal bicubic spline filter, where (u,v) is the location of the current pixel, (du,dv) is the vector at that position in the displacement map, and k is a displacement scale factor. k is chosen such that the warp narrows regions of high intensity change (its sign depends on the sign of the gradient filters). Call the resultant image D.
- Compute D + e(D-R) as the final image, where e is the factor for emphasizing changes in the image. For the CPU filter I just used 1.0.
The source code is a bit of a mess because it interleaves the first six passes together as a bank of row filters in order to increase cache locality. If you are looking into reimplementing the algorithm, I highly recommend either working from the description above or starting with the HLSL code for the original GPU filter, which is much easier to understand.
As you might guess, the warp resize algorithm is better suited to cartoon-style imagery than natural images, although it can help on the latter too. There is also an issue with the filter sometime introducing a gap in the center of edges, which splits edges into two; I haven't figured exactly what causes this, but it is sometimes caused by the displacement vectors being too strong. What's weird is that the effect is sometimes very limited, such as a hair-width gap being seen in a 20-pixel wide edge after 4x enlargement. I think this may have to do with the gradient vector swinging very close to (0,0) as the current pixel position crosses the true center of an edge, at which point the filter can't do anything to the image.
The CPU version is approximately one-sixth of the speed of the GPU version, even with MMX optimizations. This is comparing a Pentium M 1.86GHz against a GeForce Go 6800, so it's a little bit unbalanced; a Pentium 4 or Athlon 64 would probably narrow this gap a bit. Also, the GPU version does nasty hacks with bilinear texture sampling hardware to get the final three passes running fast enough and with a low enough instruction count to fit in pixel shader 2.0 limits, so its accuracy suffers compared to the CPU version. If you look closely, you can see some speckling and fine checkerboard patterns that aren't present in the output of the CPU filter. Increasing the precision of the GPU filter to match would narrow the gap further.