¶Exploring hardware overlay support in Windows 7
I finally got around to trying out the new hardware overlay support in Windows 7:
http://msdn.microsoft.com/en-us/library/dd797814(VS.85).aspx
Hardware overlays are a quirk of video hardware that have survived in spite of their lack of evolution. They're essentially a secondary display scan out path in the video chip and are intended for video display, so that a video in a window can use a more optimized display format instead of the rest of the desktop for better playback performance. The biggest advantages of a hardware overlay are hardware accelerated scaling and color conversion from YCbCr to RGB, formerly very expensive operations; in some cases, you also got primitive deinterlacing and some additional TV-out support. Unfortunately, they were often also buggy in drivers. Windows Vista appeared to be the end of the line for overlays, as they were not supported in desktop composition mode, but guess what... they're back in Windows 7. As it turns out, hardware overlays are still valuable for a couple of reasons, one being that you can flip them faster and asynchronously from the composited desktop (good for performance) and because you can't capture their image in a screen grab operation (good for the paranoid). And this time, they have a few improvements, too.
The way you get to overlays is a bit different in Windows 7. In older versions, they were a feature of DirectDraw, and that meant you basically couldn't do anything other than lock-and-load -- DirectDraw couldn't even do color conversion, other than what the hardware overlay itself supported. This time, they're hooked up to Direct3D, which makes them a lot more useful since you can process video through DXVA and shader hardware and push the result into the overlay. The color space is now better defined, as there are flags for whether RGB output is computer RGB (0-255) or studio RGB (16-235), and whether YCbCr output is Rec.601/709 or xvYcc. So far, so good -- time to try it!
Now, it's not easy to get VirtualDub talking to the new overlays yet, for two reasons: the existing overlay code uses DirectDraw, and the Direct3D path only supports D3D9, whereas you need Direct3D9Ex in order to use the overlays. Therefore, I ended up just writing a one-off application to test it. Well, having gotten overlays to work and tested them a bit, I have to say they're a bit underwhelming. The good news:
- You create it with D3DSWAPEFFECT_OVERLAY, and have a couple of new Present() options. That's pretty painless. You can even make your implicit swap chain the overlay.
- Any time the window is translucent, temporarily scaled, or tilted as a result of animation, the overlay goes away. This includes Flip 3D (Win+Tab). Not surprising, given that overlays are 2D screen-based entities and don't have 3D mapping support. What you get instead is the color key, which is usually a reasonable dark color, i.e. not obnoxious pink.
- Direct3D chooses and paints the color key for you. That's good. However, you have to call Present() with D3DPRESENT_UPDATECOLORKEY in your WM_PAINT handler.
Now, the bad news:
- About half of the documentation on overlays in the current August 2009 DirectX SDK is TBD. Fortunately, the meanings of the constants are pretty obvious if you've used DirectDraw overlays.
- DXCapsViewer doesn't know about any of the overlay caps.
- The debug runtime doesn't seem to have been updated for overlay errors, either.
- The overlay doesn't necessarily support stretching, and the one on my video card didn't. That means you have to have a swap chain at least as large as the output size and then do a 1:1 subrect Present(), and make sure you don't blow up during window animation when you get minimized. Because reallocating the swap chain can be expensive, Microsoft recommends that you create one the size of the screen. Sigh. I suppose this isn't too bad, because half the time when the hardware vendors did stretching they did it wrong anyway.
- The sample code in MSDN recommends that you set the D3DPRESENTFLAG_VIDEO bit... you know, the one that has no documentation as to what it does, does nothing on many systems, is rumored to do something neat on TV-out on some system, and might format the user's hard drive on another for all I know.
- The video driver set the LIMITEDRGB caps bit, but the corresponding D3DPRESENTFLAG_OVERLAY_LIMITEDRGB didn't seem to do anything -- I could still see a difference between 0-15 vs. 16 and 236-255 vs. 235, and the overlay looked the same as a non-overlay screen even on TV-out. So if you're excited about finally having studio RGB output, sorry, you're still SOL. (WHQL certified my #$*&....)
- It's nice that there is now well-specified YCbCr support. Unfortunately, they forgot to mention any YCbCr surface formats you could actually use. My video driver set all of the YCbCr caps bits, but none of the FOURCCs I tried worked, including AYUV, UYVY, YUY2, YV12, NV12, and everything else that was enumerated in the DirectDraw FOURCC list. This is made doubly difficult by the fact that you're creating a render target and have to be able to draw polygons on the target format, so it's unlikely that the goofier subsampled formats would actually work.
- You can get weird errors if display cloning is enabled. The API is supposed to return D3DERR_NOTAVAILABLE if the overlay is occupied, but I got D3DERR_OUTOFVIDEOMEMORY instead. I'm pretty sure that two displays and one render target take up far less than 256MB of video memory.
Disclaimer: I used Windows 7 RC for testing, since I don't have RTM.
So, what do we have at this point? Well, you can create a non-stretched RGB overlay with regular 0-255 range. That means unless you are either in a situation where you are being hampered by a low desktop composition rate or you need to prevent your image from being grabbed, it's unlikely that the overlay will have any advantages over plain blitting to the desktop. And of course, you need to use Direct3D9Ex to access them, which is juuuust different enough from regular Direct3D9 to be annoying. The situation might get better over time, and it may just be my crappy video card... but I'll wait until I actually see improvement.
If anyone wants to experiment with WDDM 1.1 overlays, here's the code I used to test them: d3doverlay.cpp. It's obviously not production code, but I assume if you're trying to build it you have some familiarity with setting up a project and with D3D9.