Current version

v1.10.4 (stable)


Main page
Archived news
Plugin SDK
Knowledge base
Contact info
Other projects


Blog Archive

DirectX vs. OpenGL

Igor writes:

Not only that Avery, just look at yourself using platform specific DirectX API instead of OpenGL -- how did they manage that?

Uh, well, it shouldn't be too surprising. I'm primarily a Win32/x86 programmer. To some people, it's a minor miracle anytime we write code that is in any way portable and doesn't declare at least one HRESULT variable per function.

In this specific case, it's more of an issue of practicality. I know both, I've used both, sometimes I prefer one over the other. I do like portable and multivendor APIs, but for me, deployment, ease of use, stability, and licensing are also factors.

Truth be told, I do like OpenGL better. It has a real spec, written in a precise manner, and the base API is better thought out than Direct3D, which has been hacked together over the years and still has garbage such as the infamous D3DERR_DEVICELOST. I learned a lot about 3D graphics by reading the OpenGL API and extension specs. It's portable, it makes easy things easy (like drawing one triangle), it's extensible, and it's faster since the application talks directly to the user-space IHV driver. Microsoft tried improving its API with Direct3D 10, but the documentation is still incomplete, it's still non-extensible, and it's even less portable than before since it's Vista-only.

Before anyone says anything else, though, theory and practice are a bit different. In practice, OpenGL drivers tend to have their own nice collections of bugs, and some are better optimized than others. Most games use Direct3D, so a lot of effort has been put into optimizing drivers and hardware for that API. Extensions also mean extension hell, and figuring out which extensions are widely available and work as advertised is just as frustrating as Direct3D caps hell. And then there's the shader issue — whereas Direct3D has standardized assembly and HLSL shader languages, OpenGL has gone through numerous vendor-specific combiner-style, assembly-style, and high-level language shader extensions.

Now, as for how this pertains to what I do, well, VirtualDub isn't exactly a 3D centric application. It does have 3D display paths, and I have dabbled with hardware acceleration for video rendering as well, so here's my VirtualDub-centric take:

When you write OpenGL code, you don't have to write an entire framework to make sure the application doesn't blow up when the screensaver appears. (I use lots of render targets which have to be allocated D3DPOOL_DEFAULT.) What other API requires you to free all objects of certain types at random times?

The shader situation on OpenGL sucks. When I was implementing OpenGL-accelerated capture support, I wanted to support older cards and looked into the NV_fragment_shader and ATI_fragment_shader extensions. Programming those directly was so painful that I wrote my own assembler for it ("asuka glc" in the build process). I later looked into Cg, and that was much improved except that it was a bugfest — the first day I broke the compiler with the vector expression (a + (a-b)*1.0). I haven't tried ARBfp or GLSL, but I'm hoping those work somewhat more reliably.

OpenGL's coordinate system makes sense: bottom-up orientation, pixel and texel centers on half-integer coordinates, normalized device coordinates are -1 to 1 in all axes. Direct3D is a mess: device coordinates are bottom-up with centers on integers — which has the nice side effect of making the projection transform viewport-dependent — but functions that take integer screen-space rects are top-down, and textures are top-down with centers on half-integers. NDC X and Y are -1 to 1 but Z is 0 to 1. Argh!

In terms of off-screen rendering, OpenGL has a distinct advantage over D3D9 due to better support for readback. Reading back the results from the video card into system memory where it can be processed by the CPU or written to disk is a major bottleneck when using the GPU to accelerate video. Direct3D 9 has the infuriating GetRenderTargetData(), which has the stupid restriction that it can't do a subrect read and also tends to stall in unexpected ways. OpenGL not only has the more flexible glReadPixels(), but in my tests it did readback noticeably faster on both ATI and NVIDIA cards. With asynchronous readback via pixel buffer objects (PBOs) on NVIDIA hardware, the readback advantage rises to ~2x. (If you're on Vista with a WDDM driver, Direct3D 9.L can supposedly do a subrect readback via StretchRect. If anyone knows if this is faster and if it can be done asynchronously I'd be interested in knowing.) I believe that NVIDIA's CUDA is also able to push into buffer objects in OpenGL, which allows for texture upload, whereas with D3D9 it can only push into a vertex buffer.

On the flip side, one of the more annoying aspects of OpenGL I've found is the tying of sampler parameters — filtering, addressing, mip map LOD bias, etc. — to textures. For a software renderer that modifies texture data for these changes, this makes sense, but it doesn't make sense for modern hardware where these are usually sampler states. It can also make sense if you consider them intrinsic to the texture, but that then breaks down if you want to do something beyond what the hardware can support. A high quality image processor really needs to support higher quality filtering than bilinear; bilinear filtering gives a really crappy gradient map. To do this, you have to emulate the higher-order filtering using lower-order filtering, and you run into the problem that if you ever need the same texture bound with different parameters in the same pass, you're screwed, because you only have one set of parameters on the texture. I do this when rendering bicubic in VirtualDub's D3D9 driver, and I can't port it to OpenGL because it's impossible to do so. I looked at the source for various OpenGL binding layers, and they all seem to emulate sampler states by pushing them into texture parameters. This sucks.

I guess I should say something else good about Direct3D. Well, the diagnostic tools are better, at least on Windows. I have NVPerfHUD, PIXWin, and D3D debug mode for debugging, I have NVShaderPerf, FX Composer, and GPU ShaderAnalyzer for shaders, and I have debugging symbols for the Direct3D and D3DX runtimes from Microsoft's public symbol server. For OpenGL, well, I have glGetError() which returns GL_INVALID_OPERATION if I remember to call it, GL debug mode if I'm using an NVIDIA card, and a GL debugging tool that everyone's pushing but is only available as a trial edition unless I pay about as much as a full VTune license costs.

Another annoying aspect of OpenGL is that NVIDIA's practically the only one really supporting it on the desktop side. They're pushing out all the cool functionality via extensions and have their OpenGL docs about as well updated as the Direct3D ones. ATI, well... not that they have many useful docs on their site anyway, but their OpenGL docs are way behind and their extension support in their OpenGL driver was way behind the last time I checked. They've put more effort into Direct3D support, even "extending" D3D9 with API hacks to support Fetch4 and R2VB. Disclaimer: I am an NVIDIA fanboy. I still want ATI to better support OpenGL so that it continues to be a viable competitor to Direct3D 9/10 on Windows.

With both APIs, it would be nice to have more flexibility in application structure. Even in games, I think we're long past the era where programs are all single-threaded, have nothing better to do with the CPU than spin in Present() or SwapBuffers() waiting for vertical blank, and don't need to render anything except to the screen. I want better support for multithreading, better ability to avoid unexpected stalls in the driver, the ability to detect/count/wait for vertical blank intervals without polling, and to not lose 3D rendering capability whenever someone locks their workstation or logs in remotely. What I do with 3D in VirtualDub is simple — most of the time, I draw a quad. The complexity is that I still have to write a whole lot of framework around that code to handle lost devices, shaders, hopping commands between threads, managing invisible placeholder windows, and dynamically linking to 3D APIs.


This blog was originally open for comments when this entry was first posted, but was later closed and then removed due to spam and after a migration away from the original blog software. Unfortunately, it would have been a lot of work to reformat the comments to republish them. The author thanks everyone who posted comments and added to the discussion.