¶Weird optimizer behavior
Take this simple function:
int foo(int p0, int p1) {
return p1 + ((p0 - p1) & ((p0 - p1) << 31));
}
This should generate four operations: a subtract, a shift, a bitwise AND, and an add. Well, with VS2005 it generates something a bit different:
00000000: 8B 54 24 08 mov edx,dword ptr [esp+8]
00000004: 8B 4C 24 04 mov ecx,dword ptr [esp+4]
00000008: 2B CA sub ecx,edx
0000000A: 8B C1 mov eax,ecx
0000000C: 69 C0 00 00 00 80 imul eax,eax,80000000h ???
00000012: 23 C1 and eax,ecx
00000014: 03 C2 add eax,edx
00000016: C3 ret
Somehow the optimizer changed the shift to a multiply, which is a serious pessimization and thus results in a rare case where the code is actually faster with the optimizer turned off!
Oddly enough, manually hoisting out the common subexpression (p0 - p1) fixes the problem. I've seen this behavior before in VC++ with 64-bit expressions of the form (a*b+c). My guess is that the compiler normally converts left shifts to multiplications and then converts them back later, but somehow the CSE optimization breaks this. Yet another reason that being lazy and repeating common subexpressions all over the place while relying on the optimizer to clean up your mess isn't the greatest idea.
The reason for the repeated subexpression, by the way, is because this is an expanded version of a min() macro. I called the function foo above instead of min because it's actually broken -- the left shift should be a right shift. As long as you can put up with the portability and range quirks, this strange formulation has the advantages of (a) being branchless, and (b) sharing a lot of code with a max() on the same arguments.