¶How not to optimize with assembly language
Time to imitate Alex Papadimoulis.
While researching some algorithms on the web, I saw a forum poster "optimize" a routine written in a high-level language into assembly language. I won't post the whole function, but here's part of it:
; s1 = seed & 0xFFFF
xor eax, eax
add eax, seed
mov ebx, eax
and eax, 0FFFFh
mov eax, ebx
mov s1, eax
; s2 = (seed / 65536) & 0xFFFF
mov eax, seed
mov ebx, 10000h
xor edx, edx
div ebx
and eax, 0FFFFh
mov s2, eax
Now, I like assembly language, and the asm code is indeed correct compared to the original high-level code. However, it's a perfect example of a bad use of assembly language to optimize. Looking at it, there are a ton of missed opportunities, such as changing the divide to a shift, removing the and operation that's a no-op, and so on. Well, except that the statement immediately before the block is this:
mov seed, 1
Basically, the fragment above computes two constant expressions -- and no, there isn't a branch target in between. And actually, due to a bug, the code wouldn't actually work otherwise. (Hint: Errant mov.) Makes you wonder why the coder didn't just use this:
mov s1, 1
mov s2, 0
The rest of the translated function, by the way, was equally faithful and awful. The worst part was that the original source had a big comment block indicating how the algorithm could be easily sped up by an order of magnitude, which was of course ignored.
If you're going to go to assembly language for speed, your job is to optimize based on knowledge that the compiler lacks, such as restricted values in data, and not to act as a really slow compiler with no optimizer. Merely translating a routine verbatim into asm (and in this case, really bad asm) is a total waste of time.