Current version

v1.10.4 (stable)


Main page
Archived news
Plugin SDK
Knowledge base
Contact info
Other projects


Blog Archive

They called _what_ in the inner loop??

AMD just open sourced the AMD Performance Library as Framewave, which at least from my perspective seems like a good thing. Not that I'm going to attempt to use it, but I perused the source out of curiosity, and it looks like there are some useful goodies in there.

And then there's some... marginal stuff.

One thing that I wanted to look at was their 8x8 2D-IDCT source. The 8x8 2D inverse discrete cosine transform (IDCT) is popular and used in a number of video compression formats. There are a million ways to implement it quickly, and although everyone's seen Intel's AP-922 SSE2 algorithm for it by now, I hadn't seen one by AMD before. So I grab the source and dig around in the JPEG module, and I see this:

int IdctQuant_LS_SSE2(const Fw16s *pSrc, Fw8u *pDst, int dstStp, const Fw16u *pQuantInvTable)
... pedx = (Fw16s *) fwMalloc(128); //64 array of Fw16s type

Who the #*@&*( calls malloc() in an optimized IDCT routine???

It looks like there are indeed a number of well-optimized SSE2 routines in the Framewave library, but after seeing things like the above a few times I was left scratching my head a bit....

Another uglyness I saw, which isn't restricted to Framewave unfortunately, is assembly language routines that have been translated to intrinsics. The result is a nasty C++ routine that has variables like "pedx" and "pesi," but has instruction names translated so that what used to be an understandable "paddw" is now "_mm_add_epi16." I know this was a hack job for portability, but the result sure is unreadable.


This blog was originally open for comments when this entry was first posted, but was later closed and then removed due to spam and after a migration away from the original blog software. Unfortunately, it would have been a lot of work to reformat the comments to republish them. The author thanks everyone who posted comments and added to the discussion.