§ ¶Computers are fast
It's amazing how much faster computers have become. Sure, they can never be fast enough for some purposes -- like, uh, video processing -- but for others, it's getting a bit ridiculous.
A linear feedback shift register is a type of sequence generator that is useful for quickly generating psuedo-random sequences of numbers. It's not a great generator, but because it only requires shifts and XORs, it's very fast. It also easily generates a maximal non-repeating sequence of size 2^N-1, which means it's also useful for generating exact coverage patterns without duplicates or dropouts, especially since the sequence is generated on the fly and isn't stored. It can be used for real-time image dissolves and static noise generation... on a 1MHz Apple II! I haven't used it in VirtualDub yet, but there are a few places where I could, such as if I needed a random dither for audio sample conversion from 16-bit to 8-bit (although I'd probably try error diffusion first).
Anyway, today I needed a 32-bit LFSR generator. An LFSR generator basically shifts new bits in that are XORs of a series of taps on the shift register, and the position of those taps is critical: if they aren't correct, the generator won't produce the maximum possible sequence. I'm too lazy to actually look up a list of primitive polynomials, though, so I usually just try a bunch of tap combinations until I get one that works. So I picked four taps and just let a test app measure the length of the sequence. It then dawned on me that in less than a minute, the computer had run through all four billion bits in the sequence. And I didn't even optimize the algorithm.
Basically, computers have gotten fast enough that sometimes it's better just to brute force test all inputs to an algorithm... it's easy, it's harder to screw up, and a passing result from one is fairly convincing. And it's lazy. I like it when companies like Intel and NVIDIA help me be lazy. I look forward to my next upgrade, which will probably include some dual core and DX10 goodness and help me be more lazy.
(Sharp readers will note that since I was able to run through the 32-bit sequence in less than a minute, I probably need a longer generator, and I can't really brute force test a 64-bit LFSR... but hey, at least it'll still be damn fast.)
Comments
Comments posted:
They're also used in old audio chips to make "white" noise, and sometimes with non-maximal-length taps to get more pleasing audio - the maximal-length ones will produce sections with repetitive runs of the same bit at some point.
I spent a while some years ago learning about them (before Wikipedia) and then determining the tap sequence for some given output, also by lazy brute force - although you can make a fair guess by looking at the run points when you know how, and mine were only 16-bit so it was fairly nippy anyway.
Maxim (link) - 22 08 07 - 05:31
How come you're not looking for a quad core?
Anon - 22 08 07 - 09:52
As the previous poster said, I'd say yeah, go to quad-core. Seeing your fondness for video processing, console emulation and the like, it's definitely worth it. I got a q6600 for 230 euros just a few days ago, and i sure as hell am happy not to have taken a e6750/e6850 instead.
Parker Lewis - 22 08 07 - 11:43
This, combined with your previous post on shaders, jogged my memory.... Do you know whether it's possible to use efficiently LFSRs in shaders, specifically in a YUV->RGB shader, to create dither for hiding banding? (Curious about regular assembly as well. The last two times I tried to implement something like it, I just made a total hash of it, since I don't understand floating point simd at all and just barely know integer.)
foxyshadis - 22 08 07 - 11:49
Quad cores aren't that prevalent in laptops yet, and I'm not really looking for a new computer right now anyway. I tend to wait a couple of years before upgrading, and I can get very far with a 1.86GHz P-M and a 6800. I might hold out until solid state drives get better and cheaper.
As for LFSRs in shaders, not likely. Computing an LFSR the traditional way is a serially dependent operation, which is basically impossible in a shader. In addition, there's the problem of requiring bit shift and XOR operations, which are difficult to do before DX10-class hardware and aren't that fast even on that (on ATI bit shifts only run on the transcendental unit). There's likely a way that the LFSR can be computed in closed form rather than iteratively, but I'm guessing that it'd be easier to use a linear congruential generator instead, which only requires multiply-add and a modulus operation (i.e. frac). In the end, I'd probably just hack something up with textures, as even on fast hardware you're not looking at much more than about 8:1 for ALU-to-texture ops, and repetition in dither is not as much of a problem as repetition in noise.
Phaeron - 22 08 07 - 15:55
Sorry about asking, but can anyone explain to me how a Linear feedback shift register works. I already looked up Linear feedback shift register on wikipedia didn't understand any of it. Googled it and ended up becoming even more confused. Just really curious on how a LFSR works.
Thank you.
mOE - 22 08 07 - 20:00
An LFSR is pretty simple: take bits from certain positions in the current value, XOR them together, and shift the new bit at the end. The bit positions you use are called the taps; the one farthest from where you shift out bits determines the maximum length. Just make sure the value doesn't start as zero (it'll lock at zero, as zero never appears in the cycle).
The variant method is to run the LFSR in reverse by shifting a bit out and XORing that in at the tap positions. I like that method better because you can generate multiple bits very quickly that way.
Phaeron - 22 08 07 - 21:24
Thanks, I'll give it a shot via textures instead of pure calculation. I guess I still need to wrap my head around 3d logic!
foxyshadis - 23 08 07 - 11:53
Phaeron, thank you so much for your quick reply. I'm going to try and write one tonight. Thanks again.
mOE - 23 08 07 - 15:08