§ ¶Today's headache: digital cameras
If you value your sanity, don't accept external files in your own programs. Parsing someone else's output is just going to give you headaches.
Okay, it's not realistic, but I can dream, can't I?
Digital cameras that produce AVI files, particularly Motion JPEG encoded ones, have been a bit of a problem for me because they sometimes produce files that are marginal or non-compliant. One common problem is that the video stream contains JPEG frames with custom Huffman tables (DHT markers), which according to Microsoft's original MJPEG spec you're not supposed to do. Instead, the Huffman tables are omitted and fixed for speed and simplicity. VirtualDub's internal MJPEG decoder was written with this in mind, so it won't decode streams that have custom Huffman tables, and so if you don't have an MJPEG codec installed you'll get decode errors. I haven't gotten around to rewriting the decoder so that it can handle custom tables, since it was sort of meant to be a fallback to begin with.
Anyway, I received a sample file today from another digital camera that has a new problem, producing broken AVI files. This time it isn't the video stream, but the RIFF structure itself: the data after the first video frame (00dc) chunk is just garbage. If you open it in VirtualDub or a standard video player, it plays fine, because the outermost part of the RIFF structure is fine and that's enough to get to the index. Usually, AVI parsers use the index whenever they can, and thus they'll read the file since the index points directly to the frames. Anything that tries walking the RIFF structure, though, will barf because it's invalid within the LIST/movi chunk that holds the frames. Dumping the file, it looks like whoever wrote the camera firmware decided to save five minutes by just seeking to the next sector boundary instead of actually writing a proper JUNK padding chunk. Sigh.
AVI, like many formats, suffers from decay due to the "works well enough" effect. In fact, just about any format in popular use will have this problem when the programs that read it don't do strict validation and the people who use those programs don't care about conformance. It's like having to deal with XML files where the angle brackets don't match because the person that wrote it figured that everyone uses regexes to parse it anyway.
Very high fault-tolerance during parsing has been not only for webdesigner a thing to live with for many years. I just wish programms would (in a very subtle manner!) show if they found something violating the sepcification. A little poo-icon appearing next to your addressbar in a browser. Or a status-line message saying 'This website(video/picture/file) is either broken or not conform to the standard XYZ. Please consider contacting the author about it.'
Draget - 15 07 11 - 22:24
Bascially the first implementation of a format reader defines a minimum set of allowed mistakes. If the first implementation did not allow any mistakes and it would have a certain percentage of market share (>20%), mistakes could never creep into the writer implementations.
That's why all new standards should have a fully working production library for all major platforms available on day one. Unfortunately nobody does this.
tobi - 16 07 11 - 02:06
"It's like having to deal with XML files where the angle brackets don't match because the person that wrote it figured that everyone uses regexes to parse it anyway."
Well, XML does mandate draconian error handling, though yea for RSS/Atom feeds this was once common. But for a real mess look at HTML, where browser vendors had to reverse other browsers for their error handling. HTML5 finally aims to codify this mess.
Yuhong Bao (link) - 16 07 11 - 10:31
Today's corporate America works on low cost, low cost and low cost. As soon as a product meets its minimum requirements, its shipped out the door. You will not find the word quality in any mission statement from most companies.
With that said, I remember in college taking a computer science class in C. So they would give us small projects to work on. For every program, I had the tendency to verify the data before using since a user can easily input the wrong data into a variable. I did this thinking my teacher would be proud that my code was robust and stable. Instead, I got scolded because I used extra bus cycles.
Now as I see the fruits of many types of products, I sit back and realize what the corporate mentality and the teaching style would produce?
PS How many Microsoft engineers have watchdogs for their endless while loops, for the few left writing code that is?
evropej - 18 07 11 - 01:42
Yes....I have noticed this as I have various digital cameras that produce mostly quite compliant MGPEG, but two that DO NOT! Awful to try and work with, one even caused issues with another very well known & respected a/v splitter. Now we add video from sources such as my eldest sons iPod & Android dumbphone, and more.
In any case, would you be interested in some of the 'strange samples'. I'll keep the file size for each sample to a limit you specify, and, say three (3) files maximum. I would love your feedback! I'm sure paracetemol & some valium will help - ha ha.
4 years now with VirtualDub as my main-stay A/V application, although slipping away a bit with x264 & mkv - yes, I have read your past (long past?) comments on this so I'll leave this subject alone (but....I just said it...says he!). In any case your DirectShow input driver @ v.0.8 from 2010-06-26 http://forums.virtualdub.org/index.php?a..
, and our friend fccHandler keeps it a useful tool in the x264 kit.
Looking forward to 1.10.0 STABLE (is it not?), and shall we say 184.108.40.206 beta 1 perhaps.
Joseph Lynch (link) - 18 07 11 - 08:37
M-JPEG typo above sry.
Just googled MJPEG & saw a comment on Wiki, how old, I don't know, interesting though: "For QuickTime formats, Apple has defined two types of coding: MJPEG-A and MJPEG-B. MJPEG-B no longer retains valid JPEG Interchange Files within it, hence it is not possible to take a frame into a JPEG file without slightly modifying the headers."
Joseph Lynch (link) - 18 07 11 - 08:41
I really like Draget's suggestion.
Writing a forgiving parser encourages even more people violating the specification.
Dominik - 20 07 11 - 11:18
"A little poo-icon appearing next to your addressbar in a browser. Or a status-line message saying 'This website(video/picture/file) is either broken or not conform to the standard XYZ."
For Firefox, there is the Html Validator extension - https://addons.mozilla.org/firefox/addon..
- which does just this: "The number of errors of a HTML page is seen on the form of an icon in the status bar when browsing." ... and more.
I am using it for several years now.
Lynax - 31 07 11 - 05:17