From XiphWiki
Revision as of 05:07, 19 December 2007 by Decoy (talk | contribs) (moved comments from the article)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Moving these comments from the article for further discussion:

Portable players are usually ARM, which is usually little-endian. The Macintosh is now little-endian. Obviously the PC is little-endian. Clearly there is a winner. It's long past time to stop putting the bytes in an order that makes both programmers and computers do extra work for no good reason. Don't try to hold back the tide.

(the sample rate field) should be a rational with at least a 22-bit numerator and 10-bit denominator

An integer sampling rate is trouble. Audio does not always come that way. For example, audio is sometimes tied to the NTSC frame rate of 30000/1001. That 1001 can show up in the sample rate, and thus needs 10 bits. Rates with a 3 in the denominator are common too. Super Audio CD needs 22 bits to represent 2.8224 MHz. So 22 bits and 10 bits will do the job. Better would be 32 bits for both numerator and denominator of course. A float will never be quite right, though it sure beats an integer and will in fact hold exact values into the MHz. One can't express 1/3 or 1/10 as a float, so 12345.6 and 12345.6666... are undoable that way. BTW, allowing for subsonic recording would be nice.

Decoy 04:07, 19 December 2007 (PST)

Perhaps so, but factually people do use OggPCM on big-endian machines. Having big-endian as an option makes sense then, and what is suggested above would then become a recommendation on how to use the format, not a necessary limitation on it.

DSD does not have to be supported, because while technically it can be viewed as high rate PCM, in practice OggPCM aims at supporting the most common forms of conventional PCM, not much more. DSD content is rare, there is no obvious reason why it would be generated by any of the free/open source software projects like Xiph, it would also need its own sample format tag, we would need to go into the specifics of bit packing, and so on. It seems like a whole lot of extra work and complexity for very little gain.

The physical header has already been finalized, so touching the sampling rate parameter is not really an option. Fractional sampling rates would again add complexity for little real benefit, and the option would be difficult to ignore if implemented. That's bad for embedded devices. The point about NTSC, drop-frame and the like is valid, granted, but given how imperfect such sources usually are, addressing the mismatch by simple resampling techniques should be sufficient.

Subsonic recording, that's IMO unnecessary generality for something intended for multimedia work.

Decoy 04:07, 19 December 2007 (PST)