OggRaw

From XiphWiki
Jump to navigation Jump to search

Purpose

For the Ogg Media Framework

Within OggStream, codecs are recoded (encoded, decoded, or transcoded) from one format to another. For example, a Vorbis codec plugin could be used to convert a Vorbis I stream to a PCM stream. Ogg packets of these streams are imported and exported from OggStream, and several of these conversions can be used in sequence (plugin chaining) to attain a desired output from any supported input within that media type.

Some codec plugins will only support one or two raw codecs, providing translations between different formatting options as needed, such that (ie) if an Ogg FLAC file contains 64-bit float data, and the media player attempting to play this data only supports 16-bit signed int data, a second plugin could provide the conversion from 64-bit float to 16-bit signed int.

Having these uncompressed codecs is, thus, essential for implementing the new Ogg media framework as it depends on interchange codecs which all applications desiring to work with a certain media type can reasonably support.


For Low-CPU Storage

While losslessly compressed Ogg codecs are available for both audio and video, some applications (ie, live recording, editing, etc) find the higher CPU requirements for processing these formats less desireable than the need for additional storage capacity. Many of these applications require syncing information not provided sufficiently by RIFF (.wav/.avi) or Quicktime, where using an uncompressed codec within Ogg provides excellent cross-bitstream syncing, or the application may be designed around the Ogg media framework where storing data in an uncompressed Ogg codec makes encoding it later while keeping comments/etc is made easier.


For Codec Development

As we experienced with Ogg Theora development, there is a shortage of simple raw data formats which support the capabilities being tested in codec development. Additionally, the lack of inter-codec sync information (ie, when using .wav & yuv4mpeg2) for these non-Ogg raw formats makes debugging more difficult than it should be.

These uncompressed Ogg codecs will hopefully solve these problems, by allowing a wide variety of data formatting options and proper inter-codec syncing for testing and development. We should not be limiting ourselves to what existing raw formats support.


Design

Why Not FourCC

The RIFF/Quicktime set of codecs have several dozen raw codecs each supporting very specific formatting options. Many of these are special purpose, never used on a wide scale, and many formatting options are not available in this system. This situation was designed under the philosophy that if an application supports a codec, as identified by the 32-bit codec identifier (aka FourCC), that it would be expected to support all the format options possible with that codec.

That system, at it's heart, is what we call "FourCC". Media frameworks designed around FourCC identify a codec purely by the 32-bit identifier, without versioning information, without further formatting information, so that by a simple table of 32-bit IDs the media framework could know which plugin to use and wether the application could support it. This, inevitably, creates a situation where most applications must support a pool of popular uncompressed codecs, increasing the footprint and complexity of the application considerably.

We can do better.

Our Philosophy

Ogg, by contrast, doesn't have a unified codec identifier. Codec software only required to be able to accuratly identify their own streams, based on information in the first packet of those streams, from other codec's streams. Thus, we don't look at a universal identifier, matching it against a table, then knowing exactly which plugin to load. Instead, we pass packet 0 to our codecs plugins and, thus, know which codec plugins which support a specific version, different feature and formatting sets, and other things which cannot be fit into a 32-bit identifier.

By this, codec plugins may, and can even be expected to, support only a subset of the formatting options available with a codec. A video codec plugin could, for example, support only low bitrate video, but do so very well, or support only non-interlaced video. This allows the codec specification to include clean backwards compatability, where a Theora to VP32 plugin could be written to only support Theora streams which do not use options unavailable to VP32.

This changes the paradigm for our uncompressed codecs, as unlike FourCC, we only need a handful of unified, generalized codecs which each support a wide variety of format options, perhaps far more than any application or codec plugin would ever use. This makes sense- it eliminates artificial limitations, previously implied on codecs to simplify implementation as support was expected to be binary (either none at all or complete).

Through minor revisions, we do not even need to support every possible format in the first implemented version. Values on formatting options, however, should be reserved for "extended" settings and a minor version field available to specify which version can, at minimum, parse the meaning of extended settings.

See Also