OggPCM Draft1: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
({{historical}})
 
(15 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{historical}}
{{draft}}
'''This is the original OggPCM draft. After a [http://lists.xiph.org/pipermail/ogg-dev/2005-November/thread.html heated debate], most developers have now moved to [[OggPCM2]]'''
== What is it ==
== What is it ==


'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg.  Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [http://en.wikipedia.org/wiki/Pulse-code_modulation Wikipedia.]
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg.  Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [[Wikipedia:Pulse-code_modulation|Wikipedia]].


== Why is it ==
== Why is it ==
The intention for this format is as an interchange format, for example for use with [[OggStream]].  It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF
The intention for this format is as an interchange format, for example for use with [[OggStream]].  It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF.
 
== Stream Description ==
A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.


The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.
The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.


== Format ==
== Packet Format ==
 
Header and comment packets are processed as per the value of their first byte.  Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html
'' This is a the current working draft, a compromise between the different promposed elements needed ''


Packets are processed as per the value of their first byte.  Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packets are packed in little endian order.  Multibyte fields in the data packet are packed according to the endian flag in the stream header packet.
The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.


An audio frame consists of one sample from each audio channel encoded in sequence.  The granule position specified is the total audio frames in the stream including the last complete packet in a page.  Audio frames must not be split across packets.  The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.
=== Header Packet ===
Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields:


An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html
  Packet 0, BOS, 16 bytes
 
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.
 
  Packet 0, BOS, 12 bytes
   8  0x00  Stream Header Packet ID
   8  0x00  Stream Header Packet ID
  24  "PCM"  Codec identifier  
  24  "PCM"  Codec identifier  
   -
   -
   8  0x01  Version Major (breaks backwards compatability to increment)
   8  0x01  Version Major (breaks backwards compatability to increment)
   8  0x00  Version Minor (backwards compatable, ie, via extended header)
   8  0x00  Version Minor (backwards compatable, ie, more supported format id's)
   8  [int] Number of Channels (1-256)
   8  [uint] Number of header packets preceding data
   1 [flg] False = MSB, True = LSB
   8 [uint] Number of Channels, 0 = 256
  3  [int]  PCM Data Type (see table below)
  4  [nil]  Padding to byte, may be used in later minor version
   -
   -
  32  [int] Samplerate (samples/second)
16  [flag] Flags
16  [enum] PCM Format ID
  -
  32  [uint] Sample Rate


The flags field is defined as follows:
  Bit      Description
  15 (MSB)  Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data
            packet containing 3 channels and 2 samples/channel, the chunked storage order would be
            001122. For the interleaved storage format (default), the order would be 012012.
  others    Reserved
Applications conforming to version 1.0 of this spec MUST:<ul>
<li>set all reserved flags to false (zero) when creating these streams.</li>
<li>preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.</li>
</ul>
=== Comment Packets ===
At this time, there is only one defined comment packet.
  Comment Header Packet
  Comment Header Packet
   8  0x03   Comment Header Packet ID
   8  0x01   Comment Header Packet ID
  24  "PCM"  Codec Identifier
  24  "PCM"  Codec Identifier
  -- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]
  -- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]


Data Packet
=== Data Packets ===
  8  0xFF  Data Packet ID
Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.
24  "PCM"  Codec identifier, pads data to 32-bits
.. [data] variable length pcm data


PCM Data Type
=== Supported PCM Formats ===
=============
Formats are identified within a header packet by a 16 bit "format type" field. While
ID#  Bits  Type
most applications will treat this as an opaque type, it is possible to discern some
  0  8    signed  (char)
information about the format from the value of this field itself. Specifically, the
  1  8    unsigned (char)
format's storage size, in bytes, and its byte ordering, can be discerned by parsing
  2  16    signed  (short int)
the lower 6 bits of the value. These values are exposed so that it is possible to
  3  24    signed  (int + 8bit padding)
extract individual samples without necessarily understanding the coding scheme involved.
  4  32    signed  (int)
While for pratical purposes, due to performance concerns, most applications will
  5  32    float    (float)
choose to operate on a buffer directly, it is nonetheless possible to work a sample
  6  64    float    (double)
at a time.  
  7  ?    Extended unsupported by 1.0 software


'''Encapsulation in Ogg'''
Binary Value    Meaning
..xxxx00        N/A, or data not accurately described by this scheme.
..xxxx01        Least significant byte first. Bytes are MS bit first.
..xxxx10        Most significant byte first. Bytes are MS bit first.
..xxxx11        Data is machine endian
..0000xx        Data can not be described by this bytepacking scheme.
..0001xx        Samples are stored using one byte per sample
..0010xx        Samples are stored using two bytes per sample
..0011xx        Samples are stored using three bytes per sample
..0100xx        Samples are stored using four bytes per sample
..1000xx        Samples are stored using eight bytes per sample


The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame.
The remaining 10 bits describe the coding scheme used to convert the digital value
to an audio signal. The following formats are defined for version 1.0 of this
format. For purposes of attribution, it should be noted that these formats are the
PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and
should be fairly comprehensive.


Format ID  Short Name            Description
  -- Signed integer coding (0)
  0x0004    OGGPCM_FMT_S8          Signed integer 8 bit
  0x0009    OGGPCM_FMT_S16_LE      Signed integer 16 bit little endian
  0x000A    OGGPCM_FMT_S16_BE      Signed integer 16 bit big endian
  0x000B    OGGPCM_FMT_S16        Signed integer 16 bit machine endian
  0x000D    OGGPCM_FMT_S24_3LE    Signed integer 24 bit little endian
  0x000E    OGGPCM_FMT_S24_3BE    Signed integer 24 bit big endian
  0x0011    OGGPCM_FMT_S32_LE      Signed integer 32 bit little endian
  0x0012    OGGPCM_FMT_S32_BE      Signed integer 32 bit big endian
  0x0013    OGGPCM_FMT_S32        Signed integer 32 bit machine endian
  --
  -- Unsigned integer coding (1)
  0x0044    OGGPCM_FMT_U8          Unsigned integer 8 bit
  0x0049    OGGPCM_FMT_U16_LE      Unsigned integer 16 bit little endian
  0x004A    OGGPCM_FMT_U16_BE      Unsigned integer 16 bit big endian
  0x004B    OGGPCM_FMT_U16        Unsigned integer 16 bit machine endian
  0x004D    OGGPCM_FMT_U24_3LE    Unsigned integer 24 bit little endian
  0x004E    OGGPCM_FMT_U24_3BE    Unsigned integer 24 bit big endian
  0x0051    OGGPCM_FMT_U32_LE      Unsigned integer 32 bit little endian
  0x0052    OGGPCM_FMT_U32_BE      Unsigned integer 32 bit big endian
  0x0053    OGGPCM_FMT_U32        Unsigned integer 32 bit machine endian
  --
  -- IEEE Floating point coding (2)
  0x0091    OGGPCM_FMT_FLT_LE      IEEE Float (-1,1) 32 bit little endian
  0x0092    OGGPCM_FMT_FLT_BE      IEEE Float (-1,1) 32 bit big endian
  0x0093    OGGPCM_FMT_FLT        IEEE Float (-1,1) 32 bit machine endian
  0x00A1    OGGPCM_FMT_FLT64_LE    IEEE Float (-1,1) 64 bit little endian
  0x00A2    OGGPCM_FMT_FLT64_BE    IEEE Float (-1,1) 64 bit big endian
  0x00A3    OGGPCM_FMT_FLT64      IEEE Float (-1,1) 64 bit machine endian
  --
  -- IEC958 coding (?) (3)
  0x00CD    OGGPCM_FMT_IEC958_3LE  IEC958 Subframe, 24 bit little endian
  0x00CE    OGGPCM_FMT_IEC958_3BE  IEC958 Subframe, 24 bit big endian
  0x00D1    OGGPCM_FMT_IEC958_LE  IEC958 Subframe, 32 bit little endian
  0x00D2    OGGPCM_FMT_IEC958_BE  IEC958 Subframe, 32 bit big endian
  0x00D3    OGGPCM_FMT_IEC958      IEC965 Subframe, 32 bit machine endian
  --
  -- Mu-Law coding (4)
  0x0104    OGGPCM_FMT_MU_LAW      Mu-Law
  --
  -- A-Law coding (5)
  0x0144    OGGPCM_FMT_A_LAW      A-Law
  --
  -- ADPCM coding (6)
  0x0180    OGGPCM_FMT_ADPCM      Ima-ADPCM   
  --
  -- GSM coding (7)
  0x01C0    OGGPCM_FMT_GSM        GSM
  --
  -- 24 bit signed integer in 32 bit storage (8)
  0x0211    OGGPCM_FMT_S24_LE      Signed integer 24 bit little endian
  0x0212    OGGPCM_FMT_S24_BE      Signed integer 24 bit big endian
  0x0213    OGGPCM_FMT_S24        Signed integer 24 bit machine endian
  --
  -- 24 bit unsigned integer in 32 bit storage (9)
  0x0251    OGGPCM_FMT_U24_LE      Unsigned integer 24 bit little endian
  0x0252    OGGPCM_FMT_U24_BE      Unsigned integer 24 bit big endian
  0x0253    OGGPCM_FMT_U24        Unsigned integer 24 bit machine endian
  --
  -- 20 bit signed integer in 24 bit storage (10)
  0x028D    OGGPCM_FMT_S20_3LE    Signed integer 20 bit little endian
  0x028E    OGGPCM_FMT_S20_3BE    Signed integer 20 bit big endian
  --
  -- 20 bit unsigned integer in 24 bit storage (11)
  0x02CD    OGGPCM_FMT_U20_3LE    Unsigned integer 20 bit little endian
  0x02CE    OGGPCM_FMT_U20_3BE    Unsigned integer 20 bit big endian
  --
  -- 18 bit signed integer in 24 bit storage (12)
  0x030D    OGGPCM_FMT_S18_3LE    Signed integer 18 bit little endian
  0x030E    OGGPCM_FMT_S18_3BE    Signed integer 18 bit big endian
  --
  -- 18 bit unsigned integer in 24 bit storage (13)
  0x034D    OGGPCM_FMT_U18_3LE    Unsigned integer 18 bit little endian
  0x034E    OGGPCM_FMT_U18_3BE    Unsigned integer 18 bit big endian
  --
  Other coding schemes supported by ALSA but not specified here:
    MPEG
  --
  TODO: ADPCM and GSM need further specification (or elimination) since these aren't really
        byte packed like the other formats here are.
== Encapsulation in Ogg ==
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.


'''Constraints'''
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a pageAudio frames must not be split across packetsThe rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routinesA truncated stream will still return the proper number of audio frames that can be decoded fully.
 
* Version 1.0 codec software MUST NOT attempt to decode when the Extended (7) Data Type is specified.
 
* An OggPCM packet MUST NOT be constructed with a partial frame; ie. an audio frame must not span two Ogg packets.
 
== Alternative Format ==
 
''This format was written by [[User:Jkoleszar|Jkoleszar]], and has since been combined with other ideas into the primary format (above)''
 
It is intended to support channels from the same source having different sampling parameters.
 
'''Packet structure'''
 
Packet 0, BOS, tbd bytes
  8  0x00      Header Packet ID
24  "PCM"      Codec identifier
  -
  8  0x01      Version Major (breaks backwards compatability to increment)
  8  0x00      Version Minor (backwards compatable, ie, via extended header)
  8  [uint]    Source ID (Unique amongst all OggPCM streams in the physical stream)
  8  [uint]    Channel Block
  -
16  [bitfield] Indicates which of the 16 channels in this channel block
                are present in this logical OGGPCM stream.
  8 [enum]    Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc)
24  [uint]    Sample rate ** this field crosses a 32bit-word barrier ** 
 
Data Packet
  8  0xFF      Data Packet ID
24  "PCM"      Codec identifier, pads data to 32-bits
.[data]    variable length pcm data, packing defined by Sample Format field in header
 
'''Sample Format'''
 
OGG_PCM_S8      = 0x1      /* Signed 8 bit. */
OGG_PCM_S16    = 0x2
OGG_PCM_S24    = 0x3
OGG_PCM_S32    = 0x4
OGG_PCM_U8      = 0x5        /* Unsigned 8 bit */
OGG_PCM_FLOAT32 = 0x6
OGG_PCM_FLOAT64 = 0x7
 
 
 
'''Discussion'''
 
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:
{| border="1" cellpadding="1"
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment
|-
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair
|-
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds
|-
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel
|-
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker
|-
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone
|-
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat
|}
 
Each entry in the table is a logical Ogg stream.  [[User:Jkoleszar|Jkoleszar]] is not convinced that the source id and channel block are necessary, but figured he'd throw it out there.

Latest revision as of 11:22, 10 November 2007


This is the original OggPCM draft. After a heated debate, most developers have now moved to OggPCM2

What is it

OggPCM is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at Wikipedia.

Why is it

The intention for this format is as an interchange format, for example for use with OggStream. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during Theora development. It is intended to be less complex to use than either RIFF or AIFF.

Stream Description

A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.

The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.

Packet Format

Header and comment packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html

The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.

Header Packet

Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields:

Packet 0, BOS, 16 bytes
 8  0x00   Stream Header Packet ID
24  "PCM"  Codec identifier 
 -
 8  0x01   Version Major (breaks backwards compatability to increment)
 8  0x00   Version Minor (backwards compatable, ie, more supported format id's)
 8  [uint] Number of header packets preceding data
 8  [uint] Number of Channels, 0 = 256
 -
16  [flag] Flags
16  [enum] PCM Format ID
 -
32  [uint] Sample Rate

The flags field is defined as follows:

  Bit       Description
  15 (MSB)  Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data
            packet containing 3 channels and 2 samples/channel, the chunked storage order would be
            001122. For the interleaved storage format (default), the order would be 012012.
  others    Reserved

Applications conforming to version 1.0 of this spec MUST:

  • set all reserved flags to false (zero) when creating these streams.
  • preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.

Comment Packets

At this time, there is only one defined comment packet.

Comment Header Packet
 8  0x01   Comment Header Packet ID
24  "PCM"  Codec Identifier
-- Continues as [Comment Header]

Data Packets

Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.

Supported PCM Formats

Formats are identified within a header packet by a 16 bit "format type" field. While most applications will treat this as an opaque type, it is possible to discern some information about the format from the value of this field itself. Specifically, the format's storage size, in bytes, and its byte ordering, can be discerned by parsing the lower 6 bits of the value. These values are exposed so that it is possible to extract individual samples without necessarily understanding the coding scheme involved. While for pratical purposes, due to performance concerns, most applications will choose to operate on a buffer directly, it is nonetheless possible to work a sample at a time.

Binary Value    Meaning
..xxxx00        N/A, or data not accurately described by this scheme.
..xxxx01        Least significant byte first. Bytes are MS bit first.
..xxxx10        Most significant byte first. Bytes are MS bit first.
..xxxx11        Data is machine endian
..0000xx        Data can not be described by this bytepacking scheme.
..0001xx        Samples are stored using one byte per sample
..0010xx        Samples are stored using two bytes per sample
..0011xx        Samples are stored using three bytes per sample
..0100xx        Samples are stored using four bytes per sample
..1000xx        Samples are stored using eight bytes per sample

The remaining 10 bits describe the coding scheme used to convert the digital value to an audio signal. The following formats are defined for version 1.0 of this format. For purposes of attribution, it should be noted that these formats are the PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and should be fairly comprehensive.

Format ID  Short Name             Description
 -- Signed integer coding (0)
 0x0004    OGGPCM_FMT_S8          Signed integer 8 bit
 0x0009    OGGPCM_FMT_S16_LE      Signed integer 16 bit little endian
 0x000A    OGGPCM_FMT_S16_BE      Signed integer 16 bit big endian
 0x000B    OGGPCM_FMT_S16         Signed integer 16 bit machine endian
 0x000D    OGGPCM_FMT_S24_3LE     Signed integer 24 bit little endian
 0x000E    OGGPCM_FMT_S24_3BE     Signed integer 24 bit big endian
 0x0011    OGGPCM_FMT_S32_LE      Signed integer 32 bit little endian
 0x0012    OGGPCM_FMT_S32_BE      Signed integer 32 bit big endian
 0x0013    OGGPCM_FMT_S32         Signed integer 32 bit machine endian
 --
 -- Unsigned integer coding (1)
 0x0044    OGGPCM_FMT_U8          Unsigned integer 8 bit
 0x0049    OGGPCM_FMT_U16_LE      Unsigned integer 16 bit little endian
 0x004A    OGGPCM_FMT_U16_BE      Unsigned integer 16 bit big endian
 0x004B    OGGPCM_FMT_U16         Unsigned integer 16 bit machine endian
 0x004D    OGGPCM_FMT_U24_3LE     Unsigned integer 24 bit little endian
 0x004E    OGGPCM_FMT_U24_3BE     Unsigned integer 24 bit big endian
 0x0051    OGGPCM_FMT_U32_LE      Unsigned integer 32 bit little endian
 0x0052    OGGPCM_FMT_U32_BE      Unsigned integer 32 bit big endian
 0x0053    OGGPCM_FMT_U32         Unsigned integer 32 bit machine endian
 --
 -- IEEE Floating point coding (2)
 0x0091    OGGPCM_FMT_FLT_LE      IEEE Float (-1,1) 32 bit little endian
 0x0092    OGGPCM_FMT_FLT_BE      IEEE Float (-1,1) 32 bit big endian
 0x0093    OGGPCM_FMT_FLT         IEEE Float (-1,1) 32 bit machine endian
 0x00A1    OGGPCM_FMT_FLT64_LE    IEEE Float (-1,1) 64 bit little endian
 0x00A2    OGGPCM_FMT_FLT64_BE    IEEE Float (-1,1) 64 bit big endian
 0x00A3    OGGPCM_FMT_FLT64       IEEE Float (-1,1) 64 bit machine endian
 --
 -- IEC958 coding (?) (3)
 0x00CD    OGGPCM_FMT_IEC958_3LE  IEC958 Subframe, 24 bit little endian
 0x00CE    OGGPCM_FMT_IEC958_3BE  IEC958 Subframe, 24 bit big endian
 0x00D1    OGGPCM_FMT_IEC958_LE   IEC958 Subframe, 32 bit little endian
 0x00D2    OGGPCM_FMT_IEC958_BE   IEC958 Subframe, 32 bit big endian
 0x00D3    OGGPCM_FMT_IEC958      IEC965 Subframe, 32 bit machine endian
 --
 -- Mu-Law coding (4)
 0x0104    OGGPCM_FMT_MU_LAW      Mu-Law
 --
 -- A-Law coding (5)
 0x0144    OGGPCM_FMT_A_LAW       A-Law
 --
 -- ADPCM coding (6)
 0x0180    OGGPCM_FMT_ADPCM       Ima-ADPCM    
 --
 -- GSM coding (7)
 0x01C0    OGGPCM_FMT_GSM         GSM
 --
 -- 24 bit signed integer in 32 bit storage (8)
 0x0211    OGGPCM_FMT_S24_LE      Signed integer 24 bit little endian
 0x0212    OGGPCM_FMT_S24_BE      Signed integer 24 bit big endian
 0x0213    OGGPCM_FMT_S24         Signed integer 24 bit machine endian
 --
 -- 24 bit unsigned integer in 32 bit storage (9)
 0x0251    OGGPCM_FMT_U24_LE      Unsigned integer 24 bit little endian
 0x0252    OGGPCM_FMT_U24_BE      Unsigned integer 24 bit big endian
 0x0253    OGGPCM_FMT_U24         Unsigned integer 24 bit machine endian
 --
 -- 20 bit signed integer in 24 bit storage (10)
 0x028D    OGGPCM_FMT_S20_3LE     Signed integer 20 bit little endian
 0x028E    OGGPCM_FMT_S20_3BE     Signed integer 20 bit big endian
 --
 -- 20 bit unsigned integer in 24 bit storage (11)
 0x02CD    OGGPCM_FMT_U20_3LE     Unsigned integer 20 bit little endian
 0x02CE    OGGPCM_FMT_U20_3BE     Unsigned integer 20 bit big endian
 --
 -- 18 bit signed integer in 24 bit storage (12)
 0x030D    OGGPCM_FMT_S18_3LE     Signed integer 18 bit little endian
 0x030E    OGGPCM_FMT_S18_3BE     Signed integer 18 bit big endian
 --
 -- 18 bit unsigned integer in 24 bit storage (13)
 0x034D    OGGPCM_FMT_U18_3LE     Unsigned integer 18 bit little endian
 0x034E    OGGPCM_FMT_U18_3BE     Unsigned integer 18 bit big endian
 --
 Other coding schemes supported by ALSA but not specified here:
   MPEG
 --
 TODO: ADPCM and GSM need further specification (or elimination) since these aren't really
       byte packed like the other formats here are.

Encapsulation in Ogg

Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.

The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for OggPCM, a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.