# TransOgg Page

(Difference between revisions)
 Revision as of 08:37, 28 May 2010 (view source)Xiphmont (Talk | contribs) (→duration values)← Older edit Revision as of 08:45, 28 May 2010 (view source)Xiphmont (Talk | contribs) Newer edit → Line 202: Line 202: == V32/64 format == == V32/64 format == - V32/64 is a simple bit-extension format that uses a single leading bit (LSbit of the codeword) to indicate if the codeword is 32 or 64 bits. + V32/64 is a bit-extension format that signals the codeword length using the leading bit (LSbit of the codeword) to indicate if the codeword is 32 or 64 bits. - inital bit: 0 -> codeword is 32 bits; upper 31 bits encode an unsigned value between 0 and 2^31-1 + inital [low] bit: 0 -> codeword is 32 bits total; - 1 -> codeword is 64 bits; upper 63 bits encode an unsigned value between 2^31 and 2^63 + 2^31 -1 + upper 31 bits encode an unsigned value between 0 and 2^31-1 + 1 -> codeword is 64 bits total; + upper 63 bits encode an unsigned value between 2^31 and 2^63 + 2^31 -1 + + == V8/16/32 format == + + V8/16/32 is a bit-extension format that signals the codeword length using the leading bit[s] (LSbit[s] of the codeword) to indicate if the codeword is 8, 16 or 32 bits. + + inital [low] bit[s]: 0  -> codeword is 8 bits total; + upper 7 bits encode an unsigned value between 0 and 127 (2^7 - 1) + 10 -> codeword is 16 bits total; + upper 15 bits encode an unsigned value between 128 (2^7) and 32895 (2^15 + 2^7 - 1) + 11 -> codeword is 32 bits total; + upper 31 bits encode an unsigned value between 32896 (2^15 + 2^7) and 2147516543 (2^31 + 2^15 + 2^7 - 1) = Syncpoints / Keyframes = = Syncpoints / Keyframes =

# TransOgg Page Primitive

transOgg transport consists of a single encapsulation primitive, based on the original Ogg page. Pages are byte aligned; in the diagram below, bytes are encoded left to right, top to bottom. Greater-than-byte quantities are encoded in little-endian byteorder.

```          01234567 01234567 01234567 01234567
0        1        2        3
--    |-------- -------- -------- --------|
0|          capture pattern          |3
|--------|-------- -------- --------|
4| sgmnts |   stream identification  |7
|------|-|--------|-------- --------|
20   8| flag |  hbytes  |      dbytes     |11
|------|- --------|-------- --------|
12|          32 bit checksum          |15
|-------- -------- -------- --------|
16|          DTS [low] word           |19
--    |-------- -------- -------- --------|
20?           DTS high word           |
|-------- -------- -------- --------
? sequence...
|--------
hbytes  ? distance...
|--------
? segment table...
|--------
? delay... duration... ppflags...
--    |--------
--    |--------
```

## capture pattern

the capture pattern consists of four ASCII 7-bit clean bytes: 'tOgS' (0x74, 0x4F, 0x47, 0x53 in order)

## segment count

'sgmnts' (8 bits, 0-255) indicates the number of packet segments encoded in this page. The first and/or last packets may be partial as specified by the FROM/TO flags (below).

## stream identification

the stream identification ID is a 24-bit pseudo-random number that uniquely identifies this media stream within the larger multiplexed stream. It must be unique both in the current multiplexed section, as well as globally unique within a chained stream. The large size of the 24 bit ID is intended to be used like a weak hash such that it will be highly unlikely to need to rewrite a stream's ID number (and thus rechecksum all the pages as well) when multiplexing or concatenating.

## flags

'flags' defines seven bit flags (bits 0-6 of byte 8) as follows:

```  0 FROM:   set == initial packet continued from previous page
note: unset if page contains no packets
1 TO  :   set == final packet continued on next page
note: unset if page contains no packets
2 CRC :   set == checksum applies to header and data
unset == checksum applies to header fields only
3 SYNC:   set == payload data begins with a syncpoint/keyframe
note: always set for keyframeless codecs
note: set if a keyframe/syncpoint packet is continued
onto the current page
4 SEQ :   set == sequence field is present
5 DURA:   set == full raw duration encoding present
6 EVIL:          as specified in RFC 3514
```

'hbytes' (bit 7 of byte 8 and bits 0-7 of byte 9 for 9 bits total, 0-511) indicates the number of bytes spanned by the variable-length header fields (DTS high word, sequence, distance, lacing, and delay/duration/ppflags fields)

'dbytes' (16 bits, 0-65535) indicates the number of bytes of data payload.

## checksum

the checksum is 32 bit CRC value (direct algorithm, initial val and final XOR = 0, generator polynomial=0x04c11db7) encoded in the page header in little-endian format. The checksum is computed over the 20+hbytes header bytes, skipping the CRC bytes. When the CRC flag is set, the CRC continues over the entire page body (dbytes).

## delivery/decode-time stamp

The DTS field is a variable-length encoded delivery time stamp value, equivalent to the high bits of the granule position in the original Ogg container. The DTS value is encoded in V32/64 format and is either 4 or 8 bytes in total.

## sequence field

The sequence field is present when the SEQ flag is set; this field orders any sequence of pages that have the same DTS, such as pages without complete packets, or pages containing only packets with zero duration. The sequence value is encoded in V8/16/32 format. The first page in a sequence of pages with identical DTS does not set the SEQ flag. The second page in a sequence sets the SEQ flag and the sequence value to zero. The third page in a sequence sets the SEQ flag and the sequence value to one, etc.

## distance field

The distance field is conditionally present only if the SYNC flag is unset. It is equal to the DTS of the current page minus the DTS of the previous syncpoint packet (not page) minus one. The value is encoded in V8/16/32 format.

## segment table

Lacing values encode the length of each payload segment in the page into the segment table. Lacing values are coded in one, two or three bytes. Lacing values are coded until the total number of coded segment lengths == 'sgmnts-1'. Length of the last segment is implicit, equalling the unencoded remainder of 'dbytes', which may be zero.

Overrunning the number of declared segments (ie, a zero run encodes past the 'sgmnts'-1 limit), or underruning the expected number of segments (ie, reading to the end of hbytes before seeing the expected number of segments) shall be considered an error condition rendering the page undecodable. Lacing values may encode no more than 255 segments total (including the implicit last segment), null or otherwise, in a single page.

If 'sgmnts' is zero, the page is a null-page containing no data. 'dbytes' must also be zero. More on proper encoding and use of null pages

## per-packet fields

the delay, duration and per-packet flags are collectively byte-aligned and fill out the remainder of the 'hbytes' span not filled by the DTS, sequence, distance and lacing fields. Within this span, the individual delay, duration and flag fields are bit-aligned using a big-endian byte-packer.

### delay values

Delay values are written first. A field of N bits is written for every packet completed on the page. N is set in the stream metaheader; N may be zero in which case no delay values are written. The value encoded is the PTS minus the DTS of the packet. The value in unsigned (positive).

### duration values

Duration values are written next, bit aligned to the end of the delay values. A duration value is written for each packet completed on the page.

If the DURA flag is unset, each duration value is written as an N bit quantity, where N is set in the metadata header. N may be zero. The value as written is interpreted according to the duration base, duration multiplier and duration table declared in the metadata header.

If the DURA flag is set, each duration value is instead encoded as an V8/16/32 value and interpreted directly against the stream's master PTS/DTS timebase.

### per-packet 'private' flags

Codec-private 'per packet' flags are encoded next. A field of N bits is written for every packet completed on the page. N is set in the stream metadata; N may be zero in which case no packet flags are written.

Any unused bits needed to fill the last byte out such that the lags are written into an integral number of bytes are set to zero. The hbytes field may not be used to 'pad' the flags fields with extra space; more than seven 'left over' bits (hbytes + 4 - DTS bytes - sequence field bytes - distance field bytes - segment tabel bytes) * 8 - flag bits > 7) shall be considered an error rendering the page undecodable.

Data payload is byte-aligned and immediately follows the last flag byte. The size of the data payload is equal to dbytes.

# Value encodings

## Lacing codewords

Lacing codewords are as follows:

```  first byte == 0 through
251 : stop reading, use unsigned value as sizeof packet.
Note that zero is a valid packet size.
== 252 : read a second byte; packet size is the unsigned value
of the second byte + 252.
== 253 : read a second byte; packet size is the unsigned value
of the second byte + 508.
== 254 : read a second byte; packet size is the unsigned value
of the second byte + 764.
== 255 : read a second and third unsigned byte;
If the second byte is < 251, packet size is
(second byte << 8) + third byte + 1020.
If the second byte == 252, read a third byte.
If the third byte is < 4, packet size is
(second byte << 8) + third byte + 1020.
If the third byte is >= 4, this indicates the
presence of (third byte) zero-length packets
in sequence.  It is always more efficient to
code more than three zero-length packets in
sequence using the this three-byte signalling,
however muxers MAY use either encoding.
Demuxers MUST handle both cases.
If the second byte > 252, this indicates a case reserved
for future use; this shall render the page not
decodable.
```

## V32/64 format

V32/64 is a bit-extension format that signals the codeword length using the leading bit (LSbit of the codeword) to indicate if the codeword is 32 or 64 bits.

```inital [low] bit: 0 -> codeword is 32 bits total;
upper 31 bits encode an unsigned value between 0 and 2^31-1
1 -> codeword is 64 bits total;
upper 63 bits encode an unsigned value between 2^31 and 2^63 + 2^31 -1
```

## V8/16/32 format

V8/16/32 is a bit-extension format that signals the codeword length using the leading bit[s] (LSbit[s] of the codeword) to indicate if the codeword is 8, 16 or 32 bits.

```inital [low] bit[s]: 0  -> codeword is 8 bits total;
upper 7 bits encode an unsigned value between 0 and 127 (2^7 - 1)
10 -> codeword is 16 bits total;
upper 15 bits encode an unsigned value between 128 (2^7) and 32895 (2^15 + 2^7 - 1)
11 -> codeword is 32 bits total;
upper 31 bits encode an unsigned value between 32896 (2^15 + 2^7) and 2147516543 (2^31 + 2^15 + 2^7 - 1)
```

# Syncpoints / Keyframes

Streams in which not every frame serves as a syncpoint may place only one syncpoint (keyframe) packet per page. The syncpoint packet must be the first packet completed on the page (if any).