OggText

From XiphWiki
Revision as of 11:46, 8 November 2008 by Silvia (talk | contribs) (started the page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


This page describes a generic media mapping (i.e. rules for multiplexing) of "text codecs" into Ogg.

Text codecs are sequences of text chunks that have a timing relationship to an audio or video stream.

Prominent examples of such text codecs are:

  • closed captions (for the deaf)
  • subtitles
  • textual audio descriptions (for the blind)
  • karaoke
  • ticker text
  • active regions
  • metadata & semantic annotations
  • transcripts
  • lyrics
  • titles / credits

There are a multitude of existing open formats for specifying some of these - in particular for specifying closed captions and subtitles. They come in different complexities - some being simply a time stamp and a text, others providing for extensive styling, graphics, and motion of the text blocks over time.

No matter what the differences - when multiplexing such codecs into Ogg, they all have to solve the same problems. This is why this page describes generically how to multiplex text codecs into Ogg.

Codecs with existing mappings are:

  • CMML
  • Kate


Bitstream Format

Ogg codecs consist of a sequence of header packets and data packets.

Header packets contain information necessary to identify and set up the codec. Data packets contain the actual codec data, in this case the time-aligned text.

When these packets are multiplexed into Ogg, they are mapped to Ogg pages. For text codecs, there is a sequence of header pages, a sequence of data pages, and an EOS page, which finishes the stream. The pages have to be ordered non-decreasing with time. No data can come after the EOS page.


Header pages

Header packets are a sequence of:

  • one ident header, which identifies the codec
  • one (optional) vorbis-comment header
  • one or more secondary header packets that are codec specific


Data pages

Data packets are generally the text data that is encapsulated into Ogg at a specific time. Each data packet is mapped onto a single Ogg data page with all its content. This is possible because generally text codec packets are rather small. The insertion time is encoded in the granule_pos of the Ogg page.

Since with text codecs we are talking about discontinuous codecs, there may be a long time between codec pages in a multiplexed stream. Therefore, optionally, the inclusion of keep-alive pages to be sent at regular intervals in the data stream is encouraged. This helps a decoder's seeking code to find a currently active text packet more easily.


EOS page

The EOS page ends a text codec stream. It is an empty packet because all the information of the codec is encapsulated in the earlier data pages.