<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.xiph.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Cpearce</id>
	<title>XiphWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.xiph.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Cpearce"/>
	<link rel="alternate" type="text/html" href="https://wiki.xiph.org/Special:Contributions/Cpearce"/>
	<updated>2026-04-24T12:38:39Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12848</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12848"/>
		<updated>2011-05-11T21:57:25Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. The latest version of Skeleton, version 4.0, also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
=== Previous version ===&lt;br /&gt;
&lt;br /&gt;
The previous version of Ogg Skeleton was version 3, and its specification can be found on the wiki page [[Ogg Skeleton 3]], or at [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt].&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its EOS page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0. Note the EOS packet appears by itself on its own page (the &amp;quot;EOS page&amp;quot;).&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...5].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton EOS packet appears by itself on the the last page of the Skeleton stream (the &amp;quot;EOS page&amp;quot;).&lt;br /&gt;
* the Skeleton EOS page ends the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton 4 is being supported by the following projects:&lt;br /&gt;
* ffmpeg2theora (version 0.27 and above) &lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://git.xiph.org/?p=OggIndex.git;a=summary source]&lt;br /&gt;
* Mozilla Firefox 4&lt;br /&gt;
&lt;br /&gt;
The following projects currently support Ogg Skeleton 3, support for Ogg Skeleton 4 is planned:&lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12680</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12680"/>
		<updated>2010-11-23T02:21:40Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Development */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
=== Previous version ===&lt;br /&gt;
&lt;br /&gt;
The previous version of Ogg Skeleton was version 3, and its specification can be found on the wiki page [[Ogg Skeleton]], or at [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt].&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...5].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton 4 is being supported by the following projects:&lt;br /&gt;
* ffmpeg2theora (version 0.27 and above) &lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://git.xiph.org/?p=OggIndex.git;a=summary source]&lt;br /&gt;
* Mozilla Firefox 4&lt;br /&gt;
&lt;br /&gt;
The following projects currently support Ogg Skeleton 3, support for Ogg Skeleton 4 is planned:&lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12679</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12679"/>
		<updated>2010-11-23T02:20:30Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
=== Previous version ===&lt;br /&gt;
&lt;br /&gt;
The previous version of Ogg Skeleton was version 3, and its specification can be found on the wiki page [[Ogg Skeleton]], or at [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt].&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...5].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton 4 is being supported by the following projects:&lt;br /&gt;
* ffmpeg2theora (version 0.27 and above) &lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://git.xiph.org/?p=OggIndex.git;a=summary source]&lt;br /&gt;
* Mozilla Firefox 4&lt;br /&gt;
&lt;br /&gt;
The following projects currently support Ogg Skeleton 3.1, support for Ogg Skeleton 4 is planned:&lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=SkeletonHeaders&amp;diff=12678</id>
		<title>SkeletonHeaders</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=SkeletonHeaders&amp;diff=12678"/>
		<updated>2010-11-23T02:09:28Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Ogg Skeleton 4 Message Headers =&lt;br /&gt;
&lt;br /&gt;
== Adding New Message Headers to Skeleton ==&lt;br /&gt;
&lt;br /&gt;
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn&#039;t had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.&lt;br /&gt;
&lt;br /&gt;
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).&lt;br /&gt;
&lt;br /&gt;
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.&lt;br /&gt;
&lt;br /&gt;
On this wiki page, we are collecting such new information fields.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Content-type ===&lt;br /&gt;
&lt;br /&gt;
Right now, there is one mandatory message header field for all of the logical bitstreams: the &amp;quot;Content-type&amp;quot; header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Language ===&lt;br /&gt;
&lt;br /&gt;
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.&lt;br /&gt;
&lt;br /&gt;
For audio tracks with speech, the Language would be the language that dominates.&lt;br /&gt;
&lt;br /&gt;
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.&lt;br /&gt;
&lt;br /&gt;
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.&lt;br /&gt;
&lt;br /&gt;
Examples are: en-US, de-DE, sgn-ase, en-cockney&lt;br /&gt;
&lt;br /&gt;
The Language field will have the dominating language specified as the first language. It is possible to specify less non-dominating languages as a list after the main language.&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
 Language: en-US, fr&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Role ===&lt;br /&gt;
&lt;br /&gt;
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.&lt;br /&gt;
&lt;br /&gt;
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.&lt;br /&gt;
&lt;br /&gt;
Text tracks:&lt;br /&gt;
* &amp;quot;text/caption&amp;quot; - transcription of all sounds, including speech, for purposes of the hard-of-hearing&lt;br /&gt;
* &amp;quot;text/subtitle&amp;quot; - translation of all speech, typically into a different language&lt;br /&gt;
* &amp;quot;text/textaudiodesc&amp;quot; - description/transcription of everything that happens in a video as text to be used for the vision-impaired through screen readers or braille&lt;br /&gt;
* &amp;quot;text/karaoke&amp;quot; - music lyrics delivered in chunks for singing along&lt;br /&gt;
* &amp;quot;text/chapters&amp;quot; - titles for sections of the media that provide a kind of chapter segmentation (similar to DVD chapters)&lt;br /&gt;
* &amp;quot;text/tickertext&amp;quot; - text to run as informative text at the bottom of the media display&lt;br /&gt;
* &amp;quot;text/lyrics&amp;quot; - transcript of the text used in music media&lt;br /&gt;
* &amp;quot;text/metadata&amp;quot; - name-value pairs that are associated with certain sections of the media&lt;br /&gt;
* &amp;quot;text/annotation&amp;quot; - free text associated with certain sections of the media&lt;br /&gt;
* &amp;quot;text/linguistic&amp;quot; - linguistic markup of the spoken words&lt;br /&gt;
&lt;br /&gt;
Video tracks:&lt;br /&gt;
* &amp;quot;video/main&amp;quot; - the main video track&lt;br /&gt;
* &amp;quot;video/alternate&amp;quot; - an alternative video track, e.g. different camera angle&lt;br /&gt;
* &amp;quot;video/sign&amp;quot; - a sign language video track&lt;br /&gt;
&lt;br /&gt;
Audio tracks:&lt;br /&gt;
* &amp;quot;audio/main&amp;quot; - the main audio track&lt;br /&gt;
* &amp;quot;audio/alternate&amp;quot; - an alternative audio track, probably linked to an alternate video track&lt;br /&gt;
* &amp;quot;audio/dub&amp;quot; - the audio track but with speech in a different language to the original&lt;br /&gt;
* &amp;quot;audio/audiodesc&amp;quot; - an audio description recording for the vision-impaired &lt;br /&gt;
* &amp;quot;audio/music&amp;quot; - a music track, e.g. when music, speech and sound effects are delivered in different tracks&lt;br /&gt;
* &amp;quot;audio/speech&amp;quot; - a speech track, e.g. when music, speech and sound effects are delivered in different tracks&lt;br /&gt;
* &amp;quot;audio/sfx&amp;quot; - a sound effects track, e.g. when music, speech and sound effects are delivered in different tracks&lt;br /&gt;
&lt;br /&gt;
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don&#039;t always provide the right main content type (e.g. application/kate is semantically a text format).&lt;br /&gt;
&lt;br /&gt;
There may also be parameters to describe the roles better, such as &amp;quot;video/alternate;angle=nw&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Name ===&lt;br /&gt;
&lt;br /&gt;
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.&lt;br /&gt;
&lt;br /&gt;
Characters allowed are basically all the characters that are also allowed for XML id fields:&lt;br /&gt;
&lt;br /&gt;
 the first character has to be one of:&lt;br /&gt;
 [A-Z] | &amp;quot;_&amp;quot; | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |&lt;br /&gt;
 [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]&lt;br /&gt;
&lt;br /&gt;
 any following characters can be one of:&lt;br /&gt;
 [A-Z] | &amp;quot;_&amp;quot; | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | &lt;br /&gt;
 [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | &lt;br /&gt;
 &amp;quot;-&amp;quot; | &amp;quot;.&amp;quot; | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]&lt;br /&gt;
&lt;br /&gt;
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.&lt;br /&gt;
&lt;br /&gt;
An example means of addressing the track by name is: track[name=&amp;quot;Madonna_singing&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Title ===&lt;br /&gt;
&lt;br /&gt;
A free text field to provide a description of the track content.&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
 Title: &amp;quot;the French audio track for the movie&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Display-hint ===&lt;br /&gt;
&lt;br /&gt;
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content &amp;quot;correctly&amp;quot;. This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.&lt;br /&gt;
&lt;br /&gt;
Currently proposed hints are:&lt;br /&gt;
&lt;br /&gt;
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.&lt;br /&gt;
&lt;br /&gt;
Examples:&lt;br /&gt;
 Display-hint: pip(20%,20%)&lt;br /&gt;
 Display-hint: pip(40,40,690,60)&lt;br /&gt;
&lt;br /&gt;
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.&lt;br /&gt;
&lt;br /&gt;
Examples:&lt;br /&gt;
 Display-hint: mask(http://www.example.com/image.png)&lt;br /&gt;
 Display-hint: mask(http://www.example.com/image.png,30%,25%)&lt;br /&gt;
 Display-hint: mask(http://www.example.com/image.png,20,20,400,320)&lt;br /&gt;
&lt;br /&gt;
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.&lt;br /&gt;
&lt;br /&gt;
Examples:&lt;br /&gt;
 Display-hint: transparent(25%)&lt;br /&gt;
 Display-hint: transparent(7%)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Track order ===&lt;br /&gt;
&lt;br /&gt;
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.&lt;br /&gt;
&lt;br /&gt;
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn&#039;t change.&lt;br /&gt;
&lt;br /&gt;
For example, a video file with the following composition would have the following indexes:&lt;br /&gt;
* track[0]: Skeleton BOS&lt;br /&gt;
* track[1]: Theora BOS for main video&lt;br /&gt;
* track[2]: Vorbis BOS for main audio&lt;br /&gt;
* track[3]: Kate BOS for English captions&lt;br /&gt;
* track[4]: Kate BOS for German subtitles&lt;br /&gt;
* track[5]: Vorbis BOS for audio descriptions&lt;br /&gt;
* track[6]: Theora BOS for sign language&lt;br /&gt;
&lt;br /&gt;
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Altitude ===&lt;br /&gt;
&lt;br /&gt;
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a &amp;quot;main&amp;quot; track is always displayed bottom-most unless otherwise defined. &lt;br /&gt;
&lt;br /&gt;
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.&lt;br /&gt;
An element with greater stack order is always in front of an element with a lower stack order.&lt;br /&gt;
&lt;br /&gt;
Example: Altitude: -150&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Track dependencies ===&lt;br /&gt;
&lt;br /&gt;
It is tempting to introduce dependencies between tracks - to specify things such as:&lt;br /&gt;
&lt;br /&gt;
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too&lt;br /&gt;
&lt;br /&gt;
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don&#039;t display them together and if you activate one, disable the others&lt;br /&gt;
&lt;br /&gt;
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).&lt;br /&gt;
&lt;br /&gt;
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.&lt;br /&gt;
&lt;br /&gt;
MPEG has a &amp;quot;groupID&amp;quot; element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.&lt;br /&gt;
&lt;br /&gt;
At this stage, it&#039;s probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=12677</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=12677"/>
		<updated>2010-11-22T23:22:15Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
= Ogg Skeleton 3.3 with Keyframe Index =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;This specification is obsolete and has been superseded by [[Ogg Skeleton 4]]. Use that instead!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search &lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and &lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until &lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target &lt;br /&gt;
time. However in media containing streams which have keyframes and &lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t &lt;br /&gt;
necessarily terminate at a keyframe. Thus if you begin decoding after your&lt;br /&gt;
first bisection terminates, you&#039;re likely to only get partial incomplete&lt;br /&gt;
frames, with &amp;quot;visual artifacts&amp;quot;, until you decode up to the next keyframe.&lt;br /&gt;
So to eliminate these visual artifacts, after the first bisection&lt;br /&gt;
terminates, you must extract the keyframe&#039;s timestamp from the last Theora&lt;br /&gt;
page&#039;s granulepos, and seek again back to the start of the keyframe and&lt;br /&gt;
decode forward until you reach the frame at the seek target. &lt;br /&gt;
&lt;br /&gt;
This is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
The bisection method above works fine for seeking in local files, but &lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection &lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have &lt;br /&gt;
very high latency, making seeking very slow. &lt;br /&gt;
&lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 bitstream attempts to alleviate this problem, by &lt;br /&gt;
providing an index of periodic keyframes for every content stream in an &lt;br /&gt;
Ogg segment. Note that the Skeleton 3.3 track only holds data for the &lt;br /&gt;
segment or &amp;quot;link&amp;quot; in which it resides. So if two Ogg files are concatenated&lt;br /&gt;
together (&amp;quot;chained&amp;quot;), the Skeleton 3.3&#039;s keyframe indexes in the first Ogg&lt;br /&gt;
segment (the first &amp;quot;link&amp;quot; in the &amp;quot;chain&amp;quot;) do not contain information&lt;br /&gt;
about the keyframes in the second Ogg segment (the second link in the chain).&lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own &lt;br /&gt;
packet in the Skeleton 3.3 track. The index for streams without the &lt;br /&gt;
concept of a keyframe, such as Vorbis streams, can instead record the &lt;br /&gt;
time position at periodic intervals, which achieves the same result. &lt;br /&gt;
When this document refers to keyframes, it also implicitly refers to these&lt;br /&gt;
independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
All the Skeleton 3.3 track&#039;s pages appear in the header pages of the Ogg &lt;br /&gt;
segment. This means the all the keyframe indexes are immediately &lt;br /&gt;
available once the header packets have been read when playing the media&lt;br /&gt;
over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream &lt;br /&gt;
provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key &lt;br /&gt;
point is intrinsically associated with exactly one stream, and stores the&lt;br /&gt;
offset of the page on which it starts, o, as well as the presentation time&lt;br /&gt;
of the keyframe t, as a fraction of seconds. This specifies that in order&lt;br /&gt;
to render the stream at presentation time t, the last page which lies before&lt;br /&gt;
all information required to render the keyframe at presentation time t begins&lt;br /&gt;
exactly at byte offset o, as offset from the beginning of the Ogg segment.&lt;br /&gt;
The offset is exactly the first byte of the page, so if you seek to a&lt;br /&gt;
keypoint&#039;s offset and don&#039;t find the beginning of a page there, you can&lt;br /&gt;
assume that the Ogg segment has been modified since the index was constructed,&lt;br /&gt;
and that the index is now invalid and should not be used. The time t is the&lt;br /&gt;
keyframe&#039;s presentation time corresponding to the granulepos, and is&lt;br /&gt;
represented as a fraction in seconds. Note that if a stream requires any&lt;br /&gt;
preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 track contains one index for each content stream in the &lt;br /&gt;
file. To seek in an Ogg file which contains keyframe indexes, first&lt;br /&gt;
construct the set which contains every active streams&#039; last keypoint which&lt;br /&gt;
has time less than or equal to the seek target time. Then from that set&lt;br /&gt;
of key points, select the key point with the smallest byte offset. You then&lt;br /&gt;
verify that there&#039;s a page found at exactly that offset, and if so, you can&lt;br /&gt;
begin decoding. If the first keyframe you encounter has a time equal to&lt;br /&gt;
that stored in the keypoint, you have made the optimal seek, and can safely&lt;br /&gt;
continue to decode up to the seek target time. You are guaranteed to pass&lt;br /&gt;
keyframes on all streams with time less than or equal to your seek target&lt;br /&gt;
time while decoding up to the seek target. However if the first keyframe&lt;br /&gt;
you encounter after decoding does not have the same presentation time as&lt;br /&gt;
is stored in the keypoint, you then the index is invalid (possibly the file&lt;br /&gt;
has been changed without updating the index) and you must either fallback&lt;br /&gt;
to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot;&lt;br /&gt;
to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain &lt;br /&gt;
keyframe indexes, so when implementing Ogg seeking, you must gracefully&lt;br /&gt;
fall-back to a bisection search or other seek algorithm when the index&lt;br /&gt;
is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 BOS packet also stores meta data about the segment in &lt;br /&gt;
which it resides. It stores the timestamps of the first and last samples&lt;br /&gt;
in the segment. This also allows you to determine the duration of the&lt;br /&gt;
indexed Ogg media without having to decode the start and end of the&lt;br /&gt;
Ogg segment to calculate the difference (which is the duration).&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 BOS packet also contains the length of the indexed segment&lt;br /&gt;
in bytes. This is so that if the seek target is outside of the indexed range,&lt;br /&gt;
you can immediately move to the next/previous segment and either seek using&lt;br /&gt;
that segment&#039;s index, or narrow the bisection window if that segment has no&lt;br /&gt;
index. You can also use the segement length to verify if the index is valid.&lt;br /&gt;
If the contents of the segment have changed, it&#039;s highly likely that the&lt;br /&gt;
length of the segment has changed as well. When you load the segment&#039;s&lt;br /&gt;
header pages, you should check the length of the physical segment, and if it&lt;br /&gt;
doesn&#039;t match that stored in the Skeleton header packet, you know the index&lt;br /&gt;
is out of date and not safe to use.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 BOS packet also contains the offset of the first non header&lt;br /&gt;
page in the Ogg segment. This means that if you wish to delay loading of an&lt;br /&gt;
index for whatever reason, you can skip forward to that offset, and start&lt;br /&gt;
decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still &lt;br /&gt;
correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
# The segment length stored in the Skeleton BOS packet doesn&#039;t match the length of the physical segment, or&lt;br /&gt;
# after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
# the first keyframe decoded after seeking to a keypoint&#039;s offset doesn&#039;t have the same presentation time as stored in the index.&lt;br /&gt;
&lt;br /&gt;
You should also always check the Skeleton version header field&lt;br /&gt;
to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment,&lt;br /&gt;
it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are &lt;br /&gt;
encoded with the least significant bit coming first in each byte. &lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least &lt;br /&gt;
significant byte first (i.e. little endian byte order). &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 track is intended to be backwards compatible with the &lt;br /&gt;
Skeleton 3.0 specification, available at &lt;br /&gt;
http://www.xiph.org/ogg/doc/skeleton.html . Unless specified &lt;br /&gt;
differently here, it is safe to assume that anything specified for a &lt;br /&gt;
Skeleton 3.0 track holds for a Skeleton 3.3 track. &lt;br /&gt;
&lt;br /&gt;
As per the Skeleton 3.0 track, a segment containing a Skeleton 3.3 track &lt;br /&gt;
must begin with a &#039;&#039;&#039;Skeleton 3.3 fishead BOS packet&#039;&#039;&#039; on a page by itself, with the &lt;br /&gt;
following format: &lt;br /&gt;
&lt;br /&gt;
# Identifier: 8 bytes, &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
# Version major: 2 Byte unsigned integer denoting the major version (3)&lt;br /&gt;
# Version minor: 2 Byte unsigned integer denoting the minor version (1)&lt;br /&gt;
# Presentationtime numerator: 8 Byte signed integer&lt;br /&gt;
# Presentationtime denominator: 8 Byte signed integer&lt;br /&gt;
# Basetime numerator: 8 Byte signed integer&lt;br /&gt;
# Basetime denominator: 8 Byte signed integer&lt;br /&gt;
# UTC [ISO8601]: a 20 Byte string containing a UTC time&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the media. Note that samples between the first-sample-time and the Presentationtime are supposed to be skipped during playback.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the segment.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The length of the segment, in bytes: 8 byte unsigned integer, 0 if unknown.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The offset of the first non-header page, in bytes: 8 byte unsigned integer, 0 if unknown.&lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units&lt;br /&gt;
of seconds. If the denominator is 0 for the first-sample-time or the&lt;br /&gt;
last-sample-time, then that value was unable to be determined at indexing&lt;br /&gt;
time, and is unknown. The duration of the Ogg segment can be calculated by&lt;br /&gt;
subtracting the first-sample-time from the last-sample-time.&lt;br /&gt;
&lt;br /&gt;
In &#039;&#039;&#039;Skeleton 3.3 the &amp;quot;fisbone&amp;quot; packets remain unchanged from Skeleton &lt;br /&gt;
3.0&#039;&#039;&#039;, and will still follow after the other streams&#039; BOS pages and &lt;br /&gt;
secondary header pages. &lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the &lt;br /&gt;
Skeleton 3.3 keyframe index packets. There should be one index packet for&lt;br /&gt;
each content stream in the Ogg segment, but index packets are not required&lt;br /&gt;
for a Skeleton 3.3 track to be considered valid. Each keypoint in the index&lt;br /&gt;
is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, checksum, and&lt;br /&gt;
timestamp. In order to save space, the offsets and timestamps are stored as&lt;br /&gt;
deltas, and then variable byte-encoded. The offset and timestamp deltas&lt;br /&gt;
store the difference between the keypoint&#039;s offset and timestamp from the&lt;br /&gt;
previous keypoint&#039;s offset and timestamp. So to calculate the page offset&lt;br /&gt;
of a keypoint you must sum the offset deltas of up to and including the&lt;br /&gt;
keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to&lt;br /&gt;
store the integer&#039;s bits, and the high bit is set in the last byte used&lt;br /&gt;
to encode the integer. The bits and bytes are in little endian byte order.&lt;br /&gt;
For example, the integer 7843, or &amp;lt;tt&amp;gt;0001 1110 1010 0011&amp;lt;/tt&amp;gt; in binary, would be&lt;br /&gt;
stored as two bytes: &amp;lt;tt&amp;gt;0xBD 0x23&amp;lt;/tt&amp;gt;, or &amp;lt;tt&amp;gt;1011 1101 0010 0011&amp;lt;/tt&amp;gt; in binary.&lt;br /&gt;
&lt;br /&gt;
Each &#039;&#039;&#039;Skeleton 3.3 keyframe index packet&#039;&#039;&#039; contains the following: &lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field.&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0.&lt;br /&gt;
# The keypoint presentation time denominator, as an 8 byte signed integer.&lt;br /&gt;
# &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
Note that a keypoint always represents the first key frame on a page. If an&lt;br /&gt;
Ogg page contains two or more keyframes, the index&#039;s key point *must* refer&lt;br /&gt;
to the first keyframe on that page, not any subsequent keyframes on that page.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by &lt;br /&gt;
presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of two&lt;br /&gt;
chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index&lt;br /&gt;
are relative to the beginning of the second Ogg in the chain, not the first.&lt;br /&gt;
Also note that if a physical Ogg bitstream is made up of chained Oggs, the&lt;br /&gt;
presence of an index in one segment does not imply that there will be an&lt;br /&gt;
index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index &lt;br /&gt;
is up to the indexer, but to limit the index size, we recommend &lt;br /&gt;
including at most one key point per every 64KB of data, or every 2000ms, &lt;br /&gt;
whichever is least frequent. &lt;br /&gt;
&lt;br /&gt;
As per the Skeleton 3.0 track, &#039;&#039;&#039;the last packet in the Skeleton 3.3 track &lt;br /&gt;
is an empty EOS packet&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see [http://github.com/cpearce/OggIndex OggIndex]. Also included there is a program OggIndexValid, which can verify that Theora and Vorbis indexes are valid. If you&#039;re implementing your own indexer, or going to be modifying existing indexes, always verify that your modified indexes are valid as per OggIndexValid!&lt;br /&gt;
&lt;br /&gt;
Recent [http://firefogg.org/nightly/ ffmpeg2theora nightlies] will also include a keyframe index in the Skeleton&lt;br /&gt;
3.3 track if you specify the command line option &amp;lt;tt&amp;gt;--seek-index&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-win32.zip&lt;br /&gt;
&lt;br /&gt;
If you already have a Firefox instance running, you&#039;ll need to either close your running&lt;br /&gt;
Firefox instance before starting the index-capable Firefox, or start the index-capable&lt;br /&gt;
Firefox with the &amp;lt;tt&amp;gt;--no-remote&amp;lt;/tt&amp;gt; command line parameter.&lt;br /&gt;
&lt;br /&gt;
To compare the network performance of indexed versus non-indexed seeking, point the&lt;br /&gt;
index-capable Firefox here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Metadata&amp;diff=12675</id>
		<title>Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Metadata&amp;diff=12675"/>
		<updated>2010-11-22T23:11:18Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Ogg Skeleton */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page aims to give an overview of the current state of metadata in Ogg and the ongoing projects towards improving it. The different components work in concert; for example [[Ogg Skeleton 4]] provides important infrastructure for [[CMML]], [[VorbisComment]] is simple to use and program, while the draft [[M3F|Multimedia Metadata Format (M3F)]] provides more sophisticated information.&lt;br /&gt;
&lt;br /&gt;
== [[VorbisComment]]s ==&lt;br /&gt;
&lt;br /&gt;
All the Xiph.org codecs have some internal mechanism for including metadata about the current stream.&lt;br /&gt;
Generally, this is one of the codec headers, and in the words of the [http://www.xiph.org/vorbis/doc/v-comment.html vorbis spec], &lt;br /&gt;
&amp;quot;It is meant for short, text comments ... much like someone jotting a quick note on the bottom of a CDR.&amp;quot; A single VorbisComment can store upto 2^64 bytes (16 exabytes).&lt;br /&gt;
&lt;br /&gt;
VorbisComments store metadata describing the stream in key=value pairs, such as &amp;quot;ARTIST=Elvis&amp;quot;, &amp;quot;TITLE=Blue Suede Shoes&amp;quot;. Multiple copies of any given key are allowed (for example you can specify ARTIST several times for multiple performers). The specification has several suggested keys: TITLE, VERSION, ALBUM, TRACKNUMBER, ARTIST, PERFORMER, COPYRIGHT, LICENSE, ORGANIZATION, DESCRIPTION, DATE, LOCATION, CONTACT, ISRC. See the [http://www.xiph.org/vorbis/doc/v-comment.html specification] for the intent of each one.&lt;br /&gt;
&lt;br /&gt;
The [[VorbisComment]] page contains improvements to the suggested comment set.&lt;br /&gt;
&lt;br /&gt;
== [[FLAC]] metadata blocks ==&lt;br /&gt;
&lt;br /&gt;
Metadata is included in the FLAC codec as METADATA_BLOCK_DATA. Seven types of metadata block are defined:  &lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_STREAMINFO&#039;&#039;: Sample rate, number of channels, etc.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_PADDING&#039;&#039;: Nul padding.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_APPLICATION&#039;&#039;: Third-party applications can register an ID. Metadata is typically 32-bit integers, but any datatypes can be specified.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_SEEKTABLE&#039;&#039;: For one or more seek points.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_VORBIS_COMMENT&#039;&#039;: Also known as FLAC tags, the contents of a VorbisComment packet. Note that the 32-bit field lengths are little-endian coded according to the Vorbis spec, as opposed to the usual big-endian coding of fixed-length integers in the rest of FLAC. FLAC metadata blocks are limited to 2^24 bytes (16 megabytes) and a VorbisComment packet in FLAC must fit within that limit.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_CUESHEET&#039;&#039;: Typically, but not necessarily, for CD-DA (Red Book) cuesheets.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_PICTURE&#039;&#039;: For binary picture data.&lt;br /&gt;
&lt;br /&gt;
== [[Ogg Skeleton 4]] ==&lt;br /&gt;
&lt;br /&gt;
[[Ogg Skeleton 4]] provides metadata useful for handling Ogg streams. This includes information like mime-types and mapping for granulepos which allows seeking streams without the need for the demuxer to understand them. It also provides a keyframe index to enable faster seeking over high latency networks.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton also allows for attachment of message header fields, given as name-value pairs, that contain some sort of protocol messages about the logical bitstream. This is intended for decode related stuff, such as the screen size for a video bitstream or the number of channels for an audio bitstream.&lt;br /&gt;
&lt;br /&gt;
== [[CMML]] ==&lt;br /&gt;
&lt;br /&gt;
The [[CMML|Continuous Media Markup Language]] allows time-based marking up of media streams, at its simplest this allows you to divide media files into clips and provide information about each clip.&lt;br /&gt;
&lt;br /&gt;
== [[M3F]] ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;[[M3F|Multimedia Metadata Format]]&#039;&#039;&#039; for the Ogg container aims to provide metadata for media streams. The exact aims of this project are still under development, but they include being able to describe artist relationships to a piece more accurately as well as providing the structure to encourage more reliable metadata.&lt;br /&gt;
&lt;br /&gt;
The format is intended to replace VorbisComments for the use of &#039;&#039;structured&#039;&#039; metadata, allowing VorbisComments to revert to its orginally intended use of &amp;quot;short, text comments ... much like someone jotting a quick note on the bottom of a CDR.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
== [[XMLEmbedding]] ==&lt;br /&gt;
&lt;br /&gt;
To implement XML metadata in Ogg (as for [[M3F]]), a mapping to Ogg streams is needed. The use of XML metadata will also open the way for the inclusion of technologies such as:&lt;br /&gt;
* RDF + dublin core&lt;br /&gt;
* [http://www.adobe.com/products/xmp/ XMP]&lt;br /&gt;
* [http://wiki.musicbrainz.org/MusicBrainzXMLMetaData MusicBrainz]&lt;br /&gt;
* [http://www.w3.org/Graphics/SVG/ SVG]&lt;br /&gt;
&lt;br /&gt;
== Aims of advanced metadata ==&lt;br /&gt;
&lt;br /&gt;
VorbisComments work well enough for most things, and can be overloaded/abused (depending on your point of view) for most other things. But there are three major requirements that point to the design of an external metadata format; one that can be interleaved with the other streams in a container.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Machinability:&#039;&#039;&#039; There are a number of items of metadata that a player will want to parse and take action on. While there are usually &#039;convention&#039; schemes for doing this with the embedded comment headers, this is much easier if there is a separate metadata stream designed for such use, instead of having to do best-effort parsing of natural language comments. For example, a video file with multiple audio tracks can specify the language of each one; a player than can parse these reliably can match them against a language preference list configured by the user to automatically select and begin playback of the best option.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Kitchen Sink:&#039;&#039;&#039; There are a minority of people who care passionately about having every detail about a track available. In the sense of conserving such information, and providing an equivalent to liner notes for online distribution, this is a goal worth supporting. However, the simple unstructured key-value pairs offered by the inline metadata are unwieldy for this level of detail. How do you tell the 2nd unit Assistant Director from the USA unit Assistant Director? How do you indicate which artist played tenor sax in the solo?&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Addressability:&#039;&#039;&#039; The internal comment metadata headers are by necessity attached to a single content stream. This is useful for some appication, but a limitation in others. In a multiplexed stream, which set of comments refers to the collection as a whole? (By convention, in Ogg, it&#039;s the first logical bitstream occuring, but we can do better.) A separate metadata stream type must address this issue of collective metadata while still allowing description of individual streams. It should also allow temporal addressability, so that changes can be described. Because the in-stream comment metadata are part of the codec headers, it cannot change over the course of the stream, and allowing additional comment packets elsewhere in the stream presents seeking challenges. In the Ogg container this can be resolved by inserting a chain boundary, but this is a poor option for very-low-bitrate streams and unreliable transports such as RTP.&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Metadata&amp;diff=12674</id>
		<title>Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Metadata&amp;diff=12674"/>
		<updated>2010-11-22T23:10:00Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page aims to give an overview of the current state of metadata in Ogg and the ongoing projects towards improving it. The different components work in concert; for example [[Ogg Skeleton 4]] provides important infrastructure for [[CMML]], [[VorbisComment]] is simple to use and program, while the draft [[M3F|Multimedia Metadata Format (M3F)]] provides more sophisticated information.&lt;br /&gt;
&lt;br /&gt;
== [[VorbisComment]]s ==&lt;br /&gt;
&lt;br /&gt;
All the Xiph.org codecs have some internal mechanism for including metadata about the current stream.&lt;br /&gt;
Generally, this is one of the codec headers, and in the words of the [http://www.xiph.org/vorbis/doc/v-comment.html vorbis spec], &lt;br /&gt;
&amp;quot;It is meant for short, text comments ... much like someone jotting a quick note on the bottom of a CDR.&amp;quot; A single VorbisComment can store upto 2^64 bytes (16 exabytes).&lt;br /&gt;
&lt;br /&gt;
VorbisComments store metadata describing the stream in key=value pairs, such as &amp;quot;ARTIST=Elvis&amp;quot;, &amp;quot;TITLE=Blue Suede Shoes&amp;quot;. Multiple copies of any given key are allowed (for example you can specify ARTIST several times for multiple performers). The specification has several suggested keys: TITLE, VERSION, ALBUM, TRACKNUMBER, ARTIST, PERFORMER, COPYRIGHT, LICENSE, ORGANIZATION, DESCRIPTION, DATE, LOCATION, CONTACT, ISRC. See the [http://www.xiph.org/vorbis/doc/v-comment.html specification] for the intent of each one.&lt;br /&gt;
&lt;br /&gt;
The [[VorbisComment]] page contains improvements to the suggested comment set.&lt;br /&gt;
&lt;br /&gt;
== [[FLAC]] metadata blocks ==&lt;br /&gt;
&lt;br /&gt;
Metadata is included in the FLAC codec as METADATA_BLOCK_DATA. Seven types of metadata block are defined:  &lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_STREAMINFO&#039;&#039;: Sample rate, number of channels, etc.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_PADDING&#039;&#039;: Nul padding.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_APPLICATION&#039;&#039;: Third-party applications can register an ID. Metadata is typically 32-bit integers, but any datatypes can be specified.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_SEEKTABLE&#039;&#039;: For one or more seek points.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_VORBIS_COMMENT&#039;&#039;: Also known as FLAC tags, the contents of a VorbisComment packet. Note that the 32-bit field lengths are little-endian coded according to the Vorbis spec, as opposed to the usual big-endian coding of fixed-length integers in the rest of FLAC. FLAC metadata blocks are limited to 2^24 bytes (16 megabytes) and a VorbisComment packet in FLAC must fit within that limit.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_CUESHEET&#039;&#039;: Typically, but not necessarily, for CD-DA (Red Book) cuesheets.&lt;br /&gt;
#&#039;&#039;METADATA_BLOCK_PICTURE&#039;&#039;: For binary picture data.&lt;br /&gt;
&lt;br /&gt;
== [[Ogg Skeleton]] ==&lt;br /&gt;
&lt;br /&gt;
[[Ogg Skeleton]] provides metadata useful for handling Ogg streams. This includes information like mime-types and mapping for granulepos which allows seeking streams without the need for the demuxer to understand them.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton also allows for attachment of message header fields, given as name-value pairs, that contain some sort of protocol messages about the logical bitstream. This is intended for decode related stuff, such as the screen size for a video bitstream or the number of channels for an audio bitstream.&lt;br /&gt;
&lt;br /&gt;
== [[CMML]] ==&lt;br /&gt;
&lt;br /&gt;
The [[CMML|Continuous Media Markup Language]] allows time-based marking up of media streams, at its simplest this allows you to divide media files into clips and provide information about each clip.&lt;br /&gt;
&lt;br /&gt;
== [[M3F]] ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;[[M3F|Multimedia Metadata Format]]&#039;&#039;&#039; for the Ogg container aims to provide metadata for media streams. The exact aims of this project are still under development, but they include being able to describe artist relationships to a piece more accurately as well as providing the structure to encourage more reliable metadata.&lt;br /&gt;
&lt;br /&gt;
The format is intended to replace VorbisComments for the use of &#039;&#039;structured&#039;&#039; metadata, allowing VorbisComments to revert to its orginally intended use of &amp;quot;short, text comments ... much like someone jotting a quick note on the bottom of a CDR.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
== [[XMLEmbedding]] ==&lt;br /&gt;
&lt;br /&gt;
To implement XML metadata in Ogg (as for [[M3F]]), a mapping to Ogg streams is needed. The use of XML metadata will also open the way for the inclusion of technologies such as:&lt;br /&gt;
* RDF + dublin core&lt;br /&gt;
* [http://www.adobe.com/products/xmp/ XMP]&lt;br /&gt;
* [http://wiki.musicbrainz.org/MusicBrainzXMLMetaData MusicBrainz]&lt;br /&gt;
* [http://www.w3.org/Graphics/SVG/ SVG]&lt;br /&gt;
&lt;br /&gt;
== Aims of advanced metadata ==&lt;br /&gt;
&lt;br /&gt;
VorbisComments work well enough for most things, and can be overloaded/abused (depending on your point of view) for most other things. But there are three major requirements that point to the design of an external metadata format; one that can be interleaved with the other streams in a container.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Machinability:&#039;&#039;&#039; There are a number of items of metadata that a player will want to parse and take action on. While there are usually &#039;convention&#039; schemes for doing this with the embedded comment headers, this is much easier if there is a separate metadata stream designed for such use, instead of having to do best-effort parsing of natural language comments. For example, a video file with multiple audio tracks can specify the language of each one; a player than can parse these reliably can match them against a language preference list configured by the user to automatically select and begin playback of the best option.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Kitchen Sink:&#039;&#039;&#039; There are a minority of people who care passionately about having every detail about a track available. In the sense of conserving such information, and providing an equivalent to liner notes for online distribution, this is a goal worth supporting. However, the simple unstructured key-value pairs offered by the inline metadata are unwieldy for this level of detail. How do you tell the 2nd unit Assistant Director from the USA unit Assistant Director? How do you indicate which artist played tenor sax in the solo?&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Addressability:&#039;&#039;&#039; The internal comment metadata headers are by necessity attached to a single content stream. This is useful for some appication, but a limitation in others. In a multiplexed stream, which set of comments refers to the collection as a whole? (By convention, in Ogg, it&#039;s the first logical bitstream occuring, but we can do better.) A separate metadata stream type must address this issue of collective metadata while still allowing description of individual streams. It should also allow temporal addressability, so that changes can be described. Because the in-stream comment metadata are part of the codec headers, it cannot change over the course of the stream, and allowing additional comment packets elsewhere in the stream presents seeking challenges. In the Ogg container this can be resolved by inserting a chain boundary, but this is a poor option for very-low-bitrate streams and unreliable transports such as RTP.&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_3&amp;diff=12672</id>
		<title>Ogg Skeleton 3</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_3&amp;diff=12672"/>
		<updated>2010-11-22T00:01:23Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Ogg Skeleton 3.0&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE:&#039;&#039;&#039; &#039;&#039;The Ogg Skeleton format has been updated to [[Ogg Skeleton 4]], which includes a keyframe index to enable faster seeking. Encoding tools are recommended to use [[Ogg Skeleton 4]] in preference to version 3.0 where possible.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 3.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 3.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12671</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12671"/>
		<updated>2010-11-21T23:43:14Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...5].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton 4 is being supported by the following projects:&lt;br /&gt;
* ffmpeg2theora (version 0.27 and above) &lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://git.xiph.org/?p=OggIndex.git;a=summary source]&lt;br /&gt;
* Mozilla Firefox 4&lt;br /&gt;
&lt;br /&gt;
The following projects currently support Ogg Skeleton 3.1, support for Ogg Skeleton 4 is planned:&lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Todo&amp;diff=12670</id>
		<title>Todo</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Todo&amp;diff=12670"/>
		<updated>2010-11-20T03:39:36Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Todos */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Todo list for xiph.org.&lt;br /&gt;
&lt;br /&gt;
If you&#039;re interested in helping out, this is a good place to start. Also, asking on irc (&#039;&#039;#vorbis&#039;&#039;, &#039;&#039;#theora&#039;&#039;, &#039;&#039;#annodex&#039;&#039; or &#039;&#039;#xiph&#039;&#039; on &#039;&#039;irc.freenode.net&#039;&#039;) is a good way to get oriented.&lt;br /&gt;
&lt;br /&gt;
== Todos ==&lt;br /&gt;
&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to Cortado.&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to VLC.&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to liboggz.&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to ffmpeg.&lt;br /&gt;
* See [[OggIndex-Migration]] for other projects which need OggIndex support.&lt;br /&gt;
* [http://github.com/cpearce/OggIndex OggIndex] needs Speex support.&lt;br /&gt;
* [[Ices]] needs Speex support&lt;br /&gt;
* Icecast toolchain needs support for WebM.&lt;br /&gt;
* Icecast toolchain needs support for CELT. NOTE: CELT Ogg encapsulation may change.&lt;br /&gt;
* Oggenc and Ogg123 need OggPCM support (encoding and playback respectively)&lt;br /&gt;
* Test and fix &#039;downstream&#039; applications&lt;br /&gt;
* Create Xiph.Org conference swag: brochures, posters, toys, demo disks, etc. (don&#039;t do this without coordinating with folks)&lt;br /&gt;
* Update the todos&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Several projects have their own todo lists in the wiki.&lt;br /&gt;
&lt;br /&gt;
* [[TheoraTodo]]&lt;br /&gt;
* [[XSPF_Todo_list]]&lt;br /&gt;
&lt;br /&gt;
We always need people to help with the [http://xiph.org/ websites] as well.&lt;br /&gt;
&lt;br /&gt;
== Website todos ==&lt;br /&gt;
* Find and fix bugs, bad links, outdated, and incorrect information.&lt;br /&gt;
* More HTML5 multimedia content for our websites, both useful things like presentation videos and the primer as well as (tasteful) dancing baloney.&lt;br /&gt;
* Better integrate our web resources:&lt;br /&gt;
** Mailing lists archives and Wiki could be more extensively integrated into the websites, e.g. sections automatically fed with recent posts/edits to relevant pages/lists.&lt;br /&gt;
** Make planet.xiph.org more publicly visible? (need to reduce the offtopic posts the leak through)&lt;br /&gt;
* Local blogging platform so JM / Xiphmont don&#039;t need to use livejournal— in particular having nice media support would be nice (ugh, more software to maintain) &lt;br /&gt;
* Make Media.xiph.org more reasonably structured and prettier (Apng previews? WebM renders?)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* See also the [[Bounties]] page&lt;br /&gt;
&lt;br /&gt;
[[Category:Developers stuff]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Todo&amp;diff=12669</id>
		<title>Todo</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Todo&amp;diff=12669"/>
		<updated>2010-11-20T03:37:44Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Todos */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Todo list for xiph.org.&lt;br /&gt;
&lt;br /&gt;
If you&#039;re interested in helping out, this is a good place to start. Also, asking on irc (&#039;&#039;#vorbis&#039;&#039;, &#039;&#039;#theora&#039;&#039;, &#039;&#039;#annodex&#039;&#039; or &#039;&#039;#xiph&#039;&#039; on &#039;&#039;irc.freenode.net&#039;&#039;) is a good way to get oriented.&lt;br /&gt;
&lt;br /&gt;
== Todos ==&lt;br /&gt;
&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to Cortado.&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to VLC.&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to liboggz.&lt;br /&gt;
* Add [[Ogg_Skeleton_4]]/index support to ffmpeg.&lt;br /&gt;
* See [[OggIndex-Migration]] for other projects which need OggIndex support.&lt;br /&gt;
* [[OggIndex]] needs Speex support.&lt;br /&gt;
* [[Ices]] needs Speex support&lt;br /&gt;
* Icecast toolchain needs support for WebM.&lt;br /&gt;
* Icecast toolchain needs support for CELT. NOTE: CELT Ogg encapsulation may change.&lt;br /&gt;
* Oggenc and Ogg123 need OggPCM support (encoding and playback respectively)&lt;br /&gt;
* Test and fix &#039;downstream&#039; applications&lt;br /&gt;
* Create Xiph.Org conference swag: brochures, posters, toys, demo disks, etc. (don&#039;t do this without coordinating with folks)&lt;br /&gt;
* Update the todos&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Several projects have their own todo lists in the wiki.&lt;br /&gt;
&lt;br /&gt;
* [[TheoraTodo]]&lt;br /&gt;
* [[XSPF_Todo_list]]&lt;br /&gt;
&lt;br /&gt;
We always need people to help with the [http://xiph.org/ websites] as well.&lt;br /&gt;
&lt;br /&gt;
== Website todos ==&lt;br /&gt;
* Find and fix bugs, bad links, outdated, and incorrect information.&lt;br /&gt;
* More HTML5 multimedia content for our websites, both useful things like presentation videos and the primer as well as (tasteful) dancing baloney.&lt;br /&gt;
* Better integrate our web resources:&lt;br /&gt;
** Mailing lists archives and Wiki could be more extensively integrated into the websites, e.g. sections automatically fed with recent posts/edits to relevant pages/lists.&lt;br /&gt;
** Make planet.xiph.org more publicly visible? (need to reduce the offtopic posts the leak through)&lt;br /&gt;
* Local blogging platform so JM / Xiphmont don&#039;t need to use livejournal— in particular having nice media support would be nice (ugh, more software to maintain) &lt;br /&gt;
* Make Media.xiph.org more reasonably structured and prettier (Apng previews? WebM renders?)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* See also the [[Bounties]] page&lt;br /&gt;
&lt;br /&gt;
[[Category:Developers stuff]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Todo&amp;diff=12668</id>
		<title>Todo</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Todo&amp;diff=12668"/>
		<updated>2010-11-20T03:37:01Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Todos */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Todo list for xiph.org.&lt;br /&gt;
&lt;br /&gt;
If you&#039;re interested in helping out, this is a good place to start. Also, asking on irc (&#039;&#039;#vorbis&#039;&#039;, &#039;&#039;#theora&#039;&#039;, &#039;&#039;#annodex&#039;&#039; or &#039;&#039;#xiph&#039;&#039; on &#039;&#039;irc.freenode.net&#039;&#039;) is a good way to get oriented.&lt;br /&gt;
&lt;br /&gt;
== Todos ==&lt;br /&gt;
&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton_4]] support to Cortado.&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton_4]] support to VLC.&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton_4]] support to liboggz.&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton_4]] support to ffmpeg.&lt;br /&gt;
* See [[OggIndex-Migration]] for other projects which need OggIndex support.&lt;br /&gt;
* [[OggIndex]] needs Speex support.&lt;br /&gt;
* [[Ices]] needs Speex support&lt;br /&gt;
* Icecast toolchain needs support for WebM.&lt;br /&gt;
* Icecast toolchain needs support for CELT. NOTE: CELT Ogg encapsulation may change.&lt;br /&gt;
* Oggenc and Ogg123 need OggPCM support (encoding and playback respectively)&lt;br /&gt;
* Test and fix &#039;downstream&#039; applications&lt;br /&gt;
* Create Xiph.Org conference swag: brochures, posters, toys, demo disks, etc. (don&#039;t do this without coordinating with folks)&lt;br /&gt;
* Update the todos&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Several projects have their own todo lists in the wiki.&lt;br /&gt;
&lt;br /&gt;
* [[TheoraTodo]]&lt;br /&gt;
* [[XSPF_Todo_list]]&lt;br /&gt;
&lt;br /&gt;
We always need people to help with the [http://xiph.org/ websites] as well.&lt;br /&gt;
&lt;br /&gt;
== Website todos ==&lt;br /&gt;
* Find and fix bugs, bad links, outdated, and incorrect information.&lt;br /&gt;
* More HTML5 multimedia content for our websites, both useful things like presentation videos and the primer as well as (tasteful) dancing baloney.&lt;br /&gt;
* Better integrate our web resources:&lt;br /&gt;
** Mailing lists archives and Wiki could be more extensively integrated into the websites, e.g. sections automatically fed with recent posts/edits to relevant pages/lists.&lt;br /&gt;
** Make planet.xiph.org more publicly visible? (need to reduce the offtopic posts the leak through)&lt;br /&gt;
* Local blogging platform so JM / Xiphmont don&#039;t need to use livejournal— in particular having nice media support would be nice (ugh, more software to maintain) &lt;br /&gt;
* Make Media.xiph.org more reasonably structured and prettier (Apng previews? WebM renders?)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* See also the [[Bounties]] page&lt;br /&gt;
&lt;br /&gt;
[[Category:Developers stuff]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Todo&amp;diff=12667</id>
		<title>Todo</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Todo&amp;diff=12667"/>
		<updated>2010-11-20T03:36:02Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Todos */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Todo list for xiph.org.&lt;br /&gt;
&lt;br /&gt;
If you&#039;re interested in helping out, this is a good place to start. Also, asking on irc (&#039;&#039;#vorbis&#039;&#039;, &#039;&#039;#theora&#039;&#039;, &#039;&#039;#annodex&#039;&#039; or &#039;&#039;#xiph&#039;&#039; on &#039;&#039;irc.freenode.net&#039;&#039;) is a good way to get oriented.&lt;br /&gt;
&lt;br /&gt;
== Todos ==&lt;br /&gt;
&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton4]] support to Cortado.&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton4]] support to VLC.&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton4]] support to liboggz.&lt;br /&gt;
* Add [[OggIndex]]/[[Ogg_Skeleton4]] support to ffmpeg.&lt;br /&gt;
* See [[OggIndex-Migration]] for other projects which need OggIndex support.&lt;br /&gt;
* [[OggIndex]] needs Speex support.&lt;br /&gt;
* [[Ices]] needs Speex support&lt;br /&gt;
* Icecast toolchain needs support for WebM.&lt;br /&gt;
* Icecast toolchain needs support for CELT. NOTE: CELT Ogg encapsulation may change.&lt;br /&gt;
* Oggenc and Ogg123 need OggPCM support (encoding and playback respectively)&lt;br /&gt;
* Test and fix &#039;downstream&#039; applications&lt;br /&gt;
* Create Xiph.Org conference swag: brochures, posters, toys, demo disks, etc. (don&#039;t do this without coordinating with folks)&lt;br /&gt;
* Update the todos&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Several projects have their own todo lists in the wiki.&lt;br /&gt;
&lt;br /&gt;
* [[TheoraTodo]]&lt;br /&gt;
* [[XSPF_Todo_list]]&lt;br /&gt;
&lt;br /&gt;
We always need people to help with the [http://xiph.org/ websites] as well.&lt;br /&gt;
&lt;br /&gt;
== Website todos ==&lt;br /&gt;
* Find and fix bugs, bad links, outdated, and incorrect information.&lt;br /&gt;
* More HTML5 multimedia content for our websites, both useful things like presentation videos and the primer as well as (tasteful) dancing baloney.&lt;br /&gt;
* Better integrate our web resources:&lt;br /&gt;
** Mailing lists archives and Wiki could be more extensively integrated into the websites, e.g. sections automatically fed with recent posts/edits to relevant pages/lists.&lt;br /&gt;
** Make planet.xiph.org more publicly visible? (need to reduce the offtopic posts the leak through)&lt;br /&gt;
* Local blogging platform so JM / Xiphmont don&#039;t need to use livejournal— in particular having nice media support would be nice (ugh, more software to maintain) &lt;br /&gt;
* Make Media.xiph.org more reasonably structured and prettier (Apng previews? WebM renders?)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* See also the [[Bounties]] page&lt;br /&gt;
&lt;br /&gt;
[[Category:Developers stuff]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12230</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12230"/>
		<updated>2010-06-09T10:31:25Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...5].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://github.com/cpearce/OggIndex source]&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12229</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12229"/>
		<updated>2010-06-09T09:58:30Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.1 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.1 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.1-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.1 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.1 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.1 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.1 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.1 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.1 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.1 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.1:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.1 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.1 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...5].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://github.com/cpearce/OggIndex source]&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12228</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=12228"/>
		<updated>2010-06-09T09:41:50Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...6].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://github.com/cpearce/OggIndex source]&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11049</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11049"/>
		<updated>2010-05-09T23:11:34Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* How to describe the logical bitstreams within an Ogg container? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
* granulepos radix: used during complex granulepos-to-time conversions, particuarly in streams such as Dirac.&lt;br /&gt;
* predelay: the delay of the presentation time behind the decode time. Used in discontinuous streams such as Dirac.&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | PTS/DTS predelay              |Padding/unused | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulepos Radix                                              | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 56-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...6].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 2000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://github.com/cpearce/OggIndex source]&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11048</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11048"/>
		<updated>2010-05-09T23:11:03Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with &amp;quot;one hop&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
* the granulepos radix, used during complex granulepos-to-time conversions, particuarly in streams such as Dirac.&lt;br /&gt;
* predelay: the delay of the presentation time behind the decode time. Used in discontinuous streams such as Dirac.&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | PTS/DTS predelay              |Padding/unused | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulepos Radix                                              | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 56-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;. Message header fields are  terminated/delimited by &amp;quot;\r\n&amp;quot;. Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.0:&lt;br /&gt;
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                                                           | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...                           |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...6].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 2000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets (the fisbone and index packets)&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://github.com/cpearce/OggIndex source]&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11044</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11044"/>
		<updated>2010-05-07T05:47:16Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Ogg Skeleton version 4.1 Format Specification */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframes indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
* the granulepos radix, used during complex granulepos-to-time conversions, particuarly in streams such as Dirac.&lt;br /&gt;
* predelay: the delay of the presentation time behind the decode time. Used in discontinuous streams such as Dirac.&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of the a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.1 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length in bytes                                       | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content byte offset                                           | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.1 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | PTS/DTS predelay              |Padding/unused | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulepos Radix                                              | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 56-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;, terminated/delimited by &amp;quot;\r\n&amp;quot;). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.1:&lt;br /&gt;
* Content-type: mime-type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keypoint in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...6].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 2000ms, whichever is least frequent.&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11042</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11042"/>
		<updated>2010-05-07T05:38:23Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframes indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
* the granulepos radix, used during complex granulepos-to-time conversions, particuarly in streams such as Dirac.&lt;br /&gt;
* predelay: the delay of the presentation time behind the decode time. Used in discontinuous streams such as Dirac.&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of the a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.1 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length                                                | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content offset                                                | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.1 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | PTS/DTS predelay              |Padding/unused | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulepos Radix                                              | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 56-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;, terminated/delimited by &amp;quot;\r\n&amp;quot;). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.1:&lt;br /&gt;
* Content-type: mime-type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keypoint in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;index\0&#039;                                          | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  |Serial number                  | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  |Number of keypoints            | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  | Timestamp denominator         | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  | First sample time numerator   | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  | Last sample end time numerator| 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                                                  | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | ...continued                  |Keypoints...                   | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The fields of the index packet are as follows:&lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;. Bytes [0...6].&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0. Bytes [10...17].&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]&lt;br /&gt;
# &#039;n&#039; key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 2000ms, whichever is least frequent. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11041</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11041"/>
		<updated>2010-05-07T05:13:11Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframes indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
* the granulepos radix, used during complex granulepos-to-time conversions, particuarly in streams such as Dirac.&lt;br /&gt;
* predelay: the delay of the presentation time behind the decode time. Used in discontinuous streams such as Dirac.&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of the a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.1 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length                                                | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content offset                                                | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.1 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | PTS/DTS predelay              |Padding/unused | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulepos Radix                                              | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 56-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;, terminated/delimited by &amp;quot;\r\n&amp;quot;). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.1:&lt;br /&gt;
* Content-type: mime-type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keypoint in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field.&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0.&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0.&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track.&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track.&lt;br /&gt;
# &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 2000ms, whichever is least frequent. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11040</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11040"/>
		<updated>2010-05-07T05:10:52Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. Skeleton version 4.0 also provides keyframes indexes to enable optimal seeking over high-latency connections, such as the internet.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;For the current specification for the keyframe index packets see&lt;br /&gt;
http://github.com/cpearce/OggIndex/blob/master/Skeleton-4.0-Index-Specification.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
* the granulepos radix, used during complex granulepos-to-time conversions, particuarly in streams such as Dirac.&lt;br /&gt;
* predelay: the delay of the presentation time behind the decode time. Used in discontinuous streams such as Dirac.&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 4.1 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Segment length                                                | 64-67&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 68-71&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Content offset                                                | 72-75&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 76-79&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 4.1 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | PTS/DTS predelay              |Padding/unused | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulepos Radix                                              | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 56-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;, terminated/delimited by &amp;quot;\r\n&amp;quot;). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
The following message headers are compulsory in Skeleton 4.1:&lt;br /&gt;
* Content-type: mime-type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.&lt;br /&gt;
* Role: describes the function of this track. Common examples are &amp;quot;video/main&amp;quot;, &amp;quot;audio/main&amp;quot;, &amp;quot;text/caption&amp;quot;. For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.&lt;br /&gt;
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.&lt;br /&gt;
&lt;br /&gt;
For more message headers, see [[SkeletonHeaders]].&lt;br /&gt;
&lt;br /&gt;
=== Keyframe indexes for faster seeking ===&lt;br /&gt;
&lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
Because all the Skeleton track&#039;s index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.&lt;br /&gt;
&lt;br /&gt;
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of the a page in the indexed stream, so if you seek to a keypoint&#039;s offset and don&#039;t find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe&#039;s presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams&#039; last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there&#039;s a page from the keypoint&#039;s stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don&#039;t encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot; to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment&#039;s index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it&#039;s highly likely that the length of the segment has changed as well. When you load the segment&#039;s header pages, you should check the length of the physical segment, and if it doesn&#039;t match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still  correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
* The segment doesn&#039;t end at the segment length offset stored in the Skeleton BOS packet (note that a new &amp;quot;link&amp;quot; in a &amp;quot;chain&amp;quot; can start at the end of the segment), or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
* after a seek to a keypoint&#039;s offset, you don&#039;t land on a page which belongs to that keypoint&#039;s stream.&lt;br /&gt;
&lt;br /&gt;
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keypoint in the index is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint&#039;s offset and timestamp from the previous keypoint&#039;s offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to store the integer&#039;s bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.&lt;br /&gt;
&lt;br /&gt;
Each index packet contains the following: &lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field.&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0.&lt;br /&gt;
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0.&lt;br /&gt;
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track.&lt;br /&gt;
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track.&lt;br /&gt;
# &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
** the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
** the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.&lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 2000ms, whichever is least frequent. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further restrictions === &lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11038</id>
		<title>Ogg Skeleton 4</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&amp;diff=11038"/>
		<updated>2010-04-27T01:01:45Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: Created page with &amp;#039;{{draft}}  &amp;#039;&amp;#039;&amp;#039;Ogg Skeleton&amp;#039;&amp;#039;&amp;#039; provides structuring information for multitrack Ogg files. It is compatible with Ogg Theora and provides extra clues for synchronization and…&amp;#039;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Ogg Skeleton&#039;&#039;&#039; provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection.&lt;br /&gt;
&lt;br /&gt;
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. &lt;br /&gt;
&lt;br /&gt;
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don&#039;t have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the &amp;quot;file&amp;quot; command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. &lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you&#039;re streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .&lt;br /&gt;
&lt;br /&gt;
== Specification ==&lt;br /&gt;
&lt;br /&gt;
This is a motivation and design sketch.&lt;br /&gt;
&#039;&#039;&#039;For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== How to describe the logical bitstreams within an Ogg container? ===&lt;br /&gt;
&lt;br /&gt;
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:&lt;br /&gt;
* the serial number: it identifies a content track&lt;br /&gt;
* the mime type: it identifies the content type&lt;br /&gt;
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width&lt;br /&gt;
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream&lt;br /&gt;
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating &amp;quot;granules / granulerate&amp;quot;.&lt;br /&gt;
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)&lt;br /&gt;
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)&lt;br /&gt;
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content&lt;br /&gt;
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content&lt;br /&gt;
&lt;br /&gt;
=== How to allow the creation of substreams from an Ogg physical bitstream? ===&lt;br /&gt;
&lt;br /&gt;
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.&lt;br /&gt;
&lt;br /&gt;
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:&lt;br /&gt;
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)&lt;br /&gt;
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream&lt;br /&gt;
&lt;br /&gt;
=== Ogg Skeleton version 3.0 Format Specification ===&lt;br /&gt;
&lt;br /&gt;
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.&lt;br /&gt;
&lt;br /&gt;
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.&lt;br /&gt;
The first 8 bytes provide the magic identifier &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of &amp;quot;fisbone\0&amp;quot;. The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.&lt;br /&gt;
&lt;br /&gt;
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fishead\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Version major                 | Version minor                 | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime numerator                                    | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Presentationtime denominator                                  | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime numerator                                            | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basetime denominator                                          | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | UTC                                                           | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 52-55&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 56-59&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 60-63&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The version fields provide version information for the Skeleton track, currently being 3.0 (the number having evolved within the Annodex project).&lt;br /&gt;
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The fisbone secondary header packet looks as follows:&lt;br /&gt;
&lt;br /&gt;
  0                   1                   2                   3&lt;br /&gt;
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Identifier &#039;fisbone\0&#039;                                        | 0-3&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 4-7&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Offset to message header fields                               | 8-11&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Serial number                                                 | 12-15&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Number of header packets                                      | 16-19&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate numerator                                         | 20-23&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 24-27&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granulerate denominator                                       | 28-31&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 32-35&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Basegranule                                                   | 36-39&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 |                                                               | 40-43&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Preroll                                                       | 44-47&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Granuleshift  | Padding/future use                            | 48-51&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
 | Message header fields ...                                     | 52-&lt;br /&gt;
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+&lt;br /&gt;
&lt;br /&gt;
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. &amp;quot;Content-Type: audio/vorbis&amp;quot;). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.&lt;br /&gt;
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.&lt;br /&gt;
&lt;br /&gt;
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:&lt;br /&gt;
* there can only be one Skeleton logical bitstream in a Ogg bitstream.&lt;br /&gt;
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don&#039;t get confused about it being e.g. Ogg Vorbis without this meta information&lt;br /&gt;
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)&lt;br /&gt;
* the secondary header pages of all logical bitstreams come next, including Skeleton&#039;s secondary header packets&lt;br /&gt;
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
&lt;br /&gt;
Ogg Skeleton is being supported by the following projects:&lt;br /&gt;
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]&lt;br /&gt;
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]&lt;br /&gt;
* the Annodex technology: [http://www.annodex.net/ annodex.net]&lt;br /&gt;
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)&lt;br /&gt;
* ffmpeg2theora (with --skeleton) &lt;br /&gt;
* speexenc (with --skeleton) &amp;amp; speexdec&lt;br /&gt;
* many more ...&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]&lt;br /&gt;
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Ogg]]&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Summer_of_Code_2010&amp;diff=10856</id>
		<title>Summer of Code 2010</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Summer_of_Code_2010&amp;diff=10856"/>
		<updated>2010-03-14T20:26:49Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* OggIndex */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This is  our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org] and [http://annodex.org/ Annodex]. The two projects participate jointly this year under Xiph&#039;s name.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Students&#039;&#039;&#039; please use the template at [[Summer of Code Applications]] when applying for a GSoC position.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentors&#039;&#039;&#039; please visit [[Summer of Code Mentoring]] and help us prepare our application as a mentoring organization.&lt;br /&gt;
&lt;br /&gt;
== Current Ideas ==&lt;br /&gt;
&lt;br /&gt;
=== OggIndex ===&lt;br /&gt;
OggIndex has recently been introduced and adds a keyframe index to the Ogg Skeleton track. Support needs to be added to many existing open source applications, such as MPlayer, VLC, etc, so that they can take advantage of the keyframe index when seeking. For more info see [[OggIndex-Migration]], [[Ogg_Index]], and [http://blog.pearce.org.nz/2010/01/indexing-keyframes-in-ogg-videos-for.html Indexing keyframes in Ogg videos for fast seeking]. Mentor: Chris Pearce&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10795</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10795"/>
		<updated>2010-01-27T01:45:02Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
= Ogg Skeleton 3.3 with Keyframe Index =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;DRAFT, last updated 27 January 2010&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;This specification is still a work in progress, and does not yet constitute an official Ogg track format.&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search &lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and &lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until &lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target &lt;br /&gt;
time. However in media containing streams which have keyframes and &lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t &lt;br /&gt;
necessarily terminate at a keyframe. Thus if you begin decoding after your&lt;br /&gt;
first bisection terminates, you&#039;re likely to only get partial incomplete&lt;br /&gt;
frames, with &amp;quot;visual artifacts&amp;quot;, until you decode up to the next keyframe.&lt;br /&gt;
So to eliminate these visual artifacts, after the first bisection&lt;br /&gt;
terminates, you must extract the keyframe&#039;s timestamp from the last Theora&lt;br /&gt;
page&#039;s granulepos, and seek again back to the start of the keyframe and&lt;br /&gt;
decode forward until you reach the frame at the seek target. &lt;br /&gt;
&lt;br /&gt;
This is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
The bisection method above works fine for seeking in local files, but &lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection &lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have &lt;br /&gt;
very high latency, making seeking very slow. &lt;br /&gt;
&lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 bitstream attempts to alleviate this problem, by &lt;br /&gt;
providing an index of periodic keyframes for every content stream in an &lt;br /&gt;
Ogg segment. Note that the Skeleton 3.3 track only holds data for the &lt;br /&gt;
segment or &amp;quot;link&amp;quot; in which it resides. So if two Ogg files are concatenated&lt;br /&gt;
together (&amp;quot;chained&amp;quot;), the Skeleton 3.3&#039;s keyframe indexes in the first Ogg&lt;br /&gt;
segment (the first &amp;quot;link&amp;quot; in the &amp;quot;chain&amp;quot;) do not contain information&lt;br /&gt;
about the keyframes in the second Ogg segment (the second link in the chain).&lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own &lt;br /&gt;
packet in the Skeleton 3.3 track. The index for streams without the &lt;br /&gt;
concept of a keyframe, such as Vorbis streams, can instead record the &lt;br /&gt;
time position at periodic intervals, which achieves the same result. &lt;br /&gt;
When this document refers to keyframes, it also implicitly refers to these&lt;br /&gt;
independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
All the Skeleton 3.3 track&#039;s pages appear in the header pages of the Ogg &lt;br /&gt;
segment. This means the all the keyframe indexes are immediately &lt;br /&gt;
available once the header packets have been read when playing the media&lt;br /&gt;
over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream &lt;br /&gt;
provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key &lt;br /&gt;
point is intrinsically associated with exactly one stream, and stores the&lt;br /&gt;
offset of the page on which it starts, o, as well as the presentation time&lt;br /&gt;
of the keyframe t, as a fraction of seconds. This specifies that in order&lt;br /&gt;
to render the stream at presentation time t, the last page which lies before&lt;br /&gt;
all information required to render the keyframe at presentation time t begins&lt;br /&gt;
exactly at byte offset o, as offset from the beginning of the Ogg segment.&lt;br /&gt;
The offset is exactly the first byte of the page, so if you seek to a&lt;br /&gt;
keypoint&#039;s offset and don&#039;t find the beginning of a page there, you can&lt;br /&gt;
assume that the Ogg segment has been modified since the index was constructed,&lt;br /&gt;
and that the index is now invalid and should not be used. The time t is the&lt;br /&gt;
keyframe&#039;s presentation time corresponding to the granulepos, and is&lt;br /&gt;
represented as a fraction in seconds. Note that if a stream requires any&lt;br /&gt;
preroll, this will be accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 track contains one index for each content stream in the &lt;br /&gt;
file. To seek in an Ogg file which contains keyframe indexes, first&lt;br /&gt;
construct the set which contains every active streams&#039; last keypoint which&lt;br /&gt;
has time less than or equal to the seek target time. Then from that set&lt;br /&gt;
of key points, select the key point with the smallest byte offset. You then&lt;br /&gt;
verify that there&#039;s a page found at exactly that offset, and if so, you can&lt;br /&gt;
begin decoding. If the first keyframe you encounter has a time equal to&lt;br /&gt;
that stored in the keypoint, you have made the optimal seek, and can safely&lt;br /&gt;
continue to decode up to the seek target time. You are guaranteed to pass&lt;br /&gt;
keyframes on all streams with time less than or equal to your seek target&lt;br /&gt;
time while decoding up to the seek target. However if the first keyframe&lt;br /&gt;
you encounter after decoding does not have the same presentation time as&lt;br /&gt;
is stored in the keypoint, you then the index is invalid (possibly the file&lt;br /&gt;
has been changed without updating the index) and you must either fallback&lt;br /&gt;
to a bisection search, or keep decoding if you&#039;ve landed &amp;quot;close enough&amp;quot;&lt;br /&gt;
to the seek target.&lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain &lt;br /&gt;
keyframe indexes, so when implementing Ogg seeking, you must gracefully&lt;br /&gt;
fall-back to a bisection search or other seek algorithm when the index&lt;br /&gt;
is not present, or when it is invalid.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 BOS packet also stores meta data about the segment in &lt;br /&gt;
which it resides. It stores the timestamps of the first and last samples&lt;br /&gt;
in the segment. This also allows you to determine the duration of the&lt;br /&gt;
indexed Ogg media without having to decode the start and end of the&lt;br /&gt;
Ogg segment to calculate the difference (which is the duration).&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 BOS packet also contains the length of the indexed segment&lt;br /&gt;
in bytes. This is so that if the seek target is outside of the indexed range,&lt;br /&gt;
you can immediately move to the next/previous segment and either seek using&lt;br /&gt;
that segment&#039;s index, or narrow the bisection window if that segment has no&lt;br /&gt;
index. You can also use the segement length to verify if the index is valid.&lt;br /&gt;
If the contents of the segment have changed, it&#039;s highly likely that the&lt;br /&gt;
length of the segment has changed as well. When you load the segment&#039;s&lt;br /&gt;
header pages, you should check the length of the physical segment, and if it&lt;br /&gt;
doesn&#039;t match that stored in the Skeleton header packet, you know the index&lt;br /&gt;
is out of date and not safe to use.&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 BOS packet also contains the offset of the first non header&lt;br /&gt;
page in the Ogg segment. This means that if you wish to delay loading of an&lt;br /&gt;
index for whatever reason, you can skip forward to that offset, and start&lt;br /&gt;
decoding from that offset forwards.&lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still &lt;br /&gt;
correct. You can consider the index invalid if any of the following are true:&lt;br /&gt;
&lt;br /&gt;
# The segment length stored in the Skeleton BOS packet doesn&#039;t match the length of the physical segment, or&lt;br /&gt;
# after a seek to a keypoint&#039;s offset, you don&#039;t land exactly on a page boundary, or&lt;br /&gt;
# the first keyframe decoded after seeking to a keypoint&#039;s offset doesn&#039;t have the same presentation time as stored in the index.&lt;br /&gt;
&lt;br /&gt;
You should also always check the Skeleton version header field&lt;br /&gt;
to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
Be aware that a keyframe index may not index all keyframes in the Ogg segment,&lt;br /&gt;
it may only index periodic keyframes instead.&lt;br /&gt;
&lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are &lt;br /&gt;
encoded with the least significant bit coming first in each byte. &lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least &lt;br /&gt;
significant byte first (i.e. little endian byte order). &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.3 track is intended to be backwards compatible with the &lt;br /&gt;
Skeleton 3.0 specification, available at &lt;br /&gt;
http://www.xiph.org/ogg/doc/skeleton.html . Unless specified &lt;br /&gt;
differently here, it is safe to assume that anything specified for a &lt;br /&gt;
Skeleton 3.0 track holds for a Skeleton 3.3 track. &lt;br /&gt;
&lt;br /&gt;
As per the Skeleton 3.0 track, a segment containing a Skeleton 3.3 track &lt;br /&gt;
must begin with a &#039;&#039;&#039;Skeleton 3.3 fishead BOS packet&#039;&#039;&#039; on a page by itself, with the &lt;br /&gt;
following format: &lt;br /&gt;
&lt;br /&gt;
# Identifier: 8 bytes, &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
# Version major: 2 Byte unsigned integer denoting the major version (3)&lt;br /&gt;
# Version minor: 2 Byte unsigned integer denoting the minor version (1)&lt;br /&gt;
# Presentationtime numerator: 8 Byte signed integer&lt;br /&gt;
# Presentationtime denominator: 8 Byte signed integer&lt;br /&gt;
# Basetime numerator: 8 Byte signed integer&lt;br /&gt;
# Basetime denominator: 8 Byte signed integer&lt;br /&gt;
# UTC [ISO8601]: a 20 Byte string containing a UTC time&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the media. Note that samples between the first-sample-time and the Presentationtime are supposed to be skipped during playback.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the segment.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The length of the segment, in bytes: 8 byte unsigned integer, 0 if unknown.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The offset of the first non-header page, in bytes: 8 byte unsigned integer.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The offset of the first non-header page in bytes: 8 byte unsigned  integer, 0 if unknown.&lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units&lt;br /&gt;
of seconds. If the denominator is 0 for the first-sample-time or the&lt;br /&gt;
last-sample-time, then that value was unable to be determined at indexing&lt;br /&gt;
time, and is unknown. The duration of the Ogg segment can be calculated by&lt;br /&gt;
subtracting the first-sample-time from the last-sample-time.&lt;br /&gt;
&lt;br /&gt;
In &#039;&#039;&#039;Skeleton 3.3 the &amp;quot;fisbone&amp;quot; packets remain unchanged from Skeleton &lt;br /&gt;
3.0&#039;&#039;&#039;, and will still follow after the other streams&#039; BOS pages and &lt;br /&gt;
secondary header pages. &lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the &lt;br /&gt;
Skeleton 3.3 keyframe index packets. There should be one index packet for&lt;br /&gt;
each content stream in the Ogg segment, but index packets are not required&lt;br /&gt;
for a Skeleton 3.3 track to be considered valid. Each keypoint in the index&lt;br /&gt;
is stored in a &amp;quot;keypoint&amp;quot;, which in turn stores an offset, checksum, and&lt;br /&gt;
timestamp. In order to save space, the offsets and timestamps are stored as&lt;br /&gt;
deltas, and then variable byte-encoded. The offset and timestamp deltas&lt;br /&gt;
store the difference between the keypoint&#039;s offset and timestamp from the&lt;br /&gt;
previous keypoint&#039;s offset and timestamp. So to calculate the page offset&lt;br /&gt;
of a keypoint you must sum the offset deltas of up to and including the&lt;br /&gt;
keypoint in the index.&lt;br /&gt;
&lt;br /&gt;
The variable byte encoded integers are encoded using 7 bits per byte to&lt;br /&gt;
store the integer&#039;s bits, and the high bit is set in the last byte used&lt;br /&gt;
to encode the integer. The bits and bytes are in little endian byte order.&lt;br /&gt;
For example, the integer 7843, or &amp;lt;tt&amp;gt;0001 1110 1010 0011&amp;lt;/tt&amp;gt; in binary, would be&lt;br /&gt;
stored as two bytes: &amp;lt;tt&amp;gt;0xBD 0x23&amp;lt;/tt&amp;gt;, or &amp;lt;tt&amp;gt;1011 1101 0010 0011&amp;lt;/tt&amp;gt; in binary.&lt;br /&gt;
&lt;br /&gt;
Each &#039;&#039;&#039;Skeleton 3.3 keyframe index packet&#039;&#039;&#039; contains the following: &lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field.&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0.&lt;br /&gt;
# The keypoint presentation time denominator, as an 8 byte signed integer.&lt;br /&gt;
# &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
## the keyframe&#039;s page&#039;s byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint&#039;s offset, or from the start of the segment if this is the first keypoint. The keypoint&#039;s page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.&lt;br /&gt;
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint&#039;s offset, as a variable byte encoded integer. This is the difference from the previous keypoint&#039;s timestamp numerator. The keypoint&#039;s timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint&#039;s. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
Note that a keypoint always represents the first key frame on a page. If an&lt;br /&gt;
Ogg page contains two or more keyframes, the index&#039;s key point *must* refer&lt;br /&gt;
to the first keyframe on that page, not any subsequent keyframes on that page.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by &lt;br /&gt;
presentation time as well).&lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of two&lt;br /&gt;
chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index&lt;br /&gt;
are relative to the beginning of the second Ogg in the chain, not the first.&lt;br /&gt;
Also note that if a physical Ogg bitstream is made up of chained Oggs, the&lt;br /&gt;
presence of an index in one segment does not imply that there will be an&lt;br /&gt;
index in any other segment. &lt;br /&gt;
&lt;br /&gt;
The exact number of keyframes used to construct key points in the index &lt;br /&gt;
is up to the indexer, but to limit the index size, we recommend &lt;br /&gt;
including at most one key point per every 64KB of data, or every 2000ms, &lt;br /&gt;
whichever is least frequent. &lt;br /&gt;
&lt;br /&gt;
As per the Skeleton 3.0 track, &#039;&#039;&#039;the last packet in the Skeleton 3.3 track &lt;br /&gt;
is an empty EOS packet&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see [http://github.com/cpearce/OggIndex OggIndex]. Also included there is a program OggIndexValid, which can verify that Theora and Vorbis indexes are valid. If you&#039;re implementing your own indexer, or going to be modifying existing indexes, always verify that your modified indexes are valid as per OggIndexValid!&lt;br /&gt;
&lt;br /&gt;
Recent [http://firefogg.org/nightly/ ffmpeg2theora nightlies] will also include a keyframe index in the Skeleton&lt;br /&gt;
3.3 track if you specify the command line option &amp;lt;tt&amp;gt;--seek-index&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-win32.zip&lt;br /&gt;
&lt;br /&gt;
If you already have a Firefox instance running, you&#039;ll need to either close your running&lt;br /&gt;
Firefox instance before starting the index-capable Firefox, or start the index-capable&lt;br /&gt;
Firefox with the &amp;lt;tt&amp;gt;--no-remote&amp;lt;/tt&amp;gt; command line parameter.&lt;br /&gt;
&lt;br /&gt;
To compare the network performance of indexed versus non-indexed seeking, point the&lt;br /&gt;
index-capable Firefox here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggIndex-Migration&amp;diff=10780</id>
		<title>OggIndex-Migration</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggIndex-Migration&amp;diff=10780"/>
		<updated>2010-01-15T02:26:54Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* GStreamer */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for collecting patches related to the [[Ogg Index]] introduction.&lt;br /&gt;
&lt;br /&gt;
Please add links and information about your favorite applications to this page!&lt;br /&gt;
&lt;br /&gt;
Applications which read (decode) Ogg files should be extended to additionally recognize the OggIndex.&lt;br /&gt;
&lt;br /&gt;
=== Encoders ===&lt;br /&gt;
&lt;br /&gt;
* ffmpeg2theora supports creating indexes&lt;br /&gt;
&lt;br /&gt;
* what about other encoders? VLC, GStreamer&lt;br /&gt;
&lt;br /&gt;
* oggz-chop support on output&lt;br /&gt;
&lt;br /&gt;
=== Decoders ===&lt;br /&gt;
&lt;br /&gt;
==== oggz-chop ====&lt;br /&gt;
&lt;br /&gt;
* untested&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== GStreamer ====&lt;br /&gt;
&lt;br /&gt;
* support missing, gives error&lt;br /&gt;
* Needs indexing added to the Ogg mux?&lt;br /&gt;
&lt;br /&gt;
==== MPlayer ====&lt;br /&gt;
&lt;br /&gt;
* support missing, no error&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== VLC ====&lt;br /&gt;
&lt;br /&gt;
* support missing, no error&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== XiphQT ====&lt;br /&gt;
&lt;br /&gt;
* support missing, not tested&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== FFMpeg ====&lt;br /&gt;
&lt;br /&gt;
* support missing, not tested&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10779</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10779"/>
		<updated>2010-01-14T04:37:10Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
= Ogg Skeleton 3.2 with Keyframe Index =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;DRAFT, last updated 14 January 2010&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;This specification is still a work in progress, and does not yet constitute an official Ogg track format.&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search &lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and &lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until &lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target &lt;br /&gt;
time. However in media containing streams which have keyframes and &lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t &lt;br /&gt;
necessarily terminate at a keyframe. Thus if you begin decoding after your&lt;br /&gt;
first bisection terminates, you&#039;re likely to only get partial incomplete&lt;br /&gt;
frames, with &amp;quot;visual artifacts&amp;quot;, until you decode up to the next keyframe.&lt;br /&gt;
So to eliminate these visual artifacts, after the first bisection&lt;br /&gt;
terminates, you must extract the keyframe&#039;s timestamp from the last Theora&lt;br /&gt;
page&#039;s granulepos, and seek again back to the start of the keyframe and&lt;br /&gt;
decode forward until you reach the frame at the seek target. &lt;br /&gt;
&lt;br /&gt;
This is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
The bisection method above works fine for seeking in local files, but &lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection &lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have &lt;br /&gt;
very high latency, making seeking very slow. &lt;br /&gt;
&lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.2 bitstream attempts to alleviate this problem, by &lt;br /&gt;
providing an index of periodic keyframes for every content stream in an &lt;br /&gt;
Ogg segment. Note that the Skeleton 3.2 track only holds data for the &lt;br /&gt;
segment in which it resides. So if two Ogg files are concatenated together&lt;br /&gt;
(&amp;quot;chained&amp;quot;), the Skeleton 3.2&#039;s keyframe indexes in the first Ogg segment&lt;br /&gt;
(the first Ogg in the &amp;quot;chain&amp;quot;) do not contain information about the&lt;br /&gt;
keyframes in the second Ogg segment (the second Ogg in the &amp;quot;chain&amp;quot;). &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own &lt;br /&gt;
packet in the Skeleton 3.2 track. The index for streams without the &lt;br /&gt;
concept of a keyframe, such as Vorbis streams, can instead record the &lt;br /&gt;
time position at periodic intervals, which achieves the same result. &lt;br /&gt;
When this document refers to keyframes, it also implicitly refers to these&lt;br /&gt;
independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
All the Skeleton 3.2 track&#039;s pages appear in the header pages of the Ogg &lt;br /&gt;
segment. This means the all the keyframe indexes are immediately &lt;br /&gt;
available once the header packets have been read when playing the media&lt;br /&gt;
over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream &lt;br /&gt;
provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key &lt;br /&gt;
point is intrinsically associated with exactly one stream, and refers to &lt;br /&gt;
a page in that stream. A key point k is defined as follows. Each key &lt;br /&gt;
point has an 8 byte offset o, a presentation time t as a fraction with an&lt;br /&gt;
8 byte numerator and an 8 byte denominator, and a 4 byte checksum c. &lt;br /&gt;
This specifies that in order to render the stream at presentation time t,&lt;br /&gt;
the last page which lies before all information required to render the &lt;br /&gt;
keyframe at presentation time t begins at byte offset o, as offset from&lt;br /&gt;
the beginning of the Ogg segment. The checksum c is the checksum of the&lt;br /&gt;
page which begins at offset o. This enables you to verify that you&#039;re&lt;br /&gt;
seeking to the intended page, and that the segment has not been modified&lt;br /&gt;
since the index was constructed. The time t is the keyframe&#039;s presentation&lt;br /&gt;
time corresponding to the granulepos, and is represented as a fraction in&lt;br /&gt;
seconds. Note that if a stream requires any preroll, this will be &lt;br /&gt;
accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.2 track contains one index for each content stream in the &lt;br /&gt;
file. To seek in an Ogg file which contains keyframe indexes, first&lt;br /&gt;
construct the set which contains every active streams&#039; last keypoint which&lt;br /&gt;
has time less than or equal to the seek target time. Then from that set&lt;br /&gt;
of key points, select the key point with the smallest byte offset. You then&lt;br /&gt;
verify that the page found at the selected key point&#039;s byte offset has the&lt;br /&gt;
same checksum as the selected keypoint&#039;s checksum, and if so, you can begin&lt;br /&gt;
decoding up to the seek target time. You are guaranteed to pass keyframes&lt;br /&gt;
on all streams with time less than or equal to your seek target time while&lt;br /&gt;
decoding up to the seek target. &lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain &lt;br /&gt;
keyframe indexes, and so when implementing Ogg seeking, you must &lt;br /&gt;
gracefully fall-back to a bisection search or other seek algorithm when &lt;br /&gt;
the index is not present. &lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still &lt;br /&gt;
correct - always check the key point&#039;s checksum matches the checksum of &lt;br /&gt;
the page found at excatly the checksum&#039;s offset. If it does not match, &lt;br /&gt;
the file has changed since it was indexed, and you cannot rely on the &lt;br /&gt;
index being reliable. You should then fallback to seek using a bisection&lt;br /&gt;
search. You should also always check the Skeleton version header field&lt;br /&gt;
to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.2 header packet also stores meta data about the segment in &lt;br /&gt;
which it resides. It stores the timestamps of the first and last samples&lt;br /&gt;
in the segment. This also allows you to determine the duration of the&lt;br /&gt;
indexed Ogg media without having to decode the start and end of the&lt;br /&gt;
Ogg segment to calculate the difference (which is the duration). The index&lt;br /&gt;
header also contains the length of the index segment in bytes. This is so&lt;br /&gt;
that if the seek target is outside of the indexed range, you can&lt;br /&gt;
immediately move to the next/previous segment and either seek using that&lt;br /&gt;
segment&#039;s index, or narrow the bisection window if that segment has no index.&lt;br /&gt;
&lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are &lt;br /&gt;
encoded with the least significant bit coming first in each byte. &lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least &lt;br /&gt;
significant byte first (i.e. little endian byte order). &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.2 track is intended to be backwards compatible with the &lt;br /&gt;
Skeleton 3.0 specification, available at &lt;br /&gt;
http://www.xiph.org/ogg/doc/skeleton.html . Unless specified &lt;br /&gt;
differently here, it is safe to assume that anything specified for a &lt;br /&gt;
Skeleton 3.0 track holds for a Skeleton 3.2 track. &lt;br /&gt;
&lt;br /&gt;
As per the Skeleton 3.0 track, a segment containing a Skeleton 3.2 track &lt;br /&gt;
must begin with a &#039;&#039;&#039;Skeleton 3.2 fishead BOS packet&#039;&#039;&#039; on a page by itself, with the &lt;br /&gt;
following format: &lt;br /&gt;
&lt;br /&gt;
# Identifier: 8 bytes, &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
# Version major: 2 Byte unsigned integer denoting the major version (3)&lt;br /&gt;
# Version minor: 2 Byte unsigned integer denoting the minor version (1)&lt;br /&gt;
# Presentationtime numerator: 8 Byte signed integer&lt;br /&gt;
# Presentationtime denominator: 8 Byte signed integer&lt;br /&gt;
# Basetime numerator: 8 Byte signed integer&lt;br /&gt;
# Basetime denominator: 8 Byte signed integer&lt;br /&gt;
# UTC [ISO8601]: a 20 Byte string containing a UTC time&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the media. Note that samples between the first-sample-time and the Presentationtime are supposed to be skipped during playback.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the segment.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The length of the segment, in bytes: 8 byte unsigned integer, 0 if unknown.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The offset of the first non-header page, in bytes: 8 byte unsigned integer.&lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units&lt;br /&gt;
of seconds. If the denominator is 0 for the first-sample-time or the&lt;br /&gt;
last-sample-time, then that value was unable to be determined at indexing&lt;br /&gt;
time, and is unknown. The duration of the Ogg segment can be calculated by&lt;br /&gt;
subtracting the first-sample-time from the last-sample-time.&lt;br /&gt;
&lt;br /&gt;
In Skeleton 3.2 the &amp;quot;fisbone&amp;quot; packets remain unchanged from Skeleton &lt;br /&gt;
3.0, and will still follow after the other streams&#039; BOS pages and &lt;br /&gt;
secondary header pages. &lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the &lt;br /&gt;
&#039;&#039;&#039;Skeleton 3.2 keyframe index packets&#039;&#039;&#039;. There is one index packet for each &lt;br /&gt;
content stream in the Ogg segment. Each index packet contains the &lt;br /&gt;
following: &lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field.&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 8 byte unsigned integer. This can be 0.&lt;br /&gt;
# The keypoint presentation time denominator, as an 8 byte signed integer.&lt;br /&gt;
# &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
## a page start&#039;s byte offset as an 8 byte unsigned integer, followed by&lt;br /&gt;
## the checksum of the page found at the offset, as a 4 byte field, followed by&lt;br /&gt;
## the presentation time numerator of the first key frame which starts on the page at the keypoint&#039;s offset, as an 8 byte integer. Divide this by the timestamp denominator to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
Note that a keypoint always represents the first key frame on a page. If an&lt;br /&gt;
Ogg page contains two or more keyframes, the index&#039;s key point *must* refer&lt;br /&gt;
to the first keyframe on that page, not the second.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by &lt;br /&gt;
presentation time as well). Note that an index packet may be larger than&lt;br /&gt;
(6 + 4 + 4 + 8 + (n * (8 + 4 + 8)) bytes, as it may have been &lt;br /&gt;
preallocated during encoding, but not completely filled. Do not make &lt;br /&gt;
assumptions about an index packet&#039;s size, always check an index packet&#039;s&lt;br /&gt;
&#039;bytes&#039; field to determine its size, and always use its &#039;n&#039; field to &lt;br /&gt;
determine the number of keypoints contained in the index packet. &lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of two&lt;br /&gt;
chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index&lt;br /&gt;
are relative to the beginning of the second Ogg in the chain, not the first.&lt;br /&gt;
Also note that if a physical Ogg bitstream is made up of chained Oggs, the&lt;br /&gt;
presence of an index in one segment does not imply that there will be an&lt;br /&gt;
index in any other segment.&lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see [http://github.com/cpearce/OggIndex OggIndex]. Also included there is a program OggIndexValid, which can verify that Theora and Vorbis indexes are valid. If you&#039;re implementing your own indexer, or going to be modifying existing indexes, always verify that your modified indexes are valid as per OggIndexValid!&lt;br /&gt;
&lt;br /&gt;
Recent [http://firefogg.org/nightly/ ffmpeg2theora nightlies] will also include a keyframe index in the Skeleton&lt;br /&gt;
3.2 track if you specify the command line option &amp;lt;tt&amp;gt;--seek-index&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-win32.zip&lt;br /&gt;
&lt;br /&gt;
If you already have a Firefox instance running, you&#039;ll need to either close your running&lt;br /&gt;
Firefox instance before starting the index-capable Firefox, or start the index-capable&lt;br /&gt;
Firefox with the &amp;lt;tt&amp;gt;--no-remote&amp;lt;/tt&amp;gt; command line parameter.&lt;br /&gt;
&lt;br /&gt;
To compare the network performance of indexed versus non-indexed seeking, point the&lt;br /&gt;
index-capable Firefox here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10773</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10773"/>
		<updated>2010-01-10T22:28:45Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{draft}}&lt;br /&gt;
&lt;br /&gt;
= Ogg Skeleton 3.1 with Keyframe Index =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;DRAFT, last updated 11 January 2010&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;This specification is still a work in progress, and does not yet constitute an official Ogg track format.&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search &lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and &lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until &lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target &lt;br /&gt;
time. However in media containing streams which have keyframes and &lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t &lt;br /&gt;
necessarily terminate at a keyframe. Thus if you begin decoding after your&lt;br /&gt;
first bisection terminates, you&#039;re likely to only get partial incomplete&lt;br /&gt;
frames, with &amp;quot;visual artifacts&amp;quot;, until you decode up to the next keyframe.&lt;br /&gt;
So to eliminate these visual artifacts, after the first bisection&lt;br /&gt;
terminates, you must extract the keyframe&#039;s timestamp from the last Theora&lt;br /&gt;
page&#039;s granulepos, and seek again back to the start of the keyframe and&lt;br /&gt;
decode forward until you reach the frame at the seek target. &lt;br /&gt;
&lt;br /&gt;
This is further complicated by the fact that packets often span multiple &lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved &lt;br /&gt;
between spanning packets. &lt;br /&gt;
&lt;br /&gt;
The bisection method above works fine for seeking in local files, but &lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection &lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have &lt;br /&gt;
very high latency, making seeking very slow. &lt;br /&gt;
&lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.1 bitstream attempts to alleviate this problem, by &lt;br /&gt;
providing an index of periodic keyframes for every content stream in an &lt;br /&gt;
Ogg segment. Note that the Skeleton 3.1 track only holds data for the &lt;br /&gt;
segment in which it resides. So if two Ogg files are concatenated together&lt;br /&gt;
(&amp;quot;chained&amp;quot;), the Skeleton 3.1&#039;s keyframe indexes in the first Ogg segment&lt;br /&gt;
(the first Ogg in the &amp;quot;chain&amp;quot;) do not contain information about the&lt;br /&gt;
keyframes in the second Ogg segment (the second Ogg in the &amp;quot;chain&amp;quot;). &lt;br /&gt;
&lt;br /&gt;
Each content track has a separate index, which is stored in its own &lt;br /&gt;
packet in the Skeleton 3.1 track. The index for streams without the &lt;br /&gt;
concept of a keyframe, such as Vorbis streams, can instead record the &lt;br /&gt;
time position at periodic intervals, which achieves the same result. &lt;br /&gt;
When this document refers to keyframes, it also implicitly refers to these&lt;br /&gt;
independent periodic samples from keyframe-less streams. &lt;br /&gt;
&lt;br /&gt;
All the Skeleton 3.1 track&#039;s pages appear in the header pages of the Ogg &lt;br /&gt;
segment. This means the all the keyframe indexes are immediately &lt;br /&gt;
available once the header packets have been read when playing the media&lt;br /&gt;
over a network connection. &lt;br /&gt;
&lt;br /&gt;
For every content stream in an Ogg segment, the Ogg index bitstream &lt;br /&gt;
provides seek algorithms with an ordered table of &amp;quot;key points&amp;quot;. A key &lt;br /&gt;
point is intrinsically associated with exactly one stream, and refers to &lt;br /&gt;
a page in that stream. A key point k is defined as follows. Each key &lt;br /&gt;
point has an 8 byte offset o, a presentation time t as a fraction with an&lt;br /&gt;
8 byte numerator and an 8 byte denominator, and a 4 byte checksum c. &lt;br /&gt;
This specifies that in order to render the stream at presentation time t,&lt;br /&gt;
the last page which lies before all information required to render the &lt;br /&gt;
keyframe at presentation time t begins at byte offset o, as offset from&lt;br /&gt;
the beginning of the Ogg segment. The checksum c is the checksum of the&lt;br /&gt;
page which begins at offset o. This enables you to verify that you&#039;re&lt;br /&gt;
seeking to the intended page, and that the segment has not been modified&lt;br /&gt;
since the index was constructed. The time t is the keyframe&#039;s presentation&lt;br /&gt;
time corresponding to the granulepos, and is represented as a fraction in&lt;br /&gt;
seconds. Note that if a stream requires any preroll, this will be &lt;br /&gt;
accounted for in the time stored in the keypoint. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.1 track contains one index for each content stream in the &lt;br /&gt;
file. To seek in an Ogg file which contains keyframe indexes, first&lt;br /&gt;
construct the set which contains every active streams&#039; last keypoint which&lt;br /&gt;
has time less than or equal to the seek target time. Then from that set&lt;br /&gt;
of key points, select the key point with the smallest byte offset. You then&lt;br /&gt;
verify that the page found at the selected key point&#039;s byte offset has the&lt;br /&gt;
same checksum as the selected keypoint&#039;s checksum, and if so, you can begin&lt;br /&gt;
decoding up to the seek target time. You are guaranteed to pass keyframes&lt;br /&gt;
on all streams with time less than or equal to your seek target time while&lt;br /&gt;
decoding up to the seek target. &lt;br /&gt;
&lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain &lt;br /&gt;
keyframe indexes, and so when implementing Ogg seeking, you must &lt;br /&gt;
gracefully fall-back to a bisection search or other seek algorithm when &lt;br /&gt;
the index is not present. &lt;br /&gt;
&lt;br /&gt;
When using the index to seek, you must verify that the index is still &lt;br /&gt;
correct - always check the key point&#039;s checksum matches the checksum of &lt;br /&gt;
the page found at excatly the checksum&#039;s offset. If it does not match, &lt;br /&gt;
the file has changed since it was indexed, and you cannot rely on the &lt;br /&gt;
index being reliable. You should then fallback to seek using a bisection&lt;br /&gt;
search. You should also always check the Skeleton version header field&lt;br /&gt;
to ensure your decoder correctly knows how to parse the Skeleton track. &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.1 header packet also stores meta data about the segment in &lt;br /&gt;
which it resides. It stores the timestamps of the first and last samples&lt;br /&gt;
in the segment. This also allows you to determine the duration of the&lt;br /&gt;
indexed Ogg media without having to decode the start and end of the&lt;br /&gt;
Ogg segment to calculate the difference (which is the duration). The index&lt;br /&gt;
header also contains the length of the index segment in bytes. This is so&lt;br /&gt;
that if the seek target is outside of the indexed range, you can&lt;br /&gt;
immediately move to the next/previous segment and either seek using that&lt;br /&gt;
segment&#039;s index, or narrow the bisection window if that segment has no index.&lt;br /&gt;
&lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are &lt;br /&gt;
encoded with the least significant bit coming first in each byte. &lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least &lt;br /&gt;
significant byte first (i.e. little endian byte order). &lt;br /&gt;
&lt;br /&gt;
The Skeleton 3.1 track is intended to be backwards compatible with the &lt;br /&gt;
Skeleton 3.0 specification, available at &lt;br /&gt;
http://www.xiph.org/ogg/doc/skeleton.html . Unless specified &lt;br /&gt;
differently here, it is safe to assume that anything specified for a &lt;br /&gt;
Skeleton 3.0 track holds for a Skeleton 3.1 track. &lt;br /&gt;
&lt;br /&gt;
As per the Skeleton 3.0 track, a segment containing a Skeleton 3.1 track &lt;br /&gt;
must begin with a &#039;&#039;&#039;Skeleton 3.1 fishead BOS packet&#039;&#039;&#039; on a page by itself, with the &lt;br /&gt;
following format: &lt;br /&gt;
&lt;br /&gt;
# Identifier: 8 bytes, &amp;quot;fishead\0&amp;quot;.&lt;br /&gt;
# Version major: 2 Byte unsigned integer denoting the major version (3)&lt;br /&gt;
# Version minor: 2 Byte unsigned integer denoting the minor version (1)&lt;br /&gt;
# Presentationtime numerator: 8 Byte signed integer&lt;br /&gt;
# Presentationtime denominator: 8 Byte signed integer&lt;br /&gt;
# Basetime numerator: 8 Byte signed integer&lt;br /&gt;
# Basetime denominator: 8 Byte signed integer&lt;br /&gt;
# UTC [ISO8601]: a 20 Byte string containing a UTC time&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the media. Note that samples between the first-sample-time and the Presentationtime are supposed to be skipped during playback.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; First-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the segment.&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; Last-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!&lt;br /&gt;
# &#039;&#039;&#039;[NEW]&#039;&#039;&#039; The length of the segment, in bytes: 8 byte signed integer, -1 if unknown.&lt;br /&gt;
&lt;br /&gt;
The first-sample-time and last-sample-time are rational numbers, in units&lt;br /&gt;
of seconds. If the denominator is 0 for the first-sample-time or the&lt;br /&gt;
last-sample-time, then that value was unable to be determined at indexing&lt;br /&gt;
time, and is unknown. The duration of the Ogg segment can be calculated by&lt;br /&gt;
subtracting the first-sample-time from the last-sample-time.&lt;br /&gt;
&lt;br /&gt;
In Skeleton 3.1 the &amp;quot;fisbone&amp;quot; packets remain unchanged from Skeleton &lt;br /&gt;
3.0, and will still follow after the other streams&#039; BOS pages and &lt;br /&gt;
secondary header pages. &lt;br /&gt;
&lt;br /&gt;
Before the Skeleton EOS page in the segment header pages come the &lt;br /&gt;
&#039;&#039;&#039;Skeleton 3.1 keyframe index packets&#039;&#039;&#039;. There is one index packet for each &lt;br /&gt;
content stream in the Ogg segment. Each index packet contains the &lt;br /&gt;
following: &lt;br /&gt;
&lt;br /&gt;
# Identifier 6 bytes: &amp;quot;index\0&amp;quot;&lt;br /&gt;
# The serialno of the stream this index applies to, as a 4 byte field.&lt;br /&gt;
# The number of keypoints in this index packet, &#039;n&#039; as a 4 byte unsigned integer. This can be 0.&lt;br /&gt;
# The keypoint presentation time denominator, as an 8 byte signed integer.&lt;br /&gt;
# &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
## a page start&#039;s byte offset as an 8 byte unsigned integer, followed by&lt;br /&gt;
## the checksum of the page found at the offset, as a 4 byte field, followed by&lt;br /&gt;
## the presentation time numerator of the first key frame which starts on the page at the keypoint&#039;s offset, as an 8 byte integer. Divide this by the timestamp denominator to determine the presentation time of the keyframe in seconds.&lt;br /&gt;
&lt;br /&gt;
Note that a keypoint always represents the first key frame on a page. If an&lt;br /&gt;
Ogg page contains two or more keyframes, the index&#039;s key point *must* refer&lt;br /&gt;
to the first keyframe on that page, not the second.&lt;br /&gt;
&lt;br /&gt;
The key points are stored in increasing order by offset (and thus by &lt;br /&gt;
presentation time as well). Note that an index packet may be larger than&lt;br /&gt;
(6 + 4 + 4 + 8 + (n * (8 + 4 + 8)) bytes, as it may have been &lt;br /&gt;
preallocated during encoding, but not completely filled. Do not make &lt;br /&gt;
assumptions about an index packet&#039;s size, always check an index packet&#039;s&lt;br /&gt;
&#039;bytes&#039; field to determine its size, and always use its &#039;n&#039; field to &lt;br /&gt;
determine the number of keypoints contained in the index packet. &lt;br /&gt;
&lt;br /&gt;
The byte offsets stored in keypoints are relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of two&lt;br /&gt;
chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s index&lt;br /&gt;
are relative to the beginning of the second Ogg in the chain, not the first.&lt;br /&gt;
Also note that if a physical Ogg bitstream is made up of chained Oggs, the&lt;br /&gt;
presence of an index in one segment does not imply that there will be an&lt;br /&gt;
index in any other segment.&lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see [http://github.com/cpearce/OggIndex OggIndex]. Also included there is a program OggIndexValid, which can verify that Theora and Vorbis indexes are valid. If you&#039;re implementing your own indexer, or going to be modifying existing indexes, always verify that your modified indexes are valid as per OggIndexValid!&lt;br /&gt;
&lt;br /&gt;
Recent [http://firefogg.org/nightly/ ffmpeg2theora nightlies] will also include a keyframe index in the Skeleton&lt;br /&gt;
3.1 track if you specify the command line option &amp;lt;tt&amp;gt;--seek-index&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-seek-win32.zip&lt;br /&gt;
&lt;br /&gt;
If you already have a Firefox instance running, you&#039;ll need to either close your running&lt;br /&gt;
Firefox instance before starting the index-capable Firefox, or start the index-capable&lt;br /&gt;
Firefox with the &amp;lt;tt&amp;gt;--no-remote&amp;lt;/tt&amp;gt; command line parameter.&lt;br /&gt;
&lt;br /&gt;
To compare the network performance of indexed versus non-indexed seeking, point the&lt;br /&gt;
index-capable Firefox here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10554</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10554"/>
		<updated>2009-09-24T05:04:44Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Ogg Index Track Specification Version 1 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Ogg Index Track Specification Version 1 =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;DRAFT, last updated 24 September 2009&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;This specification is still a work in progress, and does not yet constitute an official Ogg track format.&#039;&#039;&#039;&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search&lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and&lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until&lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target&lt;br /&gt;
time. However in media containing streams which have key frames and&lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t&lt;br /&gt;
necessarily stop at a keyframe, thus you can&#039;t simply resume playback&lt;br /&gt;
from that point. First you need to construct the keyframe&#039;s timestamp&lt;br /&gt;
from the last page&#039;s granulepos, and seek again to the start of the&lt;br /&gt;
keyframe and decode forward until you reach the frame at the seek&lt;br /&gt;
target.&lt;br /&gt;
 &lt;br /&gt;
This is further complicated by the fact that packets often span multiple&lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved&lt;br /&gt;
between spanning packets.&lt;br /&gt;
 &lt;br /&gt;
The bisection method above works fine for seeking in local files, but&lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection&lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have&lt;br /&gt;
very high latency, making seeking very slow.&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
 &lt;br /&gt;
The Ogg index bitstream attempts to alleviate this problem, by providing&lt;br /&gt;
an index of periodic keyframes in an Ogg file. The index is contained in&lt;br /&gt;
a separate track which is embedded in the Ogg file, so that players&lt;br /&gt;
which don&#039;t understand the index track can just ignore it. In streams&lt;br /&gt;
without the concept of a keyframe, such as Vorbis streams where each&lt;br /&gt;
sample is independent, the index can instead record the time position at&lt;br /&gt;
periodic intervals, which achieves the same result. When this document&lt;br /&gt;
refers to keyframes, it also refers to these independent periodic&lt;br /&gt;
samples from keyframe-less streams.&lt;br /&gt;
 &lt;br /&gt;
The Ogg index bitstream provides seek algorithms with an ordered table&lt;br /&gt;
of the Ogg page start-offsets and end-times of key points in the indexed&lt;br /&gt;
streams in an Ogg segment.&lt;br /&gt;
 &lt;br /&gt;
A key point k is defined as follows. Each key point has an 8 byte offset&lt;br /&gt;
o, an 8 byte time t, and a 4 byte checksum c. This specifies that in&lt;br /&gt;
order to render the media at presentation time t milliseconds, the last&lt;br /&gt;
page which lies before the start of all the packets containing all the&lt;br /&gt;
key frames required to render at time t begins at offset o. The checksum&lt;br /&gt;
c is the checksum of the page which begins at offset o, which enables&lt;br /&gt;
you to verify that you&#039;re seeking to the intended page.&lt;br /&gt;
 &lt;br /&gt;
To seek in an Ogg bitstream which contains an index, you find the last&lt;br /&gt;
key point in the index with time less than or equal to the target time.&lt;br /&gt;
You then seek to the key point&#039;s offset, check that the page found there&lt;br /&gt;
has checksum c, and then decode forward until you encounter the sample&lt;br /&gt;
which corresponds to your seek target time. You are guaranteed to pass&lt;br /&gt;
keyframes on all indexed streams with time less than or equal to your&lt;br /&gt;
seek target time while decoding up to the seek target.&lt;br /&gt;
 &lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain&lt;br /&gt;
an index, and so when implementing Ogg seeking, you must gracefully&lt;br /&gt;
fall-back to a bisection search or other seek algorithm when the index&lt;br /&gt;
is not present.&lt;br /&gt;
 &lt;br /&gt;
The index also only holds data for the segment in which it resides, i.e.&lt;br /&gt;
if two Ogg files are concatenated together (&amp;quot;chained&amp;quot;), the index track&lt;br /&gt;
in one Ogg segment does not contain information about the keyframes in&lt;br /&gt;
the other Ogg segment.&lt;br /&gt;
 &lt;br /&gt;
The index also stores meta data about the segment in which it resides.&lt;br /&gt;
It stores the start time and the end time, and also the length of the&lt;br /&gt;
segment in bytes. This is so that if the seek target is outside of the&lt;br /&gt;
indexed range, you can immediately move to the next/previous segment and&lt;br /&gt;
either seek using that segment&#039;s index, or narrow the bisection window&lt;br /&gt;
if that segment has no index.&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are&lt;br /&gt;
encoded with the least significant bit coming first in each byte.&lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least&lt;br /&gt;
significant byte first (i.e. little endian byte order).&lt;br /&gt;
 &lt;br /&gt;
An Ogg index track starts with an identifier header packet which&lt;br /&gt;
contains the following data, in the following order:&lt;br /&gt;
 &lt;br /&gt;
* The identifier &amp;quot;index\0&amp;quot;.&lt;br /&gt;
* The index version format number, as a 1 byte unsigned integer. This specification describes version 1, so this field should have the value 0x01.&lt;br /&gt;
* The playback start time, in milliseconds, as an 8 byte unsigned integer, this is the presentation time of the first frame.&lt;br /&gt;
* The playback end time, in milliseconds, as an 8 byte unsigned integer, this is the end time of the last frame.&lt;br /&gt;
* The length of the indexed segment, in bytes, as an 8 byte unsigned integer.&lt;br /&gt;
* The number of key points in the index, &#039;n&#039;, as a 4 byte unsigned integer.&lt;br /&gt;
&lt;br /&gt;
 &lt;br /&gt;
The track then contains one secondary header packet, which contains the&lt;br /&gt;
actual index. This is the &amp;quot;index packet&amp;quot;, and it must begin on a new&lt;br /&gt;
page, but it may span multiple pages. The index packet contains the&lt;br /&gt;
following:&lt;br /&gt;
 &lt;br /&gt;
* &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
** the page offset as an 8 byte unsigned integer, followed by&lt;br /&gt;
** the checksum of the page found at the offset, as a 4 byte field,followed by&lt;br /&gt;
** the presentation times in milliseconds of the key point, as an 8 byte unsigned integer.&lt;br /&gt;
 &lt;br /&gt;
The size of the data in the index packet is (n * (8 + 4 + 8)) bytes. The&lt;br /&gt;
key points are stored in increasing order by offset.&lt;br /&gt;
 &lt;br /&gt;
The track then contains one empty EOS packet, which must start on a new&lt;br /&gt;
page. The track therefore contains exactly three packets, on three or&lt;br /&gt;
more pages.&lt;br /&gt;
 &lt;br /&gt;
The offsets stored in the keypoints is relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of&lt;br /&gt;
two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s&lt;br /&gt;
index are relative to the beginning of the second Ogg in the chain, not&lt;br /&gt;
the first. Also note that if a physical Ogg bitstream is made up of&lt;br /&gt;
chained Oggs, the presence of an index in one segment does not imply&lt;br /&gt;
that there will be an index in any other segment.&lt;br /&gt;
 &lt;br /&gt;
The exact number of keyframes used to construct key points in the index&lt;br /&gt;
is up to the indexer, but to limit the index size, we recommend&lt;br /&gt;
including at most one key point per every 64KB of data, or every 500ms.&lt;br /&gt;
 &lt;br /&gt;
There can be only one index track per Ogg bitstream segment. The index&lt;br /&gt;
packet must occur before all non-metadata streams&#039; content packets. In&lt;br /&gt;
practice this means that the index packet will occur along with other&lt;br /&gt;
secondary header pages, before the skeleton EOS page.&lt;br /&gt;
 &lt;br /&gt;
All pages in the index bitstream have their granulepos set as 0.&lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see:&lt;br /&gt;
&lt;br /&gt;
http://github.com/cpearce/OggIndex&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.win32.zip&lt;br /&gt;
&lt;br /&gt;
Then point that browser here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10553</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10553"/>
		<updated>2009-09-24T04:56:29Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: /* Ogg Index Track Format Version 1 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Ogg Index Track Specification Version 1 =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;DRAFT, last updated 24 September 2009&#039;&#039;&#039;&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search&lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and&lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until&lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target&lt;br /&gt;
time. However in media containing streams which have key frames and&lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t&lt;br /&gt;
necessarily stop at a keyframe, thus you can&#039;t simply resume playback&lt;br /&gt;
from that point. First you need to construct the keyframe&#039;s timestamp&lt;br /&gt;
from the last page&#039;s granulepos, and seek again to the start of the&lt;br /&gt;
keyframe and decode forward until you reach the frame at the seek&lt;br /&gt;
target.&lt;br /&gt;
 &lt;br /&gt;
This is further complicated by the fact that packets often span multiple&lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved&lt;br /&gt;
between spanning packets.&lt;br /&gt;
 &lt;br /&gt;
The bisection method above works fine for seeking in local files, but&lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection&lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have&lt;br /&gt;
very high latency, making seeking very slow.&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
 &lt;br /&gt;
The Ogg index bitstream attempts to alleviate this problem, by providing&lt;br /&gt;
an index of periodic keyframes in an Ogg file. The index is contained in&lt;br /&gt;
a separate track which is embedded in the Ogg file, so that players&lt;br /&gt;
which don&#039;t understand the index track can just ignore it. In streams&lt;br /&gt;
without the concept of a keyframe, such as Vorbis streams where each&lt;br /&gt;
sample is independent, the index can instead record the time position at&lt;br /&gt;
periodic intervals, which achieves the same result. When this document&lt;br /&gt;
refers to keyframes, it also refers to these independent periodic&lt;br /&gt;
samples from keyframe-less streams.&lt;br /&gt;
 &lt;br /&gt;
The Ogg index bitstream provides seek algorithms with an ordered table&lt;br /&gt;
of the Ogg page start-offsets and end-times of key points in the indexed&lt;br /&gt;
streams in an Ogg segment.&lt;br /&gt;
 &lt;br /&gt;
A key point k is defined as follows. Each key point has an 8 byte offset&lt;br /&gt;
o, an 8 byte time t, and a 4 byte checksum c. This specifies that in&lt;br /&gt;
order to render the media at presentation time t milliseconds, the last&lt;br /&gt;
page which lies before the start of all the packets containing all the&lt;br /&gt;
key frames required to render at time t begins at offset o. The checksum&lt;br /&gt;
c is the checksum of the page which begins at offset o, which enables&lt;br /&gt;
you to verify that you&#039;re seeking to the intended page.&lt;br /&gt;
 &lt;br /&gt;
To seek in an Ogg bitstream which contains an index, you find the last&lt;br /&gt;
key point in the index with time less than or equal to the target time.&lt;br /&gt;
You then seek to the key point&#039;s offset, check that the page found there&lt;br /&gt;
has checksum c, and then decode forward until you encounter the sample&lt;br /&gt;
which corresponds to your seek target time. You are guaranteed to pass&lt;br /&gt;
keyframes on all indexed streams with time less than or equal to your&lt;br /&gt;
seek target time while decoding up to the seek target.&lt;br /&gt;
 &lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain&lt;br /&gt;
an index, and so when implementing Ogg seeking, you must gracefully&lt;br /&gt;
fall-back to a bisection search or other seek algorithm when the index&lt;br /&gt;
is not present.&lt;br /&gt;
 &lt;br /&gt;
The index also only holds data for the segment in which it resides, i.e.&lt;br /&gt;
if two Ogg files are concatenated together (&amp;quot;chained&amp;quot;), the index track&lt;br /&gt;
in one Ogg segment does not contain information about the keyframes in&lt;br /&gt;
the other Ogg segment.&lt;br /&gt;
 &lt;br /&gt;
The index also stores meta data about the segment in which it resides.&lt;br /&gt;
It stores the start time and the end time, and also the length of the&lt;br /&gt;
segment in bytes. This is so that if the seek target is outside of the&lt;br /&gt;
indexed range, you can immediately move to the next/previous segment and&lt;br /&gt;
either seek using that segment&#039;s index, or narrow the bisection window&lt;br /&gt;
if that segment has no index.&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are&lt;br /&gt;
encoded with the least significant bit coming first in each byte.&lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least&lt;br /&gt;
significant byte first (i.e. little endian byte order).&lt;br /&gt;
 &lt;br /&gt;
An Ogg index track starts with an identifier header packet which&lt;br /&gt;
contains the following data, in the following order:&lt;br /&gt;
 &lt;br /&gt;
* The identifier &amp;quot;index\0&amp;quot;.&lt;br /&gt;
* The index version format number, as a 1 byte unsigned integer. This specification describes version 1, so this field should have the value 0x01.&lt;br /&gt;
* The playback start time, in milliseconds, as an 8 byte unsigned integer, this is the presentation time of the first frame.&lt;br /&gt;
* The playback end time, in milliseconds, as an 8 byte unsigned integer, this is the end time of the last frame.&lt;br /&gt;
* The length of the indexed segment, in bytes, as an 8 byte unsigned integer.&lt;br /&gt;
* The number of key points in the index, &#039;n&#039;, as a 4 byte unsigned integer.&lt;br /&gt;
&lt;br /&gt;
 &lt;br /&gt;
The track then contains one secondary header packet, which contains the&lt;br /&gt;
actual index. This is the &amp;quot;index packet&amp;quot;, and it must begin on a new&lt;br /&gt;
page, but it may span multiple pages. The index packet contains the&lt;br /&gt;
following:&lt;br /&gt;
 &lt;br /&gt;
* &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
** the page offset as an 8 byte unsigned integer, followed by&lt;br /&gt;
** the checksum of the page found at the offset, as a 4 byte field,followed by&lt;br /&gt;
** the presentation times in milliseconds of the key point, as an 8 byte unsigned integer.&lt;br /&gt;
 &lt;br /&gt;
The size of the data in the index packet is (n * (8 + 4 + 8)) bytes. The&lt;br /&gt;
key points are stored in increasing order by offset.&lt;br /&gt;
 &lt;br /&gt;
The track then contains one empty EOS packet, which must start on a new&lt;br /&gt;
page. The track therefore contains exactly three packets, on three or&lt;br /&gt;
more pages.&lt;br /&gt;
 &lt;br /&gt;
The offsets stored in the keypoints is relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of&lt;br /&gt;
two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s&lt;br /&gt;
index are relative to the beginning of the second Ogg in the chain, not&lt;br /&gt;
the first. Also note that if a physical Ogg bitstream is made up of&lt;br /&gt;
chained Oggs, the presence of an index in one segment does not imply&lt;br /&gt;
that there will be an index in any other segment.&lt;br /&gt;
 &lt;br /&gt;
The exact number of keyframes used to construct key points in the index&lt;br /&gt;
is up to the indexer, but to limit the index size, we recommend&lt;br /&gt;
including at most one key point per every 64KB of data, or every 500ms.&lt;br /&gt;
 &lt;br /&gt;
There can be only one index track per Ogg bitstream segment. The index&lt;br /&gt;
packet must occur before all non-metadata streams&#039; content packets. In&lt;br /&gt;
practice this means that the index packet will occur along with other&lt;br /&gt;
secondary header pages, before the skeleton EOS page.&lt;br /&gt;
 &lt;br /&gt;
All pages in the index bitstream have their granulepos set as 0.&lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see:&lt;br /&gt;
&lt;br /&gt;
http://github.com/cpearce/OggIndex&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.win32.zip&lt;br /&gt;
&lt;br /&gt;
Then point that browser here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10552</id>
		<title>Ogg Index</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Ogg_Index&amp;diff=10552"/>
		<updated>2009-09-24T04:55:27Z</updated>

		<summary type="html">&lt;p&gt;Cpearce: Draft of the Ogg Index track specification&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Ogg Index Track Format Version 1 =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;DRAFT SPECIFICATION&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Updated 24 September 2009&#039;&#039;&#039;&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Overview ==&lt;br /&gt;
 &lt;br /&gt;
Seeking in an Ogg file is typically implemented as a bisection search&lt;br /&gt;
over the pages in the file. The Ogg physical bitstream is bisected and&lt;br /&gt;
the next Ogg page&#039;s end-time is extracted. The bisection continues until&lt;br /&gt;
it reaches an Ogg page with an end-time close enough to the seek target&lt;br /&gt;
time. However in media containing streams which have key frames and&lt;br /&gt;
interframes, such as Theora streams, your bisection search won&#039;t&lt;br /&gt;
necessarily stop at a keyframe, thus you can&#039;t simply resume playback&lt;br /&gt;
from that point. First you need to construct the keyframe&#039;s timestamp&lt;br /&gt;
from the last page&#039;s granulepos, and seek again to the start of the&lt;br /&gt;
keyframe and decode forward until you reach the frame at the seek&lt;br /&gt;
target.&lt;br /&gt;
 &lt;br /&gt;
This is further complicated by the fact that packets often span multiple&lt;br /&gt;
Ogg pages, and that Ogg pages from different streams can be interleaved&lt;br /&gt;
between spanning packets.&lt;br /&gt;
 &lt;br /&gt;
The bisection method above works fine for seeking in local files, but&lt;br /&gt;
for seeking in files served over the Internet via HTTP, each bisection&lt;br /&gt;
or non sequential read can trigger a new HTTP request, which can have&lt;br /&gt;
very high latency, making seeking very slow.&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Seeking with an index ==&lt;br /&gt;
 &lt;br /&gt;
The Ogg index bitstream attempts to alleviate this problem, by providing&lt;br /&gt;
an index of periodic keyframes in an Ogg file. The index is contained in&lt;br /&gt;
a separate track which is embedded in the Ogg file, so that players&lt;br /&gt;
which don&#039;t understand the index track can just ignore it. In streams&lt;br /&gt;
without the concept of a keyframe, such as Vorbis streams where each&lt;br /&gt;
sample is independent, the index can instead record the time position at&lt;br /&gt;
periodic intervals, which achieves the same result. When this document&lt;br /&gt;
refers to keyframes, it also refers to these independent periodic&lt;br /&gt;
samples from keyframe-less streams.&lt;br /&gt;
 &lt;br /&gt;
The Ogg index bitstream provides seek algorithms with an ordered table&lt;br /&gt;
of the Ogg page start-offsets and end-times of key points in the indexed&lt;br /&gt;
streams in an Ogg segment.&lt;br /&gt;
 &lt;br /&gt;
A key point k is defined as follows. Each key point has an 8 byte offset&lt;br /&gt;
o, an 8 byte time t, and a 4 byte checksum c. This specifies that in&lt;br /&gt;
order to render the media at presentation time t milliseconds, the last&lt;br /&gt;
page which lies before the start of all the packets containing all the&lt;br /&gt;
key frames required to render at time t begins at offset o. The checksum&lt;br /&gt;
c is the checksum of the page which begins at offset o, which enables&lt;br /&gt;
you to verify that you&#039;re seeking to the intended page.&lt;br /&gt;
 &lt;br /&gt;
To seek in an Ogg bitstream which contains an index, you find the last&lt;br /&gt;
key point in the index with time less than or equal to the target time.&lt;br /&gt;
You then seek to the key point&#039;s offset, check that the page found there&lt;br /&gt;
has checksum c, and then decode forward until you encounter the sample&lt;br /&gt;
which corresponds to your seek target time. You are guaranteed to pass&lt;br /&gt;
keyframes on all indexed streams with time less than or equal to your&lt;br /&gt;
seek target time while decoding up to the seek target.&lt;br /&gt;
 &lt;br /&gt;
Be aware that you cannot assume that any or all Ogg files will contain&lt;br /&gt;
an index, and so when implementing Ogg seeking, you must gracefully&lt;br /&gt;
fall-back to a bisection search or other seek algorithm when the index&lt;br /&gt;
is not present.&lt;br /&gt;
 &lt;br /&gt;
The index also only holds data for the segment in which it resides, i.e.&lt;br /&gt;
if two Ogg files are concatenated together (&amp;quot;chained&amp;quot;), the index track&lt;br /&gt;
in one Ogg segment does not contain information about the keyframes in&lt;br /&gt;
the other Ogg segment.&lt;br /&gt;
 &lt;br /&gt;
The index also stores meta data about the segment in which it resides.&lt;br /&gt;
It stores the start time and the end time, and also the length of the&lt;br /&gt;
segment in bytes. This is so that if the seek target is outside of the&lt;br /&gt;
indexed range, you can immediately move to the next/previous segment and&lt;br /&gt;
either seek using that segment&#039;s index, or narrow the bisection window&lt;br /&gt;
if that segment has no index.&lt;br /&gt;
 &lt;br /&gt;
 &lt;br /&gt;
== Format Specification ==&lt;br /&gt;
 &lt;br /&gt;
Unless otherwise specified, all integers and fields in the bitstream are&lt;br /&gt;
encoded with the least significant bit coming first in each byte.&lt;br /&gt;
Integers and fields comprising of more than one byte are encoded least&lt;br /&gt;
significant byte first (i.e. little endian byte order).&lt;br /&gt;
 &lt;br /&gt;
An Ogg index track starts with an identifier header packet which&lt;br /&gt;
contains the following data, in the following order:&lt;br /&gt;
 &lt;br /&gt;
* The identifier &amp;quot;index\0&amp;quot;.&lt;br /&gt;
* The index version format number, as a 1 byte unsigned integer. This specification describes version 1, so this field should have the value 0x01.&lt;br /&gt;
* The playback start time, in milliseconds, as an 8 byte unsigned integer, this is the presentation time of the first frame.&lt;br /&gt;
* The playback end time, in milliseconds, as an 8 byte unsigned integer, this is the end time of the last frame.&lt;br /&gt;
* The length of the indexed segment, in bytes, as an 8 byte unsigned integer.&lt;br /&gt;
* The number of key points in the index, &#039;n&#039;, as a 4 byte unsigned integer.&lt;br /&gt;
&lt;br /&gt;
 &lt;br /&gt;
The track then contains one secondary header packet, which contains the&lt;br /&gt;
actual index. This is the &amp;quot;index packet&amp;quot;, and it must begin on a new&lt;br /&gt;
page, but it may span multiple pages. The index packet contains the&lt;br /&gt;
following:&lt;br /&gt;
 &lt;br /&gt;
* &#039;n&#039; key points, each of which contain, in the following order:&lt;br /&gt;
** the page offset as an 8 byte unsigned integer, followed by&lt;br /&gt;
** the checksum of the page found at the offset, as a 4 byte field,followed by&lt;br /&gt;
** the presentation times in milliseconds of the key point, as an 8 byte unsigned integer.&lt;br /&gt;
 &lt;br /&gt;
The size of the data in the index packet is (n * (8 + 4 + 8)) bytes. The&lt;br /&gt;
key points are stored in increasing order by offset.&lt;br /&gt;
 &lt;br /&gt;
The track then contains one empty EOS packet, which must start on a new&lt;br /&gt;
page. The track therefore contains exactly three packets, on three or&lt;br /&gt;
more pages.&lt;br /&gt;
 &lt;br /&gt;
The offsets stored in the keypoints is relative to the start of the Ogg&lt;br /&gt;
bitstream segment. So if you have a physical Ogg bitstream made up of&lt;br /&gt;
two chained Oggs, the offsets in the second Ogg segment&#039;s bitstream&#039;s&lt;br /&gt;
index are relative to the beginning of the second Ogg in the chain, not&lt;br /&gt;
the first. Also note that if a physical Ogg bitstream is made up of&lt;br /&gt;
chained Oggs, the presence of an index in one segment does not imply&lt;br /&gt;
that there will be an index in any other segment.&lt;br /&gt;
 &lt;br /&gt;
The exact number of keyframes used to construct key points in the index&lt;br /&gt;
is up to the indexer, but to limit the index size, we recommend&lt;br /&gt;
including at most one key point per every 64KB of data, or every 500ms.&lt;br /&gt;
 &lt;br /&gt;
There can be only one index track per Ogg bitstream segment. The index&lt;br /&gt;
packet must occur before all non-metadata streams&#039; content packets. In&lt;br /&gt;
practice this means that the index packet will occur along with other&lt;br /&gt;
secondary header pages, before the skeleton EOS page.&lt;br /&gt;
 &lt;br /&gt;
All pages in the index bitstream have their granulepos set as 0.&lt;br /&gt;
&lt;br /&gt;
== Software Prototype ==&lt;br /&gt;
&lt;br /&gt;
For a prototype indexer, see:&lt;br /&gt;
&lt;br /&gt;
http://github.com/cpearce/OggIndex&lt;br /&gt;
&lt;br /&gt;
To see how indexes improves network seeking performance, you can download a development&lt;br /&gt;
version of Firefox which can take advantage of indexes here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.linux.tar.bz2&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.macosx.dmg&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/firefox-indexed-ogg-seek.win32.zip&lt;br /&gt;
&lt;br /&gt;
Then point that browser here:&lt;br /&gt;
&lt;br /&gt;
http://pearce.org.nz/video/indexed-seek-demo.html&lt;/div&gt;</summary>
		<author><name>Cpearce</name></author>
	</entry>
</feed>