https://wiki.xiph.org/api.php?action=feedcontributions&user=Silvia&feedformat=atom
XiphWiki - User contributions [en]
2024-03-28T17:54:00Z
User contributions
MediaWiki 1.40.1
https://wiki.xiph.org/index.php?title=Ogg_Skeleton_3&diff=16370
Ogg Skeleton 3
2016-05-22T07:53:21Z
<p>Silvia: clean up annodex.net links</p>
<hr />
<div>'''Ogg Skeleton 3.0''' provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection.<br />
<br />
'''NOTE:''' ''The Ogg Skeleton format has been updated to [[Ogg Skeleton 4]], which includes a keyframe index to enable faster seeking. Encoding tools are recommended to use [[Ogg Skeleton 4]] in preference to version 3.0 where possible.''<br />
<br />
Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. <br />
<br />
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don't have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the "file" command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).<br />
<br />
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. <br />
<br />
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you're streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .<br />
<br />
== Specification ==<br />
<br />
This is a motivation and design sketch.<br />
'''For the current specification see http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt'''<br />
<br />
=== How to describe the logical bitstreams within an Ogg container? ===<br />
<br />
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:<br />
* the serial number: it identifies a content track<br />
* the mime type: it identifies the content type<br />
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width<br />
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream<br />
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating "granules / granulerate".<br />
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)<br />
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)<br />
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content<br />
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content<br />
<br />
=== How to allow the creation of substreams from an Ogg physical bitstream? ===<br />
<br />
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.<br />
<br />
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:<br />
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)<br />
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream<br />
<br />
=== Ogg Skeleton version 3.0 Format Specification ===<br />
<br />
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.<br />
<br />
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.<br />
The first 8 bytes provide the magic identifier "fishead\0".<br />
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of "fisbone\0". The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0.<br />
<br />
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'fishead\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Version major | Version minor | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Presentationtime numerator | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Presentationtime denominator | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Basetime numerator | 28-31<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 32-35<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Basetime denominator | 36-39<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 40-43<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| UTC | 44-47<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 48-51<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 52-55<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 56-59<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 60-63<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
The version fields provide version information for the Skeleton track, currently being 3.0 (the number having evolved within the Annodex project).<br />
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).<br />
<br />
<br />
The fisbone secondary header packet looks as follows:<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'fisbone\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Offset to message header fields | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Serial number | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Number of header packets | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate numerator | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate denominator | 28-31<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 32-35<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Basegranule | 36-39<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 40-43<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Preroll | 44-47<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granuleshift | Padding/future use | 48-51<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Message header fields ... | 52-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. "Content-Type: audio/vorbis"). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.<br />
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.<br />
<br />
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:<br />
* there can only be one Skeleton logical bitstream in a Ogg bitstream.<br />
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don't get confused about it being e.g. Ogg Vorbis without this meta information<br />
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)<br />
* the secondary header pages of all logical bitstreams come next, including Skeleton's secondary header packets<br />
* the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear<br />
<br />
== Development ==<br />
<br />
Ogg Skeleton is being supported by the following projects:<br />
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]<br />
* liboggz: [https://git.xiph.org/liboggz.git git]<br />
* the Annodex technology: (not available any more)<br />
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)<br />
* ffmpeg2theora (with --skeleton) <br />
* speexenc (with --skeleton) & speexdec<br />
* many more ...<br />
<br />
== External links ==<br />
<br />
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]<br />
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]<br />
<br />
<br />
[[Category:Ogg]]</div>
Silvia
https://wiki.xiph.org/index.php?title=Ogg&diff=16369
Ogg
2016-05-22T07:52:17Z
<p>Silvia: remove annodex.net links</p>
<hr />
<div>The '''Ogg''' transport bitstream is designed to provide framing, error protection and seeking structure for higher-level codec streams that consist of raw, unencapsulated data packets, such as the [[Opus]], [[Vorbis]] and [[FLAC]] audio codecs or the [[Theora]] and [[Dirac]] video codecs.<br />
<br />
== Name ==<br />
<br />
Ogg derives from "ogging", jargon from the computer game Netrek. Ogg is not an acronym and should not be mentioned as "OGG".<br />
<br />
== Design constraints for Ogg bitstreams ==<br />
<br />
* True streaming; we must not need to seek to build a 100% complete bitstream.<br />
* Use no more than approximately 1-2% of bitstream bandwidth for packet boundary marking, high-level framing, sync and seeking.<br />
* Specification of absolute position within the original sample stream.<br />
* Simple mechanism to ease limited editing, such as a simplified concatenation mechanism.<br />
* Detection of corruption, recapture after error and direct, random access to data at arbitrary positions in the bitstream.<br />
<br />
== Specification / standard==<br />
<br />
The Ogg transport bitstream and file format is defined in RFC 3533 approved 2003-May. As RFC documents are invariable once approved, there will never be newer versions of RFC 3533, but an [[RFC_3533_Errata]] exists instead. Existing flaws are discussed at [[OggIssues]], ideas for the future at [[TransOgg]].<br />
<br />
== Detecting Ogg files and extracting information ==<br />
<br />
Ogg files begin with a signature "OggS". This signature also repeats many times inside the file, at the beginning of every page. There are several tools to get information about Ogg files:<br />
* Ogginfo - part of Vorbis-Tools, supports Vorbis codec only (historical Ogg-vs-Vorbis issue), other codecs cause it to report garbage<br />
* Opusinfo - part of Opus-Tools, supports only Opus codec well, only minimal Vorbis support<br />
* Oggz ???<br />
* MediaInfo [http://sourceforge.net/projects/mediainfo/ sf.net/projects/mediainfo] - provides information about media (and some other) files, supports many types, also Ogg with various codecs, generic audio and video information only, no Ogg-specific details<br />
<br />
== Projects using Ogg ==<br />
<br />
=== Codecs ===<br />
<br />
* [[Opus]]<br />
* [[CMML]]<br />
* [[FLAC]] ([http://xiph.org/flac/ogg_mapping.html Ogg mapping])<br />
* [[OggKate|Kate]]<br />
* [http://opus-codec.org/ Opus] ([[OggOpus|Ogg mapping]])<br />
* [[OggPCM|PCM]]<br />
* [[Ogg Skeleton|Skeleton]]<br />
* [[Speex]] ([[OggSpeex|Ogg mapping]])<br />
* [[Theora]] ([[OggTheora|Ogg mapping]])<br />
* [[Vorbis]] ([[OggVorbis|Ogg mapping]])<br />
* [[OggWrit|Writ]]<br />
<br />
=== Servers ===<br />
<br />
* [[Icecast]]<br />
* [http://www.metavid.org/ Metavid]<br />
<br />
== Developer info ==<br />
<br />
* [[GranulePosAndSeeking]] -- a discussion of the interpretation of granulepos, and the algorithm for seeking on Ogg files<br />
<br />
=== Ogg page format ===<br />
<br />
The LSb (least significant bit) comes first in the Bytes. Fields<br />
with more than one byte length are encoded LSB (least significant<br />
byte) first.<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| capture_pattern: Magic number for page start "OggS" | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| version | header_type | granule_position | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | bitstream_serial_number | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | page_sequence_number | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | CRC_checksum | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| |page_segments | segment_table | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | 28-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
== Implementations ==<br />
<br />
The Ogg encapsulation format can be handled with the following libraries:<br />
<br />
* libogg: [http://svn.xiph.org/trunk/ogg/ libogg svn] (C, cross-platform) Low-level Ogg parsing and writing.<br />
* liboggz: [http://git.xiph.org/?p=liboggz.git liboggz git] (C, cross-platform) liboggz wraps libogg and provides features such as seeking.<br />
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable] (C++, Win32)<br />
* [http://www.kfish.org/software/hogg HOgg] (pure Haskell)<br />
* [http://www.jcraft.com/jorbis/ JOrbis] (pure Java) contains com.jcraft.jogg<br />
* [http://www.sacredchao.net/quodlibet/wiki/Development/Mutagen Mutagen] (pure Python)<br />
<br />
== See also ==<br />
<br />
* [[Flash]]<br />
* [[Oggless]]<br />
* [[MIME Types and File Extensions]]<br />
* [[RFC_3533_Errata]] - errors and flaws in the specification<br />
* [[Nut_Container]]<br />
<br />
== External links ==<br />
<br />
* [http://www.xiph.org/ogg/doc/ Ogg documentation]<br />
* [http://www.ietf.org/rfc/rfc3533.txt Ogg RFC]<br />
* [http://en.wikipedia.org/wiki/Ogg Ogg at Wikipedia]<br />
* [http://wiki.multimedia.cx/index.php?title=Ogg Ogg at Multimedia Wiki]<br />
<br />
[[Category:Ogg]]</div>
Silvia
https://wiki.xiph.org/index.php?title=Ogg_Skeleton_4&diff=16368
Ogg Skeleton 4
2016-05-22T07:50:15Z
<p>Silvia: updated liboggz link</p>
<hr />
<div>'''Ogg Skeleton''' provides structuring information for multitrack [[Ogg]] files. It is compatible with Ogg [[Theora]] and provides extra clues for synchronization and content negotiation such as language selection. The latest version of Skeleton, version 4.0, also provides keyframe indexes to enable optimal seeking over high-latency connections, such as the internet.<br />
<br />
Ogg is a generic container format, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. <br />
<br />
Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don't have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the "file" command under *nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof).<br />
<br />
Ogg Skeleton is designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. <br />
<br />
Seeking in an Ogg file is typically implemented as a bisection search for the seek target timestamp. However when seeking over a high latency connection, such as the internet, such searches can be slow. Some bitstreams, notably Theora, have keyframes, and so in order to seek to a given temporal offset in a Theora stream, you must first perform a bisection search to find the target Theora frame, determine its keyframe, and then perform another bisection search to locate that keyframe and decode forwards to the temoporal offset. This can be very slow. The Ogg Skeleton 4.0 provides an index of keyframes, and indexes periodic samples on streams without the concept of a keyframe, so that seeking over high-latency connections can simply be performed optimally with "one hop".<br />
<br />
Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you're streaming this file from a Web server after a query for a temporal subpart such as in http://example.com/video.ogv?t=7-59 .<br />
<br />
=== Previous version ===<br />
<br />
The previous version of Ogg Skeleton was version 3, and its specification can be found on the wiki page [[Ogg Skeleton 3]], or at [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt].<br />
<br />
=== How to describe the logical bitstreams within an Ogg container? ===<br />
<br />
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton:<br />
* the serial number: it identifies a content track<br />
* the mime type: it identifies the content type<br />
* other generic name-value fields that can provide meta information such as the language of a track or the video height and width<br />
* the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream<br />
* the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream. Note that when using this to interpret timestamps, the granulepos of a data page must first be parsed to extract a granule value using the method described in [[GranulePosAndSeeking]]. This value can then be mapped to time by calculating "granules / granulerate".<br />
* the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)<br />
* the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)<br />
* a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content<br />
* a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content<br />
<br />
=== How to allow the creation of substreams from an Ogg physical bitstream? ===<br />
<br />
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream.<br />
<br />
The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream:<br />
* the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift)<br />
* the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream<br />
<br />
=== Keyframe indexes for faster seeking ===<br />
<br />
Seeking in an Ogg file is typically implemented as a bisection search over the pages in the file. The bisection method above works fine for seeking in local files, but for seeking in files served over the Internet via HTTP, each bisection or non sequential read can trigger a new HTTP request, which can have very high latency, making seeking very slow. Seeking is further complicated by the fact that packets often span multiple <br />
Ogg pages, and that Ogg pages from different streams can be interleaved <br />
between spanning packets. <br />
<br />
Each content track has a separate index, which is stored in its own packet in the Skeleton 4.0 track. The index for streams without the concept of a keyframe, such as Vorbis streams, can instead record the time position at periodic intervals, which achieves the same result. When this document refers to keyframes, it also implicitly refers to these independent periodic samples from keyframe-less streams. <br />
<br />
Because all the Skeleton track's index packets appear in the header pages of the Ogg segment, all the keyframe indexes are immediately available once the header packets have been read when playing the media over a network connection. <br />
<br />
For every content stream in an Ogg segment, the Skeleton provides seek algorithms with an index, or ordered table of "key points". A key point is intrinsically associated with exactly one stream, and stores the offset, o, of the last page which lies before all data required to decode the keyframe, as well as the presentation time of the keyframe t, as a fraction of seconds.<br />
<br />
The offset is relative from the beginning of the Ogg segment, and is exactly the first byte of a page in the indexed stream, so if you seek to a keypoint's offset and don't find the beginning of a page there, or you find a page from another stream, you can assume that the Ogg segment has been modified since the index was constructed, and the index can be considered invalid. The time t is the keyframe's presentation time corresponding to the granulepos, and is represented as a fraction in seconds. Note that if a stream requires any preroll, this will be accounted for in the time stored in the keypoint. <br />
<br />
The Skeleton 4.0 track contains one index for each content stream in the file. To seek in an Ogg file which contains keyframe indexes, first construct the set which contains every active streams' last keypoint which has time less than or equal to the seek target time. This tells you a known point on every stream which lies before the seek target. Then from that set of key points, select the key point with the smallest byte offset. You then verify that there's a page from the keypoint's stream found at exactly that offset, and if so, you can begin decoding. You are guaranteed to pass keyframes on all streams with time less than or equal to your seek target time while decoding up to the seek target. However if you don't encounter a keyframe with the same presentation time as is stored in the keypoint, then the index is invalid (possibly the file has been changed without updating the index) and you must either fallback to a bisection search, or keep decoding if you've landed "close enough" to the seek target.<br />
<br />
Be aware that you cannot assume that any or all Ogg files will contain keyframe indexes, so when implementing Ogg seeking, you must gracefully fall-back to a bisection search or other seek algorithm when the index is not present, or when it is invalid.<br />
<br />
The Skeleton 4.0 index packets also stores meta data about the segment in which it resides. It stores the timestamps of the first and last samples in its track. This also allows you to determine the duration of the indexed Ogg media without having to decode the start and end of the Ogg segment to calculate the difference (which is the duration). With the index packets storing the start and end times of every track, you can calculate the duration as the end time of the last active stream minus the start time of first active stream.<br />
<br />
The Skeleton 4.0 BOS packet contains the length of the indexed segment in bytes. This is so that if the seek target is outside of the indexed range, you can immediately move to the next/previous segment and either seek using that segment's index, or narrow the bisection window if that segment has no index. You can also use the segement length to verify if the index is valid. If the contents of the segment have changed, it's highly likely that the length of the segment has changed as well. When you load the segment's header pages, you should check the length of the physical segment, and if it doesn't match the length stored in the Skeleton header packet, you know that either the index is out of date, or the file has been chained since indexing.<br />
<br />
The Skeleton 4.0 BOS packet also contains the offset of the first non header page in the Ogg segment. This means that if you wish to delay loading of an index for whatever reason, you can skip forward to that offset, and start decoding from that offset forwards.<br />
<br />
When using the index to seek, you must verify that the index is still correct. You can consider the index invalid if any of the following are true:<br />
<br />
* The segment doesn't end at the segment length offset stored in the Skeleton BOS packet (note that a new "link" in a "chain" can start at the end of the segment), or<br />
* after a seek to a keypoint's offset, you don't land exactly on a page boundary, or<br />
* after a seek to a keypoint's offset, you don't land on a page which belongs to that keypoint's stream.<br />
<br />
While loading the Skeleton BOS header, you should always check the Skeleton version field to ensure your decoder correctly knows how to parse the Skeleton track. <br />
<br />
Be aware that a keyframe index may not index all keyframes in the Ogg segment, it may only index periodic keyframes instead.<br />
<br />
<br />
=== Ogg Skeleton version 4.0 Format Specification ===<br />
<br />
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream.<br />
<br />
The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page.<br />
The first 8 bytes provide the magic identifier "fishead\0".<br />
After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of "fisbone\0". The Skeleton logical bitstream has no actual content packets. Its EOS page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0. Note the EOS packet appears by itself on its own page (the "EOS page").<br />
<br />
The fishead ident header looks as follows ([http://annodex.org/w/images/3/39/FishHeads.JPG inspiration]):<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'fishead\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Version major | Version minor | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Presentationtime numerator | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Presentationtime denominator | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Basetime numerator | 28-31<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 32-35<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Basetime denominator | 36-39<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 40-43<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| UTC | 44-47<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 48-51<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 52-55<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 56-59<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 60-63<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Segment length in bytes | 64-67<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 68-71<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Content byte offset | 72-75<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 76-79<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
The version fields provide version information for the Skeleton track, currently being 4.0 (the number having evolved within the Annodex project).<br />
Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000).<br />
<br />
<br />
The fisbone secondary header packet looks as follows:<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'fisbone\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Offset to message header fields | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Serial number | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Number of header packets | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate numerator | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate denominator | 28-31<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 32-35<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Basegranule | 36-39<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 40-43<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Preroll | 44-47<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granuleshift | Padding/future use | 48-51<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Message header fields ... | 52-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
The mime type is provided as a message header field specified in the same way that HTTP header fields are given, e.g. "Content-Type: audio/vorbis". Message header fields are terminated/delimited by "\r\n". Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing.<br />
The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above.<br />
<br />
The following message headers are compulsory in Skeleton 4.0:<br />
* Content-type: mime type of the content encoded in this stream, e.g. audio/vorbis, video/theora, etc. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
* Role: describes the function of this track. Common examples are "video/main", "audio/main", "text/caption". For a complete list of possibilities, see http://wiki.xiph.org/SkeletonHeaders#Role.<br />
* Name: a unique free text string which can be used to directly address the track in scripting applications, such as an HTML5 viewer.<br />
<br />
For more message headers, see [[SkeletonHeaders]].<br />
<br />
Before the Skeleton EOS page in the segment header pages come the Skeleton 4.0 keyframe index packets. There should be one index packet foreach content track in the Ogg segment, but index packets are not required for a Skeleton 4.0 track to be considered valid. Each keyframe in the index is stored in a "keypoint", which in turn stores an offset, and timestamp. In order to save space, the offsets and timestamps are stored as deltas, and then variable byte-encoded. The offset and timestamp deltas store the difference between the keypoint's offset and timestamp from the previous keypoint's offset and timestamp. So to calculate the page offset of a keypoint you must sum the offset deltas of up to and including the keypoint in the index.<br />
<br />
The variable byte encoded integers are encoded using 7 bits per byte to store the integer's bits, and the high bit is set in the last byte used to encode the integer. The bits and bytes are in little endian byte order. For example, the integer 7843, or 0001 1110 1010 0011 in binary, would be stored as two bytes: 0xBD 0x23, or 1011 1101 0010 0011 in binary.<br />
<br />
Each index packet contains the following: <br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'index\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |Serial number | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |Number of keypoints | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | Timestamp denominator | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | First sample time numerator | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | 28-31<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | Last sample end time numerator| 32-35<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... | 36-39<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |Keypoints... | 40-43<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
The fields of the index packet are as follows:<br />
<br />
# Identifier 6 bytes: "index\0". Bytes [0...5].<br />
# The serialno of the stream this index applies to, as a 4 byte field. Bytes [6...9]<br />
# The number of keypoints in this index packet, 'n' as a 8 byte unsigned integer. This can be 0. Bytes [10...17].<br />
# The presentation time denominator for this stream, as an 8 byte signed integer. All timestamps, including keypoint timestamps, first and last sample timestamps are fractions of seconds over this denominator. This must not be 0. Bytes [18...25].<br />
# First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the track. Bytes [26...33]<br />
# Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the track. Bytes [34...41]<br />
# 'n' key points, starting with the first keypoint at byte 42. Each keypoint contains, in the following order:<br />
## the keyframe's page's byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint's offset, or from the start of the segment if this is the first keypoint. The keypoint's page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.<br />
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint's offset, as a variable byte encoded integer. This is the difference from the previous keypoint's timestamp numerator. The keypoint's timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint's. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.<br />
<br />
The key points are stored in increasing order by offset (and thus by presentation time as well).<br />
<br />
The byte offsets stored in keypoints are relative to the start of the Ogg bitstream segment. So if you have a physical Ogg bitstream made up of two chained Oggs, the offsets in the second Ogg segment's bitstream's index are relative to the beginning of the second Ogg in the chain, not the first. Also note that if a physical Ogg bitstream is made up of chained Oggs, the presence of an index in one segment does not imply that there will be an index in any other segment. <br />
<br />
The first-sample-time and last-sample-time are rational numbers, in units of seconds. If the denominator is 0 for the first-sample-time or the last-sample-time, then that value was unable to be determined at indexing time, and is unknown.<br />
<br />
The exact number of keyframes used to construct key points in the index is up to the indexer, but to limit the index size, we recommend including at most one key point per every 64KB of data, or every 1000ms, whichever is least frequent.<br />
<br />
=== Further restrictions === <br />
<br />
A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing:<br />
* there can only be one Skeleton logical bitstream in a Ogg bitstream.<br />
* the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don't get confused about it being e.g. Ogg Vorbis without this meta information<br />
* the bos pages of all the other logical bistreams come next (a requirement of Ogg)<br />
* the secondary header pages of all logical bitstreams come next, including Skeleton's secondary header packets (the fisbone and index packets)<br />
* the Skeleton EOS packet appears by itself on the the last page of the Skeleton stream (the "EOS page").<br />
* the Skeleton EOS page ends the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear.<br />
<br />
== Development ==<br />
<br />
Ogg Skeleton 4 is being supported by the following projects:<br />
* ffmpeg2theora (version 0.27 and above) <br />
* OggIndex: [http://firefogg.org/nightly/ binaries], [http://git.xiph.org/?p=OggIndex.git;a=summary source]<br />
* Mozilla Firefox 4<br />
<br />
The following projects currently support Ogg Skeleton 3, support for Ogg Skeleton 4 is planned:<br />
* speexenc (with --skeleton) & speexdec<br />
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]<br />
* liboggz: [https://git.xiph.org/liboggz.git git]<br />
* the Annodex technology: (not available any more)<br />
* [http://www.kfish.org/software/hogg/ HOgg] (Haskell)<br />
* many more ...<br />
<br />
== External links ==<br />
<br />
* Ogg Skeleton is described in more detail in the [http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt Skeleton I-D in svn]<br />
* Ogg Skeleton was originally specified in Annodex v3: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]<br />
<br />
<br />
[[Category:Ogg]]</div>
Silvia
https://wiki.xiph.org/index.php?title=MIME_Types_and_File_Extensions&diff=15297
MIME Types and File Extensions
2015-01-15T13:52:58Z
<p>Silvia: updated STATUS</p>
<hr />
<div>STATUS: [http://www.ietf.org/rfc/rfc5334.txt RFC 5334] encapsulates the below listed policies. More details are [http://wiki.xiph.org/index.php/MIMETypesCodecs here], which also include a specification of the codecs parameter of the MIME types. Use the correct file extensions straight away.<br />
<br />
<br />
IMPLEMENTATION recommendations and patches: see [[MIME-Migration]].<br />
<br />
== .ogg - audio/ogg ==<br />
<br />
* Ogg Vorbis I Profile<br />
* .ogg applies now for Vorbis I files only<br />
* .ogg has more recently also been used for Ogg FLAC and for Theora, too &mdash; these uses are deprecated now in favor of .oga and .ogv respectively<br />
* has been defined in RFC 3534 for application/ogg, so rfc 3534 will be re-defined<br />
<br />
RATIONALE: .ogg has traditionally been used for Vorbis I files, in particular in HW players, hence it is kept for backwards-compatibility<br />
<br />
== .ogv - video/ogg ==<br />
<br />
* Ogg Video Profile (a/v in Ogg container)<br />
* apps supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg<br />
* This list is not exhaustive (for example, [[Dirac]] + FLAC is acceptable too)<br />
* SHOULD contain a Skeleton track and/or MAY contain a CMML logical bitstream.<br />
<br />
== .opus - audio/ogg ==<br />
<br />
* Ogg Opus profile<br />
* Defined by https://tools.ietf.org/html/draft-ietf-codec-oggopus<br />
<br />
== .oga - audio/ogg ==<br />
<br />
* Ogg Audio Profile (audio in Ogg container)<br />
* Applications supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* Covers Ogg [[FLAC]], [[Ghost]], and [[OggPCM]] <br />
* Although they share the same MIME type, Vorbis, Opus and Speex use different file extensions.<br />
* SHOULD contain a Skeleton logical bitstream.<br />
* Vorbis and Speex may use .oga, but it is not the prefered method of distributing these files because of backwards-compatibility issues.<br />
<br />
== .ogx - application/ogg ==<br />
<br />
* Ogg Multiplex Profile (anything in [[Ogg]])<br />
* can contain any logical bitstreams multiplexed together in an ogg container<br />
* will replace the .ogg extension from RFC 3534<br />
* random multitrack files MUST contain a [[Skeleton]] track to identify all containing logical bitstreams<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* thus, e.g. an annodex file can gracefully degrade to .ogx if an app cannot decode [[CMML]] and/or [[Skeleton]]<br />
* USE: application/ogg has been registered, so can be used immediately<br />
<br />
== .spx - audio/ogg ==<br />
<br />
* Ogg Speex Profile<br />
* .spx has traditionally been used for Speex files within Ogg and should be considered for backwards-compatibility<br />
<br />
== .flac - audio/flac ==<br />
<br />
* FLAC in native encapsulation format<br />
<br />
== .anx - application/annodex ==<br />
<br />
* THIS FILE FORMAT IS DEPRECATED.<br />
* Profile for multiplexed Ogg that includes a skeleton track and at least one CMML logical bitstream<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* apps that come across an annodex file and cannot decode CMML and/or Skeleton, but can deal with the others SHOULD gracefully degrade by ignoring these<br />
<br />
== .axa - audio/annodex ==<br />
<br />
* THIS FILE FORMAT IS DEPRECATED.<br />
* Profile for audio in Annodex <br />
* covers e.g. [[Vorbis]], [[Speex]], [[FLAC]], [[Opus]], [[Ghost]], [[OggPCM]] inside Ogg with Skeleton and CMML<br />
<br />
== .axv - video/annodex ==<br />
<br />
* THIS FILE FORMAT IS DEPRECATED.<br />
* Profile for video in Annodex <br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg with Skeleton and CMML<br />
<br />
== .xspf - application/xspf+xml ==<br />
<br />
* Profile for XSPF<br />
* Covers [[XSPF]], while being used through XML<br />
* Does not cover [[JSPF]], which is XSPF but on JSON<br />
<br />
== Ogg Kate files - application/kate ==<br />
<br />
* Binary representation of Kate encapsulated in Ogg<br />
* may have a skeleton<br />
* can be used to identify the mime type of the track itself (e.g. in skeleton)<br />
* uses .ogx extension when in a file by itself<br />
* is subdued by the dominant mime type if in a audio or video file to become audio/ogg or video/ogg<br />
<br />
== Codec MIME types ==<br />
<br />
Codecs need their own MIME types for streaming in RTP and to be used in multitrack ogg files using skeleton:<br />
<br />
* audio/vorbis for Vorbis without container<br />
* video/theora for Theora without container<br />
* audio/speex for Speex without container<br />
* audio/flac for FLAC without and in native container<br />
* audio/opus for Opus without container<br />
* text/cmml for CMML without container<br />
* application/kate for the textual representation of Kate (.kate files)</div>
Silvia
https://wiki.xiph.org/index.php?title=MIME_Types_and_File_Extensions&diff=15294
MIME Types and File Extensions
2015-01-15T13:46:58Z
<p>Silvia: deprecating Annodex.</p>
<hr />
<div>STATUS: Work on RFCs and tools is in process to reflect these policies. More details are [http://wiki.xiph.org/index.php/MIMETypesCodecs here], which also include a specification of the codecs parameter of the MIME types. Use the correct file extensions straight away.<br />
<br />
DISCLAIMER: currently, application/ogg, video/ogg, audio/ogg and audio/vorbis are registered MIME types. Registration for the others will be undertaken. During this process, the "x-" versions of these unregistered MIME types may be used.<br />
<br />
IMPLEMENTATION recommendations and patches: see [[MIME-Migration]].<br />
<br />
== .ogg - audio/ogg ==<br />
<br />
* Ogg Vorbis I Profile<br />
* .ogg applies now for Vorbis I files only<br />
* .ogg has more recently also been used for Ogg FLAC and for Theora, too &mdash; these uses are deprecated now in favor of .oga and .ogv respectively<br />
* has been defined in RFC 3534 for application/ogg, so rfc 3534 will be re-defined<br />
<br />
RATIONALE: .ogg has traditionally been used for Vorbis I files, in particular in HW players, hence it is kept for backwards-compatibility<br />
<br />
== .ogv - video/ogg ==<br />
<br />
* Ogg Video Profile (a/v in Ogg container)<br />
* apps supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg<br />
* This list is not exhaustive (for example, [[Dirac]] + FLAC is acceptable too)<br />
* SHOULD contain a Skeleton track and/or MAY contain a CMML logical bitstream.<br />
<br />
== .opus - audio/ogg ==<br />
<br />
* Ogg Opus profile<br />
* Defined by https://tools.ietf.org/html/draft-ietf-codec-oggopus<br />
<br />
== .oga - audio/ogg ==<br />
<br />
* Ogg Audio Profile (audio in Ogg container)<br />
* Applications supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* Covers Ogg [[FLAC]], [[Ghost]], and [[OggPCM]] <br />
* Although they share the same MIME type, Vorbis, Opus and Speex use different file extensions.<br />
* SHOULD contain a Skeleton logical bitstream.<br />
* Vorbis and Speex may use .oga, but it is not the prefered method of distributing these files because of backwards-compatibility issues.<br />
<br />
== .ogx - application/ogg ==<br />
<br />
* Ogg Multiplex Profile (anything in [[Ogg]])<br />
* can contain any logical bitstreams multiplexed together in an ogg container<br />
* will replace the .ogg extension from RFC 3534<br />
* random multitrack files MUST contain a [[Skeleton]] track to identify all containing logical bitstreams<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* thus, e.g. an annodex file can gracefully degrade to .ogx if an app cannot decode [[CMML]] and/or [[Skeleton]]<br />
* USE: application/ogg has been registered, so can be used immediately<br />
<br />
== .spx - audio/ogg ==<br />
<br />
* Ogg Speex Profile<br />
* .spx has traditionally been used for Speex files within Ogg and should be considered for backwards-compatibility<br />
<br />
== .flac - audio/flac ==<br />
<br />
* FLAC in native encapsulation format<br />
<br />
== .anx - application/annodex ==<br />
<br />
* THIS FILE FORMAT IS DEPRECATED.<br />
* Profile for multiplexed Ogg that includes a skeleton track and at least one CMML logical bitstream<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* apps that come across an annodex file and cannot decode CMML and/or Skeleton, but can deal with the others SHOULD gracefully degrade by ignoring these<br />
<br />
== .axa - audio/annodex ==<br />
<br />
* THIS FILE FORMAT IS DEPRECATED.<br />
* Profile for audio in Annodex <br />
* covers e.g. [[Vorbis]], [[Speex]], [[FLAC]], [[Opus]], [[Ghost]], [[OggPCM]] inside Ogg with Skeleton and CMML<br />
<br />
== .axv - video/annodex ==<br />
<br />
* THIS FILE FORMAT IS DEPRECATED.<br />
* Profile for video in Annodex <br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg with Skeleton and CMML<br />
<br />
== .xspf - application/xspf+xml ==<br />
<br />
* Profile for XSPF<br />
* Covers [[XSPF]], while being used through XML<br />
* Does not cover [[JSPF]], which is XSPF but on JSON<br />
<br />
== Ogg Kate files - application/kate ==<br />
<br />
* Binary representation of Kate encapsulated in Ogg<br />
* may have a skeleton<br />
* can be used to identify the mime type of the track itself (e.g. in skeleton)<br />
* uses .ogx extension when in a file by itself<br />
* is subdued by the dominant mime type if in a audio or video file to become audio/ogg or video/ogg<br />
<br />
== Codec MIME types ==<br />
<br />
Codecs need their own MIME types for streaming in RTP and to be used in multitrack ogg files using skeleton:<br />
<br />
* audio/vorbis for Vorbis without container<br />
* video/theora for Theora without container<br />
* audio/speex for Speex without container<br />
* audio/flac for FLAC without and in native container<br />
* audio/opus for Opus without container<br />
* text/cmml for CMML without container<br />
* application/kate for the textual representation of Kate (.kate files)</div>
Silvia
https://wiki.xiph.org/index.php?title=Talk:SkeletonHeaders&diff=14997
Talk:SkeletonHeaders
2014-09-22T07:15:51Z
<p>Silvia: discussion on "altitude"</p>
<hr />
<div>"Altitude" is a bit weird. We could either just use zIndex or maybe Priority?</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=14528
SkeletonHeaders
2014-03-17T07:07:48Z
<p>Silvia: /* Role */ adding a few more roles to match with the HTML spec</p>
<hr />
<div>This page describes the Message Headers in [[Ogg Skeleton 4]].<br />
<br />
== Adding New Message Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
The Language field will have the dominating language specified as the first language. It is possible to specify less non-dominating languages as a list after the main language.<br />
<br />
Example:<br />
Language: en-US, fr<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption" - transcription of all sounds, including speech, for purposes of the hard-of-hearing<br />
* "text/subtitle" - translation of all speech, typically into a different language<br />
* "text/textaudiodesc" - description/transcription of everything that happens in a video as text to be used for the vision-impaired through screen readers or braille<br />
* "text/karaoke" - music lyrics delivered in chunks for singing along<br />
* "text/chapters" - titles for sections of the media that provide a kind of chapter segmentation (similar to DVD chapters)<br />
* "text/tickertext" - text to run as informative text at the bottom of the media display<br />
* "text/lyrics" - transcript of the text used in music media<br />
* "text/metadata" - name-value pairs that are associated with certain sections of the media<br />
* "text/annotation" - free text associated with certain sections of the media<br />
* "text/linguistic" - linguistic markup of the spoken words<br />
<br />
Video tracks:<br />
* "video/main" - the main video track<br />
* "video/alternate" - an alternative video track, e.g. different camera angle<br />
* "video/sign" - a sign language video track<br />
* "video/captioned" - the main video track with burnt-in captions<br />
* "video/subtitled" - the main video track with burnt-in subtitles<br />
<br />
Audio tracks:<br />
* "audio/main" - the main audio track<br />
* "audio/alternate" - an alternative audio track, probably linked to an alternate video track<br />
* "audio/dub" - the audio track but with speech in a different language to the original<br />
* "audio/audiodesc" - an audio description recording for the vision-impaired<br />
* "audio/described" - the main audio track mixed with audio descriptions<br />
* "audio/music" - a music track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/speech" - a speech track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/sfx" - a sound effects track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/commentary" - commentary on the main audio or video track<br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
=== Title ===<br />
<br />
A free text field to provide a description of the track content.<br />
<br />
Example:<br />
Title: "the French audio track for the movie"<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently proposed hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=Metadata&diff=14219
Metadata
2013-07-24T05:32:37Z
<p>Silvia: updated info on M3F & added better info on CMML and XML formats</p>
<hr />
<div>This page aims to give an overview of the current state of metadata in Ogg and the ongoing projects towards improving it. The different components work in concert; for example [[Ogg Skeleton]] provides important infrastructure for [[CMML]], [[VorbisComment]] is simple to use and program, while the draft [[M3F|Multimedia Metadata Format (M3F)]] provides more sophisticated information.<br />
<br />
== [[VorbisComment]]s ==<br />
<br />
All the Xiph.org codecs have some internal mechanism for including metadata about the current stream.<br />
Generally, this is one of the codec headers, and in the words of the [http://www.xiph.org/vorbis/doc/v-comment.html vorbis spec], <br />
"It is meant for short, text comments ... much like someone jotting a quick note on the bottom of a CDR." A single VorbisComment can store upto 2^64 bytes (16 exabytes).<br />
<br />
VorbisComments store metadata describing the stream in key=value pairs, such as "ARTIST=Elvis", "TITLE=Blue Suede Shoes". Multiple copies of any given key are allowed (for example you can specify ARTIST several times for multiple performers). The specification has several suggested keys: TITLE, VERSION, ALBUM, TRACKNUMBER, ARTIST, PERFORMER, COPYRIGHT, LICENSE, ORGANIZATION, DESCRIPTION, GENRE, DATE, LOCATION, CONTACT, ISRC. See the [http://www.xiph.org/vorbis/doc/v-comment.html specification] for the intent of each one.<br />
<br />
The [[VorbisComment]] page contains improvements to the suggested comment set.<br />
<br />
== [[FLAC]] metadata blocks ==<br />
<br />
Metadata is included in the FLAC codec as METADATA_BLOCK_DATA. Seven types of metadata block are defined: <br />
#''METADATA_BLOCK_STREAMINFO'': Sample rate, number of channels, etc.<br />
#''METADATA_BLOCK_PADDING'': Nul padding.<br />
#''METADATA_BLOCK_APPLICATION'': Third-party applications can register an ID. Metadata is typically 32-bit integers, but any datatypes can be specified.<br />
#''METADATA_BLOCK_SEEKTABLE'': For one or more seek points.<br />
#''METADATA_BLOCK_VORBIS_COMMENT'': Also known as FLAC tags, the contents of a VorbisComment packet. Note that the 32-bit field lengths are little-endian coded according to the Vorbis spec, as opposed to the usual big-endian coding of fixed-length integers in the rest of FLAC. FLAC metadata blocks are limited to 2^24 bytes (16 megabytes) and a VorbisComment packet in FLAC must fit within that limit.<br />
#''METADATA_BLOCK_CUESHEET'': Typically, but not necessarily, for CD-DA (Red Book) cuesheets.<br />
#''METADATA_BLOCK_PICTURE'': For binary picture data.<br />
<br />
== [[Ogg Skeleton]] ==<br />
<br />
[[Ogg Skeleton]] provides metadata useful for handling Ogg streams. This includes information like mime-types and mapping for granulepos which allows seeking streams without the need for the demuxer to understand them. The latest version, [[Ogg Skeleton 4]], also provides a keyframe index to enable faster seeking over high latency networks.<br />
<br />
Ogg Skeleton allows for attachment of message header fields, given as name-value pairs, that contain some sort of protocol messages about the logical bitstream. This is intended for decode related stuff, such as the screen size for a video bitstream or the number of channels for an audio bitstream.<br />
<br />
== [[OggKate]] ==<br />
<br />
[[OggKate]] was originally designed for karaoke and text. The stream can carry text and images, and these can be animated.<br />
<br />
<br />
== [[CMML]] (deprecated) ==<br />
<br />
CMML is not used anymore; use [[OggKate]] instead. The [[CMML|Continuous Media Markup Language]] allowed time-based marking up of media streams, at its simplest this allowed you to divide media files into clips and provide information about each clip.<br />
<br />
<br />
== [[M3F]] (unused draft) ==<br />
<br />
M3F is not developed any more; use [[VorbisComment]] instead.<br />
<br />
The format was intended to replace VorbisComments for the use of ''structured'' metadata, allowing VorbisComments to revert to its orginally intended use of "short, text comments ... much like someone jotting a quick note on the bottom of a CDR."<br />
<br />
'''[[M3F|Multimedia Metadata Format]]''' for the Ogg container is a draft specification which aims to provide metadata for media streams. The exact aims of this project are still under development, but they include being able to describe artist relationships to a piece more accurately as well as providing the structure to encourage more reliable metadata.<br />
<br />
<br />
== [[XMLEmbedding]] (unused draft) ==<br />
<br />
To implement XML metadata in Ogg (as for [[M3F]]), a mapping to Ogg streams is needed. The use of XML metadata will also open the way for the inclusion of technologies such as:<br />
* RDF + dublin core<br />
* [http://www.adobe.com/products/xmp/ XMP]<br />
* [http://wiki.musicbrainz.org/MusicBrainzXMLMetaData MusicBrainz]<br />
* [http://www.w3.org/Graphics/SVG/ SVG]<br />
At the moment, this specification is still not past the discussion stage.<br />
<br />
<br />
== Aims of advanced metadata ==<br />
<br />
VorbisComments work well enough for most things, and can be overloaded/abused (depending on your point of view) for most other things. But there are three major requirements that point to the design of an external metadata format; one that can be interleaved with the other streams in a container.<br />
<br />
* '''Machinability:''' There are a number of items of metadata that a player will want to parse and take action on. While there are usually 'convention' schemes for doing this with the embedded comment headers, this is much easier if there is a separate metadata stream designed for such use, instead of having to do best-effort parsing of natural language comments. For example, a video file with multiple audio tracks can specify the language of each one; a player than can parse these reliably can match them against a language preference list configured by the user to automatically select and begin playback of the best option.<br />
<br />
* '''Kitchen Sink:''' There are a minority of people who care passionately about having every detail about a track available. In the sense of conserving such information, and providing an equivalent to liner notes for online distribution, this is a goal worth supporting. However, the simple unstructured key-value pairs offered by the inline metadata are unwieldy for this level of detail. How do you tell the 2nd unit Assistant Director from the USA unit Assistant Director? How do you indicate which artist played tenor sax in the solo?<br />
<br />
* '''Addressability:''' The internal comment metadata headers are by necessity attached to a single content stream. This is useful for some appication, but a limitation in others. In a multiplexed stream, which set of comments refers to the collection as a whole? (By convention, in Ogg, it's the first logical bitstream occuring, but we can do better.) A separate metadata stream type must address this issue of collective metadata while still allowing description of individual streams. It should also allow temporal addressability, so that changes can be described. Because the in-stream comment metadata are part of the codec headers, it cannot change over the course of the stream, and allowing additional comment packets elsewhere in the stream presents seeking challenges. In the Ogg container this can be resolved by inserting a chain boundary, but this is a poor option for very-low-bitrate streams and unreliable transports such as RTP.</div>
Silvia
https://wiki.xiph.org/index.php?title=Chapter_Extension&diff=13721
Chapter Extension
2012-10-12T02:51:44Z
<p>Silvia: make chapter spec simpler</p>
<hr />
<div>Chapters are a means of providing direct and semantically relevant navigation points for a media file. On particular use case are so-called [http://en.wikipedia.org/wiki/Enhanced_podcast "enhanced podcasts"], i.e. audio files with additional chapter markers.<br />
<br />
Chapters are typically provided for a recorded file, not for a live resource.<br />
<br />
Since chapters are used for navigation - in particular to avoid having to listen to large amounts of undesired content in order to get to desired content - it is important that the information about chapters is available at the start of a media file to be able to be displayed without having to decode the media file. Therefore, we regard chapters as header-style metadata.<br />
<br />
Header-style metadata has traditionally been transported in [[VorbisComment]] headers inside Ogg.<br />
<br />
== Format ==<br />
<br />
We therefore propose an extensions to VorbisComment for transporting chapters. We introduce VorbisComment fields called CHAPTERxxx and CHAPTERxxxNAME with xxx being a number between 000 and 999. (1000 chapters are assumed to be sufficient.)<br />
<br />
The value for the CHAPTERxxx field name is the start time of the chapter (Hour:Min:DecimalSeconds). See the [[#Examples|examples]] below.<br />
<br />
The value for the CHAPTERxxxNAME field name is just a text string (8 bit clean UTF-8 encoded values, as is required for VorbisComments).<br />
<br />
This should basically support the same input format that [http://savvyadmin.com/adding-chapters-to-videos-using-mkv-containers/ Matroska chapters] support, too.<br />
<br />
== Examples ==<br />
<br />
An example chapter file with two sequential chapters: <br />
<br />
CHAPTER001=00:00:00.000<br />
CHAPTER001NAME=Chapter 1<br />
CHAPTER002=00:05:00.000<br />
CHAPTER002NAME=Chapter 2<br />
<br />
<br />
== Further Fields ==<br />
<br />
Other fields can also be used to support enhanced podcasts:<br />
<br />
* a field to extend chapters with a url to navigate to while listening to a chapter of a podcast:<br />
<br />
CHAPTERxxxURL=http&#058;//...<!-- &#058; for colon used to suppress automatic link --><br />
<br />
== WebVTT ==<br />
<br />
We expect people may also want support for hierarchically structured chapters with subchapters etc. This specification does not support this need. We recommend making use of WebVTT as a chapter format specification to satisfy this need. At a future time we expect a mapping of WebVTT into Ogg to allow for embedding of such files into Ogg, too.<br />
<br />
WebVTT is a more modern means of specifying chapters [http://dev.w3.org/html5/webvtt/#webvtt-file-using-chapter-title-text that use cues with chapter titles] and can deal with time overlapping cues.</div>
Silvia
https://wiki.xiph.org/index.php?title=Chapter_Extension&diff=13315
Chapter Extension
2012-04-11T12:32:53Z
<p>Silvia: added URLs to chapters</p>
<hr />
<div>Chapters are a means of providing direct and semantically relevant navigation points for a media file. On particular use case are so-called [http://en.wikipedia.org/wiki/Enhanced_podcast "enhanced podcasts"], i.e. audio files with additional chapter markers.<br />
<br />
Chapters are typically provided for a recorded file, not for a live resource.<br />
<br />
Since chapters are used for navigation - in particular to avoid having to listen to large amounts of undesired content in order to get to desired content - it is important that the information about chapters is available at the start of a media file to be able to be displayed without having to decode the media file. Therefore, we regard chapters as header-style metadata.<br />
<br />
Header-style metadata has traditionally been transported in [[VorbisComment]] headers inside Ogg.<br />
<br />
== Format ==<br />
<br />
We therefore propose an extensions to VorbisComment for transporting chapters. We introduce VorbisComment fields called CHAPTERxxx and CHAPTERxxxNAME with xxx being a number between 000 and 999. (1000 chapters are assumed to be sufficient.)<br />
<br />
The value for the CHAPTERxxx field name is the start time of the chapter (Hour:Min:DecimalSeconds), "-->" (U+002D, U+002D, U+003E), and the end time (Hour:Min:DecimalSeconds). See the [[#Examples|examples]] below.<br />
<br />
The value for the CHAPTERxxxNAME field name is just a text string (8 bit clean UTF-8 encoded values, as is required for VorbisComments).<br />
<br />
This should basically support the same input format that [http://savvyadmin.com/adding-chapters-to-videos-using-mkv-containers/ Matroska chapters] support, too.<br />
<br />
== Examples ==<br />
<br />
An example chapter file with two sequential chapters: <br />
<br />
CHAPTER001=00:00:00.000-->00:05:00.000<br />
CHAPTER001NAME=Chapter 1<br />
CHAPTER002=00:05:00.000-->00:10:00.000<br />
CHAPTER002NAME=Chapter 2<br />
<br />
Another example chapter file with a chapter and three hierarchically structured subchapters: <br />
<br />
CHAPTER001=00:00:00.000-->00:06:00.000<br />
CHAPTER001NAME=Chapter 1<br />
CHAPTER002=00:00:00.000-->00:02:00.000<br />
CHAPTER002NAME=Chapter 1.1<br />
CHAPTER003=00:00:02.000-->00:04:00.000<br />
CHAPTER003NAME=Chapter 1.2<br />
CHAPTER004=00:00:04.000-->00:06:00.000<br />
CHAPTER004NAME=Chapter 1.3<br />
<br />
== Further Fields ==<br />
<br />
Other fields can also be used to support enhanced podcasts:<br />
<br />
* a field to extend chapters with a url to navigate to while listening a chapter of a podcast:<br />
<br />
CHAPTERxxxURL=http://...<br />
<br />
<br />
== WebVTT ==<br />
<br />
A more modern means of specifying chapters is through [http://dev.w3.org/html5/webvtt/#webvtt-file-using-chapter-title-text WebVTT files that use cues with chapter titles].</div>
Silvia
https://wiki.xiph.org/index.php?title=VorbisComment&diff=13249
VorbisComment
2012-03-05T09:49:53Z
<p>Silvia: added chapter extension</p>
<hr />
<div>VorbisComment is a base-level [[Metadata]] format initially created for use with Ogg [[Vorbis]]. It has since been adopted in the specifications of <br />
[[Ogg]] encapsulations for other Xiph.Org codecs including [[Theora]], [[Speex]] and [[FLAC]].<br />
<br />
The use case for VorbisComment is given as:<br />
<blockquote><br />
... much like someone jotting a quick note on the bottom of a CDR. It should be a little information to remember the disc by and explain it to others; a short, to-the-point text note that need not only be a couple words, but isn't going to be more than a short paragraph.[http://xiph.org/vorbis/doc/v-comment.html]<br />
</blockquote><br />
<br />
VorbisComments are typically used to provide basic information like the title and copyright holder of a work.<br />
As such the scope is similar to that of ID3 tags used with MP3 files.<br />
VorbisComment is widely supported on [[VorbisHardware|portable Ogg Vorbis players]] as well as streaming, editing and playback software.<br />
<br />
Although the syntax of VorbisComment is well-specified, various conventions exist for the field names in use.<br />
The goal for this page is to codify best practices and collect proposals for standardization of VorbisComment field names.<br />
<br />
VorbisComments are typically encoded as the second packet in a codec stream. When VorbisComments are included in the first (ie. Theora) stream of an Ogg Theora file, they are assumed to cover all streams in the multiplexed group. [http://lists.xiph.org/pipermail/vorbis-dev/2008-December/019676.html]<br />
<br />
VorbisComment is the simplest and most widely-supported mechanism for storing metadata with Xiph.Org codecs. For other existing and proposed mechanisms, see [[Metadata]].<br />
<br />
== Recommended field names ==<br />
<br />
The current [http://xiph.org/vorbis/doc/v-comment.html VorbisComment recommendation] contains a recommended set<br />
of field names for comments.<br />
<br />
== Proposed field names ==<br />
<br />
Some proposals for extra field names:<br />
<br />
* [http://age.hobba.nl/audio/mirroredpages/ogg-tagging.html Ogg Vorbis Comment Field Recommendations]<br />
* [http://gophernet.org/articles/vorbiscomment/ Proposals for extending Ogg Vorbis comments]<br />
* [[Field names]]<br />
* [[Chapter Extension]]<br />
<br />
Comments are intended to be free-form, but for the purposes of interoperability, it is helpful to define tag sets for particular applications, and provide some guidelines for machine parsing. Note that some field names have to be non-free-form to achieve machine parsing.<br />
<br />
=== Cover art ===<br />
<br />
==== METADATA_BLOCK_PICTURE ====<br />
The [http://flac.sourceforge.net/format.html#metadata_block_picture binary FLAC picture structure] is base64 encoded and placed within a VorbisComment with the tag name "METADATA_BLOCK_PICTURE". This is the preferred and recommended way of embedding cover art within VorbisComments. It has the following benefits:<br />
<br />
* Easy to use for developers since the identical (or similar) structure is also used by FLAC and MP3.<br />
* The cover art can either be linked or embedded within the stream.<br />
* Common picture file formats are supported (jpg and png).<br />
* A description may be included and the picture type (front cover, back cover...) and image mime type are provided.<br />
* Base64 encoded data is invariant under UTF-8 and a valid UTF-8 string, so obeys the rules for comment data.<br />
<br />
Implementations interpreting or writing picture blocks should note the following details: <br />
<br />
===== General encoding/decoding =====<br />
* Failure to decode a picture block should not prevent playback of the file (failure to deal with the particularly large packet required by the comment header is a separate problem with the player implementation).<br />
* Base64 encoding is used as in section 4 of [http://www.faqs.org/rfcs/rfc4648.html RFC4648]. We note that line feeds are not allowed and padding characters ('=') are required.<br />
* Applications adding picture blocks should inform users that some applications or hardware may not support them and should provide a method to remove the blocks (this is expected to be trivial for applications capable of adding them).<br />
<br />
===== Block handling =====<br />
* The unencoded format is that of the [http://flac.sourceforge.net/format.html#metadata_block_picture FLAC picture block]. The fields are stored in big endian order as in FLAC, picture data is stored according to the relevant standard.<br />
* Picture data should be stored in PNG or JPEG formats or linked separately. It is recommended readers support both PNG and JPEG<br />
* Allowed values for the MIME string are "image/", "image/png", "image/jpeg" and "-->" (the link indicator) and "" (length 0). An empty MIME string indicates type "image/"<br />
* Fields present in the ID3V2.4.0 [http://www.id3.org/id3v2.4.0-frames#line-1085 Attached Picture Frame] (APIC Frame) take the same interpretation as in the ID3V2.4.0 format with the following exceptions (following the FLAC format):<br />
** The description field is UTF-8 (encoded without ID3V2's initial 'encoding byte')<br />
** String fields are not null terminated: their preceding length fields are used instead.<br />
<br />
===== Linked images =====<br />
Support for linked images is optional for applications handling picture blocks. When a linked picture is indicated the following rules are observed:<br />
* The picture data is a complete URL indicating the picture to be used, relative URLs are allowed (note relative URLs do not start with a protocol specifier and are retrieved with the same protocol as the file being processed).<br />
* Links are ISO-8859-1 encoded<br />
* Applications MAY retrieve linked images via the file:// protocol.<br />
* Applications MUST obtain user approval if they wish to retrieve images via remote protocols.<br />
* Link targets may become unavailable: applications supporting linked images SHOULD recover gracefully from this and MAY report the absence to the user.<br />
* The type of the linked file is not restricted to JPEG and JFIF and applications MAY support other formats<br />
* If the application does not support linked images, the target is unavailable, not permitted or an unknown format the picture block should be skipped.<br />
* Applications may make links available to users, this is of particular use when links are unsupported or of unsupported type<br />
<br />
===== Image dimension fields =====<br />
* The height, width, colour depth and 'number of colours' fields are for purely informational purposes. Applications MUST NOT use them for decoding purposes, but MAY display them to the user and MAY use them to make a decision whether to skip the block (for example if selecting the most appropriate among multiple blocks).<br />
* Applications writing picture blocks MUST set these fields correctly OR set them all to zero.<br />
<br />
===== Multiple blocks =====<br />
* Multiple image blocks MAY be included as separate METADATA_BLOCK_PICTURE comments.<br />
* There may only be one each of picture type (APIC type) 1 and 2 in a Vorbis stream.<br />
* Block order is significant for some types and applications should preserve the comment order when reading or writing VorbisComment headers. The block order may be used to determine the order pictures are presented to the user.<br />
<br />
===== Playback tests =====<br />
Embedding a picture into the file might break playback of existing players (especially hardware players, software players could be updated easily). A workaround would be to link the picture within the tag. Furthermore users should become informed in some way that embedding a picture COULD cause problems (as stated above).<br />
<br />
In order to test if there are playback problems, there are test files available [http://www.audioranger.com/coverart_mk.ogg here] and [http://www.audioranger.com/coverart_im.ogg here]. You're invited to download one of these test files (or both), test playback on your software and hardware players, and report the results here on the wiki.<br />
<br />
'''Tested software players'''<br />
* Audacious 1.5.1: no problem<br />
* foobar2000: no problems<br />
* Gnome: built-in preview playback: no problem<br />
* MediaMonkey: no problems<br />
* Media Player Classic (unicode build) 6.4.9.1: no problem<br />
* RoarAudio: no problems (server and client side)<br />
* Rythmbox 0.11.6: no problem<br />
* Totem 2.24.3: no problem<br />
* VLC 0.9.4/0.9.6: doesn't play<br />
** Patch send to VLC to fix this - should get in 1.0.0<br />
* WinAmp: no problems<br />
* Windows Media Player 11: no problem<br />
* XMPlay 3.4.2: no problem<br />
* Nero ShowTime: no problem<br />
* Songbird 1.8.0: no problem, able to show and edit embedded pictures<br />
<br />
'''Tested hardware players'''<br />
* Logitech Squeezebox: Supported as of January 2009 (server version 7.3.3)<br />
* Sandisk Sansa Fuze (Firmware 01.01.22): Hangs up when trying to playback the demo file - had to reset the player<br />
** Note: The "Fuze" can play ogg vorbis files which have embedded pictures from "Easytag"<br />
* Cowon iAudio U3 (Firmware 1.29, 4 GB): works<br />
* Cowon D2: no problem (latest Firmware: 2.59, 8GB Version)<br />
* iRiver E100: no problem (latest Firmware: 1.16 G_U, 8GB Version)<br />
* Samsung YP-R1: no problem (latest Firmware: 3.07, 16GB Version)<br />
<br />
'''Tested tag editors'''<br />
* Easytag 2.1.6: can open the file to edit the normal tag fields<br />
* MP3Tag 2.42e: can open the file to edit the normal tag fields<br />
* MP3Tag 2.47b: is able to show and edit embedded pictures<br />
<br />
'''Tested other software'''<br />
* Total Recorder: capable to work with artwork according to the specification.<br />
<br />
==== Unofficial COVERART field (deprecated) ====<br />
There also exists an unofficial, not well supported comment field named "COVERART". It includes a base64-encoded string of the binary picture data (usually a JPEG file, but this could be a different file format too). The disadvantages are that<br />
* no additional information like a description about the cover art or its type (front cover, back cover etc.) is provided,<br />
* the cover art can't be linked<br />
* the base64 string is displayed within many tag editors as plain text because of their missing support for this "COVERART" field<br />
* it may breaks the playback on hardware players because of a large VorbisComment header<br />
The unofficial "COVERART" field is supported for example by such software as AudioShell (http://www.softpointer.com/AudioShell.htm) - read/write, and Total Recorder (http://www.totalrecorder.com/)- only read.<br />
<br />
===== Conversion to METADATA_BLOCK_PICTURE =====<br />
Old "COVERART" tags should be converted to the new METADATA_BLOCK_PICTURE tag (see above for its specification). This conversion is straightforward and is suggested to be done the following way:<br />
<br />
* Decode the COVERART tag. A program MAY check the signature of the embedded picture in order to determine whether it is an allowed type. Lossless conversion from disallowed types to allowed types MAY be carried out.<br />
* Fill out the FLAC block with the binary picture data. If the MIME type of the picture is unknown or can't be determined, the MIME type "image/" MAY be used instead. Supplying image dimensions, color depth etc. is optional (see specification above).<br />
* In the absence of other information the picture type 'Other' should be used. Applications may want to allow users to select a default type or specify the type to use.<br />
* Encode the new picture block, remove the COVERART tag from the comments and add the METADATA_BLOCK_PICTURE entry.<br />
* If multiple tags are being converted the order of the METADATA_BLOCK_PICTURE tags should be the same as that of the COVERART tags they are replacing.<br />
<br />
=== Date and time ===<br />
<br />
The goal is to specify '''one''' standard format for describing date and/or time.<br />
<br />
==== ISO proposal ====<br />
The date format for any field describing a date must follow the ISO scheme: YYYY-MM-DD, shortened to just YYYY-MM or simply YYYY.<br />
<br />
We have been recommending this usage with the DATE tag for some time. It is proposed that the spec be amended to include this information for machinability.<br />
<br />
The time format for any field '''except''' track duration must be specified with leading T and ending with a time zone. Schemas with and without dates: <br />
<pre><br />
YYYY-MM-DDTHH:MM:SS+TS<br />
</pre><br />
<pre><br />
THH:MM+TZ<br />
</pre><br />
<br />
=== ENCODER ===<br />
<br />
The goal is to attribute encoder software. This value can be used in the future to determine which files can be improved by being re encoded with a newer version.<br />
<br />
:'''Comment''': What is lacking from the vendor string present in the spec from the start? All libvorbis and encoder tunings I'm aware of have recorded the encoder version here.<br />
<br />
Rationale for not using the vendor string:<br />
* The vendor string is usually used to store the name and version of the underlying codec library<br />
* The idea of ENCODER is to store the name of the user-visible application, for example <tt>ffmpeg2theora</tt>.<br />
* It can be useful for debugging to store the name and version of the calling application.<br />
* The libvorbis API does not let applications override the vendor string.<br />
<br />
==== Proposal: Inclusion of URL in ENCODER value ====<br />
The encoder field name must be a unique URL providing both encoder software name and version. If no unique URL address is available were both name and version is available; then the version number can be specified by separating with a space character. For example:<br />
<br />
<nowiki>ENCODER=http://flac.sourceforge.net/ 1.2.1</nowiki><br />
<br />
* Note that ffmpeg2theora uses ENCODER, but does not include a url. ''Added by Rillian on September 17, 2007''<br />
<br />
==== Proposal: ENCODED_BY ====<br />
<br />
I've also seen ENCODED_BY. ''Added by Rillian on September 17, 2007''<br />
: ENCODED_BY is usually the person who did the encoding. This should not be part of the recommendation due to legal problems around deliberate and accidental distribution to third parties. Basically the name of the encoder should not be included to protect encoders from their own egos and possible legal prosecution. ''Added by Aleksandersen on September 20, 2007''<br />
<br />
=== Improving license data ===<br />
<br />
The goal is to provide a method for proclaiming license and copyright information (basically clarifying ‘distribution rights (if any) and ownership’).<br />
<br />
The [http://xiph.org/vorbis/doc/v-comment.html specification document] describes LICENSE and COPYRIGHT fields. But is not clear enough about whether these should be machine-readable.<br />
<br />
We should consider working together with Creative Commons to have complementary and interlinked information on the Creative Commons and Xiph wikis. Refer to the [http://wiki.creativecommons.org/Ogg Ogg page] in the Creative Commons wiki.<br />
<br />
==== New RIGHTS field name proposal ====<br />
One proposal is to replace the COPYRIGHT and LICENSE field names with RIGHTS. RIGHTS must be a human-readable copyright statement. Basic example:<br />
<br />
<nowiki>RIGHTS=Copyright © Recording Company Inc. All distribution rights reserved.</nowiki><br />
<br />
But this is not machine-readable. Adding two complementary field names should do the trick: RIGHTS-DATE, describing the date of copyright; and RIGHTS-URI, providing a method for linking to a license. Software agents can assume that multiple songs uses the sameURIs, such as in the case for Creative Commons. Full example:<br />
<br />
<nowiki>RIGHTS=Copyright © 2019 Recording Company Inc. All distribution rights reserved.</nowiki><br /><br />
<nowiki>RIGHTS-DATE=2019-04</nowiki><br /><br />
<nowiki>RIGHTS-URI=http://somewhere.com/license.xhtml</nowiki><br />
<br />
Software such as for multimedia management and playback are encouraged to display the RIGHTS statement as a linked phrase using RIGHTS-URI.<br />
<br />
RIGHTS-DATE does not need to be displayed as it is required in the human readable version by international copyright agreements. RIGHTS-DATE can be used to determine when a copyrighted work falls under the public domain and related matters. (''The Beatles''' copyright on their original studio recordings (not the remixes) are soon expiering. So mechanisms such as the RIGHTS-DATE are indeed required in music management and filesharing software!)<br />
<br />
To remain machine-readable it would be required to have at most one instance of each RIGHTS field name. All fields would of course remain optional.<br />
<br />
The ''Dublin Core Metadata Initiative'' recommends the use of ‘rights’ to describe license and copyright matters. The web feed format Atom 1.0 has implemented a rights element in their specification.<br />
<br />
:'''Comment''': The triplet RIGHTS, RIGHTS-DATE, RIGHTS-URI is an example of structured metadata. VorbisComments are inherently unstructured, and this should be respected. Structured metadata belongs in a different stream, such as XML (using [[Metadata#XMLEmbedding|XMLEmbedding]]).<br />
<br />
==== Improving existing fields proposal ====<br />
Similar to the DATE tag above, we have generally recommended that a URL uniquely identifying the license be included in the LICENSE field to allow machine identification of the license. This is in agreement with the proposal in the Creative Commons wiki. Since the COPYRIGHT field is a human-readable statement of the copyright, like the proposed RIGHTS tag above, some people include a license url there. Therefore if a url can't be found in a LICENSE tag if any, applications should use one from the COPYRIGHT tag, if any. Contact information for verification, attribution, relicensing, etc. can be obtained from the COPYRIGHT field, but Creative Commons also recommend a separate CONTACT tag for this information. This is reasonable, so we propose it be included.<br />
<br />
=== Geo Location fields ===<br />
<br />
The existing LOCATION field is meant to carry a human readable location for the recording/creation of the media file. <br />
<br />
Having geographical coordinates according to [http://en.wikipedia.org/wiki/World_Geodetic_System WGS84] can be useful as well, especially in a form that can be machine parsed. The agreed format is similar to this [http://en.wikipedia.org/wiki/Geo_(microformat) geo microformat]:<br />
<br />
GEO_LOCATION= ''latitude'' ; ''longitude'' [; ''elevation'' ] <br />
<br />
where each value is a fixed point decimal number formatted in the C locale with a period (.) for the radix. Values are separated with a ';' and white space is not significant. The elevation is optional.<br />
<br />
''latitude'' is the geo latitude location of where the media has been recorded or produced in decimal degrees according to WGS84 (zero at the equator, negative values for southern latitudes) (C double).<br />
<br />
''longitude'' is the geo longitude location of where the media has been recorded or produced in decimal degrees according to WGS84 (zero at the prime meridian in Greenwich/UK, negative values for western longitudes). (C double).<br />
<br />
''elevation'' is the geo elevation of where the media has been recorded or produced in meters according to WGS84 (zero is average sea level) (C double).<br />
<br />
=== Replay Gain ===<br />
<br />
The REPLAYGAIN_* fields implement the Replay Gain proposal for storing a track relative volume adjustment, which can be used to "fix" quiet or loud sounding Vorbis or FLAC streams. The set of tags is intended to be machine parsed, and has the following form: <br />
<pre><br />
REPLAYGAIN_TRACK_GAIN=-7.03 dB<br />
REPLAYGAIN_TRACK_PEAK=1.21822226<br />
REPLAYGAIN_ALBUM_GAIN=-6.37 dB<br />
REPLAYGAIN_ALBUM_PEAK=1.21822226<br />
</pre><br />
<br />
See http://www.replaygain.org/ for detailed information about Replay Gain and how the different values are calculated.<br />
<br />
=== Tantalos resource ID ===<br />
Tantalos is a protocol to auto locate and access content in a network scope (normally 'LAN' but can also be bigger like 'company network'). It uses UUIDs to identify resources. There are two groups of those UUIDs: IDs generated using the meta data of the track and IDs generated in some other way (for example random IDs, see UUID specs). The later group may need to be passed with the track. The VCLT Playlist format (see below) uses 'HASH={UUID}id...' with hex-dash format for this (HASH={UUID}e278173d-4d6d-4c66-95ec-4ec85eedc7d1).<br />
--[[User:Ph3-der-loewe|Ph3-der-loewe]] 02:05, 13 December 2011 (PST)<br />
<br />
== Other (non-proposed field names) ==<br />
=== VCLT playlist format ===<br />
The VCLT playlist format uses some ''keys'' which look like VorbisComments but they aren't nor they are proposed to become (expect for HASH). This includes the keys STREAMURL, FILENAME, FILEURL, LENGTH, HASH (see above for this one), SIGNALINFO, AUDIOINFO and OFFSET. You can find more infos about those at [https://bts.keep-cool.org/wiki/Specs/VCLT https://bts.keep-cool.org/wiki/Specs/VCLT].<br />
<br />
== Implementations ==<br />
<br />
* [http://sbooth.org/importers/ OggImporter] &ndash; imports Ogg Vorbis and Ogg FLAC files to Spotlight.<br />
* vorbiscomment &ndash; a commandline tool, part of [[VorbisTools]].<br />
* [http://www.xiph.org/oggz/ oggz-comment] &ndash; a commandline tool, part of [[Oggz]] tool. It can add comments to multitrack and video files.</div>
Silvia
https://wiki.xiph.org/index.php?title=Chapter_Extension&diff=13248
Chapter Extension
2012-03-05T09:48:48Z
<p>Silvia: started specification</p>
<hr />
<div>Chapters are a means of providing direct and semantically relevant navigation points for a media file. On particular use case are so-called http://en.wikipedia.org/wiki/Enhanced_podcast "enhanced podcasts"], i.e. audio files with additional chapter markers.<br />
<br />
Chapters are typically provided for a recorded file, not for a live resource.<br />
<br />
Since chapters are used for navigation - in particular to avoid having to listen to large amounts of undesired content in order to get to desired content - it is important that the information about chapters is available at the start of a media file to be able to be displayed without having to decode the media file. Therefore, we regard chapters as header-style metadata.<br />
<br />
Header-style metadata has traditionally been transported in VorbisComment headers inside Ogg.<br />
<br />
We therefore propose an extensions to VorbisComment for transporting chapters. We introduce VorbisComment fields called CHAPTERxxx and CHAPTERxxxNAME with xxx being a number between 000 and 999. We assume 1000 chapters to be sufficient.<br />
<br />
An example chapter file with two sequential chapters: <br />
<br />
CHAPTER001=00:00:00.000-00:05:00.000<br />
CHAPTER001NAME=Chapter 1<br />
CHAPTER002=00:05:00.000-00:10:00.000<br />
CHAPTER002NAME=Chapter 2<br />
<br />
<br />
Another example chapter file with a chapter and three hierarchically structured subchapters: <br />
<br />
CHAPTER001=00:00:00.000-00:06:00.000<br />
CHAPTER001NAME=Chapter 1<br />
CHAPTER002=00:00:00.000-00:02:00.000<br />
CHAPTER002NAME=Chapter 1.1<br />
CHAPTER003=00:00:02.000-00:04:00.000<br />
CHAPTER003NAME=Chapter 1.2<br />
CHAPTER004=00:00:04.000-00:06:00.000<br />
CHAPTER004NAME=Chapter 1.3<br />
<br />
<br />
This should basically support the same input format that [http://savvyadmin.com/adding-chapters-to-videos-using-mkv-containers/ Matroska chapters] support, too.<br />
<br />
A more modern means of specifying chapters is through [http://dev.w3.org/html5/webvtt/#webvtt-file-using-chapter-title-text WebVTT files that use cues with chapter titles].</div>
Silvia
https://wiki.xiph.org/index.php?title=CMML&diff=12991
CMML
2011-08-26T01:18:01Z
<p>Silvia: make Kate a link</p>
<hr />
<div>= NOTE: CMML is deprecated; Xiph recommends you use [[Kate]] instead =<br />
<br />
<br />
'''CMML''' stands for <b>Continuous Media Markup Language</b> and is to audio or video what HTML is to text. CMML is essentially a timed text codec. It allows to structure a time-continuously sampled data file by dividing it into temporal sections (so-called <i>clips</i>) and provides these clips with some additional information. This information is HTML-like and is essentially a textual representation of the audio or video file. CMML enables textual searches on these otherwise binary files.<br />
<br />
CMML is appropriate for use with all [[Ogg]] media formats, to provide subtitles and timed metadata. This description gives a quick introduction only and explains how to map CMML into Ogg. For full specifications, see [http://www.annodex.net/specifications.html http://www.annodex.net/specifications.html].<br />
<br />
<br />
== CMML specification ==<br />
<br />
Before describing the actual data that goes into a logical Ogg bitstream, we need to understand what the stand-alone "codec" packets contains.<br />
<br />
CMML basically consists of:<br />
<br />
* a <b>head</b> tag which contains information for the complete audio/video file<br />
* a set of <b>clip</b> tags which each contains information on a temporal section of the file<br />
* for authoring purposes, CMML also allows a <b>stream</b> tag which spcifies the file it describes<br />
<br />
An example CMML file looks like this:<br />
<br />
<pre><br />
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><br />
<!DOCTYPE cmml SYSTEM "cmml.dtd"><br />
<br />
<cmml lang="en" id="simple" granulerate="1000/1"><br />
<br />
<stream id="fish" basetime="0"><br />
<import id="videosrc" lang="en" title="Video fish" <br />
granulerate="25/1" contenttype="video/ogg" <br />
src="fish.ogv" start="0" end="360"><br />
<param id="vheight" name="video.height" value="250"/><br />
<param id="vwidth" name="video.width" value="180"/><br />
</import><br />
</stream><br />
<br />
<head><br />
<title>Types of fish</title><br />
<meta name="Producer" content="Joe Ordinary"/><br />
<meta name="DC.Author" content="Joe's friend"/><br />
</head><br />
<br />
<clip id="intro" start="0"><br />
<a href="http://www.example.com/fish.html">Read more about fish</a><br />
<desc>This is the introduction to the film Joe made about fish.</desc><br />
</clip><br />
<br />
<clip id="dolphin" start="npt:3.5" end="npt:5:5.9"><br />
<img src="dolphin.jpg"/><br />
<desc>Here, Joe caught sight of a dolphin in the ocean.</desc><br />
<meta name="Subject" content="dolphin"/><br />
</clip><br />
<br />
<clip id="goldfish" start="npt:5:5.9"><br />
<a href="http://www.example.com/morefish.anx?id=goldfish">More video clips on goldfish.</a><br />
<img src="http://www.example.com/goldfish.jpg"/><br />
<desc>Joe has a fishtank at home with many colourful fish. The common goldfish is one of them and Joe's favourite.<br />
Here are some fabulous pictures he has taken of them.</desc><br />
<meta name="Location" content="Joe's fishtank"/><br />
<meta name="Subject" content="goldfish"/><br />
</clip><br />
<br />
</cmml><br />
</pre><br />
<br />
<br />
The head element is a standard head element from html.<br />
<br />
Clips contain (amongst others) the following information:<br />
<br />
* a name in the <b>id</b> attribute so addressing of the clips is possible, as in http://www.example.com/morefish.anx?id=goldfish (Web server needs to [http://annodex.net/software/mod_annodex/ support] this)<br />
* a <b>start</b> and possibly an <b>end</b> attribute, to tell the clip where it is temporally located<br />
* a <b>title</b> attribute to give it a short description<br />
* <b>meta</b> elements to provide it with structed meta data as name-value pairs<br />
* a <b>img</b> element which links to a picture that represents the content of the clip visually<br />
* a <b>a</b> element which puts a hyperlink to another Web resource into the clip<br />
* a <b>desc</b> element giving a long, free-text description/annotation/transcription for the clip<br />
<br />
Most of this information is optional.<br />
<br />
== CMML mapping into Ogg ==<br />
<br />
When CMML is mapped into an Ogg logical bitstream it needs to be serialised first. XML is a hierarchical file format, so is not generally serialisable. However, CMML has been designed to be serialised easily.<br />
<br />
CMML is serialised by having some initial header packets that set up the CMML decoding environment, and contain header type information. The content packets of a CMML logical bitstream then consists of <b>clip</b> tags only. The <b>stream</b> tag is not copied into the CMML bitstream as it controls the authoring only.<br />
<br />
All of the CMML bitstream information is text. As it gets encoded into a binary bitstream, an encoding format has to be specified. To simplify things, UTF-8 is defined as the mandatory encoding format for all data in a CMML binary bitstream. Also, the encoding process MUST ensure that newline characters are represented as LF (or "\n" in C) only and replace any new line representations that come as CR LF combinations (or "\r\n" in C) with LF only.<br />
<br />
The media mapping for CMML into Ogg is as follows:<br />
* The bos page contains a CMML ident packet.<br />
* The first secondary header packet of CMML contains the xml preamble.<br />
* The second secondary header packet contains the CMML "head" tag.<br />
* The content or data packets for CMML contain the CMML "clip" tags each encoded in their own packet and inserted at the accurate time.<br />
* The eos page contains a packet with an empty clip tag.<br />
<br />
<br />
=== The CMML ident header packet ===<br />
<br />
The CMML logical bitstream starts with an ident header which is encapsulated into the CMML bos page. The ident header contains all information required to identify the CMML bitstream and to set up a CMML decoder. It has the following format:<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'CMML\0\0\0\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Version major | Version minor | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate numerator | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate denominator | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granuleshift | 28<br />
+-+-+-+-+-+-+-+-+<br />
| ...<br />
<br />
The CMML <i>version</i> as described here is major=2 minor=1.<br />
<br />
The <i>granulerate</i> represents the temporal resolution of the logical bitstream in Hz given as a rational number in the same way as the [[OggSkeleton]] fisbone secondary header specifies granulerate. It enables a mapping of granule position of the data pages to time by calculating "granulepos / granulerate".<br />
<br />
The default granule rate for CMML is: 1/1000.<br />
<br />
The <i>granuleshift</i> is a 1 Byte integer number describing whether to partition the granule_position into two for the CMML logical bitstream, and how many of the lower bits to use for the partitioning. The upper bits then still signify a time-continuous granule position for a directly decodable and presentable data granule. The lower bits allow for specification of the granule position of a previous CMML data packet (i.e. "clip" element), which helps to identify how much backwards seeking is necessary to get to the last and still active "clip" element (of the given track). The granuleshift is therefore the log of the maximum possible clip spacing.<br />
<br />
The default granule shift used is 32, which halfs the granule position to allow for the backwards pointer.<br />
<br />
=== The CMML secondary header packets ===<br />
<br />
The CMML secondary headers are a sequence of two packets that contain the CMML and XML "setup" information:<br />
* one packet with the CMML xml preamble and <b>cmml</b> tag.<br />
* one packet with the CMML <b>head</b> tag.<br />
<br />
These packets contain textual, not binary information.<br />
<br />
The CMML preamble tags are all single-line tags, such as the xml processing instruction (<?xml...>) and the document type declaration (<!DOCTYPE...>).<br />
<br />
The only CMML tag that is not already serialized from a CMML file is the <b>cmml</b> tag, as it encloses all the other content tags. To serialise it, the <b>cmml</b> start tag is transformed into a processing instruction, retaining all its attributes (<?cmml ...>), and the <b>cmml</b> end tag is deleted.<br />
<br />
The first CMML secondary header packet has the following format:<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <?xml ... | 0-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <!DOCTYPE ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <?cmml ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
The second CMML secondary header packet contains the CMML <b>head</b> element with all its attributes and other containing elements and has the following format.<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <head ... | 0-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| </head> |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
=== The CMML data packets ===<br />
<br />
The data packets of the CMML bitstream contain the CMML <b>clip</b> elements. Their <b>start</b> and <b>end</b> attributes however only exist for authoring purposes and are not copied into the bitstream (to avoid contradictory information), but are rather represented through the time mapping of the encapsulation format that interleaves CMML data with data from other time-continuous bitstreams. Generally the time mapping is done through some timestamp representation and through the position in the stream.<br />
<br />
A <b>clip</b> tag is encoded with all tags (except for the <b>start</b> and <b>end</b> attributes) as a string printed into a clip packet. The <b>clip</b> tag's <b>start</b> attribute tells the encapsulator at what time to insert the clip packet into the bitstream. If an <b>end</b> attribute is present, it leads to the creation of another clip packet, unless another clip packet starts on the same track beforehand. This clip packet contains an "empty" <b>clip</b> tag, i.e. a <b>clip</b> tag without <b>meta</b>, <b>a</b>, <b>img</b> or <b>desc</b> elements and no attribute values except for a copy of the <b>track</b> attribute from the original <b>clip</b> tag.<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <clip ... | 0-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| </clip> |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
== Development ==<br />
<br />
Ogg CMML is being supported by the following projects:<br />
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]<br />
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]<br />
* libcmml: [http://svn.annodex.net/libcmml/ libcmml svn] or [http://annodex.net/software/libcmml/ libcmml]<br />
* libannodex: [http://svn.annodex.net/libannodex/ libannodex svn] or [http://annodex.net/software/libannodex/ libannodex]<br />
* the Annodex technology: [http://www.annodex.net/ annodex.net] including perl, python, php bindings, a firefox plugin, authoring software etc.<br />
<br />
<br />
== External links ==<br />
<br />
* CMML is described in more detail in the CMML v2.1 specification: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]<br />
<br />
[[Category:Ogg Mappings]]</div>
Silvia
https://wiki.xiph.org/index.php?title=CMML&diff=12990
CMML
2011-08-26T01:17:09Z
<p>Silvia: deprecated CMML</p>
<hr />
<div>= NOTE: CMML is deprecated; Xiph recommends you use Kate instead =<br />
<br />
<br />
'''CMML''' stands for <b>Continuous Media Markup Language</b> and is to audio or video what HTML is to text. CMML is essentially a timed text codec. It allows to structure a time-continuously sampled data file by dividing it into temporal sections (so-called <i>clips</i>) and provides these clips with some additional information. This information is HTML-like and is essentially a textual representation of the audio or video file. CMML enables textual searches on these otherwise binary files.<br />
<br />
CMML is appropriate for use with all [[Ogg]] media formats, to provide subtitles and timed metadata. This description gives a quick introduction only and explains how to map CMML into Ogg. For full specifications, see [http://www.annodex.net/specifications.html http://www.annodex.net/specifications.html].<br />
<br />
<br />
== CMML specification ==<br />
<br />
Before describing the actual data that goes into a logical Ogg bitstream, we need to understand what the stand-alone "codec" packets contains.<br />
<br />
CMML basically consists of:<br />
<br />
* a <b>head</b> tag which contains information for the complete audio/video file<br />
* a set of <b>clip</b> tags which each contains information on a temporal section of the file<br />
* for authoring purposes, CMML also allows a <b>stream</b> tag which spcifies the file it describes<br />
<br />
An example CMML file looks like this:<br />
<br />
<pre><br />
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><br />
<!DOCTYPE cmml SYSTEM "cmml.dtd"><br />
<br />
<cmml lang="en" id="simple" granulerate="1000/1"><br />
<br />
<stream id="fish" basetime="0"><br />
<import id="videosrc" lang="en" title="Video fish" <br />
granulerate="25/1" contenttype="video/ogg" <br />
src="fish.ogv" start="0" end="360"><br />
<param id="vheight" name="video.height" value="250"/><br />
<param id="vwidth" name="video.width" value="180"/><br />
</import><br />
</stream><br />
<br />
<head><br />
<title>Types of fish</title><br />
<meta name="Producer" content="Joe Ordinary"/><br />
<meta name="DC.Author" content="Joe's friend"/><br />
</head><br />
<br />
<clip id="intro" start="0"><br />
<a href="http://www.example.com/fish.html">Read more about fish</a><br />
<desc>This is the introduction to the film Joe made about fish.</desc><br />
</clip><br />
<br />
<clip id="dolphin" start="npt:3.5" end="npt:5:5.9"><br />
<img src="dolphin.jpg"/><br />
<desc>Here, Joe caught sight of a dolphin in the ocean.</desc><br />
<meta name="Subject" content="dolphin"/><br />
</clip><br />
<br />
<clip id="goldfish" start="npt:5:5.9"><br />
<a href="http://www.example.com/morefish.anx?id=goldfish">More video clips on goldfish.</a><br />
<img src="http://www.example.com/goldfish.jpg"/><br />
<desc>Joe has a fishtank at home with many colourful fish. The common goldfish is one of them and Joe's favourite.<br />
Here are some fabulous pictures he has taken of them.</desc><br />
<meta name="Location" content="Joe's fishtank"/><br />
<meta name="Subject" content="goldfish"/><br />
</clip><br />
<br />
</cmml><br />
</pre><br />
<br />
<br />
The head element is a standard head element from html.<br />
<br />
Clips contain (amongst others) the following information:<br />
<br />
* a name in the <b>id</b> attribute so addressing of the clips is possible, as in http://www.example.com/morefish.anx?id=goldfish (Web server needs to [http://annodex.net/software/mod_annodex/ support] this)<br />
* a <b>start</b> and possibly an <b>end</b> attribute, to tell the clip where it is temporally located<br />
* a <b>title</b> attribute to give it a short description<br />
* <b>meta</b> elements to provide it with structed meta data as name-value pairs<br />
* a <b>img</b> element which links to a picture that represents the content of the clip visually<br />
* a <b>a</b> element which puts a hyperlink to another Web resource into the clip<br />
* a <b>desc</b> element giving a long, free-text description/annotation/transcription for the clip<br />
<br />
Most of this information is optional.<br />
<br />
== CMML mapping into Ogg ==<br />
<br />
When CMML is mapped into an Ogg logical bitstream it needs to be serialised first. XML is a hierarchical file format, so is not generally serialisable. However, CMML has been designed to be serialised easily.<br />
<br />
CMML is serialised by having some initial header packets that set up the CMML decoding environment, and contain header type information. The content packets of a CMML logical bitstream then consists of <b>clip</b> tags only. The <b>stream</b> tag is not copied into the CMML bitstream as it controls the authoring only.<br />
<br />
All of the CMML bitstream information is text. As it gets encoded into a binary bitstream, an encoding format has to be specified. To simplify things, UTF-8 is defined as the mandatory encoding format for all data in a CMML binary bitstream. Also, the encoding process MUST ensure that newline characters are represented as LF (or "\n" in C) only and replace any new line representations that come as CR LF combinations (or "\r\n" in C) with LF only.<br />
<br />
The media mapping for CMML into Ogg is as follows:<br />
* The bos page contains a CMML ident packet.<br />
* The first secondary header packet of CMML contains the xml preamble.<br />
* The second secondary header packet contains the CMML "head" tag.<br />
* The content or data packets for CMML contain the CMML "clip" tags each encoded in their own packet and inserted at the accurate time.<br />
* The eos page contains a packet with an empty clip tag.<br />
<br />
<br />
=== The CMML ident header packet ===<br />
<br />
The CMML logical bitstream starts with an ident header which is encapsulated into the CMML bos page. The ident header contains all information required to identify the CMML bitstream and to set up a CMML decoder. It has the following format:<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Identifier 'CMML\0\0\0\0' | 0-3<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 4-7<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Version major | Version minor | 8-11<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate numerator | 12-15<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 16-19<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granulerate denominator | 20-23<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| | 24-27<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| Granuleshift | 28<br />
+-+-+-+-+-+-+-+-+<br />
| ...<br />
<br />
The CMML <i>version</i> as described here is major=2 minor=1.<br />
<br />
The <i>granulerate</i> represents the temporal resolution of the logical bitstream in Hz given as a rational number in the same way as the [[OggSkeleton]] fisbone secondary header specifies granulerate. It enables a mapping of granule position of the data pages to time by calculating "granulepos / granulerate".<br />
<br />
The default granule rate for CMML is: 1/1000.<br />
<br />
The <i>granuleshift</i> is a 1 Byte integer number describing whether to partition the granule_position into two for the CMML logical bitstream, and how many of the lower bits to use for the partitioning. The upper bits then still signify a time-continuous granule position for a directly decodable and presentable data granule. The lower bits allow for specification of the granule position of a previous CMML data packet (i.e. "clip" element), which helps to identify how much backwards seeking is necessary to get to the last and still active "clip" element (of the given track). The granuleshift is therefore the log of the maximum possible clip spacing.<br />
<br />
The default granule shift used is 32, which halfs the granule position to allow for the backwards pointer.<br />
<br />
=== The CMML secondary header packets ===<br />
<br />
The CMML secondary headers are a sequence of two packets that contain the CMML and XML "setup" information:<br />
* one packet with the CMML xml preamble and <b>cmml</b> tag.<br />
* one packet with the CMML <b>head</b> tag.<br />
<br />
These packets contain textual, not binary information.<br />
<br />
The CMML preamble tags are all single-line tags, such as the xml processing instruction (<?xml...>) and the document type declaration (<!DOCTYPE...>).<br />
<br />
The only CMML tag that is not already serialized from a CMML file is the <b>cmml</b> tag, as it encloses all the other content tags. To serialise it, the <b>cmml</b> start tag is transformed into a processing instruction, retaining all its attributes (<?cmml ...>), and the <b>cmml</b> end tag is deleted.<br />
<br />
The first CMML secondary header packet has the following format:<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <?xml ... | 0-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <!DOCTYPE ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <?cmml ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
The second CMML secondary header packet contains the CMML <b>head</b> element with all its attributes and other containing elements and has the following format.<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <head ... | 0-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| </head> |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
=== The CMML data packets ===<br />
<br />
The data packets of the CMML bitstream contain the CMML <b>clip</b> elements. Their <b>start</b> and <b>end</b> attributes however only exist for authoring purposes and are not copied into the bitstream (to avoid contradictory information), but are rather represented through the time mapping of the encapsulation format that interleaves CMML data with data from other time-continuous bitstreams. Generally the time mapping is done through some timestamp representation and through the position in the stream.<br />
<br />
A <b>clip</b> tag is encoded with all tags (except for the <b>start</b> and <b>end</b> attributes) as a string printed into a clip packet. The <b>clip</b> tag's <b>start</b> attribute tells the encapsulator at what time to insert the clip packet into the bitstream. If an <b>end</b> attribute is present, it leads to the creation of another clip packet, unless another clip packet starts on the same track beforehand. This clip packet contains an "empty" <b>clip</b> tag, i.e. a <b>clip</b> tag without <b>meta</b>, <b>a</b>, <b>img</b> or <b>desc</b> elements and no attribute values except for a copy of the <b>track</b> attribute from the original <b>clip</b> tag.<br />
<br />
0 1 2 3<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| <clip ... | 0-<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| ... |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
| </clip> |<br />
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<br />
<br />
<br />
== Development ==<br />
<br />
Ogg CMML is being supported by the following projects:<br />
* the Ogg Directshow filters: see [http://www.illiminable.com/ogg/ illiminable]<br />
* liboggz: [http://svn.annodex.net/liboggz/ liboggz svn] or [http://annodex.net/software/liboggz/ liboggz]<br />
* libcmml: [http://svn.annodex.net/libcmml/ libcmml svn] or [http://annodex.net/software/libcmml/ libcmml]<br />
* libannodex: [http://svn.annodex.net/libannodex/ libannodex svn] or [http://annodex.net/software/libannodex/ libannodex]<br />
* the Annodex technology: [http://www.annodex.net/ annodex.net] including perl, python, php bindings, a firefox plugin, authoring software etc.<br />
<br />
<br />
== External links ==<br />
<br />
* CMML is described in more detail in the CMML v2.1 specification: [http://svn.annodex.net/standards/ I-D in svn] or [http://annodex.net/specifications.html I-D]<br />
<br />
[[Category:Ogg Mappings]]</div>
Silvia
https://wiki.xiph.org/index.php?title=MediaWiki:Sidebar&diff=12676
MediaWiki:Sidebar
2010-11-22T23:20:27Z
<p>Silvia: updated skeleton link</p>
<hr />
<div>__NOTOC__<br />
__NOEDITSECTION__<br />
<div class="portlet"><br />
<font size="+1">'''[[Main Page]]'''</font><br />
===<font color="black">Xiph.Org Projects</font>===<br />
===Audio—===<br />
*[[Vorbis]]<br />
*[[FLAC]]<br />
*[[Speex]]<br />
*[[CELT]]<br />
===Video—===<br />
*[[Theora]]<br />
*[[Dirac]]<br />
===Text—===<br />
*[[XSPF]] <br />
*[[CMML]] <br />
*[[Kate]] <br />
===Container—===<br />
*[[Ogg]] <br />
*[[Ogg Skeleton 4|Skeleton]]<br />
===Streaming—===<br />
[[Icecast]]<br />
</div></div>
Silvia
https://wiki.xiph.org/index.php?title=Main_Page&diff=12673
Main Page
2010-11-22T02:09:04Z
<p>Silvia: moved to skeleton 4</p>
<hr />
<div>In an effort to bring open-source ideals to the world of multimedia the [[Xiph.Org Foundation]] develops a multitude of amazing products. This wiki describes our free and open protocols and software.<br />
<br />
----<br />
<br />
<br />
= Demonstrations of Xiph technologies =<br />
<br />
Want to hear or see Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams|Vorbis Streams]]: Stations streaming with the [[Vorbis]] codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware|Vorbis Hardware]]: Hardware players using the Vorbis codec<br />
* [[VorbisSoftwarePlayers|Vorbis Software Players]]: list of media players with out-of-box support for Vorbis<br />
* [[TheoraHardware|Theora Hardware]]: Hardware using the Theora video codec<br />
* [[TheoraSoftwarePlayers|Theora Software Players]]: list of media players with Theora support<br />
* [[List of Theora videos]]: Sources for video encoded with [[Theora]]<br />
<br />
= Projects/Formats =<br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommended container for Xiph codecs.<br />
** [[Ogg Skeleton 4]]: Skeleton information on all logical content bitstreams in Ogg.<br />
** [[MIMETypesCodecs|Specification of MIME types and respective codecs parameter]]<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML Sharable Playlist Format<br />
<br />
== Codecs ==<br />
<br />
* '''Compressed Audio/Video Codecs:'''<br />
** [[Vorbis]]: Audio codec with a [[Tremor|fixed point decoder]]<br />
** [[Theora]]: Video codec<br />
** [[FLAC]]: Free Lossless Audio Codec<br />
** [[Speex]]: Speech codec<br />
* '''Uncompressed Audio/Video Codecs:'''<br />
** [[OggPCM]]: Audio codec<br />
* '''Timed Text/Metadata Codecs:'''<br />
** [[CMML]]: Continuous Media Markup Language, used for [http://www.annodex.net/ Annodex] and subtitles (xine, vlc, gstreamer, and DirectShow support)<br />
**[[OggKate|Kate]]: new format for lyrics and subtitles<br />
<br />
== Software ==<br />
<br />
* '''Software for distributing media'''<br />
** [[Icecast]]: Streaming server<br />
** [[Ices]]: Source client for Icecast servers<br />
<br />
* '''Libraries'''<br />
** [[OggPlay]]: library for synchronised Xiph media playback<br />
**[[XiphQT]]: Quicktime component to play the main Xiph formats<br />
** [[VorbisCommentEdit]]: Macintosh Framework making it easy to incorporate the editing of [[VorbisComment|Vorbis Comments]]<br />
<br />
* '''Other software'''<br />
** [[OggComponent/VorbisComponent]]: Wrappers to integrate Vorbis into Mac OS X (does not yet support encoding)<br />
** [http://xiph.org/paranoia/ cdparanoia]: CDDA extractor/ripper<br />
<br />
== Community ==<br />
<br />
*[[How to help]]<br />
*[[Spread Open Media]]: project to promote Xiph formats.<br />
**[[MailOgging]]: provides templates for anyone willing to contact a company requesting them to add support for Xiph formats.<br />
*[[People]]: Who's who in Xiph.<br />
<br />
== Work in Progress ==<br />
* [[Work In Progress]]: codecs and software still in the research and development stages.<br />
* [[Todo]]: To-do list for various Xiph projects.<br />
<br />
= Project management =<br />
<br />
* [[AdminProcesses]]: who's in charge of what project<br />
* [[MonthlyMeeting]]: page with information on Xiph's MonthlyMeeting<br />
* [[MailingLists]]: list of Xiph's mailing lists<br />
* [[Bounties]]: list of bounties that you can take to improve Xiph's projects<br />
<br />
= Resources for Video and Audio programmers =<br />
<br />
* [[Ambisonics]]: page with technical information on Ambisonics<br />
* [[Resources and papers on Audio, Music and Speech|Courses and papers on Audio, Music and Speech]]: page with links to MIT and other universities' content<br />
* [[Oggless]]: for ideas on how to use the different Xiph codecs outside Ogg<br />
<br />
= Wiki internal =<br />
<br />
* [[Translations]]: What about some translation work<br />
* [[Sandbox]]: Testbed for testing editing skills<br />
* [[XiphWiki:Copyrights]]: License used for all content on the XiphWiki</div>
Silvia
https://wiki.xiph.org/index.php?title=TheoraSoftwareEncoders&diff=11027
TheoraSoftwareEncoders
2010-04-22T12:32:36Z
<p>Silvia: added directshow filters - not sure why they were missing</p>
<hr />
<div>== Multi-platform ==<br />
*[http://www.v2v.cc/~j/ffmpeg2theora/ ffmpeg2theora] a commandline encoder from any format read by ffmpeg to Theora/Vorbis: [http://svn.xiph.org/trunk/ffmpeg2theora/ svn], [http://www.v2v.cc/~j/ffmpeg2theora/download.html major releases], [http://firefogg.org/nightly/ very latest versions of ffmpeg2theora and some more stuff]<br />
* [http://diracvideo.org/download/ffmpeg2dirac/ ffmpeg2dirac] - fork of ffmpeg2theora, can enode into OGG Dirac but also Theora<br />
*[http://www.videolan.org/ VLC Media Player] Can transcode from any source it supports into Ogg/Theora. WARNING: Apparently creates broken Ogg streams.<br />
*[http://gstreamer.freedesktop.org/ GStreamer] GStreamer is a library that allows the construction of graphs of media-handling components, ranging from simple Vorbis playback to complex audio (mixing) and video (non-linear editing) processing.<br />
*[http://sarava.org/theorur/ Theorur] is a GUI for Ogg/Theora streaming (icecast2 system), written using gtk2.<br />
*[http://handbrake.fr Handbrake] is a GUI/CLI free software for ripping/encoding DVD/Files into various containers and formats including theora & vorbis since September 2008. <br />
*[http://firefogg.org/ firefogg] is a Firefox extension that encodes locally and uploads in chunks or when encoding finishes<br />
<br />
== Windows ==<br />
*[http://dir.visonair.tv/streamer.php Visonair.tv Ogg Streamer] A Windows application to stream directly from a webcam to an Icecast server.<br />
*[http://www.visonair.tv/ Visonair.tv Virtual Stage] Includes an application to encode to Theora, forces fixed size and encoding parameters though. Registration required.<br />
*[http://www.freewarefiles.com/program_6_227_33306.html GFrontend] GUI Frontend for ffmpeg2theora. An unsupported / discontinued open source project.<br />
*[http://sourceforge.net/projects/theoraconverter/ Theora Converter .NET] A GUI frontend for ffmpeg2theora based on GFrontend. Supports 2 pass encoding with ffmpeg2theora 0.26.<br />
*[http://mediacoder.sourceforge.net/ MediaCoder] Application to encode media files into many target formats, including Theora.<br />
*[http://www.erightsoft.com/ SUPER] General purpose converter application, also serves as a frontend to ffmpeg2theora.<br />
*[http://teejee2008.wordpress.com/ffcoder/ FFCoder], general purpose converter<br />
*[http://en.wikipedia.org/wiki/Theora#Encoding The Wikipedia Theora Page] provides an up to date list of software that can encode Theora.<br />
*[http://www.xiph.org/dshow/Ogg DSF DirectShow Filters for Windows] allows playback and encoding of Ogg Theora/Vorbis in all Windows apps that use the DirectShow framework.<br />
<br />
== Linux/BSD ==<br />
*[http://thoggen.net/ Thoggen] is a DVD backup utility ('DVD ripper') for Linux, based on GStreamer and Gtk+.<br />
*[http://freej.org/ Freej] Freej is a realtime video mixer. It can stream Theora and Vorbis live to [http://icecast.org icecast]. Check [http://lab.dyne.org/FreejStreaming here] for more info.<br />
* [http://oggconvert.tristanb.net/ OggConvert] is a small Gnome utility which uses GStreamer to convert (almost) any media file to Vorbis, Theora and Dirac.<br />
<br />
== Mac OS X ==<br />
*[http://xiph.org/quicktime XiphQT] allows you to export from Quicktime, or any application supporting Quicktime (i.e. Final Cut Pro), to Theora/Vorbis.<br />
*[http://v2v.cc/~j/SimpleTheoraEncoder/ Simple Theora Encoder] an ffmpeg2theora frontend<br />
<br />
== See also== <br />
{{Template:Theora}}<br />
<br />
[[Category:Theora]]</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10943
SkeletonHeaders
2010-03-24T01:18:47Z
<p>Silvia: removed transparentcolor</p>
<hr />
<div>{{draft}}<br />
<br />
= Ogg Skeleton 3.4 with new Message Headers =<br />
<br />
'''DRAFT, last updated 23 March 2010'''<br />
<br />
'''This specification is still a work in progress, and does not yet constitute an official Ogg specification.'''<br />
<br />
<br />
== Adding New Message Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
The Language field will have the dominating language specified as the first language. It is possible to specify less non-dominating languages as a list after the main language.<br />
<br />
Example:<br />
Language: en-US, fr<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption" - transcription of all sounds, including speech, for purposes of the hard-of-hearing<br />
* "text/subtitle" - translation of all speech, typically into a different language<br />
* "text/textaudiodesc" - description/transcription of everything that happens in a video as text to be used for the vision-impaired through screen readers or braille<br />
* "text/karaoke" - music lyrics delivered in chunks for singing along<br />
* "text/chapters" - titles for sections of the media that provide a kind of chapter segmentation (similar to DVD chapters)<br />
* "text/tickertext" - text to run as informative text at the bottom of the media display<br />
* "text/lyrics" - transcript of the text used in music media<br />
* "text/metadata" - name-value pairs that are associated with certain sections of the media<br />
* "text/annotation" - free text associated with certain sections of the media<br />
* "text/linguistic" - linguistic markup of the spoken words<br />
<br />
Video tracks:<br />
* "video/main" - the main video track<br />
* "video/alternate" - an alternative video track, e.g. different camera angle<br />
* "video/sign" - a sign language video track<br />
<br />
Audio tracks:<br />
* "audio/main" - the main audio track<br />
* "audio/alternate" - an alternative audio track, probably linked to an alternate video track<br />
* "audio/dub" - the audio track but with speech in a different language to the original<br />
* "audio/audiodesc" - an audio description recording for the vision-impaired <br />
* "audio/music" - a music track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/speech" - a speech track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/sfx" - a sound effects track, e.g. when music, speech and sound effects are delivered in different tracks<br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Title ===<br />
<br />
A free text field to provide a description of the track content.<br />
<br />
Example:<br />
Title: "the French audio track for the movie"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently proposed hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10942
SkeletonHeaders
2010-03-24T01:02:07Z
<p>Silvia: added descriptions to the roles</p>
<hr />
<div>{{draft}}<br />
<br />
= Ogg Skeleton 3.4 with new Message Headers =<br />
<br />
'''DRAFT, last updated 23 March 2010'''<br />
<br />
'''This specification is still a work in progress, and does not yet constitute an official Ogg specification.'''<br />
<br />
<br />
== Adding New Message Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
The Language field will have the dominating language specified as the first language. It is possible to specify less non-dominating languages as a list after the main language.<br />
<br />
Example:<br />
Language: en-US, fr<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption" - transcription of all sounds, including speech, for purposes of the hard-of-hearing<br />
* "text/subtitle" - translation of all speech, typically into a different language<br />
* "text/textaudiodesc" - description/transcription of everything that happens in a video as text to be used for the vision-impaired through screen readers or braille<br />
* "text/karaoke" - music lyrics delivered in chunks for singing along<br />
* "text/chapters" - titles for sections of the media that provide a kind of chapter segmentation (similar to DVD chapters)<br />
* "text/tickertext" - text to run as informative text at the bottom of the media display<br />
* "text/lyrics" - transcript of the text used in music media<br />
* "text/metadata" - name-value pairs that are associated with certain sections of the media<br />
* "text/annotation" - free text associated with certain sections of the media<br />
* "text/linguistic" - linguistic markup of the spoken words<br />
<br />
Video tracks:<br />
* "video/main" - the main video track<br />
* "video/alternate" - an alternative video track, e.g. different camera angle<br />
* "video/sign" - a sign language video track<br />
<br />
Audio tracks:<br />
* "audio/main" - the main audio track<br />
* "audio/alternate" - an alternative audio track, probably linked to an alternate video track<br />
* "audio/dub" - the audio track but with speech in a different language to the original<br />
* "audio/audiodesc" - an audio description recording for the vision-impaired <br />
* "audio/music" - a music track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/speech" - a speech track, e.g. when music, speech and sound effects are delivered in different tracks<br />
* "audio/sfx" - a sound effects track, e.g. when music, speech and sound effects are delivered in different tracks<br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently proposed hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
* transparentcolor(colorcode) on a video track - turn all pixels of the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10941
SkeletonHeaders
2010-03-24T00:46:47Z
<p>Silvia: added multi-value to language</p>
<hr />
<div>{{draft}}<br />
<br />
= Ogg Skeleton 3.4 with new Message Headers =<br />
<br />
'''DRAFT, last updated 23 March 2010'''<br />
<br />
'''This specification is still a work in progress, and does not yet constitute an official Ogg specification.'''<br />
<br />
<br />
== Adding New Message Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
The Language field will have the dominating language specified as the first language. It is possible to specify less non-dominating languages as a list after the main language.<br />
<br />
Example:<br />
Language: en-US, fr<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently proposed hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
* transparentcolor(colorcode) on a video track - turn all pixels of the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10939
SkeletonHeaders
2010-03-23T05:45:22Z
<p>Silvia: added intro</p>
<hr />
<div>{{draft}}<br />
<br />
= Ogg Skeleton 3.4 with new Message Headers =<br />
<br />
'''DRAFT, last updated 23 March 2010'''<br />
<br />
'''This specification is still a work in progress, and does not yet constitute an official Ogg specification.'''<br />
<br />
<br />
== Adding New Message Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently proposed hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
* transparentcolor(colorcode) on a video track - turn all pixels of the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10938
SkeletonHeaders
2010-03-23T05:22:13Z
<p>Silvia: /* Display-hint */</p>
<hr />
<div>== Adding New Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently proposed hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
* transparentcolor(colorcode) on a video track - turn all pixels of the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10888
SkeletonHeaders
2010-03-21T11:25:58Z
<p>Silvia: </p>
<hr />
<div>== Adding New Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently available hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
* transparentcolor(colorcode) on a video track - turn all pixels of the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10884
SkeletonHeaders
2010-03-21T00:51:40Z
<p>Silvia: clarifications</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently available hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the zero coordinates of the display area of the video with x,y providing the origin of the top left corner of the PIP video and w,h the width and height in pixels which are optional. x, y, w, and h can be specified in percentage, thus allowing persistent placement independent of the scaling of the video display.<br />
<br />
Examples:<br />
Display-hint: pip(20%,20%)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. x,y,w, and h can be provided in pixels or in percent. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,30%,25%)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content. This transparency is applied to all pixels in the same way.<br />
<br />
Examples:<br />
Display-hint: transparent(25%)<br />
Display-hint: transparent(7%)<br />
<br />
* transparentcolor(colorcode) on a video track - turn all pixels of the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10883
SkeletonHeaders
2010-03-21T00:39:01Z
<p>Silvia: turn it into actual percentages</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently available hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the "main" video track with x,y providing the origin of the top left corner of the PIP video and w,h the width and height which are optional<br />
<br />
Examples:<br />
Display-hint: pip(20,20)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% (int value between 0 and 100) on the complete video track as it will be rendered on top of other content.<br />
<br />
Examples:<br />
Display-hint: transparent(25)<br />
Display-hint: transparent(7)<br />
<br />
* transparentcolor(colorcode) on a video track - turn the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10882
SkeletonHeaders
2010-03-21T00:29:27Z
<p>Silvia: added dependencies section</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently available hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the "main" video track with x,y providing the origin of the top left corner of the PIP video and w,h the width and height which are optional<br />
<br />
Examples:<br />
Display-hint: pip(20,20)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% on the complete video track as it will be rendered on top of other content.<br />
<br />
Examples:<br />
Display-hint: transparent(0.25)<br />
Display-hint: transparent(0.7)<br />
<br />
* transparentcolor(colorcode) on a video track - turn the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude (better name?) message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===<br />
<br />
It is tempting to introduce dependencies between tracks - to specify things such as:<br />
<br />
* track b depends on track a being available (e.g. main audio depending on main video), so always display them together and if you remove a track, remove all depending tracks, too<br />
<br />
* track c and d are alternative tracks to track b (e.g. dubs in other languages for main audio), so don't display them together and if you activate one, disable the others<br />
<br />
* track a and one of b,c,d one of e,f,g where e depends on b, f depends on c, and g depends on d, make up a presentation profile and should be displayed together (e.g. main video, one of the audio dubs, and their respective captions).<br />
<br />
It is not clear yet whether there is an actual need to maintain this information as author-provided hints or whether a media player can itself determine a lot from the other fields, such as role and language.<br />
<br />
MPEG has a "groupID" element which allows for tracks to be put into groups of alternative tracks. This feature is, however, not used very often and decisions are being left to the media player.<br />
<br />
At this stage, it's probably too early to make a specification for how to encode this in Ogg. The need has not been totally clarified yet.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10881
SkeletonHeaders
2010-03-20T13:47:10Z
<p>Silvia: added Altitude</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Currently available hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the "main" video track with x,y providing the origin of the top left corner of the PIP video and w,h the width and height which are optional<br />
<br />
Examples:<br />
Display-hint: pip(20,20)<br />
Display-hint: pip(40,40,690,60)<br />
<br />
* mask(img,x,y,w,h) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
Examples:<br />
Display-hint: mask(http://www.example.com/image.png)<br />
Display-hint: mask(http://www.example.com/image.png,20,20,400,320)<br />
<br />
* transparent(transparency) on a video track - put a transparency of x% on the complete video track as it will be rendered on top of other content.<br />
<br />
Examples:<br />
Display-hint: transparent(0.25)<br />
Display-hint: transparent(0.7)<br />
<br />
* transparentcolor(colorcode) on a video track - turn the color identified by the colorcode into transparent pixels.<br />
<br />
Examples:<br />
Display-hint: transparentcolor(#454545)<br />
Display-hint: transparentcolor(#777777)<br />
<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index in a consistent manner across different media players, such that e.g. JavaScript can always link to the same track reliably across browsers. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
=== Altitude ===<br />
<br />
The Altitude message header field defines the stack order of the tracks, i.e. which track is displayed further towards the top of the stack and which further down. By default, a "main" track is always displayed bottom-most unless otherwise defined. <br />
<br />
The Altitude field takes the same numerical values as the z-index in CSS, unlimited negative and positive numbers.<br />
An element with greater stack order is always in front of an element with a lower stack order.<br />
<br />
Example: Altitude: -150<br />
<br />
<br />
=== Track dependencies ===</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10880
SkeletonHeaders
2010-03-20T11:32:47Z
<p>Silvia: started on display hints</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
* "text/activeregion"<br />
* "text/metadata"<br />
* "text/annotation"<br />
* "text/transcript"<br />
* "text/linguistic"<br />
* "text/chapters"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
There may also be parameters to describe the roles better, such as "video/alternate;angle=nw"<br />
<br />
<br />
=== Display-hint ===<br />
<br />
Media players that do not get informed about how a content author intends a media file to be displayed have no change to display the content "correctly". This is why the Display-hint message header field allows providing of hints on how a certain track should be displayed. A media player can of course decide to ignore these hints.<br />
<br />
Example hints are:<br />
<br />
* pip(x,y,w,h) on a video track - picture-in-picture display in relation to the "main" video track with x,y providing the origin of the top left corner of the PIP video and w,h the width and height<br />
<br />
* mask(x,y,w,h,img) on a video track - use the image given at img url (?) as a video mask to allow the video to appear in shapes other than rectangular. The masking image should be a black shape on a white background. The image is placed at offset x,y and scaled to width and height w and h. Pixels under the white background are made transparent and only pixels under the black shape are retained.<br />
<br />
* overlay(transparency) on a video track - <br />
<br />
* alpha(trackref) on a video track - <br />
<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index. It has no influence on what should be displayed on top of which other track.<br />
<br />
<br />
== Track dependencies ===</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10879
SkeletonHeaders
2010-03-20T10:33:54Z
<p>Silvia: added roles section</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Language ===<br />
<br />
Content in a track usually originates from a specific language. This language can be specified in a Language message header field. The code is created according to http://www.w3.org/TR/ltli/ and http://www.rfc-editor.org/rfc/bcp/bcp47.txt.<br />
<br />
For audio tracks with speech, the Language would be the language that dominates.<br />
<br />
For video tracks, it might be the language that is signed (if it is a sign language video), or the language that is most often represented in scene text.<br />
<br />
For text tracks, it is the dominating language in the text, e.g. English or German subtitles.<br />
<br />
Examples are: en-US, de-DE, sgn-ase, en-cockney<br />
<br />
<br />
=== Role ===<br />
<br />
Role describe what semantic type of content is contained in a track. Every track can only have a single role value, so the most appropriate role has to be chosen. The same role can be used across multiple tracks.<br />
<br />
The following list some commonly used roles. Other roles are possible, too, but should only be used/introduced if there is really a need for it.<br />
<br />
Text tracks:<br />
* "text/caption"<br />
* "text/subtitle"<br />
* "text/textaudiodesc"<br />
* "text/karaoke"<br />
* "text/chapters"<br />
* "text/tickertext"<br />
* "text/lyrics"<br />
<br />
Video tracks:<br />
* "video/main"<br />
* "video/alternate" (e.g. different camera angle)<br />
* "video/sign" (for sign language)<br />
* "video/alpha" (a track to alpha blend)<br />
<br />
Audio tracks:<br />
* "audio/main"<br />
* "audio/alternate" (probably linked to an alternate video track)<br />
* "audio/dub"<br />
* "audio/audiodesc"<br />
* "audio/music"<br />
* "audio/speech"<br />
* "audio/sfx" (sound effects) <br />
<br />
Notice how we are re-using the Content-type approach of specifying the main semantic type of the track first. This is necessary, since mime types don't always provide the right main content type (e.g. application/kate is semantically a text format).<br />
<br />
<br />
=== Name ===<br />
<br />
This field provides the opportunity to associate a free text string with the track to allow direct addressing of the track through its name.<br />
<br />
Characters allowed are basically all the characters that are also allowed for XML id fields:<br />
<br />
the first character has to be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |<br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]<br />
<br />
any following characters can be one of:<br />
[A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | <br />
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] | <br />
"-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]<br />
<br />
<br />
The name needs to be unique between all the track names, otherwise it is undefined which of the tracks is retrieved when addressing by name.<br />
<br />
An example means of addressing the track by name is: track[name="Madonna_singing"]<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index. It has no influence on what should be displayed on top of which other track.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10878
SkeletonHeaders
2010-03-20T08:37:36Z
<p>Silvia: added more details</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory Message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Content-role ===<br />
<br />
<br />
=== Content-language ===<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media file and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream. If a file is re-encoded, the order may change, so you can only rely on this for addressing if the file doesn't change.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language<br />
<br />
This track order is simply to have a means to address tracks through an index. It has no influence on what should be displayed on top of which other track.</div>
Silvia
https://wiki.xiph.org/index.php?title=SkeletonHeaders&diff=10877
SkeletonHeaders
2010-03-20T07:48:54Z
<p>Silvia: started page</p>
<hr />
<div>== Adding Required Headers to Skeleton ==<br />
<br />
With the HTML5 video element, Ogg is now a major format on the Web and is being applied to solve use cases it hasn't had to solve before, but was built to allow, see http://www.xiph.org/ogg/doc/oggstream.html.<br />
<br />
One particular such use case is dealing with multitrack audio and video, such as in videos with multiple view angles encoded in one, or ones with a sign language video track, an audio description audio track, a caption track and several subtitle tracks in different languages (i.e. several theora, several vorbis and several kate tracks).<br />
<br />
While encoding of multitrack files is already possible, it is unclear how such files would be rendered, how tracks would be differentiated and addressed (e.g. from a JavaScript API), etc. Skeleton has been built in a way such that it is extensible with message header fields for this purpose.<br />
<br />
On this wiki page, we are collecting such new information fields.<br />
<br />
<br />
=== Content-type ===<br />
<br />
Right now, there is one mandatory Message header field for all of the logical bitstreams: the "Content-type" header field, which contains the mime type of the track. The mime types in use here are listed at http://wiki.xiph.org/MIME_Types_and_File_Extensions#Codec_MIME_types.<br />
<br />
<br />
=== Role ===<br />
<br />
<br />
<br />
=== Track order ===<br />
<br />
In many applications it is necessary to walk through all the tracks in a media resource and address tracks by an index.<br />
<br />
In Ogg, the means to number through the tracks is by the order in which the bos pages of the tracks appear in the Ogg stream.<br />
<br />
For example, a video file with the following composition would have the following indexes:<br />
* track[0]: Skeleton BOS<br />
* track[1]: Theora BOS for main video<br />
* track[2]: Vorbis BOS for main audio<br />
* track[3]: Kate BOS for English captions<br />
* track[4]: Kate BOS for German subtitles<br />
* track[5]: Vorbis BOS for audio descriptions<br />
* track[6]: Theora BOS for sign language</div>
Silvia
https://wiki.xiph.org/index.php?title=MIME_Types_and_File_Extensions&diff=10876
MIME Types and File Extensions
2010-03-20T07:06:15Z
<p>Silvia: /* Codec MIME types */</p>
<hr />
<div>STATUS: Work on RFCs and tools is in process to reflect these policies. More details are [http://wiki.xiph.org/index.php/MIMETypesCodecs here], which also include a specification of the codecs parameter of the MIME tyes. Use the correct file extensions straight away.<br />
<br />
DISCLAIMER: currently, application/ogg, video/ogg, audio/ogg and audio/vorbis are registered MIME types. Registration for the others will be undertaken. During this process, the "x-" versions of these unregistered MIME types may be used.<br />
<br />
IMPLEMENTATION recommendations and patches: see [[MIME-Migration]].<br />
<br />
== .ogx - application/ogg ==<br />
<br />
* Ogg Multiplex Profile (anything in [[Ogg]])<br />
* can contain any logical bitstreams multiplexed together in an ogg container<br />
* will replace the .ogg extension from RFC 3534 http://www.ietf.org/rfc/rfc3534.txt<br />
* random multitrack files MUST contain a [[Skeleton]] track to identify all containing logical bitstreams<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* thus, e.g. an annodex file can gracefully degrade to .ogx if an app cannot decode [[CMML]] and/or [[Skeleton]]<br />
* USE: application/ogg has been registered, so can be used immediately<br />
<br />
== .ogv - video/ogg ==<br />
<br />
* Ogg Video Profile (a/v in Ogg container)<br />
* apps supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg<br />
* This list is not exhaustive (for example, [[Dirac]] + FLAC is acceptable too)<br />
* SHOULD contain a Skeleton track and/or MAY contain a CMML logical bitstream.<br />
<br />
== .oga - audio/ogg ==<br />
<br />
* Ogg Audio Profile (audio in Ogg container)<br />
* Applications supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* Covers Ogg [[FLAC]], [[Ghost]], and [[OggPCM]] <br />
* Although they share the same MIME type, Vorbis and Speex use different file extensions.<br />
* SHOULD contain a Skeleton logical bitstream.<br />
* Vorbis and Speex may use .oga, but it is not the prefered method of distributing these files because of backwards-compatibility issues.<br />
<br />
== .ogg - audio/ogg ==<br />
<br />
* Ogg Vorbis I Profile<br />
* .ogg applies now for Vorbis I files only<br />
* .ogg has more recently also been used for Ogg FLAC and for Theora, too &mdash; these uses are deprecated now in favor of .oga and .ogv respectively<br />
* has been defined in RFC 3534 http://www.ietf.org/rfc/rfc3534.txt for application/ogg, so rfc 3534 will be re-defined<br />
<br />
RATIONALE: .ogg has traditionally been used for Vorbis I files, in particular in HW players, hence it is kept for backwards-compatibility<br />
<br />
== .spx - audio/ogg ==<br />
<br />
* Ogg Speex Profile<br />
* .spx has traditionally been used for Speex files within Ogg and should be considered for backwards-compatibility<br />
<br />
== .flac - audio/flac ==<br />
<br />
* FLAC in native encapsulation format<br />
<br />
== .anx - application/annodex ==<br />
<br />
* Profile for multiplexed Ogg that includes a skeleton track and at least one CMML logical bitstream<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* apps that come across an annodex file and cannot decode CMML and/or Skeleton, but can deal with the others SHOULD gracefully degrade by ignoring these<br />
<br />
== .axa - audio/annodex ==<br />
<br />
* Profile for audio in Annodex <br />
* covers e.g. [[Vorbis]], [[Speex]], [[FLAC]], [[Ghost]], [[OggPCM]] inside Ogg with Skeleton and CMML<br />
<br />
== .axv - video/annodex ==<br />
<br />
* Profile for video in Annodex <br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg with Skeleton and CMML<br />
<br />
== .xspf - application/xspf+xml ==<br />
<br />
* Profile for XSPF<br />
* Covers [[XSPF]], while being used through XML<br />
* Does not cover [[JSPF]], which is XSPF but on JSON<br />
<br />
== Ogg Kate files - application/kate ==<br />
<br />
* Binary representation of Kate encapsulated in Ogg<br />
* may have a skeleton<br />
* can be used to identify the mime type of the track itself (e.g. in skeleton)<br />
* uses .ogx extension when in a file by itself<br />
* is subdued by the dominant mime type if in a audio or video file to become audio/ogg or video/ogg<br />
<br />
== Codec MIME types ==<br />
<br />
Codecs need their own MIME types for streaming in RTP and to be used in multitrack ogg files using skeleton:<br />
<br />
* audio/vorbis for Vorbis without container<br />
* video/theora for Theora without container<br />
* audio/speex for Speex without container<br />
* audio/flac for FLAC without and in native container<br />
* text/cmml for CMML without container<br />
* application/kate for the textual representation of Kate (.kate files)</div>
Silvia
https://wiki.xiph.org/index.php?title=Ogg_Index&diff=10875
Ogg Index
2010-03-20T06:37:26Z
<p>Silvia: typo - duplicate field</p>
<hr />
<div>{{draft}}<br />
<br />
= Ogg Skeleton 3.3 with Keyframe Index =<br />
<br />
'''DRAFT, last updated 27 January 2010'''<br />
<br />
'''This specification is still a work in progress, and does not yet constitute an official Ogg track format.'''<br />
<br />
== Overview ==<br />
<br />
Seeking in an Ogg file is typically implemented as a bisection search <br />
over the pages in the file. The Ogg physical bitstream is bisected and <br />
the next Ogg page's end-time is extracted. The bisection continues until <br />
it reaches an Ogg page with an end-time close enough to the seek target <br />
time. However in media containing streams which have keyframes and <br />
interframes, such as Theora streams, your bisection search won't <br />
necessarily terminate at a keyframe. Thus if you begin decoding after your<br />
first bisection terminates, you're likely to only get partial incomplete<br />
frames, with "visual artifacts", until you decode up to the next keyframe.<br />
So to eliminate these visual artifacts, after the first bisection<br />
terminates, you must extract the keyframe's timestamp from the last Theora<br />
page's granulepos, and seek again back to the start of the keyframe and<br />
decode forward until you reach the frame at the seek target. <br />
<br />
This is further complicated by the fact that packets often span multiple <br />
Ogg pages, and that Ogg pages from different streams can be interleaved <br />
between spanning packets. <br />
<br />
The bisection method above works fine for seeking in local files, but <br />
for seeking in files served over the Internet via HTTP, each bisection <br />
or non sequential read can trigger a new HTTP request, which can have <br />
very high latency, making seeking very slow. <br />
<br />
== Seeking with an index ==<br />
<br />
The Skeleton 3.3 bitstream attempts to alleviate this problem, by <br />
providing an index of periodic keyframes for every content stream in an <br />
Ogg segment. Note that the Skeleton 3.3 track only holds data for the <br />
segment or "link" in which it resides. So if two Ogg files are concatenated<br />
together ("chained"), the Skeleton 3.3's keyframe indexes in the first Ogg<br />
segment (the first "link" in the "chain") do not contain information<br />
about the keyframes in the second Ogg segment (the second link in the chain).<br />
<br />
Each content track has a separate index, which is stored in its own <br />
packet in the Skeleton 3.3 track. The index for streams without the <br />
concept of a keyframe, such as Vorbis streams, can instead record the <br />
time position at periodic intervals, which achieves the same result. <br />
When this document refers to keyframes, it also implicitly refers to these<br />
independent periodic samples from keyframe-less streams. <br />
<br />
All the Skeleton 3.3 track's pages appear in the header pages of the Ogg <br />
segment. This means the all the keyframe indexes are immediately <br />
available once the header packets have been read when playing the media<br />
over a network connection. <br />
<br />
For every content stream in an Ogg segment, the Ogg index bitstream <br />
provides seek algorithms with an ordered table of "key points". A key <br />
point is intrinsically associated with exactly one stream, and stores the<br />
offset of the page on which it starts, o, as well as the presentation time<br />
of the keyframe t, as a fraction of seconds. This specifies that in order<br />
to render the stream at presentation time t, the last page which lies before<br />
all information required to render the keyframe at presentation time t begins<br />
exactly at byte offset o, as offset from the beginning of the Ogg segment.<br />
The offset is exactly the first byte of the page, so if you seek to a<br />
keypoint's offset and don't find the beginning of a page there, you can<br />
assume that the Ogg segment has been modified since the index was constructed,<br />
and that the index is now invalid and should not be used. The time t is the<br />
keyframe's presentation time corresponding to the granulepos, and is<br />
represented as a fraction in seconds. Note that if a stream requires any<br />
preroll, this will be accounted for in the time stored in the keypoint. <br />
<br />
The Skeleton 3.3 track contains one index for each content stream in the <br />
file. To seek in an Ogg file which contains keyframe indexes, first<br />
construct the set which contains every active streams' last keypoint which<br />
has time less than or equal to the seek target time. Then from that set<br />
of key points, select the key point with the smallest byte offset. You then<br />
verify that there's a page found at exactly that offset, and if so, you can<br />
begin decoding. If the first keyframe you encounter has a time equal to<br />
that stored in the keypoint, you have made the optimal seek, and can safely<br />
continue to decode up to the seek target time. You are guaranteed to pass<br />
keyframes on all streams with time less than or equal to your seek target<br />
time while decoding up to the seek target. However if the first keyframe<br />
you encounter after decoding does not have the same presentation time as<br />
is stored in the keypoint, you then the index is invalid (possibly the file<br />
has been changed without updating the index) and you must either fallback<br />
to a bisection search, or keep decoding if you've landed "close enough"<br />
to the seek target.<br />
<br />
Be aware that you cannot assume that any or all Ogg files will contain <br />
keyframe indexes, so when implementing Ogg seeking, you must gracefully<br />
fall-back to a bisection search or other seek algorithm when the index<br />
is not present, or when it is invalid.<br />
<br />
The Skeleton 3.3 BOS packet also stores meta data about the segment in <br />
which it resides. It stores the timestamps of the first and last samples<br />
in the segment. This also allows you to determine the duration of the<br />
indexed Ogg media without having to decode the start and end of the<br />
Ogg segment to calculate the difference (which is the duration).<br />
<br />
The Skeleton 3.3 BOS packet also contains the length of the indexed segment<br />
in bytes. This is so that if the seek target is outside of the indexed range,<br />
you can immediately move to the next/previous segment and either seek using<br />
that segment's index, or narrow the bisection window if that segment has no<br />
index. You can also use the segement length to verify if the index is valid.<br />
If the contents of the segment have changed, it's highly likely that the<br />
length of the segment has changed as well. When you load the segment's<br />
header pages, you should check the length of the physical segment, and if it<br />
doesn't match that stored in the Skeleton header packet, you know the index<br />
is out of date and not safe to use.<br />
<br />
The Skeleton 3.3 BOS packet also contains the offset of the first non header<br />
page in the Ogg segment. This means that if you wish to delay loading of an<br />
index for whatever reason, you can skip forward to that offset, and start<br />
decoding from that offset forwards.<br />
<br />
When using the index to seek, you must verify that the index is still <br />
correct. You can consider the index invalid if any of the following are true:<br />
<br />
# The segment length stored in the Skeleton BOS packet doesn't match the length of the physical segment, or<br />
# after a seek to a keypoint's offset, you don't land exactly on a page boundary, or<br />
# the first keyframe decoded after seeking to a keypoint's offset doesn't have the same presentation time as stored in the index.<br />
<br />
You should also always check the Skeleton version header field<br />
to ensure your decoder correctly knows how to parse the Skeleton track. <br />
<br />
Be aware that a keyframe index may not index all keyframes in the Ogg segment,<br />
it may only index periodic keyframes instead.<br />
<br />
== Format Specification ==<br />
<br />
Unless otherwise specified, all integers and fields in the bitstream are <br />
encoded with the least significant bit coming first in each byte. <br />
Integers and fields comprising of more than one byte are encoded least <br />
significant byte first (i.e. little endian byte order). <br />
<br />
The Skeleton 3.3 track is intended to be backwards compatible with the <br />
Skeleton 3.0 specification, available at <br />
http://www.xiph.org/ogg/doc/skeleton.html . Unless specified <br />
differently here, it is safe to assume that anything specified for a <br />
Skeleton 3.0 track holds for a Skeleton 3.3 track. <br />
<br />
As per the Skeleton 3.0 track, a segment containing a Skeleton 3.3 track <br />
must begin with a '''Skeleton 3.3 fishead BOS packet''' on a page by itself, with the <br />
following format: <br />
<br />
# Identifier: 8 bytes, "fishead\0".<br />
# Version major: 2 Byte unsigned integer denoting the major version (3)<br />
# Version minor: 2 Byte unsigned integer denoting the minor version (1)<br />
# Presentationtime numerator: 8 Byte signed integer<br />
# Presentationtime denominator: 8 Byte signed integer<br />
# Basetime numerator: 8 Byte signed integer<br />
# Basetime denominator: 8 Byte signed integer<br />
# UTC [ISO8601]: a 20 Byte string containing a UTC time<br />
# '''[NEW]''' First-sample-time numerator: 8 byte signed integer representing the numerator for the presentation time of the first sample in the media. Note that samples between the first-sample-time and the Presentationtime are supposed to be skipped during playback.<br />
# '''[NEW]''' First-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!<br />
# '''[NEW]''' Last-sample-time numerator: 8 byte signed integer representing the end time of the last sample in the segment.<br />
# '''[NEW]''' Last-sample-time denominator: 8 byte signed integer, with value 0 if the timestamp is unknown. Decoders should always ensure that the denominator is not 0 before using it as a divisor!<br />
# '''[NEW]''' The length of the segment, in bytes: 8 byte unsigned integer, 0 if unknown.<br />
# '''[NEW]''' The offset of the first non-header page, in bytes: 8 byte unsigned integer, 0 if unknown.<br />
<br />
The first-sample-time and last-sample-time are rational numbers, in units<br />
of seconds. If the denominator is 0 for the first-sample-time or the<br />
last-sample-time, then that value was unable to be determined at indexing<br />
time, and is unknown. The duration of the Ogg segment can be calculated by<br />
subtracting the first-sample-time from the last-sample-time.<br />
<br />
In '''Skeleton 3.3 the "fisbone" packets remain unchanged from Skeleton <br />
3.0''', and will still follow after the other streams' BOS pages and <br />
secondary header pages. <br />
<br />
Before the Skeleton EOS page in the segment header pages come the <br />
Skeleton 3.3 keyframe index packets. There should be one index packet for<br />
each content stream in the Ogg segment, but index packets are not required<br />
for a Skeleton 3.3 track to be considered valid. Each keypoint in the index<br />
is stored in a "keypoint", which in turn stores an offset, checksum, and<br />
timestamp. In order to save space, the offsets and timestamps are stored as<br />
deltas, and then variable byte-encoded. The offset and timestamp deltas<br />
store the difference between the keypoint's offset and timestamp from the<br />
previous keypoint's offset and timestamp. So to calculate the page offset<br />
of a keypoint you must sum the offset deltas of up to and including the<br />
keypoint in the index.<br />
<br />
The variable byte encoded integers are encoded using 7 bits per byte to<br />
store the integer's bits, and the high bit is set in the last byte used<br />
to encode the integer. The bits and bytes are in little endian byte order.<br />
For example, the integer 7843, or <tt>0001 1110 1010 0011</tt> in binary, would be<br />
stored as two bytes: <tt>0xBD 0x23</tt>, or <tt>1011 1101 0010 0011</tt> in binary.<br />
<br />
Each '''Skeleton 3.3 keyframe index packet''' contains the following: <br />
<br />
# Identifier 6 bytes: "index\0"<br />
# The serialno of the stream this index applies to, as a 4 byte field.<br />
# The number of keypoints in this index packet, 'n' as a 8 byte unsigned integer. This can be 0.<br />
# The keypoint presentation time denominator, as an 8 byte signed integer.<br />
# 'n' key points, each of which contain, in the following order:<br />
## the keyframe's page's byte offset delta, as a variable byte encoded integer. This is the number of bytes that this keypoint is after the preceeding keypoint's offset, or from the start of the segment if this is the first keypoint. The keypoint's page start is therefore the sum of the byte-offset-deltas of all the keypoints which come before it.<br />
## the presentation time numerator delta, of the first key frame which starts on the page at the keypoint's offset, as a variable byte encoded integer. This is the difference from the previous keypoint's timestamp numerator. The keypoint's timestamp numerator is therefore the sum of all the timestamp numerator deltas up to and including the keypoint's. Divide the timestamp numerator sum by the timestamp denominator stored earlier in the index packet to determine the presentation time of the keyframe in seconds.<br />
<br />
Note that a keypoint always represents the first key frame on a page. If an<br />
Ogg page contains two or more keyframes, the index's key point *must* refer<br />
to the first keyframe on that page, not any subsequent keyframes on that page.<br />
<br />
The key points are stored in increasing order by offset (and thus by <br />
presentation time as well).<br />
<br />
The byte offsets stored in keypoints are relative to the start of the Ogg<br />
bitstream segment. So if you have a physical Ogg bitstream made up of two<br />
chained Oggs, the offsets in the second Ogg segment's bitstream's index<br />
are relative to the beginning of the second Ogg in the chain, not the first.<br />
Also note that if a physical Ogg bitstream is made up of chained Oggs, the<br />
presence of an index in one segment does not imply that there will be an<br />
index in any other segment. <br />
<br />
The exact number of keyframes used to construct key points in the index <br />
is up to the indexer, but to limit the index size, we recommend <br />
including at most one key point per every 64KB of data, or every 2000ms, <br />
whichever is least frequent. <br />
<br />
As per the Skeleton 3.0 track, '''the last packet in the Skeleton 3.3 track <br />
is an empty EOS packet'''.<br />
<br />
== Software Prototype ==<br />
<br />
For a prototype indexer, see [http://github.com/cpearce/OggIndex OggIndex]. Also included there is a program OggIndexValid, which can verify that Theora and Vorbis indexes are valid. If you're implementing your own indexer, or going to be modifying existing indexes, always verify that your modified indexes are valid as per OggIndexValid!<br />
<br />
Recent [http://firefogg.org/nightly/ ffmpeg2theora nightlies] will also include a keyframe index in the Skeleton<br />
3.3 track if you specify the command line option <tt>--seek-index</tt>.<br />
<br />
To see how indexes improves network seeking performance, you can download a development<br />
version of Firefox which can take advantage of indexes here:<br />
<br />
http://pearce.org.nz/video/firefox-indexed-seek-linux.tar.bz2<br />
<br />
http://pearce.org.nz/video/firefox-indexed-seek-macosx.dmg<br />
<br />
http://pearce.org.nz/video/firefox-indexed-seek-win32.zip<br />
<br />
If you already have a Firefox instance running, you'll need to either close your running<br />
Firefox instance before starting the index-capable Firefox, or start the index-capable<br />
Firefox with the <tt>--no-remote</tt> command line parameter.<br />
<br />
To compare the network performance of indexed versus non-indexed seeking, point the<br />
index-capable Firefox here:<br />
<br />
http://pearce.org.nz/video/indexed-seek-demo.html</div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10694
IDABC Questionnaire 2009
2009-11-15T23:46:59Z
<p>Silvia: clarified consensus seeking process</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation. An [http://validator.xiph.org/ online validation service] is available.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is well defined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification of the encoder is continuously revised based on user feedback to improve clarity and accuracy. The specification of the decoding part has been stable for years.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions. However, the core developers will review contributions and make sure they do not contradict the general architecture and they work well with the existing code and the test cases.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products. A particularly well known, independent, but interoperable implementation is provided by the FFmpeg open source project.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards, specifications, and software published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a contribution must either be entirely clear of any known patents, or any patents that read upon the contribution must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notably, the Xiph [http://wiki.xiph.org/OggKate OggKate] codec for time-aligned text and image content provides support for subtitles for internationalisation, captions for the hearing-impaired, and textual audio descriptions for the visually impaired. Further, Ogg supports multiple tracks of audio and video content in one container, such that sign language tracks and audio descriptions can be included into one file. For this to work, Xiph have defined [http://wiki.xiph.org/Ogg_Skeleton Skeleton] which holds metadata about each track encapsulated within the one Ogg file. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance. Also, Xiph's [http://validator.xiph.org online validation service] is a freely available service that can be used by anyone to check conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past. The oggz tools contain a validation program called oggz-validate which implementers have made massive use of in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus. However, when new contributions are made, the key specification developers will be able to veto on the introduction of a new feature. Generally, differences are discussed openly and new features are adapted until they fit the overall architecture of the standards, at which stage they are introduced into the specification, standards and software.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10693
IDABC Questionnaire 2009
2009-11-15T23:42:22Z
<p>Silvia: clarification on a11y</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation. An [http://validator.xiph.org/ online validation service] is available.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is well defined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification of the encoder is continuously revised based on user feedback to improve clarity and accuracy. The specification of the decoding part has been stable for years.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions. However, the core developers will review contributions and make sure they do not contradict the general architecture and they work well with the existing code and the test cases.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products. A particularly well known, independent, but interoperable implementation is provided by the FFmpeg open source project.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards, specifications, and software published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a contribution must either be entirely clear of any known patents, or any patents that read upon the contribution must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notably, the Xiph [http://wiki.xiph.org/OggKate OggKate] codec for time-aligned text and image content provides support for subtitles for internationalisation, captions for the hearing-impaired, and textual audio descriptions for the visually impaired. Further, Ogg supports multiple tracks of audio and video content in one container, such that sign language tracks and audio descriptions can be included into one file. For this to work, Xiph have defined [http://wiki.xiph.org/Ogg_Skeleton Skeleton] which holds metadata about each track encapsulated within the one Ogg file. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance. Also, Xiph's [http://validator.xiph.org online validation service] is a freely available service that can be used by anyone to check conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past. The oggz tools contain a validation program called oggz-validate which implementers have made massive use of in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10692
IDABC Questionnaire 2009
2009-11-15T23:32:10Z
<p>Silvia: some word smithing to clarify "standards" of Xiph</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation. An [http://validator.xiph.org/ online validation service] is available.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is well defined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification of the encoder is continuously revised based on user feedback to improve clarity and accuracy. The specification of the decoding part has been stable for years.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions. However, the core developers will review contributions and make sure they do not contradict the general architecture and they work well with the existing code and the test cases.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products. A particularly well known, independent, but interoperable implementation is provided by the FFmpeg open source project.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards, specifications, and software published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a contribution must either be entirely clear of any known patents, or any patents that read upon the contribution must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notable Xiph specifications include [http://wiki.xiph.org/OggKate OggKate] and [http://wiki.xiph.org/index.php/CMML CMML], which provide subtitles for the hearing-impaired, as well as [http://wiki.xiph.org/Ogg_Skeleton Skeleton], which can specify scene description audio tracks for the visually impaired. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10691
IDABC Questionnaire 2009
2009-11-15T23:20:53Z
<p>Silvia: ups, type</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation. An [http://validator.xiph.org/ online validation service] is available.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is well defined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification of the encoder is continuously revised based on user feedback to improve clarity and accuracy. The specification of the decoding part has been stable for years.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions. However, the core developers will review contributions and make sure they do not contradict the general architecture and they work well with the existing code and the test cases.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products. A particularly well known, independent, but interoperable implementation is provided by the FFmpeg open source project.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a standard must either be entirely clear of any known patents, or any patents that read upon the standard must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notable Xiph specifications include [http://wiki.xiph.org/OggKate OggKate] and [http://wiki.xiph.org/index.php/CMML CMML], which provide subtitles for the hearing-impaired, as well as [http://wiki.xiph.org/Ogg_Skeleton Skeleton], which can specify scene description audio tracks for the visually impaired. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10690
IDABC Questionnaire 2009
2009-11-15T23:20:15Z
<p>Silvia: added ffmpeg independent implementation</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation. An [http://validator.xiph.org/ online validation service] is available.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is well defined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification of the encoder is continuously revised based on user feedback to improve clarity and accuracy. The specification of the decoding part has been stable for years.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions. However, the core developers will review contributions and make sure they do not contradict the general architecture and they work well with the existing code and the test cases.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products. A particularly well known, independent, but interoperable implementation is provided by the ffmpeg open source project.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a standard must either be entirely clear of any known patents, or any patents that read upon the standard must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notable Xiph specifications include [http://wiki.xiph.org/OggKate OggKate] and [http://wiki.xiph.org/index.php/CMML CMML], which provide subtitles for the hearing-impaired, as well as [http://wiki.xiph.org/Ogg_Skeleton Skeleton], which can specify scene description audio tracks for the visually impaired. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10689
IDABC Questionnaire 2009
2009-11-15T23:17:41Z
<p>Silvia: added validation service, architecture consistency, and decoder stability</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation. An [http://validator.xiph.org/ online validation service] is available.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is well defined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification of the encoder is continuously revised based on user feedback to improve clarity and accuracy. The specification of the decoding part has been stable for years.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions. However, the core developers will review contributions and make sure they do not contradict the general architecture and they work well with the existing code and the test cases.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a standard must either be entirely clear of any known patents, or any patents that read upon the standard must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notable Xiph specifications include [http://wiki.xiph.org/OggKate OggKate] and [http://wiki.xiph.org/index.php/CMML CMML], which provide subtitles for the hearing-impaired, as well as [http://wiki.xiph.org/Ogg_Skeleton Skeleton], which can specify scene description audio tracks for the visually impaired. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=IDABC_Questionnaire_2009&diff=10688
IDABC Questionnaire 2009
2009-11-15T23:07:46Z
<p>Silvia: /* Market support */</p>
<hr />
<div><strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong><br />
<br />
= Context =<br />
We received [http://lists.xiph.org/pipermail/theora/2009-November/002996.html an e-mail] from a consultant studying the suitability of Theora for use in "eGovernment", on behalf of the [http://ec.europa.eu/idabc/ IDABC], an EU governmental agency responsible for "Interoperability" with an emphasis on open source. The investigation is in the context of [http://ec.europa.eu/idabc/en/document/7728 European Interoperability Framework], about which there has been [http://www.computerworlduk.com/community/blogs/index.cfm?entryid=2620&blogid=14&pn=1 some real controversy].<br />
<br />
The method of assessment is the Common Assessment Method for Standards and Specifications, including the questions below.<br />
<br />
= CAMSS Questions =<br />
== Part 4: Market Criteria ==<br />
<br />
This group of Market criteria analyses the formal specification in the scope of its market environment, and more precisely it examines the implementations of the formal specification and the market players. This implies identifying to which extent the formal specification benefits from market support and wide adoption, what are its level of maturity and its capacity of reusability.<br />
<br />
Market support is evaluated through an analysis of how many products implementing the formal specification exist, what their market share is and who their end-users are. The quality and the completeness (in case of partitioning) of the implementations of the formal specification can also be analysed. Availability of existing or planned mechanisms to assess conformity of implementations to the standard or to the specification could also be identified. The existence of at least one reference implementation (i.e.: mentioning a recognized certification process) - and of which one is an open source implementation - can also be relevant to the assessment. Wide adoption can also be assessed across domains (i.e.: public and private sectors), in an open environment, and/or in a similar field (i.e.: best practices).<br />
<br />
A formal specification is mature if it has been in use and development for long enough that most of its initial problems have been overcome and its underlying technology is well understood and well defined. Maturity is also assessed by identifying if all aspects of the formal specification are considered as validated by usage, (i.e.: if the formal specification is partitioned), and if the reported issues have been solved and documented.<br />
<br />
Reusability of a formal specification is enabled if it includes guidelines for its implementation in a given context. The identification of successful implementations of the standard or specification should focus on good practices in a similar field. Its incompatibility with related standards or specifications should also be taken into account.<br />
<br />
The ideas behind the Market Criteria can also be expressed in the form of the following questions:<br />
<br />
=== Market support ===<br />
* Does the standard have strong support in the marketplace? <br />
: Yes. For example, among web browsers, support for Xiph's Ogg, Theora, and Vorbis standards is now included by default in Mozilla Firefox, Google Chrome, and the latest versions of Opera, representing hundreds of millions of installed users just in this market alone. Further, a QuickTime component exists which enables use of Xiph's Ogg, Theora, and Vorbis standards in all Mac OS X applications that make use of the QuickTime framework - which includes Safari/Webkit, iMovie, QuickTime, and many others. On Windows, DirectShow filters exist which also enable all Windows applications that use the DirectShow framework to use Xiph's Ogg, Theora, and Vorbis standards.<br />
* What products exist for this formal specification ? <br />
: Theora is a video codec, and as such the required products are encoders, decoders, and transmission systems. All three types of products are widely available for Theora.<br />
* How many implementations of the formal specification are there? <br />
: Xiph does not require implementors to acquire any license before implementing the specification. Therefore, we do not have a definitive count of the number of implementations. In addition to the reference implementation, which has been ported to most modern platforms and highly optimized for x86 and ARM CPUs and TI C64x+ DSPs, we are aware of a number of independent, conformant or mostly-conformant implementations. These include two C decoders (ffmpeg and QTheora), a Java decoder (Jheora), a C# decoder, an FPGA decoder, and an FPGA encoder.<br />
* Are there products from different suppliers in the market that implement this formal specification ? <br />
: Yes. Corporations such as Atari, Canonical, DailyMotion, Elphel, Fluendo, Google, Mozilla, Novell, Opera, Red Hat, Sun Microsystems, Ubisoft, and countless others have supplied products with an implementation of the Theora standard.<br />
* Are there many products readily available from a variety of suppliers? <br />
: Yes. Theora has been deployed in embedded devices, security cameras, video games, video conferencing systems, web browsers, home theater systems, and many other products. A complete, legal, open-source reference implementation can also be downloaded free of charge, including components for all major media frameworks (DirectShow, gstreamer, and Quicktime), giving the plethora of applications which use these frameworks the ability to use the codec.<br />
* What is the market share of the products implementing the formal specification, versus other implementations of competing formal specifications ? <br />
: Theora playback is extremely widely available, covering virtually the entire market of personal computers. Theora is also increasingly available in mobile and embedded devices. Since we do not require licensing for products that implement the specification, we do not have market share numbers that can be compared with competing formal specifications. Because implementations are readily available and free, Theora is included in many products that support multiple codecs, and is sometimes the only video codec included in free software products.<br />
* Who are the end-users of these products implementing the formal specification?<br />
: The end users are television viewers, video gamers, web surfers, movie makers, business people, video distribution services, and anyone else who interacts with moving pictures.<br />
<br />
=== Maturity ===<br />
* Are there any existing or planned mechanisms to assess conformity of the implementations of the formal specification? <br />
: Yes. In addition to a continuous peer review process, we maintain a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that allow implementors to assess decoder conformity. We also provide free online developer support and testing for those attempting to make a conforming implementation.<br />
* Is there a reference implementation (i.e.: mentioning a recognized certification process)? <br />
: Yes. Xiph maintains a reference implementation called [http://downloads.xiph.org/releases/theora/ libtheora]. In addition to serving as a reference, libtheora is also highly optimized to achieve the maximum possible speed, accuracy, reliability, efficiency, and video quality. As a result, many implementors of Theora adopt the reference implementation.<br />
* Is there an open source implementation? <br />
: Yes. libtheora is made available under a completely permissive BSD-like license. Its open-source nature also contributes to its quality as a reference implementation, as implementors are welcome to contribute their improvements to the reference. There are also several other open source implementations.<br />
* Does the formal specification show wide adoption? <br />
** across different domains? (I.e.: public and private) <br />
: Yes. In addition to the private companies mentioned in the previous section, Theora has also been specified as the sole format supported by non-profit institutions such as Wikipedia, currently the 6th largest website in the world, or as one of a small number of preferred formats supported by other public organizations, such as the Norwegian government.<br />
** in an open environment? <br />
: Yes. On open/free operating systems such as those distributed by Novell/SuSE, Canonical, and Red Hat, Theora is the primary default video codec.<br />
** in a similar field? (i.e.: can best practices be identified?) <br />
* Has the formal specification been in use and development long enough that most of its initial problems have been overcome? <br />
: Yes. Theora was derived from VP3, which was originally released in May 2000. The Theora specification was completed in 2004. Theora has now been used in a wide variety of applications, on the full spectrum of computing devices.<br />
* Is the underlying technology of the standard well-understood? (e.g., a reference model is welldefined, appropriate concepts of the technology are in widespread use, the technology may have been in use for many years, a formal mathematical model is defined, etc.) <br />
: Yes. The underlying technology has been in use for nearly a decade, and most of the concepts have been in widespread use for even longer.<br />
* Is the formal specification based upon technology that has not been well-defined and may be relatively new? <br />
: No. The formal specification is based on technology from the On2 VP3 codec, which is substantially similar to simple block-transform codecs like H.261. This class of codecs is extremely well understood, and has been actively in use for over 20 years.<br />
* Has the formal specification been revised? (Yes/No, Nof) <br />
: Yes. The specification is continuously revised based on user feedback to improve clarity and accuracy.<br />
* Is the formal specification under the auspices of an architectural board? (Yes/No) <br />
: No. Although officially maintained by the Xiph.Org Foundation, anyone is free to join this organization, and one need not even be a member to make contributions.<br />
* Is the formal specification partitioned in its functionality? (Yes/No) <br />
: No. Theora is very deliberately not partitioned, to avoid the confusion created by a "standard" composed of many incompatible "profiles". The Theora standard does not have any optional components. A compliant Theora decoder can correctly process any Theora stream.<br />
** To what extent does each partition participate to its overall functionality? (NN%) <br />
: N/A.<br />
** To what extent is each partition implemented? (NN%) (cf market adoption)<br />
: N/A.<br />
<br />
=== Re-usability === <br />
* Does the formal specification provide guidelines for its implementation in a given organisation? <br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] provides "non-normative" advice and explanation for implementors of Theora decoders and encoders, including example algorithms for implementing required mathematical transforms. Xiph also maintains [http://wiki.xiph.org/Main_Page a documentation base] for implementors who desire more guidelines beyond the specification itself.<br />
* Can other cases where similar systems implement the formal specification be considered as successful implementations and good practices? <br />
: Xiph's standards have successfully been implemented by many organisations in a wide variety of environments. We maintain (non-exhaustive) [http://wiki.xiph.org/TheoraSoftwarePlayers lists] of products which implement Theora support, many of them open source, so that others may use them as a reference when preparing their own products.<br />
* Is its compatibility with related formal specification documented?<br />
: Yes. For example, [http://theora.org/doc/Theora.pdf the Theora specification] also documents the use of Theora within the [http://www.ietf.org/rfc/rfc3533.txt standard Ogg encapsulation format], and the [http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.txt TheoraRTP draft specification] explains how to transmit Theora using the [http://tools.ietf.org/html/rfc3550 RTP standard]. In addition, the specification documents Theora's compatibility with ITU-R B.470, ITU-R B.601, ITU-R B.709, SMPTE-170M, [http://tools.ietf.org/html/rfc2044 UTF-8], ISO 10646, and [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.pdf Ogg Vorbis].<br />
<br />
== Part 5: Standardisation Criteria == <br />
From Idabc-camss<br />
<br />
Note: Throughout this section, “Organisation” refers to the standardisation/fora/consortia body in charge of the formal specification.<br />
<br />
Significant characteristics of the way the organisation operates are for example the way it gives the possibility to stakeholders to influence the evolution of the formal specification, or which conditions it attaches to the use of the formal specification or its implementation. Moreover, it is important to know how the formal specification is defined, supported, and made available, as well as how interaction with stakeholders is managed by the organisation during these steps. Governance of interoperability testing with other formal specifications is also indicative.<br />
<br />
The standardisation criteria analyses therefore the following elements:<br />
<br />
=== Availability of Documentation ===<br />
The availability of documentation criteria is linked to cost and online availability. Access to all preliminary results documentation can be online, online for members only, offline, offline, for members only or not available. Access can be free or for a fee (which fee?).<br />
: Every Xiph standard is permanently available online to everyone at no cost. For example, we invite everyone to download [http://theora.org/doc/Theora.pdf the most up-to-date copy of the Theora specification], and [http://xiph.org/vorbis/doc/Vorbis_I_spec.html the latest revision of Vorbis]. All previous revisions are available from Xiph's [http://svn.xiph.org/ revision control system].<br />
<br />
=== Intellectual Property Right ===<br />
The Intellectual Property Rights evaluation criteria relates to the ability for implementers to use the formal specification in products without legal or financial implications. The IPR policy of the organisation is therefore evaluated according to: <br />
* the availability of the IPR or copyright policies of the organisation (available on-line or off-line, or not available);<br />
: The reference implementations of each codec include all necessary IPR and copyright licenses for that codec, including all documentation, and are freely available to everyone.<br />
* the organisation’s governance to disclose any IPR from any contributor (ex-ante, online, offline, for free for all, for a fee for all, for members only, not available);<br />
: Xiph does not require the identification of specific patents that may be required to implement a standard, however it does require an open-source compatible, royalty free license from a contributor for any such patents they may own before the corresponding technology can be included in a standard. These license are made available online, for free, to all parties.<br />
* the level of IPR set "mandatory" by the organisation (no patent, royalty free patent, patent and RAND with limited liability , patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none); <br />
: All standards published by the Xiph.Org Foundation are required to have "open-source compatible" IPR. This means that a standard must either be entirely clear of any known patents, or any patents that read upon the standard must be available under a transferable, irrevocable public nonassertion agreement to all people everywhere. For example, see [http://svn.xiph.org/trunk/theora/LICENSE our On2 patent nonassertion warrant]. Other common "royalty free" patent licenses are either not transferable, revocable under certain conditions (such as patent infringement litigation against the originating party), or otherwise impose restrictions that would prevent distribution under common [http://www.opensource.org/ OSI]-approved licenses. These would not be acceptable.<br />
* the level of IPR "recommended" by the organisation (no patent, royalty free patent, patent and RAND with limited liability, patent and classic RAND, patent with explicit licensing, patent with defensive licensing, or none). [Note: RAND (Reasonable and Non Discriminatory License) is based on a "fairness" concept. Companies agree that if they receive any patents on technologies that become essential to the standard then they agree to allow other groups attempting to implement the standard to use these patents and they agree that the charges for the patents shall be reasonable. "RAND with limited availability" is a version of RAND where the "reasonable charges" have an upper limit.]<br />
: Xiph's recommended IPR requirements are the same as our mandatory requirements.<br />
<br />
=== Accessibility ===<br />
<br />
The accessibility evaluation criteria describe the importance of equal and safe accessibility by the users of implementations of formal specifications. This aspect can be related to safety (physical safety and conformance safety) and accessibility of physical impaired people (design for all).<br />
<br />
Focus is made particularly on accessibility and conformance safety. Conformance testing is testing to determine whether a system meets some specified formal specification. The result can be results from a test suite. Conformance validation is when the conformance test uniquely qualifies a given implementation as conformant or not. Conformance certification is a process that provides a public and easily visible "stamp of approval" that an implementation of a standard validates as conformant.<br />
<br />
The following questions allow an assessment of accessibility and conformance safety: <br />
* Does a mechanism that ensures disability support by a formal specification exist? (Y/N) <br />
: Yes. Xiph ensures support for users with disabilities by providing specifications for accessible technologies independent of the codec itself. Notable Xiph specifications include [http://wiki.xiph.org/OggKate OggKate] and [http://wiki.xiph.org/index.php/CMML CMML], which provide subtitles for the hearing-impaired, as well as [http://wiki.xiph.org/Ogg_Skeleton Skeleton], which can specify scene description audio tracks for the visually impaired. When Theora is transmitted or stored in an Ogg container, it is automatically compatible with these accessibility measures.<br />
* Is conformance governance always part of a standard? (Y/N) <br />
: No. Xiph does not normally provide a formal conformance testing process as part of a standard.<br />
* Is a conformance test offered to implementers? (Y/N) <br />
: Yes. Xiph maintains a suite of [http://v2v.cc/~j/theora_testsuite/ test vectors] that can be used by implementors to confirm basic conformance.<br />
* Is conformance validation available to implementers? (Y/N) <br />
: Yes. Informal conformance testing is available to implementors upon request, and Xiph has provided such testing for a number of implementations in the past.<br />
* Is conformance certification available? (Y/N) <br />
: Yes. Xiph does not require certification, but maintains the right to withhold the use of our trademarks from implementors that act in bad faith. Implementors may, however, request explicit permission to use our trademarks with a conforming implementation.<br />
* Is localisation of a formal specification possible? (Y/N)<br />
: Yes. We welcome anyone who wishes to translate Xiph specifications into other languages. We have no policy requiring that the normative specification be written in English.<br />
<br />
=== Interoperability governance === <br />
The interoperability governance evaluation criteria relates to how interoperability is identified and maintained between interoperable formal specifications. In order to do this, the organisation may provide governance for: <br />
* open identification in formal specifications, <br />
* open negotiation in formal specifications, <br />
* open selection in formal specifications. <br />
<br />
=== Meeting and consultation ===<br />
The meeting and consultation evaluation criteria relates to the process of defining a formal specification. As formal specifications are usually defined by committees, and these committees normally consist of members of the organisation, this criteria studies how to become a member and which are the financial barriers for this, as well as how are non-members able to have an influence on the process of defining the formal specification. It analyses: <br />
* if the organisation is open to all types of companies and organisations and to individuals; <br />
: Yes. Xiph welcomes representatives from all companies and organizations.<br />
* if the standardisation process may specifically allow participation of members with limited abilities when relevant; <br />
: Yes. Standardization occurs almost entirely in internet communications channels, allowing participants with disabilities to engage fully in the standards development process. We also encourage nonexperts and students to assist us as they can, and to learn about Xiph technologies by participating in the standards development process.<br />
* if meetings are open to all members;<br />
: Xiph meetings are open to everyone. We charge no fee for and place no restrictions on attendance or participation. For example, anyone interested in contributing to the Theora specification may join [http://lists.xiph.org/pipermail/theora-dev/ the Theora development mailing list].<br />
* if all can participate in the formal specification creation process; <br />
: Yes. All people are welcome to participate in the specification creation process. No dues or fees are required to participate<br />
* if non-members can participate in the formal specification creation process.<br />
: Yes. Xiph does not maintain an explicit list of members, and no one is excluded from contributing to specifications as they are developed.<br />
<br />
=== Consensus ===<br />
Consensus is decision making primarily with regard to the approval of formal specifications and review with interest groups (non-members). The consensus evaluation criterion is evaluated with the following questions:<br />
* Does the organisation have a stated objective of reaching consensus when making decisions on standards? <br />
: There is no explicitly stated objective of reaching consensus.<br />
* If consensus is not reached, can the standard be approved? (answers are: cannot be approved but referred back to working group/committee, approved with 75% majority, approved with 66% majority, approved with 51% majority, can be decided by a "director" or similar in the organisation).<br />
: The standard can be approved without consensus via the decision of a "director" or similar.<br />
* Is there a formal process for external review of standard proposals by interest groups (nonmembers)?<br />
: Since anyone may participate in the development process and make proposals, there is no need for a separate formal process to include proposals by nonmembers.<br />
<br />
=== Due Process ===<br />
The due process evaluation criteria relates to the level of respect of each member of the organisation with regard to its rights. More specifically, it must be assured that if a member believes an error has been made in the process of defining a formal specification, it must be possible to appeal this to an independent, higher instance. The question is therefore: can a member formally appeal or raise objections to a procedure or to a technical specification to an independent, higher instance?<br />
<br />
: Yes. Even if a member fails an appeal within the organization, because all of the technology Xiph standardizes is open and freely implementable, they are always free to develop their own, competing version. Such competing versions may even still be eligible for standardization under the Xiph umbrella.<br />
<br />
=== Changes to the formal specification ===<br />
The suggested changes made to a formal specification need to be presented, evaluated and approved in the same way as the formal specification was first defined. This criteria therefore applies the above criteria to the changes made to the formal specification(availability of documentation, Intellectual Property Right, accessibility, interoperability governance, meeting and consultation, consensus, due process).<br />
<br />
: The exact same process is used for revisions to the standard as was used for the original development of the standard, and thus the answers to all of the above questions remain the same.<br />
<br />
=== Support ===<br />
It is critical that the organisation takes responsibility for the formal specification throughout its life span. This can be done in several ways such as for example a regular periodic review of the formal specification. The support criteria relates to the level of commitment the organisation has taken to support the formal specification throughout its life: <br />
* does the organisation provide support until removal of the published formal specification from public domain (Including this process? <br />
: Xiph.Org standards are never removed from the public domain. Xiph endeavors to provide support for as long as the standard remains in use.<br />
* does the organisation make the formal specification still available even when in non-maintenance mode?<br />
: Yes. All Xiph.Org standards are freely licensed and will always be available.<br />
* does the organisation add new features and keep the formal specification up-to-date?<br />
: Yes. Xiph maintains its ecosystem of standards on a continuous basis.<br />
* does the organisation rectify problems identified in initial implementations?<br />
: Yes. Xiph maintains [https://trac.xiph.org/report a problem reporting system] that is open to the public, and invites everyone to submit suggestions for improvements. Improvements are made both to the standards documents and to the reference implementations.<br />
* does the organisation only create the formal specification?<br />
: No. Xiph also produces high-quality reusable reference implementations of its standards, released under an open license.<br />
<br />
<br />
<strong>This is a draft document. A work in progress. A scratchpad for ideas. It should not be widely circulated in this form.</strong></div>
Silvia
https://wiki.xiph.org/index.php?title=Work_In_Progress&diff=10587
Work In Progress
2009-10-11T08:58:33Z
<p>Silvia: added Ogg Index</p>
<hr />
<div>* '''General Usage:'''<br />
** [[Ogg_Index]]: Introducting index headers into Ogg<br />
** [[Metadata]]: Various types of Ogg metadata including the [[M3F]] (Multimedia Metadata Format) and [[XMLEmbedding]]<br />
** [[MIME_Types_and_File_Extensions]]: MIME Types and file extensions for Ogg multimedia files<br />
** [[Subtle]]: Subtitling tool for professional use that intends to support most subtitle formats including CMML and OggKate<br />
** [[OggText]]: A generic media mapping for (discontinuous) text codecs into Ogg<br />
** [[ROE]]: A description format for describing the tracks and languages etc. of an Ogg multitrack composition<br />
<br />
* '''Compressed Codecs:'''<br />
** [[Theora]] 1.1 "thusnelda": improving compression efficiency while keeping compatibility, see [[Theora11Todo]]<br />
** [[OggDirac]]: The "next-generation" wavelet based video codec, lossy or lossless <br />
** [[OggCELT]]: A low-latency audio codec<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
<br />
* '''Uncompressed Codecs:'''<br />
** [[OggKate]]: A codec for karaoke and text encapsulation in Ogg<br />
** [[OggPCM]]: Uncompressed PCM audio, currently being implemented<br />
** [[OggSpots]]: A mapping for encapsulating timed images in Ogg<br />
** [[OggUVS]]: Uncompressed RGB and YUV video<br />
<br />
* '''Abandonware''' (nobody working on those as far as we know)<br />
** [[Ghost]]: A "next-generation" audio codec (vapourware so far -- don't hold your breath)<br />
** [[Oggless]]: Embedding Xiph codecs like Vorbis in containers other than Ogg<br />
** [[IceShare]]: P2P content distribution<br />
** [[OggPCM_Draft1]]: Original uncompressed PCM audio proposal<br />
** [[OggRGB]]: Original uncompressed RGB video proposal<br />
** [[OggWrit]]: Text phrase codec (e.g. subtitles)<br />
** [[OggYUV]]: Original uncompressed YUV video proposal<br />
<br />
[[Category:Developers stuff]]</div>
Silvia
https://wiki.xiph.org/index.php?title=MIME_Types_and_File_Extensions&diff=10580
MIME Types and File Extensions
2009-10-04T10:50:29Z
<p>Silvia: /* Ogg Kate files - application/kate */</p>
<hr />
<div>STATUS: Work on RFCs and tools is in process to reflect these policies. More details are [http://wiki.xiph.org/index.php/MIMETypesCodecs here], which also include a specification of the codecs parameter of the MIME tyes. Use the correct file extensions straight away.<br />
<br />
DISCLAIMER: currently, application/ogg, video/ogg, audio/ogg and audio/vorbis are registered MIME types. Registration for the others will be undertaken. During this process, the "x-" versions of these unregistered MIME types may be used.<br />
<br />
IMPLEMENTATION recommendations and patches: see [[MIME-Migration]].<br />
<br />
== .ogx - application/ogg ==<br />
<br />
* Ogg Multiplex Profile (anything in [[Ogg]])<br />
* can contain any logical bitstreams multiplexed together in an ogg container<br />
* will replace the .ogg extension from RFC 3534 http://www.ietf.org/rfc/rfc3534.txt<br />
* random multitrack files MUST contain a [[Skeleton]] track to identify all containing logical bitstreams<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* thus, e.g. an annodex file can gracefully degrade to .ogx if an app cannot decode [[CMML]] and/or [[Skeleton]]<br />
* USE: application/ogg has been registered, so can be used immediately<br />
<br />
== .ogv - video/ogg ==<br />
<br />
* Ogg Video Profile (a/v in Ogg container)<br />
* apps supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg<br />
* This list is not exhaustive (for example, [[Dirac]] + FLAC is acceptable too)<br />
* SHOULD contain a Skeleton track and/or MAY contain a CMML logical bitstream.<br />
<br />
== .oga - audio/ogg ==<br />
<br />
* Ogg Audio Profile (audio in Ogg container)<br />
* Applications supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* Covers Ogg [[FLAC]], [[Ghost]], and [[OggPCM]] <br />
* Although they share the same MIME type, Vorbis and Speex use different file extensions.<br />
* SHOULD contain a Skeleton logical bitstream.<br />
* Vorbis and Speex may use .oga, but it is not the prefered method of distributing these files because of backwards-compatibility issues.<br />
<br />
== .ogg - audio/ogg ==<br />
<br />
* Ogg Vorbis I Profile<br />
* .ogg applies now for Vorbis I files only<br />
* .ogg has more recently also been used for Ogg FLAC and for Theora, too &mdash; these uses are deprecated now in favor of .oga and .ogv respectively<br />
* has been defined in RFC 3534 http://www.ietf.org/rfc/rfc3534.txt for application/ogg, so rfc 3534 will be re-defined<br />
<br />
RATIONALE: .ogg has traditionally been used for Vorbis I files, in particular in HW players, hence it is kept for backwards-compatibility<br />
<br />
== .spx - audio/ogg ==<br />
<br />
* Ogg Speex Profile<br />
* .spx has traditionally been used for Speex files within Ogg and should be considered for backwards-compatibility<br />
<br />
== .flac - audio/flac ==<br />
<br />
* FLAC in native encapsulation format<br />
<br />
== .anx - application/annodex ==<br />
<br />
* Profile for multiplexed Ogg that includes a skeleton track and at least one CMML logical bitstream<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* apps that come across an annodex file and cannot decode CMML and/or Skeleton, but can deal with the others SHOULD gracefully degrade by ignoring these<br />
<br />
== .axa - audio/annodex ==<br />
<br />
* Profile for audio in Annodex <br />
* covers e.g. [[Vorbis]], [[Speex]], [[FLAC]], [[Ghost]], [[OggPCM]] inside Ogg with Skeleton and CMML<br />
<br />
== .axv - video/annodex ==<br />
<br />
* Profile for video in Annodex <br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg with Skeleton and CMML<br />
<br />
== .xspf - application/xspf+xml ==<br />
<br />
* Profile for XSPF<br />
* Covers [[XSPF]], while being used through XML<br />
* Does not cover [[JSPF]], which is XSPF but on JSON<br />
<br />
== Ogg Kate files - application/kate ==<br />
<br />
* Binary representation of Kate encapsulated in Ogg<br />
* may have a skeleton<br />
* can be used to identify the mime type of the track itself (e.g. in skeleton)<br />
* uses .ogx extension when in a file by itself<br />
* is subdued by the dominant mime type if in a audio or video file to become audio/ogg or video/ogg<br />
<br />
== Codec MIME types ==<br />
<br />
Codecs need their own MIME types for streaming in RTP and to be used in multitrack ogg files using skeleton:<br />
<br />
* audio/vorbis for Vorbis without container<br />
* video/theora for Theora without container<br />
* audio/speex for Speex without container<br />
* audio/flac for FLAC without and in native container<br />
* text/cmml for CMML without container<br />
* text/kate for the textual representation of Kate (.kate files)</div>
Silvia
https://wiki.xiph.org/index.php?title=MIME_Types_and_File_Extensions&diff=10579
MIME Types and File Extensions
2009-10-04T10:49:55Z
<p>Silvia: /* Ogg Kate files - application/kate */</p>
<hr />
<div>STATUS: Work on RFCs and tools is in process to reflect these policies. More details are [http://wiki.xiph.org/index.php/MIMETypesCodecs here], which also include a specification of the codecs parameter of the MIME tyes. Use the correct file extensions straight away.<br />
<br />
DISCLAIMER: currently, application/ogg, video/ogg, audio/ogg and audio/vorbis are registered MIME types. Registration for the others will be undertaken. During this process, the "x-" versions of these unregistered MIME types may be used.<br />
<br />
IMPLEMENTATION recommendations and patches: see [[MIME-Migration]].<br />
<br />
== .ogx - application/ogg ==<br />
<br />
* Ogg Multiplex Profile (anything in [[Ogg]])<br />
* can contain any logical bitstreams multiplexed together in an ogg container<br />
* will replace the .ogg extension from RFC 3534 http://www.ietf.org/rfc/rfc3534.txt<br />
* random multitrack files MUST contain a [[Skeleton]] track to identify all containing logical bitstreams<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* thus, e.g. an annodex file can gracefully degrade to .ogx if an app cannot decode [[CMML]] and/or [[Skeleton]]<br />
* USE: application/ogg has been registered, so can be used immediately<br />
<br />
== .ogv - video/ogg ==<br />
<br />
* Ogg Video Profile (a/v in Ogg container)<br />
* apps supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg<br />
* This list is not exhaustive (for example, [[Dirac]] + FLAC is acceptable too)<br />
* SHOULD contain a Skeleton track and/or MAY contain a CMML logical bitstream.<br />
<br />
== .oga - audio/ogg ==<br />
<br />
* Ogg Audio Profile (audio in Ogg container)<br />
* Applications supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* Covers Ogg [[FLAC]], [[Ghost]], and [[OggPCM]] <br />
* Although they share the same MIME type, Vorbis and Speex use different file extensions.<br />
* SHOULD contain a Skeleton logical bitstream.<br />
* Vorbis and Speex may use .oga, but it is not the prefered method of distributing these files because of backwards-compatibility issues.<br />
<br />
== .ogg - audio/ogg ==<br />
<br />
* Ogg Vorbis I Profile<br />
* .ogg applies now for Vorbis I files only<br />
* .ogg has more recently also been used for Ogg FLAC and for Theora, too &mdash; these uses are deprecated now in favor of .oga and .ogv respectively<br />
* has been defined in RFC 3534 http://www.ietf.org/rfc/rfc3534.txt for application/ogg, so rfc 3534 will be re-defined<br />
<br />
RATIONALE: .ogg has traditionally been used for Vorbis I files, in particular in HW players, hence it is kept for backwards-compatibility<br />
<br />
== .spx - audio/ogg ==<br />
<br />
* Ogg Speex Profile<br />
* .spx has traditionally been used for Speex files within Ogg and should be considered for backwards-compatibility<br />
<br />
== .flac - audio/flac ==<br />
<br />
* FLAC in native encapsulation format<br />
<br />
== .anx - application/annodex ==<br />
<br />
* Profile for multiplexed Ogg that includes a skeleton track and at least one CMML logical bitstream<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* apps that come across an annodex file and cannot decode CMML and/or Skeleton, but can deal with the others SHOULD gracefully degrade by ignoring these<br />
<br />
== .axa - audio/annodex ==<br />
<br />
* Profile for audio in Annodex <br />
* covers e.g. [[Vorbis]], [[Speex]], [[FLAC]], [[Ghost]], [[OggPCM]] inside Ogg with Skeleton and CMML<br />
<br />
== .axv - video/annodex ==<br />
<br />
* Profile for video in Annodex <br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg with Skeleton and CMML<br />
<br />
== .xspf - application/xspf+xml ==<br />
<br />
* Profile for XSPF<br />
* Covers [[XSPF]], while being used through XML<br />
* Does not cover [[JSPF]], which is XSPF but on JSON<br />
<br />
== Ogg Kate files - application/kate ==<br />
<br />
* Binary representation of Kate encapsulated in Ogg<br />
* may have a skeleton<br />
* can be used to identify the mime type of the track itself (e.g. in skeleton)<br />
* uses .ogx extension when in a file by itself<br />
* is subdued by the dominant mime type if in a audio or video file to become audio/vorbis or video/theora<br />
<br />
== Codec MIME types ==<br />
<br />
Codecs need their own MIME types for streaming in RTP and to be used in multitrack ogg files using skeleton:<br />
<br />
* audio/vorbis for Vorbis without container<br />
* video/theora for Theora without container<br />
* audio/speex for Speex without container<br />
* audio/flac for FLAC without and in native container<br />
* text/cmml for CMML without container<br />
* text/kate for the textual representation of Kate (.kate files)</div>
Silvia
https://wiki.xiph.org/index.php?title=MIME_Types_and_File_Extensions&diff=10578
MIME Types and File Extensions
2009-10-04T10:47:18Z
<p>Silvia: added kate</p>
<hr />
<div>STATUS: Work on RFCs and tools is in process to reflect these policies. More details are [http://wiki.xiph.org/index.php/MIMETypesCodecs here], which also include a specification of the codecs parameter of the MIME tyes. Use the correct file extensions straight away.<br />
<br />
DISCLAIMER: currently, application/ogg, video/ogg, audio/ogg and audio/vorbis are registered MIME types. Registration for the others will be undertaken. During this process, the "x-" versions of these unregistered MIME types may be used.<br />
<br />
IMPLEMENTATION recommendations and patches: see [[MIME-Migration]].<br />
<br />
== .ogx - application/ogg ==<br />
<br />
* Ogg Multiplex Profile (anything in [[Ogg]])<br />
* can contain any logical bitstreams multiplexed together in an ogg container<br />
* will replace the .ogg extension from RFC 3534 http://www.ietf.org/rfc/rfc3534.txt<br />
* random multitrack files MUST contain a [[Skeleton]] track to identify all containing logical bitstreams<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* thus, e.g. an annodex file can gracefully degrade to .ogx if an app cannot decode [[CMML]] and/or [[Skeleton]]<br />
* USE: application/ogg has been registered, so can be used immediately<br />
<br />
== .ogv - video/ogg ==<br />
<br />
* Ogg Video Profile (a/v in Ogg container)<br />
* apps supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg<br />
* This list is not exhaustive (for example, [[Dirac]] + FLAC is acceptable too)<br />
* SHOULD contain a Skeleton track and/or MAY contain a CMML logical bitstream.<br />
<br />
== .oga - audio/ogg ==<br />
<br />
* Ogg Audio Profile (audio in Ogg container)<br />
* Applications supporting .oga, .ogv SHOULD support decoding from muxed Ogg streams<br />
* Covers Ogg [[FLAC]], [[Ghost]], and [[OggPCM]] <br />
* Although they share the same MIME type, Vorbis and Speex use different file extensions.<br />
* SHOULD contain a Skeleton logical bitstream.<br />
* Vorbis and Speex may use .oga, but it is not the prefered method of distributing these files because of backwards-compatibility issues.<br />
<br />
== .ogg - audio/ogg ==<br />
<br />
* Ogg Vorbis I Profile<br />
* .ogg applies now for Vorbis I files only<br />
* .ogg has more recently also been used for Ogg FLAC and for Theora, too &mdash; these uses are deprecated now in favor of .oga and .ogv respectively<br />
* has been defined in RFC 3534 http://www.ietf.org/rfc/rfc3534.txt for application/ogg, so rfc 3534 will be re-defined<br />
<br />
RATIONALE: .ogg has traditionally been used for Vorbis I files, in particular in HW players, hence it is kept for backwards-compatibility<br />
<br />
== .spx - audio/ogg ==<br />
<br />
* Ogg Speex Profile<br />
* .spx has traditionally been used for Speex files within Ogg and should be considered for backwards-compatibility<br />
<br />
== .flac - audio/flac ==<br />
<br />
* FLAC in native encapsulation format<br />
<br />
== .anx - application/annodex ==<br />
<br />
* Profile for multiplexed Ogg that includes a skeleton track and at least one CMML logical bitstream<br />
* apps that identify a logical bitstream which they cannot decode SHOULD ignore it but MAY still decode the ones they can<br />
* apps that come across an annodex file and cannot decode CMML and/or Skeleton, but can deal with the others SHOULD gracefully degrade by ignoring these<br />
<br />
== .axa - audio/annodex ==<br />
<br />
* Profile for audio in Annodex <br />
* covers e.g. [[Vorbis]], [[Speex]], [[FLAC]], [[Ghost]], [[OggPCM]] inside Ogg with Skeleton and CMML<br />
<br />
== .axv - video/annodex ==<br />
<br />
* Profile for video in Annodex <br />
* covers e.g. [[Theora]], Theora + Vorbis, Theora + Speex, Theora + FLAC, [[Dirac]] + Vorbis, [[OggMNG|MNG]] + FLAC, [[OggUVS]] inside Ogg with Skeleton and CMML<br />
<br />
== .xspf - application/xspf+xml ==<br />
<br />
* Profile for XSPF<br />
* Covers [[XSPF]], while being used through XML<br />
* Does not cover [[JSPF]], which is XSPF but on JSON<br />
<br />
== Ogg Kate files - application/kate ==<br />
<br />
* Binary representation of Kate encapsulated in Ogg<br />
* may have a skeleton<br />
* can be used to identify the mime type of the track itself (e.g. in skeleton)<br />
* uses .ogx extension when in a file by itself<br />
* is subdued by the dominant mime type if in a audio or video file<br />
<br />
== Codec MIME types ==<br />
<br />
Codecs need their own MIME types for streaming in RTP and to be used in multitrack ogg files using skeleton:<br />
<br />
* audio/vorbis for Vorbis without container<br />
* video/theora for Theora without container<br />
* audio/speex for Speex without container<br />
* audio/flac for FLAC without and in native container<br />
* text/cmml for CMML without container<br />
* text/kate for the textual representation of Kate (.kate files)</div>
Silvia
https://wiki.xiph.org/index.php?title=Work_In_Progress&diff=10522
Work In Progress
2009-08-25T14:05:57Z
<p>Silvia: added ROE</p>
<hr />
<div>* '''General Usage:'''<br />
** [[Metadata]]: Various types of Ogg metadata including the [[M3F]] (Multimedia Metadata Format) and [[XMLEmbedding]]<br />
** [[MIME_Types_and_File_Extensions]]: MIME Types and file extensions for Ogg multimedia files<br />
** [[Subtle]]: Subtitling tool for professional use that intends to support most subtitle formats including CMML and OggKate<br />
** [[OggText]]: A generic media mapping for (discontinuous) text codecs into Ogg<br />
** [[ROE]]: A description format for describing the tracks and languages etc. of an Ogg multitrack composition<br />
<br />
* '''Compressed Codecs:'''<br />
** [[Theora]] 1.1 "thusnelda": improving compression efficiency while keeping compatibility, see [[Theora11Todo]]<br />
** [[OggDirac]]: The "next-generation" wavelet based video codec, lossy or lossless <br />
** [[OggCELT]]: A low-latency audio codec<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
<br />
* '''Uncompressed Codecs:'''<br />
** [[OggKate]]: A codec for karaoke and text encapsulation in Ogg<br />
** [[OggPCM]]: New Uncompressed PCM audio, currently being implemented (formerly Draft2)<br />
** [[OggSpots]]: A mapping for encapsulating timed images in Ogg<br />
** [[OggUVS]]: Uncompressed RGB and YUV video, under active development (preferred to OggRGB and OggYUV).<br />
<br />
* '''Abandonware''' (nobody working on those as far as we know)<br />
** [[Ghost]]: A "next-generation" audio codec (vapourware so far -- don't hold your breath)<br />
** [[Oggless]]: Embedding Xiph codecs like Vorbis in containers other than Ogg<br />
** [[IceShare]]: P2P content distribution<br />
** [[OggPCM_Draft1]]: Original uncompressed PCM audio proposal<br />
** [[OggRGB]]: Original uncompressed RGB video proposal<br />
** [[OggWrit]]: Text phrase codec (e.g. subtitles)<br />
** [[OggYUV]]: Original uncompressed YUV video proposal<br />
<br />
[[Category:Developers stuff]]</div>
Silvia
https://wiki.xiph.org/index.php?title=Timed_Divs_HTML&diff=10263
Timed Divs HTML
2009-06-21T22:59:46Z
<p>Silvia: /* Direct linking on a HTML5 page */</p>
<hr />
<div>{{draft}}<br />
<br />
= Introduction =<br />
<br />
This page specifies a subclass of HTML documents that is a time-aligned text format for audio-visual content. We call the format "timed divs within HTML" or TDHT. It is intended to be used only in a World Wide Web context i.e. everywhere that Web browser functionality is available. Use cases for the format are subtitles, captions, annotations and other time aligned text as listed at http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs .<br />
<br />
TDHT may be similar to W3C TimedText DFXP in many respects, but in comparison to DFXP it does not re-invent HTML, CSS and effects, but rather uses existing HTML, CSS and javascript for these. The purpose of DFXP is to create a web-independent exchange format for timed text, which is why it cannot directly be specified as a subpart of HTML.<br />
<br />
TDHT in contrast is HTML with a minimum number of changes. TDHT is parsable by any HTML parser. It works with CSS and javascript. No new functionality has to be defined for TDHT.<br />
<br />
<br />
= File Extension =<br />
<br />
Files in this format are to be of text/html mime type since they are valid html files, apart from some extra attributes.<br />
<br />
Files in this format should have a file extension of .tdht to separate them from plain html files.<br />
<br />
= The TDHT format changes from HTML =<br />
<br />
TDHT files are time-aligned text. This means there is a time association with blocks of text and there is time-based seeking functionality on those blocks of text.<br />
<br />
Here is an example TDHT file for subtitles:<br />
<br />
<pre><br />
<html><br />
<head><br />
<title>Desperate Housewives - Season 5, Episode 6</title><br />
</head><br />
<body><br />
<div start="00:00:00.070" end="00:00:02.270"><br />
<p>Previously on...</p><br />
</div><br />
<div start="00:00:02.280" end="00:00:04.270"><br />
<p>We had an agreement to keep things casual.</p><br />
</div><br />
<div start="00:00:04.280" end="00:00:06.660"><br />
<p>Susan made her feelings clear.</p><br />
</div><br />
<div start="00:00:06.800" end="00:00:10.100"><br />
<p>So if I was with another woman, that wouldn't bother you? No, it wouldn't.</p><br />
</div><br />
</body><br />
</html><br />
</pre><br />
<br />
The differences of TDHT from HTML are described using [http://www.w3.org/TR/html401/ HTML4.01], but the changes apply the same to [http://www.whatwg.org/specs/web-apps/current-work/ HTML5], which doesn't have a normative schema.<br />
<br />
The following changes to HTML are made for TDHT:<br />
<br />
<br />
== 1. The body element ==<br />
<br />
In HTML4.01, the [http://www.w3.org/TR/html401/struct/global.html#h-7.5 body element] is defined as follows:<br />
<br />
<pre><br />
<!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL) -- document body --><br />
<!ATTLIST BODY<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
onload %Script; #IMPLIED -- the document has been loaded --<br />
onunload %Script; #IMPLIED -- the document has been removed --<br />
><br />
</pre><br />
<br />
In TDHT1.0 we restrict body to just contain a sequence of div tags:<br />
<br />
<pre><br />
<!ELEMENT BODY O O (DIV)+ -- document body --><br />
<!ATTLIST BODY<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
onload %Script; #IMPLIED -- the document has been loaded --<br />
onunload %Script; #IMPLIED -- the document has been removed --<br />
><br />
</pre><br />
<br />
Any tags inside the body tag that are non-conformant to this specification (such as regular html tags that are allowed inside body) must be ignored for TDHT.<br />
<br />
The div tags, however, can contain anything that HTML div tags can contain, thus enabling a very flexible, but time-aligned text model.<br />
<br />
== 2. The div element ==<br />
<br />
In HTML, the [http://www.w3.org/TR/html401/struct/global.html#h-7.5.4 div element] is defined as follows:<br />
<br />
<pre><br />
<!ELEMENT DIV - - (%flow;)* -- generic language/style container --><br />
<!ATTLIST DIV<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
><br />
</pre><br />
<br />
In TDHT1.0 we extend it with start and end time attributes:<br />
<br />
<pre><br />
<!ELEMENT DIV - - (%flow;)* -- generic language/style container --><br />
<!ATTLIST DIV<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
start %Time; #IMPLIED -- start time<br />
end %Time; #IMPLIED -- end time<br />
><br />
</pre><br />
<br />
The Time entity represents a valid time string accroding to HTML5: http://www.whatwg.org/specs/web-apps/current-work/#valid-time-string . The end time string must be larger than the start time string, otherwise the div element does not exist for any duration and can never turn active.<br />
<br />
&lt;div> elements in a TDHT file should be ordered by start time to simplify parsing. Inside Ogg or when rendered, they will be ordered by start time.<br />
<br />
= Rendering in a Web Browser =<br />
<br />
A TDHT file is meant to be associated with a audio or video file and rendered in a Web browser in sync with the audio or video file.<br />
<br />
The TDHT file's div elements are not rendered into an existing HTML page, but rather a TDHT file creates its own [http://www.whatwg.org/specs/web-apps/current-work/#the-iframe-element iframe-like] new nested browsing context. It is linked to the parent HTML page through an itext element that is inserted as a child of the video element. Creation of a nested browsing context is important because a TDHT file can come from a different URI than the Web page and thus for security reasons and for general base URI computations a nested browsing context is the better approach with the DOM nodes of the hosting page and the DOM nodes of the TDHT document in different owner documents. That way, the hosting document has the security origin of its own URL and the TDHT document has the security origin of its URL. <br />
<br />
The rendering and CSS view port are either by default the rectangle occupied by the given <video> or <audio> tag, or an area provided for by the hosting HTML page through the itext element's properties. The zoom factor of the iframe must be set to such a value that the width of the view port established by the itext frame is equally wide in CSS px as the video frame is wide in codec pixels. (Example: If the video encodes a frame that is 240 pixels wide but is displayed at 480 CSS px wide, the zoom factor of the itext frame should be 200% so that the box that on the outsize measures 480 px seems like a box of 240 px from within the itext frame.)<br />
<br />
The itext frame is by default transparent.<br />
<br />
A TDHT file can get to a browser either as a external resource, or as part of audio or video resource (in particular inside Ogg, see below). Parsing in these two cases is slightly different for the browser.<br />
<br />
For the external TDHT file case:<br />
The TDHT file is parsed using the HTML5 parsing algorithm in its normal mode into a non-rendered DOM. To render a div, the children of the div would be cloned into the body of the rendering shell document (replacing possible previous children of body).<br />
<br />
For the Ogg-internal TDHT case:<br />
To multiplex an external TDHT file into Ogg, each div with its innerHTML would be placed into a data packet and the head data in to an Ogg header. For decoding, the rendering shell document is set up and the head tag is included from the Ogg headers. To render a packet, the div and its innerHTML would be added to the innerHTML of the body element of the rendering shell document as they come. This will use the HTML fragment parser.<br />
<br />
As the browser plays the video, it must render the TDHT &lt;div> tags in sync. As the start time of a &lt;div> tag is reached, the &lt;div> tag is made activate, and it is made inactive as the &lt;div> tag's end time is reached. If no start time is given, the start is assumed to be 0, and if no end time is given, end is assumed to be infinity.<br />
<br />
An "active" &lt;div> tag may be a &lt;div> tag that is being displayed ("display: block") in contrast to an "inactive" &lt;div> tag, which may not be displayed ("display: none"). For some text formats however the difference between "active" and "inactive" may be a background colour or the display location on screen or some other mechanism. The default should be between "block" and "none", but changeable through CSS.<br />
<br />
As the browser has parsed the TDHT file or its consitutent &lt;div> tags, it is expected to keep the structure in memory. When seeking happens on the video, it can then decide upon which &lt;div> tags are supposed to be active at the seek time and display these. [There is a discussion to be had here about the effect this has on the DOM. Different selectors may apply to a caption depending on whether the video was played back all the way there or seeking skipped over data to get there. It was suggested that inactive captions should be removed from the DOM, so there's always a well-defined small unambiguous DOM to match selectors against. However, this may for example not be desirable on some text display formats.]<br />
<br />
= Encapsulation into Ogg =<br />
<br />
The [http://wiki.xiph.org/index.php/OggText OggText] specification is used to encapsulate a TDHT file into Ogg.<br />
<br />
The codec-specific header data for the OggText ident header is the <head>..</head> part of the TDHT file. The complete <head> tag including all its subtags is encoded into the ident header unchanged.<br />
<br />
The &lt;div> elements with all their inner HTML are the data packets of the TDHT text codec and are thus encapsulated into the data packets as text codec data. A complete &lt;div> including all its subtags is encoded into one data packet each.<br />
<br />
= Direct linking on a HTML5 page =<br />
<br />
Often, subtitles and other time-aligned text files are not actually provided inside a video stream (e.g. inside Ogg), but are referenced as a separate partner resource to a video.<br />
<br />
To allow association of such files with a <video> or <audio> element, we propose the following approach:<br />
<br />
<pre><br />
<video i="video" src="http://example.com/video.ogv" controls><br />
<itext id="caption1" category="CC" lang="en/us" src="caption.srt" style=""></itext><br />
<itext id="caption2" category="CC" lang="de/de" src="caption.tdht" style=""></itext><br />
<itext id="subtitle1" category="SUB" lang="de/de" src="german.dfxp" style=""></itext><br />
<itext id="subtitle2" category="SUB" lang="jp" src="japanese.smil" style="></itext><br />
<itext id="subtitle3" category="SUB" lang="fr" src="translation_webservice/fr/caption.srt" style=""></itext><br />
</video><br />
</pre><br />
<br />
Notice the second set of closed captions being a TDHT file.<br />
<br />
The id tag is simply a unique identifier for the tag.<br />
The category is from [http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs Ogg text categories].<br />
The lang contains a natural language according to [http://en.wikipedia.org/wiki/Language_code language codes].<br />
The src element contains the actual file URI that we are after.<br />
The style element allows to attach styling to marked-up import files.<br />
<br />
The <itext> element would act like an <iframe> element and create the nested browsing context described earlier. It has been renamed from earlier mentions of this approach from <text> to <itext> to avoid name clashes with e.g. SVG.<br />
<br />
The user agent would then provide an interface such as:<br />
<br />
interface MediaItextElement : HTMLElement {<br />
attribute DOMString src;<br />
attribute DOMString category;<br />
attribute DOMString lang;<br />
attribute DOMString id;<br />
attribute DOMString style;<br />
};<br />
<br />
In javascript there will need to be additional functions such as:<br />
<br />
getItext (): returns an array of time-aligned text elements<br />
addItext({src,category,lang,style,name}): adds a time-aligned text element to a <video> or <audio> element<br />
enable(itextElement): activates display of an itext file<br />
disable(itextElement) : deactivates display of an itext file<br />
delay(itextElement, seconds) : delays the itext file in relation to the video file by a positive or negative number of seconds</div>
Silvia
https://wiki.xiph.org/index.php?title=Timed_Divs_HTML&diff=10262
Timed Divs HTML
2009-06-21T22:27:50Z
<p>Silvia: /* Direct linking on a HTML5 page */</p>
<hr />
<div>{{draft}}<br />
<br />
= Introduction =<br />
<br />
This page specifies a subclass of HTML documents that is a time-aligned text format for audio-visual content. We call the format "timed divs within HTML" or TDHT. It is intended to be used only in a World Wide Web context i.e. everywhere that Web browser functionality is available. Use cases for the format are subtitles, captions, annotations and other time aligned text as listed at http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs .<br />
<br />
TDHT may be similar to W3C TimedText DFXP in many respects, but in comparison to DFXP it does not re-invent HTML, CSS and effects, but rather uses existing HTML, CSS and javascript for these. The purpose of DFXP is to create a web-independent exchange format for timed text, which is why it cannot directly be specified as a subpart of HTML.<br />
<br />
TDHT in contrast is HTML with a minimum number of changes. TDHT is parsable by any HTML parser. It works with CSS and javascript. No new functionality has to be defined for TDHT.<br />
<br />
<br />
= File Extension =<br />
<br />
Files in this format are to be of text/html mime type since they are valid html files, apart from some extra attributes.<br />
<br />
Files in this format should have a file extension of .tdht to separate them from plain html files.<br />
<br />
= The TDHT format changes from HTML =<br />
<br />
TDHT files are time-aligned text. This means there is a time association with blocks of text and there is time-based seeking functionality on those blocks of text.<br />
<br />
Here is an example TDHT file for subtitles:<br />
<br />
<pre><br />
<html><br />
<head><br />
<title>Desperate Housewives - Season 5, Episode 6</title><br />
</head><br />
<body><br />
<div start="00:00:00.070" end="00:00:02.270"><br />
<p>Previously on...</p><br />
</div><br />
<div start="00:00:02.280" end="00:00:04.270"><br />
<p>We had an agreement to keep things casual.</p><br />
</div><br />
<div start="00:00:04.280" end="00:00:06.660"><br />
<p>Susan made her feelings clear.</p><br />
</div><br />
<div start="00:00:06.800" end="00:00:10.100"><br />
<p>So if I was with another woman, that wouldn't bother you? No, it wouldn't.</p><br />
</div><br />
</body><br />
</html><br />
</pre><br />
<br />
The differences of TDHT from HTML are described using [http://www.w3.org/TR/html401/ HTML4.01], but the changes apply the same to [http://www.whatwg.org/specs/web-apps/current-work/ HTML5], which doesn't have a normative schema.<br />
<br />
The following changes to HTML are made for TDHT:<br />
<br />
<br />
== 1. The body element ==<br />
<br />
In HTML4.01, the [http://www.w3.org/TR/html401/struct/global.html#h-7.5 body element] is defined as follows:<br />
<br />
<pre><br />
<!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL) -- document body --><br />
<!ATTLIST BODY<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
onload %Script; #IMPLIED -- the document has been loaded --<br />
onunload %Script; #IMPLIED -- the document has been removed --<br />
><br />
</pre><br />
<br />
In TDHT1.0 we restrict body to just contain a sequence of div tags:<br />
<br />
<pre><br />
<!ELEMENT BODY O O (DIV)+ -- document body --><br />
<!ATTLIST BODY<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
onload %Script; #IMPLIED -- the document has been loaded --<br />
onunload %Script; #IMPLIED -- the document has been removed --<br />
><br />
</pre><br />
<br />
Any tags inside the body tag that are non-conformant to this specification (such as regular html tags that are allowed inside body) must be ignored for TDHT.<br />
<br />
The div tags, however, can contain anything that HTML div tags can contain, thus enabling a very flexible, but time-aligned text model.<br />
<br />
== 2. The div element ==<br />
<br />
In HTML, the [http://www.w3.org/TR/html401/struct/global.html#h-7.5.4 div element] is defined as follows:<br />
<br />
<pre><br />
<!ELEMENT DIV - - (%flow;)* -- generic language/style container --><br />
<!ATTLIST DIV<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
><br />
</pre><br />
<br />
In TDHT1.0 we extend it with start and end time attributes:<br />
<br />
<pre><br />
<!ELEMENT DIV - - (%flow;)* -- generic language/style container --><br />
<!ATTLIST DIV<br />
%attrs; -- %coreattrs, %i18n, %events --<br />
start %Time; #IMPLIED -- start time<br />
end %Time; #IMPLIED -- end time<br />
><br />
</pre><br />
<br />
The Time entity represents a valid time string accroding to HTML5: http://www.whatwg.org/specs/web-apps/current-work/#valid-time-string . The end time string must be larger than the start time string, otherwise the div element does not exist for any duration and can never turn active.<br />
<br />
&lt;div> elements in a TDHT file should be ordered by start time to simplify parsing. Inside Ogg or when rendered, they will be ordered by start time.<br />
<br />
= Rendering in a Web Browser =<br />
<br />
A TDHT file is meant to be associated with a audio or video file and rendered in a Web browser in sync with the audio or video file.<br />
<br />
The TDHT file's div elements are not rendered into an existing HTML page, but rather a TDHT file creates its own [http://www.whatwg.org/specs/web-apps/current-work/#the-iframe-element iframe-like] new nested browsing context. It is linked to the parent HTML page through an itext element that is inserted as a child of the video element. Creation of a nested browsing context is important because a TDHT file can come from a different URI than the Web page and thus for security reasons and for general base URI computations a nested browsing context is the better approach with the DOM nodes of the hosting page and the DOM nodes of the TDHT document in different owner documents. That way, the hosting document has the security origin of its own URL and the TDHT document has the security origin of its URL. <br />
<br />
The rendering and CSS view port are either by default the rectangle occupied by the given <video> or <audio> tag, or an area provided for by the hosting HTML page through the itext element's properties. The zoom factor of the iframe must be set to such a value that the width of the view port established by the itext frame is equally wide in CSS px as the video frame is wide in codec pixels. (Example: If the video encodes a frame that is 240 pixels wide but is displayed at 480 CSS px wide, the zoom factor of the itext frame should be 200% so that the box that on the outsize measures 480 px seems like a box of 240 px from within the itext frame.)<br />
<br />
The itext frame is by default transparent.<br />
<br />
A TDHT file can get to a browser either as a external resource, or as part of audio or video resource (in particular inside Ogg, see below). Parsing in these two cases is slightly different for the browser.<br />
<br />
For the external TDHT file case:<br />
The TDHT file is parsed using the HTML5 parsing algorithm in its normal mode into a non-rendered DOM. To render a div, the children of the div would be cloned into the body of the rendering shell document (replacing possible previous children of body).<br />
<br />
For the Ogg-internal TDHT case:<br />
To multiplex an external TDHT file into Ogg, each div with its innerHTML would be placed into a data packet and the head data in to an Ogg header. For decoding, the rendering shell document is set up and the head tag is included from the Ogg headers. To render a packet, the div and its innerHTML would be added to the innerHTML of the body element of the rendering shell document as they come. This will use the HTML fragment parser.<br />
<br />
As the browser plays the video, it must render the TDHT &lt;div> tags in sync. As the start time of a &lt;div> tag is reached, the &lt;div> tag is made activate, and it is made inactive as the &lt;div> tag's end time is reached. If no start time is given, the start is assumed to be 0, and if no end time is given, end is assumed to be infinity.<br />
<br />
An "active" &lt;div> tag may be a &lt;div> tag that is being displayed ("display: block") in contrast to an "inactive" &lt;div> tag, which may not be displayed ("display: none"). For some text formats however the difference between "active" and "inactive" may be a background colour or the display location on screen or some other mechanism. The default should be between "block" and "none", but changeable through CSS.<br />
<br />
As the browser has parsed the TDHT file or its consitutent &lt;div> tags, it is expected to keep the structure in memory. When seeking happens on the video, it can then decide upon which &lt;div> tags are supposed to be active at the seek time and display these. [There is a discussion to be had here about the effect this has on the DOM. Different selectors may apply to a caption depending on whether the video was played back all the way there or seeking skipped over data to get there. It was suggested that inactive captions should be removed from the DOM, so there's always a well-defined small unambiguous DOM to match selectors against. However, this may for example not be desirable on some text display formats.]<br />
<br />
= Encapsulation into Ogg =<br />
<br />
The [http://wiki.xiph.org/index.php/OggText OggText] specification is used to encapsulate a TDHT file into Ogg.<br />
<br />
The codec-specific header data for the OggText ident header is the <head>..</head> part of the TDHT file. The complete <head> tag including all its subtags is encoded into the ident header unchanged.<br />
<br />
The &lt;div> elements with all their inner HTML are the data packets of the TDHT text codec and are thus encapsulated into the data packets as text codec data. A complete &lt;div> including all its subtags is encoded into one data packet each.<br />
<br />
= Direct linking on a HTML5 page =<br />
<br />
Often, subtitles and other time-aligned text files are not actually provided inside a video stream (e.g. inside Ogg), but are referenced as a separate partner resource to a video.<br />
<br />
To allow association of such files with a <video> or <audio> element, we propose the following approach:<br />
<br />
<pre><br />
<video i="video" src="http://example.com/video.ogv" controls><br />
<itext id="caption1" category="CC" lang="en/us" src="caption.srt" style=""></itext><br />
<itext id="caption2" category="CC" lang="de/de" src="caption.tdht" style=""></itext><br />
<itext id="subtitle1" category="SUB" lang="de/de" src="german.dfxp" style=""></itext><br />
<itext id="subtitle2" category="SUB" lang="jp" src="japanese.smil" style="></itext><br />
<itext id="subtitle3" category="SUB" lang="fr" src="translation_webservice/fr/caption.srt" style=""></itext><br />
</video><br />
</pre><br />
<br />
Notice the second set of closed captions being a TDHT file.<br />
<br />
The id tag is simply a unique identifier for the tag.<br />
The category is from [[http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs Ogg text categories]].<br />
The lang contains a natural language according to [[http://en.wikipedia.org/wiki/Language_code | language codes]].<br />
The src element contains the actual file URI that we are after.<br />
The style element allows to attach styling to marked-up import files.<br />
<br />
The <itext> element would act like an <iframe> element and create the nested browsing context described earlier. It has been renamed from earlier mentions of this approach from <text> to <itext> to avoid name clashes with e.g. SVG.<br />
<br />
The user agent would then provide an interface such as:<br />
<br />
interface MediaItextElement : HTMLElement {<br />
attribute DOMString src;<br />
attribute DOMString category;<br />
attribute DOMString lang;<br />
attribute DOMString id;<br />
attribute DOMString style;<br />
};<br />
<br />
In javascript there will need to be additional functions such as:<br />
<br />
getItext (): returns an array of time-aligned text elements<br />
addItext({src,category,lang,style,name}): adds a time-aligned text element to a <video> or <audio> element<br />
enable(itextElement): activates display of an itext file<br />
disable(itextElement) : deactivates display of an itext file<br />
delay(itextElement, seconds) : delays the itext file in relation to the video file by a positive or negative number of seconds</div>
Silvia
https://wiki.xiph.org/index.php?title=MIMETypesCodecs&diff=10231
MIMETypesCodecs
2009-06-04T11:35:12Z
<p>Silvia: added file extension sentence</p>
<hr />
<div>== Specification of MIME types and respective codecs parameter ==<br />
<br />
Also includes a specification of the recommended file extensions to use with Ogg.<br />
<br />
=== MIME Types ===<br />
<br />
The following MIME types are now officially registered with IANA and specified with the IETF as [http://www.ietf.org/rfc/rfc5334.txt RFC 5334]:<br />
<br />
* video/ogg - for video (with audio) encapsulated in Ogg<br />
** recommends a Skeleton logical bitstrem<br />
** .ogv file extension<br />
** Macintosh File Type Code: OggV<br />
<br />
* audio/ogg - for audio encapsulated in Ogg<br />
** recommends a Skeleton logical bitstrem<br />
** .oga file extension, .ogg for Vorbis I, .spx for Speex<br />
** Macintosh File Type Code: OggA<br />
<br />
* application/ogg - for complex, multitrack, multiplexed files encapsulated in Ogg<br />
** requires a Skeleton logical bitstream<br />
** .ogx file extension<br />
** Macintosh File Type Code: OggX<br />
<br />
<br />
[[MIME_Types_and_File_Extensions|Other MIME types]] are still in the process.<br />
<br />
=== Codecs Parameter ===<br />
<br />
[http://www.rfc-editor.org/rfc/rfc4281.txt Typically], MIME types of media encapsulation formats use the optional "codecs" parameter to specify which codes are being used in a particular file.<br />
<br />
Codecs encapsulated in Ogg require a text identifier at the beginning of the first header page to identify the encapsulated codecs. The following table contains the identifiers for existing Xiph codecs and the codecs parameter names used for */ogg MIME types (in alphabetical order):<br />
<br />
{| class="codecstable" border="1"<br />
|-<br />
! Codecs Parameter Name<br />
! Codec Type<br />
! Codec Identifier<br />
(decimal, hex, octal)<br />
! Version Field (if available)<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h celt]<br />
| audio<br />
| char[0,8]: <tt>'CELT\ \ \ \ '</tt><br />
hex: <tt>'0x43 0x45 0x4c 0x54 0x20 0x20 0x20 0x20'</tt><br />
<br />
oct: <tt>'0103 0105 0114 0124 0040 0040 0040 0040'</tt><br />
| char[28,4]: version id<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h cmml]<br />
| text<br />
| char[0,8]: <tt>'CMML\0\0\0\0'</tt><br />
hex: <tt>'0x43 0x4d 0x4d 0x4c 0x00 0x00 0x00 0x00'</tt><br />
<br />
oct: <tt>'0103 0115 0115 0114 0000 0000 0000 0000'</tt><br />
| char[8,2]: major version number,<br />
char[10,2]: minor version number<br />
|-<br />
| [http://wiki.xiph.org/index.php/OggDirac dirac]<br />
| video<br />
| char[0,5]: <tt>'BBCD\0'</tt><br />
hex: <tt>'0x42 0x42 0x43 0x44 00'</tt><br />
<br />
oct: <tt>'0102 0102 0103 0104 0000'</tt><br />
| ??<br />
|-<br />
| [http://flac.sourceforge.net/ogg_mapping.html flac]<br />
| audio<br />
| char[0,5]: <tt>'\177FLAC'</tt><br />
hex: <tt>'0x7F 0x46 0x4C 0x41 0x43'</tt><br />
<br />
oct: <tt>'0177 0106 0114 0101 0103'</tt><br />
| char[5,1]: binary major version number, <br />
char[6,1]: binary minor version number of mapping<br />
|-<br />
| [[OggMNG|jng]]<br />
| video<br />
| char[0,8]: <tt>'\213JNG\r\n\032\n'</tt><br />
hex: <tt>'0x8b 0x4a 0x4e 0x47 0x0D 0x0A 0x1A 0x0A'</tt><br />
<br />
oct: <tt>'0213 0112 0116 0107 0015 0012 0032 0012'</tt><br />
| ??<br />
|-<br />
| [[OggKate|kate]]<br />
| text<br />
| char[0,8]: <tt>'\x80kate\0\0\0'</tt><br />
hex: <tt>'0x80 0x6b 0x61 0x74 0x65 0x00 0x00 0x00'</tt><br />
<br />
oct: <tt>'0200 0153 0141 0164 0145 0000 0000 0000'</tt><br />
| char[9,1]: major version number,<br />
char[10,1]: minor version number<br />
|-<br />
| [http://lists.xiph.org/pipermail/vorbis-dev/2001-August/004501.html midi]<br />
| text<br />
| char[0,8]: <tt>'OggMIDI\0'</tt><br />
hex: <tt>'0x4f 0x67 0x67 0x4d 0x49 0x44 0x49 0x00'</tt><br />
<br />
oct: <tt>'0117 0147 0147 0115 0111 0104 0111 0000'</tt><br />
| char[8,1]: version field<br />
|-<br />
| [[OggMNG|mng]]<br />
| video<br />
| char[0,8]: <tt>'\212MNG\r\n\032\n'</tt><br />
hex: <tt>'0x8a 0x4d 0x4e 0x47 0x0D 0x0A 0x1A 0x0A'</tt><br />
<br />
oct: <tt>'0212 0115 0116 0107 0015 0012 0032 0012'</tt><br />
| ??<br />
|-<br />
| [[OggPCM|pcm]]<br />
| audio<br />
| char[0,8]: <tt>'PCM\ \ \ \ \ '</tt><br />
hex: <tt>'0x53 0x70 0x65 0x65 0x78 0x20 0x20 0x20'</tt><br />
<br />
oct: <tt>'0123 0160 0145 0145 0170 0040 0040 0040'</tt><br />
| char[8,2]: version major field,<br />
char[10,2]: version minor field<br />
|-<br />
| [[OggMNG|png]]<br />
| video<br />
| char[0,8]: <tt>'\211PNG\r\n\032\n'</tt><br />
hex: <tt>'0x89 0x50 0x4e 0x47 0x0D 0x0A 0x1A 0x0A'</tt><br />
<br />
oct: <tt>'0211 0120 0116 0107 0015 0012 0032 0012'</tt><br />
| ??<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h speex]<br />
| audio<br />
| char[0,8]: <tt>'Speex\ \ \ '</tt><br />
hex: <tt>'0x53 0x70 0x65 0x65 0x78 0x20 0x20 0x20'</tt><br />
<br />
oct: <tt>'0123 0160 0145 0145 0170 0040 0040 0040'</tt><br />
| char[28,4]: version id<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h theora]<br />
| video<br />
| char[0,7]: <tt>'\x80theora'</tt><br />
hex: <tt>'0x80 0x74 0x68 0x65 0x6f 0x72 0x61'</tt><br />
<br />
oct: <tt>'0180 0164 0150 0145 0157 0162 0141'</tt><br />
| char[7,1]: major version number,<br />
char[8,1]: minor version number,<br />
<br />
char[9,1]: version revision number<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h vorbis]<br />
| audio<br />
| char[0,7]: <tt>'\x01vorbis'</tt><br />
hex: <tt>'0x01 0x76 0x6f 0x72 0x62 0x69 0x73'</tt><br />
<br />
oct: <tt>'0001 0166 0157 0162 0142 0151 0163'</tt><br />
| char[7,4]: version field<br />
|-<br />
| [[OggYUV4MPEG|yuv4mpeg]]<br />
| video<br />
| char[0,8]: <tt>'YUV4MPEG'</tt><br />
hex: <tt>'0x59 0x55 0x56 0x34 0x4d 0x50 0x45 0x47'</tt><br />
<br />
oct: <tt>'0131 0125 0126 0064 0115 0120 0105 0107'</tt><br />
| char[8,1] = '2' (0x32) for yuv4mpeg format version 2<br />
|}<br />
<br />
The "char[x,y]" fields mean here: start at byte number x (counting from 0) for a length of y bytes.<br />
<br />
[[Category:Ogg]]</div>
Silvia
https://wiki.xiph.org/index.php?title=MIMETypesCodecs&diff=10230
MIMETypesCodecs
2009-06-04T11:33:25Z
<p>Silvia: added audio; moved the ogx section down</p>
<hr />
<div>== Specification of MIME types and respective codecs parameter ==<br />
<br />
=== MIME Types ===<br />
<br />
The following MIME types are now officially registered with IANA and specified with the IETF as [http://www.ietf.org/rfc/rfc5334.txt RFC 5334]:<br />
<br />
* video/ogg - for video (with audio) encapsulated in Ogg<br />
** recommends a Skeleton logical bitstrem<br />
** .ogv file extension<br />
** Macintosh File Type Code: OggV<br />
<br />
* audio/ogg - for audio encapsulated in Ogg<br />
** recommends a Skeleton logical bitstrem<br />
** .oga file extension, .ogg for Vorbis I, .spx for Speex<br />
** Macintosh File Type Code: OggA<br />
<br />
* application/ogg - for complex, multitrack, multiplexed files encapsulated in Ogg<br />
** requires a Skeleton logical bitstream<br />
** .ogx file extension<br />
** Macintosh File Type Code: OggX<br />
<br />
<br />
[[MIME_Types_and_File_Extensions|Other MIME types]] are still in the process.<br />
<br />
=== Codecs Parameter ===<br />
<br />
[http://www.rfc-editor.org/rfc/rfc4281.txt Typically], MIME types of media encapsulation formats use the optional "codecs" parameter to specify which codes are being used in a particular file.<br />
<br />
Codecs encapsulated in Ogg require a text identifier at the beginning of the first header page to identify the encapsulated codecs. The following table contains the identifiers for existing Xiph codecs and the codecs parameter names used for */ogg MIME types (in alphabetical order):<br />
<br />
{| class="codecstable" border="1"<br />
|-<br />
! Codecs Parameter Name<br />
! Codec Type<br />
! Codec Identifier<br />
(decimal, hex, octal)<br />
! Version Field (if available)<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h celt]<br />
| audio<br />
| char[0,8]: <tt>'CELT\ \ \ \ '</tt><br />
hex: <tt>'0x43 0x45 0x4c 0x54 0x20 0x20 0x20 0x20'</tt><br />
<br />
oct: <tt>'0103 0105 0114 0124 0040 0040 0040 0040'</tt><br />
| char[28,4]: version id<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h cmml]<br />
| text<br />
| char[0,8]: <tt>'CMML\0\0\0\0'</tt><br />
hex: <tt>'0x43 0x4d 0x4d 0x4c 0x00 0x00 0x00 0x00'</tt><br />
<br />
oct: <tt>'0103 0115 0115 0114 0000 0000 0000 0000'</tt><br />
| char[8,2]: major version number,<br />
char[10,2]: minor version number<br />
|-<br />
| [http://wiki.xiph.org/index.php/OggDirac dirac]<br />
| video<br />
| char[0,5]: <tt>'BBCD\0'</tt><br />
hex: <tt>'0x42 0x42 0x43 0x44 00'</tt><br />
<br />
oct: <tt>'0102 0102 0103 0104 0000'</tt><br />
| ??<br />
|-<br />
| [http://flac.sourceforge.net/ogg_mapping.html flac]<br />
| audio<br />
| char[0,5]: <tt>'\177FLAC'</tt><br />
hex: <tt>'0x7F 0x46 0x4C 0x41 0x43'</tt><br />
<br />
oct: <tt>'0177 0106 0114 0101 0103'</tt><br />
| char[5,1]: binary major version number, <br />
char[6,1]: binary minor version number of mapping<br />
|-<br />
| [[OggMNG|jng]]<br />
| video<br />
| char[0,8]: <tt>'\213JNG\r\n\032\n'</tt><br />
hex: <tt>'0x8b 0x4a 0x4e 0x47 0x0D 0x0A 0x1A 0x0A'</tt><br />
<br />
oct: <tt>'0213 0112 0116 0107 0015 0012 0032 0012'</tt><br />
| ??<br />
|-<br />
| [[OggKate|kate]]<br />
| text<br />
| char[0,8]: <tt>'\x80kate\0\0\0'</tt><br />
hex: <tt>'0x80 0x6b 0x61 0x74 0x65 0x00 0x00 0x00'</tt><br />
<br />
oct: <tt>'0200 0153 0141 0164 0145 0000 0000 0000'</tt><br />
| char[9,1]: major version number,<br />
char[10,1]: minor version number<br />
|-<br />
| [http://lists.xiph.org/pipermail/vorbis-dev/2001-August/004501.html midi]<br />
| text<br />
| char[0,8]: <tt>'OggMIDI\0'</tt><br />
hex: <tt>'0x4f 0x67 0x67 0x4d 0x49 0x44 0x49 0x00'</tt><br />
<br />
oct: <tt>'0117 0147 0147 0115 0111 0104 0111 0000'</tt><br />
| char[8,1]: version field<br />
|-<br />
| [[OggMNG|mng]]<br />
| video<br />
| char[0,8]: <tt>'\212MNG\r\n\032\n'</tt><br />
hex: <tt>'0x8a 0x4d 0x4e 0x47 0x0D 0x0A 0x1A 0x0A'</tt><br />
<br />
oct: <tt>'0212 0115 0116 0107 0015 0012 0032 0012'</tt><br />
| ??<br />
|-<br />
| [[OggPCM|pcm]]<br />
| audio<br />
| char[0,8]: <tt>'PCM\ \ \ \ \ '</tt><br />
hex: <tt>'0x53 0x70 0x65 0x65 0x78 0x20 0x20 0x20'</tt><br />
<br />
oct: <tt>'0123 0160 0145 0145 0170 0040 0040 0040'</tt><br />
| char[8,2]: version major field,<br />
char[10,2]: version minor field<br />
|-<br />
| [[OggMNG|png]]<br />
| video<br />
| char[0,8]: <tt>'\211PNG\r\n\032\n'</tt><br />
hex: <tt>'0x89 0x50 0x4e 0x47 0x0D 0x0A 0x1A 0x0A'</tt><br />
<br />
oct: <tt>'0211 0120 0116 0107 0015 0012 0032 0012'</tt><br />
| ??<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h speex]<br />
| audio<br />
| char[0,8]: <tt>'Speex\ \ \ '</tt><br />
hex: <tt>'0x53 0x70 0x65 0x65 0x78 0x20 0x20 0x20'</tt><br />
<br />
oct: <tt>'0123 0160 0145 0145 0170 0040 0040 0040'</tt><br />
| char[28,4]: version id<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h theora]<br />
| video<br />
| char[0,7]: <tt>'\x80theora'</tt><br />
hex: <tt>'0x80 0x74 0x68 0x65 0x6f 0x72 0x61'</tt><br />
<br />
oct: <tt>'0180 0164 0150 0145 0157 0162 0141'</tt><br />
| char[7,1]: major version number,<br />
char[8,1]: minor version number,<br />
<br />
char[9,1]: version revision number<br />
|-<br />
| [http://svn.annodex.net/liboggz/trunk/src/liboggz/oggz_auto.h vorbis]<br />
| audio<br />
| char[0,7]: <tt>'\x01vorbis'</tt><br />
hex: <tt>'0x01 0x76 0x6f 0x72 0x62 0x69 0x73'</tt><br />
<br />
oct: <tt>'0001 0166 0157 0162 0142 0151 0163'</tt><br />
| char[7,4]: version field<br />
|-<br />
| [[OggYUV4MPEG|yuv4mpeg]]<br />
| video<br />
| char[0,8]: <tt>'YUV4MPEG'</tt><br />
hex: <tt>'0x59 0x55 0x56 0x34 0x4d 0x50 0x45 0x47'</tt><br />
<br />
oct: <tt>'0131 0125 0126 0064 0115 0120 0105 0107'</tt><br />
| char[8,1] = '2' (0x32) for yuv4mpeg format version 2<br />
|}<br />
<br />
The "char[x,y]" fields mean here: start at byte number x (counting from 0) for a length of y bytes.<br />
<br />
[[Category:Ogg]]</div>
Silvia
https://wiki.xiph.org/index.php?title=ROE&diff=10172
ROE
2009-04-13T06:28:44Z
<p>Silvia: </p>
<hr />
<div>Rich Open multitrack media Exposition (ROE)<br />
<br />
<br />
= Overview =<br />
<br />
ROE (Rich Open multitrack media Exposition) is a way of describing the relationships between tracks of media in a stream. It is used to group tracks which have similar purpose and to identify alternatives.<br />
<br />
= Usage =<br />
<br />
== Authoring ==<br />
One use of ROE is to author a multi-track audio-visual stream from multiple input files. In this document, we present a description of how to use ROE to author multi-track Ogg files.<br />
<br />
== Dynamic Web Requests ==<br />
Another use of ROE is in a Web client-server scenario. The Web server uses ROE as a means of representing the different tracks that are available for a multi-track Web resource. A Web client may not require all available tracks to present the resource to the user. It may decide to request the ROE representation first and then request only a subset of tracks from the server, e.g. only the English soundtrack. Or it may directly request particular tracks only. The server will use the request from the client to dynamically compose a multi-track stream with the requested tracks and mandatory tracks and serve this to satisfy the resource request.<br />
<br />
== ROE in Use ==<br />
A draft version of the spec is in use in the mediaWiki extension [http://metavid.org/w/index.php/MetaVidWiki MetaVidWiki]. This runs on the site [http://metavid.org metavid.org] and is used for remote embeding. In [http://metavid-mike.blogspot.com/ this blog] for example all the clips refrence a single roe file to expose multiple video tracks and text transcripts. [http://metavid.org/w/index.php?title=Special:MvExportStream&feed_format=roe&stream_name=House_proceeding_06-09-08_01&t=0%3A01%3A38%2F0%3A10%3A00 Sample ROE output] from metavid<br />
<br />
= The ROE model =<br />
<br />
Here we describe two representations of ROE: that of ROE XML, and that of ROE in Ogg Skeleton. Each representation is capable of entirely encoding the relationships of the ROE model, such that it is possible to losslessly convert between them.<br />
<br />
= ROE XML =<br />
<br />
ROE XML is a XML markup language that describes a hierarchical serialization of the ROE model.<br />
<br />
A ROE XML file is an instance document of the [http://svn.annodex.net/standards/roe/roe_1_0.xsd ROE XML schema].<br />
<br />
It is composed of a <head> tag followed by a <body> tag. <br />
<br />
<br />
== Head Element ==<br />
<br />
=== Head Tags ===<br />
The <head> tag is optional and may optionally contain:<br />
<br />
* a <title> tag to provide a textual description for the multi-track stream,<br />
* a set of <link> tags that provide an alternative representation of the multi-track stream, e.g. as a html document,<br />
* a <img> tag to provide a representative thumbnail for the multi-track stream,<br />
* a set of <meta> tags that provide structured name-value annotations of the multi-track stream,<br />
* a <base> tag to provide a base URI for resources referred to in the ROE file, and<br />
* a set of <profile> tags that allows description of so-called track profiles.<br />
<br />
The <title>, <link>, <meta>, and <base> tags are taken out of [http://www.w3.org/TR/xhtml1-schema/ XHTML] and serve the same purpose as they serve there.<br />
<br />
=== Track Profiles ===<br />
A track profile is a combination of tracks that is pre-defined within the ROE file and can be accessed by Web clients or authoring applications directly. Examples of such profiles are the Director's cut, or the Australian version.<br />
<br />
A profile defines a list of references to the tracks of a media resource and possibly a selection from the alternative media sources of the track, to use for a particular pre-defined profile of the resource.<br />
<br />
To that end, the profile element has a subelement called "partial" which contains the ID of a selected track and potentially the ID of a selected alternate media source for the track.<br />
<br />
An example profile is:<br />
<br />
<profile name="director's cut"><br />
<partial track="v" select="v1" /><br />
<partial track="a" /><br />
</profile><br />
<br />
The <head> tag essentially separates the profiles from the core document structure being provided in the <body> element.<br />
<br />
<br />
== Body Element ==<br />
<br />
The <body> tag consists of a sequence of <track> elements that each describe a logical media track.<br />
<br />
=== The Track Tag ===<br />
A media track may consist of one of:<br />
<br />
* a media source, such as a audio, video, or text stream described in a <mediaSource> tag,<br />
* a sequence of media sources described in a <seq> tag with start and end times, or<br />
* a set of alternate media sources described in a <switch> tag, only one of which can be selected.<br />
<br />
The <track> element contains a mandatory "provides" attribute, which introduces a virtual label such as "commentary", "video", "audio", "textoverlay", "closedcaption", "logo", or "scoreboard". The track provides that kind of content.<br />
<br />
=== The Switch Tag ===<br />
The <switch> tag provides a choice between alternates, distinguished for a specific reason. The reason is given in the "distinction" attribute of the <switch> tag.<br />
<br />
Inside a <switch> tag, the choices can be specified through the following means:<br />
<br />
* directly as a <mediaSource>,<br />
* as a sequence of media sources in a <seq> element, or<br />
* as the outcome of another <switch> tag.<br />
<br />
Example <switch> element:<br />
<br />
<switch distinction="language" default="a3"><br />
<switch id="a1" distinction="bitrate" default="a1b1"><br />
<mediaSource id="a1b1" lang="en" content-type="audio/vorbis" src="http://example.com/lang1b1.oga" /><br />
<mediaSource id="a1b2" lang="en" content-type="audio/vorbis" src="http://example.com/lang1b2.oga" /><br />
</switch><br />
<mediaSource id="a2" lang="de" content-type="audio/vorbis" src="http://example.com/lang2.oga" /><br />
<seq id="a3"><br />
<mediaSource id="a3a" lang="fr" content-type="audio/vorbis" src="http://example.com/lang3a.oga" /><br />
<mediaSource id="a3b" lang="fr" content-type="audio/vorbis" src="http://example.com/lang3b.oga" /><br />
</seq><br />
</switch><br />
<br />
In this example, we have a choice between three languages: en, de and fr.<br />
The English language track also comes in two different bitrates.<br />
The French language track comes in two different files that should be played in sequence<br />
<br />
=== Inline XML files ===<br />
Some media source elements are XML documents themselves. These can be represented inline in a ROE file. The purpose of this is to contain all or some the annotation information of a media resource inside one XML file. Thus, the "inline" attribute can have the values "false", "partial" or "full".<br />
<br />
An example inline XML file is the use of CMML inside a ROE track:<br />
<br />
<track id="t1" provides="caption"><br />
<mediaSource id="c" src="http://example.com/cmml1.cmml" inline="partial" content-type="text/cmml" ><br />
<cmml role="caption" xmlns:cmml="http://www.annodex.org/spec/cmml/cmml40"><br />
<cmml:head><br />
<cmml:title>random 1</cmml:title><br />
</cmml:head><br />
<cmml:clip start="t1" end="t2"><br />
<cmml:body><br />
<html:p><html:span>rillian:</html:span>FOMS rocks</html:p><br />
</cmml:body><br />
</cmml:clip><br />
</cmml><br />
</mediaSource><br />
</track><br />
<br />
== An example ROE XML file ==<br />
<br />
Putting it all together, here is an example of a ROE XML file:<br />
<br />
<?xml version="1.0"?><br />
<xs:schema targetNamespace="http://www.xiph.org/roe1.0"<br />
xmlns:xs="http://www.w3.org/2001/XMLS<br />
xmlns:html="http://www.w3.org/1999/xhtml"<br />
elementFormDefault="qualified"<br />
attributeFormDefault="unqualified"><br />
<ROE><br />
<head><br />
<link id="html_linkback" rel="alternate" type="text/html" href="http://example.com/full_video.html"/><br />
<img id="stream_thumb" src="http://example.com/full_video.jpg"/><br />
<title>Example video</title><br />
<profile name="director's cut"><br />
<partial track="v" select="v1" /><br />
<partial track="a" /> <br />
</profile><br />
</head><br />
<body><br />
<track id="v" provides="video"><br />
<switch distinction="angle" default="v1"><br />
<mediaSource id="v1" content-type="video/theora" src="http://example.com/angle1.ogv?track=v1&amp;t=t1/t2" /><br />
<mediaSource id="v2" content-type="video/theora" src="http://example.com/angle2.ogv" /><br />
</switch><br />
</track><br />
<track id="a" provides="audio"><br />
<switch distinction="Content-Language" default="a3"><br />
<switch id="a1" distinction="bitrate" default="a1b1"><br />
<mediaSource id="a1b1" lang="en" content-type="audio/vorbis" src="http://example.com/lang1b1.oga" /><br />
<mediaSource id="a1b2" lang="en" content-type="audio/vorbis" src="http://example.com/lang1b2.oga" /><br />
</switch><br />
<mediaSource id="a2" lang="de" content-type="audio/vorbis" src="http://example.com/lang2.oga" /><br />
<seq id="a3"><br />
<mediaSource id="a3a" lang="fr" content-type="audio/vorbis" src="http://example.com/lang3a.oga" /><br />
<mediaSource id="a3b" lang="fr" content-type="audio/vorbis" src="http://example.com/lang3b.org" /><br />
</seq><br />
</switch><br />
</track><br />
<track id="t" provides="text overlay"><br />
<switch distinction="Content-Language" default="t1"><br />
<mediaSource id="t1" lang="en" content-type="text/cmml" src="http://example.com/transcript1.cmml" /><br />
<mediaSource id="t2" lang="de" content-type="text/cmml" src="http://example.com/transcript2.cmml" /><br />
<mediaSource id="t3" lang="fr" content-type="text/cmml" src="http://example.com/transcript3.cmml" /><br />
</switch><br />
</track><br />
<track id="l" provides="logo" default="O1"><br />
<seq><br />
<mediaSource id="O1" content-type="application/ogg" src="http://example.com/mng.ogx?track=1" /><br />
<mediaSource id="O2" content-type="application/ogg" src="http://example.com/mng.ogx?track=2" /><br />
</seq><br />
</track><br />
</body><br />
</ROE><br />
<br />
= Representation in Skeleton =<br />
<br />
When the relationships described by ROE are written into an Ogg stream, they are encoded using the message header fields of Ogg Skeleton fisbones for each track. One of the primary design goals for fisbone headers is to minimize the need for global information to be stored in a stream. Each track's fisbone contains headers describing only itself and its relationship to other tracks in the stream. This allows tracks to be inserted or removed at the Ogg level without needing to modify any data in individual headers.<br />
<br />
== Relationships ==<br />
<br />
Relationships between tracks are given by the following headers:<br />
<br />
=== Provides ===<br />
<br />
''Provides'' introduces a virtual label such as "commentary", which this track provides. Many tracks may provide the same such label, and as long as one is present then a dependency on that label can be satisfied.<br />
<br />
=== Depends ===<br />
<br />
This declares that it is not valid to include this track in a stream unless the track it depends on is present. An example use of this might be the generic captioning of sound effects for the deaf, which may not make sense unless the captioning of speech (in an appropriate language) is also rendered.<br />
''Depends'' refers to either a virtual label provided by another track, or an explicit track ID.<br />
<br />
When removing a track from a file, any other tracks dependent on it must also be removed.<br />
<br />
=== Recommends ===<br />
<br />
''Recommends'' refers to either a virtual label provided by another track, or an explicit track ID.<br />
<br />
=== Suggests ===<br />
<br />
''Suggests'' refers to either a virtual label provided by another track, or an explicit track ID.<br />
<br />
=== Conflicts ===<br />
<br />
''Conflicts'' refers to either a virtual label provided by another track, or an explicit track ID.<br />
<br />
<br />
== Serving Suggestions ==<br />
<br />
=== Disposition ===<br />
<br />
<br />
= HTTP-style message headers for client-server negotiation =</div>
Silvia