XMLEmbedding: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(typos and refinement)
(Made clear spec is for both XML files and XML streams)
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{draft}}
{{draft}}
Schemes such as [[MDMF]] require an embedding in physical Ogg streams. This page is for development of a specification for embedding metadata streams in Ogg. The final version will probably look like the CMML mapping.
Schemes such as [[M3F]] require an embedding in physical Ogg streams. This page is for development of a specification for embedding XML files and XML streams in Ogg. The final version will probably look like the CMML mapping.


==Current plans==
==Current plans==
Line 6: Line 6:


===Magic number?===
===Magic number?===
Should we have a magic number for this? By convention the beginning of stream packet for codecs in Ogg identifies the packet, Strictly speaking we will have a magic number regardless, but should it be:
Should we have a magic number for this? By convention the beginning of stream packet for codecs in Ogg identifies the packet, Strictly speaking we will have a magic number regardless, but should it be:


* '<?xml' the opening of the XML declaration. In this instance the demuxer would pass this upwards to an XML parser which would derive from the rest of the bos packet what to do with it.
* '<?xml' the opening of the XML declaration. In this instance the demuxer would pass this upwards to an XML parser which would derive from the rest of the bos packet what to do with it.


* Some other sequence before the XML starts, identifying the particular stream type, probably with a version number which will imply the contents to be some form of XML. CMML does this. This may avoid the almost inevitable problem of some implementor assuming '<?xml' is always metadata, it may also reduce the difficulty of writing a demuxer. Demuxers could still peak ahead and try to look of an xml namespace it recognizes.
* Some other sequence before the XML starts, identifying the particular stream type, probably with a version number which will imply the contents to be some form of XML. CMML does this. This may avoid the almost inevitable problem of some implementor assuming '<?xml' is always metadata, it may also reduce the difficulty of writing a demuxer. Demuxers could still peek ahead and try to look for an xml namespace it recognizes.
 
===Division into packets===
 
The raw XML should ideally be broken into packets in a way that the loss of some packets, while destroying information, does not result in an invalid stream. Generally this means:
 
* The bos packet should consist of any initial processing directives, namespace declarations, and the root tag if there is one.
* Subsequent packets should be valid xml stanzas, similar to the [http://xmpp.org/ XMPP] definition, which concatenated are also valid xml.
* The eos packet can be empty. If there was a root tag in the bos packet, it should be closed here.
 
Parsers should close all open tags on encountering eos to handle truncated stream conditions. Encountering eos here means either a after processing a packet marked with the eos flag, having finished on an Ogg page with the eos flag set, or a virtual eos, implied by encountering bos flags for a new chain segment.


===Granulepos mapping===
===Granulepos mapping===
c.f. Silvia Pfeiffer on ogg-dev:
 
c.f. Silvia Pfeiffer on [http://lists.xiph.org/pipermail/ogg-dev/2007-September/000570.html ogg-dev]:
<blockquote>I suggest using the solution that CMML has come to use.
<blockquote>I suggest using the solution that CMML has come to use.


Line 25: Line 37:
Also, use the granulepos scheme that we defined for CMML pages- you're
Also, use the granulepos scheme that we defined for CMML pages- you're
going to make your lives easier.</blockquote>
going to make your lives easier.</blockquote>
That is, use a split granulepos scheme like keyframe codecs to indicate the offset to the previous packet.
Whether this is appropriate for a given XML steam will depend on its application. Metadata that applies
to the whole stream should just be included at the beginning, like OggSkeleton, and the granulepos can just
all be zero. Time-based data, like a slideshow or 'currently playing' for radio streams, should be muxed
throughout the stream.
Given that the CMML mapping would be sensible should we simply hijack
CMML for this use (i.e. just put the metadata XML in a CMML stream?
Arguments against this: it means stream parsing is needed to find out
whether the CMML stream contains metadata too and possibly complicates
CMML handling.  If not hijacking CMML it might be worth having a flag
indicating whether the stream is continuous or secondary header only.
===ID===
It seems reasonable to use the fishbone message fields in [[Ogg Skeleton]] to supply an ID to be associated
with each logical stream (via an "id:" message header field).  The other side of this problem is how
these should be addressed.  The physical bitstream itself shouldn't need one, but do we worry about
chained/concatenated streams?
The Skeleton section of the [http://annodex.net/TR/draft-pfeiffer-annodex-02.html#anchor8 Annodex bitstream format]
specifies that mandatory header fields MUST be US-ASCII encoded, but allows UTF-8 for other message fields. This
does not appear to be a problem for an ID field. [http://www.ietf.org/rfc/rfc2822.txt RFC2822] limits message header
fields to 998 bytes (excluding CRLF) and spaces are not normally permitted in IDs, so IDs would be limited to 994
bytes long.
===Mime type===
For use in Skeleton.  [[MIME Types and File Extensions]] gives 'text/cmml' for 'CMML without container', if that
can be used by Skeleton to describe packetized CMML in Ogg then there's no issue here; 'text/xml' or whatever
is appropriate could be used.


==Test Files==
==Test Files==
It is a barrier to the widespread introduction of any metadata format that the [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html Vorbis I spec] only requires players to support an unaccompanied [[Vorbis]] stream; many [[Ogg]] [[Vorbis]] players will refuse to play augmented streams, especially if the content is not recognised (although many recent players do succeed). As a prelude to development of an [[Ogg]] metadata format it will be necessary to encourage developers to introduce more flexible [[Ogg]] filters.
It is a barrier to the widespread introduction of any metadata format that the [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html Vorbis I spec] only requires players to support an unaccompanied [[Vorbis]] stream; many [[Ogg]] [[Vorbis]] players will refuse to play augmented streams, especially if the content is not recognised (although many recent players do succeed). As a prelude to development of an [[Ogg]] metadata format it will be necessary to encourage developers to introduce more flexible [[Ogg]] filters.
One of the intentions of the current work on [[MIME Types and File Extensions]] is to superseede the Vorbis I spec allowing all types of metadata to be included and encouraging program writers to ignore unrecognised content in these files. The metadata embedded Ogg files below are therefore '.oga' served as audio/x-ogg. '''You may want to use your browser's''' ''save as'' '''function if you have trouble opening the links.'''


To help with testing the following files are available, based on a speculative (and very basic) metadata format. In each case the derivative files are under the same license as the original. Two sets are provided to allow chained stream testing. On some players the seek tests produce an annoying clicking&mdash;if you like the music get the originals. Please notice that filenames are mixed case
To help with testing the following files are available, based on a speculative (and very basic) metadata format. In each case the derivative files are under the same license as the original. Two sets are provided to allow chained stream testing. On some players the seek tests produce an annoying clicking&mdash;if you like the music get the originals. Please notice that filenames are mixed case
Line 33: Line 77:


Original (Vorbis I): [http://music.ibiblio.org/pub/multimedia/pandora/vorbis/contrib/Debbie_Hu/Bach-Busoni_Nun_freut_euch_3.ogg Bach - Nun freut euch lieben Christen] performed by Debbie Hu, from [http://music.ibiblio.org/pub/multimedia/pandora/mp3/Read.html Pandora Records] and available under the [http://www.eff.org/IP/Open_licenses/eff_oal.php EFF OAL].
Original (Vorbis I): [http://music.ibiblio.org/pub/multimedia/pandora/vorbis/contrib/Debbie_Hu/Bach-Busoni_Nun_freut_euch_3.ogg Bach - Nun freut euch lieben Christen] performed by Debbie Hu, from [http://music.ibiblio.org/pub/multimedia/pandora/mp3/Read.html Pandora Records] and available under the [http://www.eff.org/IP/Open_licenses/eff_oal.php EFF OAL].
* The [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.rdf.ogg Ogg-Vorbis-XML version]
* The [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.rdf.oga Ogg-Vorbis-XML version]
* The XML/RDF description as a [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.xml separate document]
* The XML/RDF description as a [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.xml separate document]
* With the XML page [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.multi.ogg repeated after every fifth Vorbis page]. (This is not a suggested way to add meta data, just a way of testing how players handle seeking in the presence of an unknown stream.)
* With the XML page [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.multi.oga repeated after every fifth Vorbis page]. (This is not a suggested way to add meta data, just a way of testing how players handle seeking in the presence of an unknown stream.)
* With the XML page repeated after every fifth Vorbis page and the [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.end.ogg stream ending on a meta data page] (breaks simpler track-length strategies, again not a suggested format for metadata)
* With the XML page repeated after every fifth Vorbis page and the [http://www.srcf.ucam.org/~ibm21/Bach-Busoni_Nun_freut_euch_3.end.oga stream ending on a meta data page] (breaks simpler track-length strategies, again not a suggested format for metadata)


Original (Vorbis I): [http://ccmixter.org/media/files/disharmonic/2958/ On The Moon (Trip Hop mix)] by [http://ccmixter.org/media/people/disharmonic/ Disharmonic], from [http://ccmixter.org/ ccMixter] and available under the Creative Commons [http://creativecommons.org/licenses/by/2.5/ Attribution 2.5] license.
Original (Vorbis I): [http://ccmixter.org/media/files/disharmonic/2958/ On The Moon (Trip Hop mix)] by [http://ccmixter.org/media/people/disharmonic/ Disharmonic], from [http://ccmixter.org/ ccMixter] and available under the Creative Commons [http://creativecommons.org/licenses/by/2.5/ Attribution 2.5] license.
* The XML/RDF description as a [http://www.srcf.ucam.org/~ibm21/disharmonic_-_On_The_Moon_(Trip_Hop_mix).xml separate document]
* The XML/RDF description as a [http://www.srcf.ucam.org/~ibm21/disharmonic_-_On_The_Moon_(Trip_Hop_mix).xml separate document]
* With the XML page [http://www.srcf.ucam.org/~ibm21/disharmonic_-_On_The_Moon_(Trip_Hop_mix).multi.ogg repeated after every fifth Vorbis page].
* With the XML page [http://www.srcf.ucam.org/~ibm21/disharmonic_-_On_The_Moon_(Trip_Hop_mix).multi.oga repeated after every fifth Vorbis page].
* With the XML page repeated after every fifth Vorbis page and the [http://www.srcf.ucam.org/~ibm21/disharmonic_-_On_The_Moon_(Trip_Hop_mix).end.ogg stream ending on a meta data page]
* With the XML page repeated after every fifth Vorbis page and the [http://www.srcf.ucam.org/~ibm21/disharmonic_-_On_The_Moon_(Trip_Hop_mix).end.oga stream ending on a meta data page]

Latest revision as of 10:40, 12 June 2012

Schemes such as M3F require an embedding in physical Ogg streams. This page is for development of a specification for embedding XML files and XML streams in Ogg. The final version will probably look like the CMML mapping.

Current plans

(taken from mailing list discussions)

Magic number?

Should we have a magic number for this? By convention the beginning of stream packet for codecs in Ogg identifies the packet, Strictly speaking we will have a magic number regardless, but should it be:

  • '<?xml' the opening of the XML declaration. In this instance the demuxer would pass this upwards to an XML parser which would derive from the rest of the bos packet what to do with it.
  • Some other sequence before the XML starts, identifying the particular stream type, probably with a version number which will imply the contents to be some form of XML. CMML does this. This may avoid the almost inevitable problem of some implementor assuming '<?xml' is always metadata, it may also reduce the difficulty of writing a demuxer. Demuxers could still peek ahead and try to look for an xml namespace it recognizes.

Division into packets

The raw XML should ideally be broken into packets in a way that the loss of some packets, while destroying information, does not result in an invalid stream. Generally this means:

  • The bos packet should consist of any initial processing directives, namespace declarations, and the root tag if there is one.
  • Subsequent packets should be valid xml stanzas, similar to the XMPP definition, which concatenated are also valid xml.
  • The eos packet can be empty. If there was a root tag in the bos packet, it should be closed here.

Parsers should close all open tags on encountering eos to handle truncated stream conditions. Encountering eos here means either a after processing a packet marked with the eos flag, having finished on an Ogg page with the eos flag set, or a virtual eos, implied by encountering bos flags for a new chain segment.

Granulepos mapping

c.f. Silvia Pfeiffer on ogg-dev:

I suggest using the solution that CMML has come to use.

The XML file is essentially the same as an unencapsulated physical bitstream.

Then there is a mapping into a logical bitstream, where some of the default information - in particular the XML header - are split off and put into the bos packet - nothing really needs to go into the eos packet. There's also a magic number and a version number.

Also, use the granulepos scheme that we defined for CMML pages- you're

going to make your lives easier.

That is, use a split granulepos scheme like keyframe codecs to indicate the offset to the previous packet. Whether this is appropriate for a given XML steam will depend on its application. Metadata that applies to the whole stream should just be included at the beginning, like OggSkeleton, and the granulepos can just all be zero. Time-based data, like a slideshow or 'currently playing' for radio streams, should be muxed throughout the stream.

Given that the CMML mapping would be sensible should we simply hijack CMML for this use (i.e. just put the metadata XML in a CMML stream? Arguments against this: it means stream parsing is needed to find out whether the CMML stream contains metadata too and possibly complicates CMML handling. If not hijacking CMML it might be worth having a flag indicating whether the stream is continuous or secondary header only.

ID

It seems reasonable to use the fishbone message fields in Ogg Skeleton to supply an ID to be associated with each logical stream (via an "id:" message header field). The other side of this problem is how these should be addressed. The physical bitstream itself shouldn't need one, but do we worry about chained/concatenated streams?

The Skeleton section of the Annodex bitstream format specifies that mandatory header fields MUST be US-ASCII encoded, but allows UTF-8 for other message fields. This does not appear to be a problem for an ID field. RFC2822 limits message header fields to 998 bytes (excluding CRLF) and spaces are not normally permitted in IDs, so IDs would be limited to 994 bytes long.

Mime type

For use in Skeleton. MIME Types and File Extensions gives 'text/cmml' for 'CMML without container', if that can be used by Skeleton to describe packetized CMML in Ogg then there's no issue here; 'text/xml' or whatever is appropriate could be used.

Test Files

It is a barrier to the widespread introduction of any metadata format that the Vorbis I spec only requires players to support an unaccompanied Vorbis stream; many Ogg Vorbis players will refuse to play augmented streams, especially if the content is not recognised (although many recent players do succeed). As a prelude to development of an Ogg metadata format it will be necessary to encourage developers to introduce more flexible Ogg filters.

One of the intentions of the current work on MIME Types and File Extensions is to superseede the Vorbis I spec allowing all types of metadata to be included and encouraging program writers to ignore unrecognised content in these files. The metadata embedded Ogg files below are therefore '.oga' served as audio/x-ogg. You may want to use your browser's save as function if you have trouble opening the links.

To help with testing the following files are available, based on a speculative (and very basic) metadata format. In each case the derivative files are under the same license as the original. Two sets are provided to allow chained stream testing. On some players the seek tests produce an annoying clicking—if you like the music get the originals. Please notice that filenames are mixed case and add a note in discussion if you find a broken link.

Original (Vorbis I): Bach - Nun freut euch lieben Christen performed by Debbie Hu, from Pandora Records and available under the EFF OAL.

Original (Vorbis I): On The Moon (Trip Hop mix) by Disharmonic, from ccMixter and available under the Creative Commons Attribution 2.5 license.