M3F

From XiphWiki
Jump to navigation Jump to search


See other suggested metadata methods.

This document describes the proposed Media Description and Metadata for the Ogg Container Format (MDMF). The format is built on the Extensible Markup Language (XML). It is intended to describe any kind of multimedia (audio, video, text, images, …) that can reside in an Ogg container.

HELP IS NEEDED!! See the ogg-dev email list or contribute directly to the wiki.

Format description

The basics

No element except the ‘metadata’ element is required.

‘xml:lang’ and ‘xml:base’ attributes may be used on any element.

All date attributes and elements must only contain date and times written in the ‘ISO 8601:2000 International Date and Time Format’. The format works as following: YYYY-MM-DDTHH:MMZ or 2009-05-15T15:00+01:00 as an example.

XML declaration and name spaces

A metadata document must have a standard XML declaration on the very first line. The XML decleration must contain ‘version’ and ‘encoding’ attributes. The below example also uses the optional document-wide language ‘lang’ and URI ‘base’ attributes.

The ‘metadata’ element must contain at least one XML name space defining the format via the ‘xmlns’ attribute.

<?xml version="1.0" encoding="UTF-8" lang="en" base="./" ?>
<metadata xmlns="http://xmlns.xiph.org/metadata/0.1/">
Extensions

The format can be extended by including multiple XML name spaces. Software may not add, modify, or expect elements and attributes not defined by a XML name space. Any XML name spaace used must be linked from the document. (Preferably from the ‘metadata’ element for easier software rendering.)

(Tip for extenstion creators: Look at how the fictional URL in the above example describes both the version numbering—allowing futher modifications of the format while allowing software to be backward compatible and the format name in a human readable way.)

Media elements

Media is described as children of the ‘metadata’ element. The possible children elements are as follows:

  • audio – any audio item in the strem, such as Vorbis and FLAC encoded music or speeches.
  • image – any image in the strem, such as JPEG, PNG, and SVG.
  • text – any text in the stream, such as subtitles, clear text, and CMML.
  • video – any video in the stream, such as Theora and MPEG.
Comment: CMML is text, but I think describing it as such depends on what properties you want to ascribe to it.-Imalone 10:29, 17 September 2007 (PDT)

Each child must have a ‘oggserial’ attribute. This attribute links the media element with the correct chunk in the stream.

Each child must have a ‘type’ attribute. This attribute describes the MIME type of the media. [Comment SP: MIME type should not be in here, but rather in skeleton. Skeleton should be prescribed for this format.]

Each child can have a ‘id’ attribute. This attribute is required when the element is to be linked to. For example when used with the ‘artwork’ element for the ‘audio’ element's ‘collection’ child. Attribute used to connect multiple media elements.

Comment: does 'id' label the resource?-Imalone 10:29, 17 September 2007 (PDT)

Below is a simplified example showing how an ‘image’ is used as an ‘audio’'s artwork for a ‘collection’ element.

	<image type="application/svg+xml" oggserial="54321" id="front-cover" />
	<audio type="audio/x-flac+lossless" oggserial="12345">
		<collection>
			<artwork uri="#front-cover" /></collection></audio>
The audio element

Possible children and their own attributes and children with examples of the ‘audio’ element. In alphabetic order, but may occur anywhere in the file.

  • collection - <collection track="2" tracks="12" date="1996-10-01" uri="urn:x-isrc:0123456789"><title>Fountains of Wayne</title><artwork uri="#embedded-image" /></collection>

Music albums, collections, and sets the parant is a part of.

The ‘track’ attribute describes the resource's place in the collection. The ‘tracks’ (plural) attribute describes how many other resources the collection contains.

The ‘date’ attribute describes the date and time the resource was made available to the public. Such as date of a speech and music album release dates.

The ‘uri’ attribute uniquly describes the collection. In the above example the resource is being described with an URN using The Fountains of Wayne's self titled album's ISRC album.

The ‘title’ child describes the parant collection's title.

The ‘artwork’ child links the collection with an ‘image’ media element in the stream.

Comment: This is very single-cd centric: two CD albums are not uncommon, you may want to describe something as part of a live set or an orchestral work rather than a CD. I wondered a while back about recruiting XSPF for this purpose.-Imalone 10:33, 17 September 2007 (PDT)
  • encoding <encoding><date>2019-02-17T15:00+01:00</date><quality compression="best" />
    <software title="flac" version="2.2" uri="http://xiph.org/flac/" /></encoding>

The ‘encoding’ element contains information about the resource's digitalization or encoding.

The ‘date’ child element describes the date and time of encoding.

The ‘quality’ child element describes the resource's quality. Possible attributes are ‘bitrate’ and ‘compression’.

The ‘source’ child element describes the resource the current resource came from. In the above example the ‘media’ and URI attribute describe's Fontain of Wayne's self titles album. Again the URI is an URN using the album's ISRC number.

The ‘software’ child element describes the software used to encode the resource. ‘title’, ‘version’, and ‘uri’ is self explained.

  • entities – <entities><person role="vocals guitar">Chris Collingwood</person><organisation role="label" uri="http://recording-people.com/">Recording Company</organisation></entities>

The ‘entities’ elements describes any ‘person’ and ‘organisation’ involved in the creation and distribution of the resource. The entity element should contain all persons and organisations, but the entities element may occur several time.

Any person and organisation may be described with a ‘role’, ‘title’, and ‘uri’ attributes. The ‘role’ attribute describes the entity's role in the process. Roles are space separated values. Possible role values are: vocal, instrument, choir, ensemble, producer, publisher, label, and ... The ‘title’ should describe the role with only a few words. Example: role="instrument vocals" title="guitar and lead vocals".

Please note that individual encoders are discouraged from including their name in any resource to avoid potential legal problems.

Comment: We're not really about giving advice to people creating illegitimate streams. For legitimate uses this may be worthwhile. Anyway, I believe in natural selection. Imalone 12:45, 11 September 2007 (PDT)
Comment: I'd like a clarification of how roles will work, I believe progressive refinement would be a useful feature. This could allow a defined vocabulary which could be machine-interpreted to a required level while retaining free-form refinement (e.g. 'doric flute').-Imalone 10:42, 17 September 2007 (PDT)
  • date – <date>2007-01-08</date>
  • duration – <duration>01:04:54</duration>

Describes the resource's play time duration. The value must be in acordance with ‘ISO 8601:2000’ but with no leading T and time zone. Describes the date the resource was made publicly available.

  • location – <location>China, Earth</location>

Place of recording. (Standard needed for this generic element!)

Comment: Give it more properties, could supply URI (Google Earth has a scheme for this, though we don't want to be tied to Google), lat+long-itude, address. Dublin Core does have defnitions that might be of use.-Imalone 10:42, 17 September 2007 (PDT)
  • rights – <rights date="2018">℗ 2018 Recording Company. All distribution rights reserved.</rights>

Describes the resource's distribution rights. The ‘date’ may be set as an attribute for easier right managment.

  • title – <title>Sink To the Bottom</title>

The title or name given to a resource. Element has no attributes.

Full example

<?xml version="1.0" encoding="UTF-8" lang="en" base="./" ?>
<metadata xmlns="http://xmlns.xiph.org/media-metadata/0.1/">
	<video type="video/theora" />
	<image type="application/svg+xml" />
	<text type="text/plain" />
	<audio type="audio/flac" oggserial="audio.flac">
		<title>Sink To the Bottom</title>
		<collection track="2" tracks="12" date="1996-10-01" uri="urn:x-isrc:0123456789">
			<title>Fountains of Wayne</title>
			<artwork uri="#embedded-image" /></collection>
		<collection track="1" tracks="1" date="1997">
			<title>Sink To the Bottom</title></collection>
		<entities>
			<person role="vocal">Chris Collingwood</person>
			<person role="instrument vocal" title="base">Adam Schlesinger</person>
			<person role="vocal instrument" title="guitar and vocals">Jody Porter</person>
			<person role="instruments" title="drums">Brian Young</person>
			<organisation role="ensemble">Some People in the Background</organisation>
			<person role="producer">Person behind the Glass Wall</person>
			<organisation role="label" uri="http://recording-people.com/">Recording Company</organisation></entities>
		<rights date="2018">℗ 2008 Recording Company. All distribution rights reserved.</rights>
		<duration>23:04:01</duration>
		<date>2007-01-08</date>
		<location>China, Earth</location>
		<encoding>
			<date>2009-02-17</date>
			<quality compression="8" />
			<source media="cd" uri="urn:x-isrc:0123456789" />
			<software title="flac" version="2.2" uri="http://xiph.org/flac/" /></encoding></audio></metadata>

History

  • 2007-09-08 – Wiki page created based on original format and suggestsion from the email list.
  • 2007-09-06 – Format suggested on Xiph's ogg-dev email list by Daniel Aleksandersen.