See other suggested metadata methods.
This document describes the proposed Multimedia Metadata Format (M3F) for the Ogg container. The format is built on the Extensible Markup Language (XML). It is intended to describe any kind of multimedia (audio, video, text, images, etc) that can reside in the Ogg container.
Multimedia Metadata Format documents describe media resources in Ogg containers and stream. The format can link resources with one another for media players that support rendering multiple kinds of media. (Such as audio tracks and albumart; and video and commentary audio overlays.)
No element except ‘metadata’ is required. But some elements have required attributes.
All dates must be formatted as ISO 8601:2000 – International Date and Time Format.
XML, declaration, and name spaces
A metadata document must have a standard XML declaration on the very first line. The XML deceleration must contain the ‘version’ and ‘encoding’ attributes.
<?xml encoding="UTF-8" version="1.0" ?>
Encoding should default to Unicode/UTF-8. XML version should always use the oldest version were desired features are available, as per the XML specifications. Only when features not pressent in older XML versions are required should a newer version be used.
In addition the XML attributes ‘base’ and ‘lang’ may be used. Refer to them with an ‘xml:’ prefix if used in any element except the XML declaration. The optional ‘base’ attribute defines the base URI for compleating relative URIs. This value must default to the current Ogg Container. (Basically all other URIs in the format is relative. The attribute is usefull for formats other than Ogg.) The attribute is inherited by all children. The optional ‘lang’ element describes the language of a resource using three letter ISO 639-3 codes. Note that this element should only be used on the ‘resource’ element or in the XML declaration for easier software parsing. The attribute is inherited by all children and have no default value.
The ‘metadata’ element is required as the top level container. It must contain at least one XML name space defining the format via the ‘xmlns’ attribute. (The URL used in the example is not the final address as no name space have been created yet.)
<metadata xmlns="http://xmlns.xiph.org/metadata/0.1/"> […]</metadata>
The Multimedia Metadata Format can be extended by including multiple XML name spaces to the ‘metadata’ element. As with any other XML format: Software may not add, modify, or expect elements and attributes not defined by a XML name space.
The ‘xml:lang’ can be used on any element, and it is inherited from parent elements. In the below example, the first title element inherits ‘eng’ as its language from the ‘resource‘ parent element. A player may present any title, but should prefer the global language (language of the ‘metadata’ element) or the user’s language, as specified by the player. If the player’s interface language (or possibly a dedicated metadata language option) is German (ISO 639-3:deu), than it should assume the user prefers the German title.
<resource […] xml:lang="eng"> <title>The Science of Sleep</title> <title xml:lang="deu">Anleitung zum Träumen</title> <title xml:lang="ita">La science des rêves</title></resource>
Addressing the media resource
Media resources in the stream is described as ‘resource’ children of the ‘metadata’ element. Each resource element must have a ‘oggserial’ linking it to the correct chunk in the stream. It must also have a ‘type’ attribute with the native MIME type of the resource.
<resource oggserial="0×EXAMPLE" type="audio/vorbis"> […]</resource>
The ‘uri’ attribute may be used in stead of ‘oggserial’. For ogg serials the URI would be ‘urn:oggserial:#0×EXAMPLE’. (Other container and native file formats may specify any URI that works with that format.)
Resource elements can also have an optional unique ‘xml:id’ attribute. The ‘xml:id’ attribute is used as a label when the resource needs to be addressed by another resource element.
<resource xml:id="unique-resource-id" […]> […]</resource>
Note: It is good practise to include every resource in a stream as a resource element. This makes it easier to link and describe relationships with other resources. (Such as with films and subtitles; and music and albumart.)
Describing the media resource
There are many children elements of the ‘resource’ element. All are optional and everyone can be used with any resource. Though media type spesific children are grouped together. These children does not make much sense with all media types
The ‘audience’ element is a self-regulated filtering mechanism intended for parental control and self-regulated filtering. The optional ‘nudity’ attribute is a space separated list with one or more of the following values ‘breasts’, ‘buttocks’, and ‘genitals’. The optional ‘sexual’ attribute is a space separated list of with one or more for the following values ‘kissing’, ‘sexact’, ‘touching’, ‘sexlanguage’, ‘erections’ and ‘erotica’. The optional ‘violence’ attribute is a space separated list with one or more of the following values ‘rape’, ‘human-injury’, ‘animal-injury’, ‘anime-injury’, ‘human-blod’, ‘animal-blod’, ‘anime-blod’, ‘human-torture’, ‘animal-torture’, and ‘anime-torture’. The optional ‘language’ atribute is a space separated list with one or more of the following values ‘vulgar’, ’swear’, and ‘mild’. The optional ‘harmful’ attribute is a space separated list with one or more of the following values ‘tobacco’, ‘alcohol’, ‘drug’, ‘weapons’, ‘example-dangerous’, ‘horror’, and ‘discrimination’. The required ‘context’ attribute is a space separated list with one or more of the following values ‘artistic’, ‘educational’, ‘medical’, ’sports’ and ‘news context’.
<audience context="education" language="mild" nudity="sexact kissing" […] />
Note: The filters are based on Internet Content Ration Associations' (ICRA) work. See their label generator for full meaning of values.
The ‘category’ element describes the listing genre of the resource. The required ‘sort’ attribute describes the preferred genre for listing.
<category sort="metal"> […]</category>
The optional ‘genre’ child element describes more in-depth sorting for the resource.
<category sort="metal"> <genre>symphonic metal</genre> <genre>goth metal</genre> </category>
Media resources may appear in collections (DVD set boxes, CD albums, etc.). The ‘collection’ element describes the resources relation and order/place in collections. The optional ‘date’ attribute describes the date the collection was made publicly available (its ‘release date’). The optional ‘track’ attribute describes the resource's order/place in the collection. The optional ‘tracks’ attribute describes the total number of resources in the collection. The optional ‘uri’ attribute should uniquely identify the collection as a whole.
<collection date="2019-01-15" track="2" tracks="12" uri="urn:x-isrc:0123456789"> […]</collection>
The optional ‘artwork’ child element links the collection to a image resource. The required ‘uri’ attribute should either be a resource's ‘id’ attribute value with a ‘#’ prefix (as below) or a web URL resource. The optional ‘type’ attribute should be the MIME type of the image resource. The attribute should not be used when linking to other internal resources; but is encurraged when linking to external resources (such as web URLs).
<collection> <artwork uri="#embedded-image" /> </collection>
The optional ‘title’, ‘subtitle’, and ‘tagline’ child elements function as the ‘resource:title’ element.
<collection> <title>Great Audio VI</title> <tagline>Music that rocks you!</tagline> </collection>
Note that CD singles are indeed collections too.
The ‘encoding’ element describes the encoding or digitalization of the resource.
The optional ‘date’ child element describes when the last file encoding happen. When the file is re-encoded the original date of encoding should be preserved, and another date element should be added with the date of re-encoding.
<encoding> <date>2019-01-21</date> </encoding>
The optional ‘source’ child element describes the original media source for the encoding. The required ‘media’ attribute must be either ‘cd’, ‘dvd’, ‘tape’, ‘web-stream’, ‘tv-stream’, ‘radio-stream’, ‘file’, or ‘unknown’. The optional ‘uri’ attribute should uniquely identify the media.
<encoding> <source media="cd" uri="urn:x-isrc:0123456789" /> </encoding>
The optional ‘software’ child element describes the softwares used for the encoding. The optional ‘title’ attribute describes the software name. The optional ‘version’ attribute describes the software version. The required ‘uri’ attribute should uniquely identify the software (and version).
<encoding> <software title="flac" version="2.2" uri="http://xiph.org/flac/" /> </encoding>
Note: The software version attribute is important for one reason; It makes it so much easier to find out what files needs to be re-encoded (from a huge collection) if there ever were a bug in a software release.
The ‘performers’ element describes by whom the resource was performed. The unrequired ‘sort’ attribute describes the preferred performer for listing. (This sorting attribute was included for backwards compatibility with music library managers/players that lists only one artist's name.)
<performers sort="White Stripes, The"> […]</performers>
The optional ‘person’ child element describes an performer. The ‘name’ child element describes the performer’s name.
<performers> <person> <name>Jack White</name></person></performers>
The required ‘musician’ child element has no value. The optional ‘instrument’ child element describes the instrument used by the performer, with one of the following values ‘wind’, ‘lamellophone’, ‘percussion, ‘string’, ‘voice’, ‘electronic’, and ‘keyboard’. Two additional value to this child element is ‘vocal’, and ‘lead-vocal’.
<performers> <person> <name>Jack White</name> <musician /> <instrument>string</instrument> <instrument>keyboard</instrument> <instrument>lead-vocal</instrument></person></performers>
Note: When searching for ‘Jack White’ as a guitarist the above example should suffice as a guitar is grouped under string instrument. This should be considered when implementing the above elements in search engines.
The required ‘actor’ child element’s optional ‘portrait’ attribute describes a fictional name an actors portraits in a movie (his role).
<performers> <person> <name>Gael García Bernal</name> <actor portrait="Stéphane Miroux" /></person></performers>
Other and organisations
Optional person elements can contain ‘director’, ‘produser‘, and ‘writer’.
<performers> <person> <name>Michel Gondry</name> <director /></person></performers>
The ‘organisation’ child element describes organisations involved in the production of the resource. The optional ‘uri’ attribute describes the organisation using an (unique?) URI.
<performers> <organisation uri="http://partizan.com/">Partizan</organisation></performers>
The ‘texts’ element links media resources with CMML text resources such as song lyrics and film subtitles in the stream. The required ‘uri’ must point to another resource's id attribute or an external web URL resource. The optional ‘type’ attribute specifies the MIME type of external resource. )It is not encouraged to use the type nor URL option. Keep things in the stream, so to speak.)
<texts uri="#example-text-resource" />
The ‘recording’ element describes recording conditions.
The optional ‘date’ child element describes when the recording was made.
<recording> <date>2018-10-17</date> </recording>
The optional ‘duration’ child element describes how long the recording lasts. This value must be specified as a colon separated value containing days:hours:minutes:seconds:milliseconds. When the value is low enough to not use a field it should be left blank or have the value zero (‘0’). The below examples says zero days, zero hours, seven minutes, four seconds, and 54 milliseconds.
<recording> <duration>::07:04:54</duration> </recording>
Note: When displaying durations in players; it might be better to convert days to hours, and visa versa. This recommendation does not specify in which cases this should be done. Duration and time is presented in different ways according to local variation (and user preference). Though displaying a shortened (lave out the null values) full time according to local variations will provably be the best way. Example: ‘2 days, 23 hours, and 18 seconds.’ OR a converted to days ‘2,96 days.’ Displaying ‘2:23::18:’ (same value as previous examples) may be confusing to users who are not used to this format. Even though it is the shortest way to display the full duration.
The optional ‘location’ child element describes when the recording was made in a human readable-way. The optional ‘lat’ and ‘long’ attributes are the machine-readable latitude and longitude position of the recording.
<recording> <location lat="22.20N" long="114.11E">Hong Kong, China, Earth</location> </recording>
The ‘rights’ element describes the Copyright and license status of the resource.
The optional ‘date’ child element describes when the Copyright were put in place. This is especially useful when determining when a work's Copyright expires.
<rights> <date>2018-10-20</date> </rights>
The optional ‘license’ child element is a short and human-readable version of the full license.
<rights> <license>© 2018 Recording Company. All distribution rights reserved.</license> </rights>
The optional ‘link’ child element can point to any URI via it's ‘uri’ attribute where a full version of the license is available. This means it can be pointed to a ‘resource’ element via it's ‘id’ attribute as well!
<rights> <link type="text/html" uri="http://licenses.record-company.com/artist.html" /> </rights>
Describing titles, subtitles, and taglines
The ‘title’ element describes the resource's title.
<title>Awesome Audio Track</title>
The ‘subtitle’ element describes secondary title.
<subtitle>The Sound of Music</subtitle>
The ‘tagline’ element describes promotional taglines and slogans.
<tagline>Get to the real sound!</tagline>
- 2007-11-25 – Began work with simplifying the format.
- 2007-09-08 – Wiki page created based on original format and suggestions from the mailing list.
- 2007-09-06 – Format proposed on the ogg-dev mailing list by Daniel Aleksandersen.