VorbisComment: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
No edit summary
Line 84: Line 84:
* Cowon iAudio U3 (Firmware 1.29, 4 GB): works
* Cowon iAudio U3 (Firmware 1.29, 4 GB): works
* Cowon D2: no problem (latest Firmware: 2.59, 8GB Version)
* Cowon D2: no problem (latest Firmware: 2.59, 8GB Version)
* iRiver E100: no problem (latest Firmware: 1.16 G_U, 8GB Version)


'''Tested tag editors'''
'''Tested tag editors'''
* Easytag 2.1.6: can open the file to edit the normal tag fields
* Easytag 2.1.6: can open the file to edit the normal tag fields
* MP3Tag 2.42e: can open the file to edit the normal tag fields
* MP3Tag 2.42e: can open the file to edit the normal tag fields
'''Tested other software'''
* Total Recorder: can open the file without a problem. After re-saving content of the "BINARY_COVERART" will be lost.


===Dates and time===
===Dates and time===

Revision as of 07:17, 1 April 2009

VorbisComment is a base-level Metadata format initially created for use with Ogg Vorbis. It has since been adopted in the specifications of Ogg encapsulations for other Xiph.Org codecs including Theora, Speex and FLAC.

The use case for VorbisComment is given as:

... much like someone jotting a quick note on the bottom of a CDR. It should be a little information to remember the disc by and explain it to others; a short, to-the-point text note that need not only be a couple words, but isn't going to be more than a short paragraph.[1]

VorbisComments are typically used to provide basic information like the title and copyright holder of a work. As such the scope is similar to that of ID3 tags used with MP3 files. VorbisComment is widely supported on portable Ogg Vorbis players as well as streaming, editing and playback software.

Although the syntax of VorbisComment is well-specified, various conventions exist for the field names in use. The goal for this page is to codify best practices and collect proposals for standardization of VorbisComment field names.

VorbisComments are typically encoded as the second packet in a codec stream. When VorbisComments are included in the first (ie. Theora) stream of an Ogg Theora file, they are assumed to cover all streams in the multiplexed group. [2]

VorbisComment is the simplest and most widely-supported mechanism for storing metadata with Xiph.Org codecs. For other existing and proposed mechanisms, see Metadata.

Recommended field names

The current VorbisComment recommendation contains a recommended set of field names for comments.

Proposed field names

Some proposals for extra field names:

Comments are intended to be free-form, but for the purposes of interoperability, it is helpful to define tag sets for particular applications, and provide some guidelines for machine parsing.

Some field names may have to be non-free-form to achieve machine parsing. Such as ENCODER, DATE, RIGHTS-DATE, and RIGHTS-URI. See reasoning below.

Cover art

VorbisComments don't officially support album cover art yet. Since this is a frequently requested feature though, the goal is to find a consensus and an official standard on how to embed (or link) album cover art pictures within ogg vorbis files.

Unofficial "COVERART" field

There exists an unofficial, not well supported comment field named "COVERART". It includes a base64-encoded string of the binary picture data (usually a JPEG file, but this could be a different file format too). The disadvantages are that

  • no additional information like a description about the cover art is provided,
  • the base64 string is displayed within many tag editors as plain text because of their missing support for this "COVERART" field
  • it may breaks the playback on hardware players because of a large vorbis comment header
  • the cover art can't be linked

The unofficial "COVERART" field is supported for example by such software as AudioShell (http://www.softpointer.com/AudioShell.htm) and Total Recorder (http://www.totalrecorder.com/).

Proposal

Placing the binary FLAC coverart structure within a vorbis comment named "BINARY_COVERART" would have the following benefits:

  • Easy to use for developers since the identical (or similar) structure is also used by FLAC and MP3, which means that chances are good that people and software programmers are willing to support this.
  • Old C / C++ based implementations don't display the binary data as string since it always starts with a zero byte at the first position, which is an empty string when interpreted as UTF-8.
  • The cover art can either be linked or embedded within the stream.
  • All common picture file formats are supported (jpg, gif, whatever).
  • Additional information like a description or the picture type (front cover, back cover...) is supported.

Possible disadvantages are:

  • As with the base64 "COVERART" field, it might break playback of existing players (especially hardware players, software players could be updated easily). A workaround would be to link the picture within the tag, or to notify a user of a software tagger that his hardware player might not support playback of the file if he embeds a picture.


In order to test if there are playback problems with this proposal, there is a test file available here. You're invited to download this file, test playback on your software and hardware players, and report the results here on the wiki.

Tested software players

  • Audacious 1.5.1: no problem
  • foobar2000: no problems
  • Gnome: built-in preview playback: no problem
  • MediaMonkey: no problems
  • Media Player Classic (unicode build) 6.4.9.1: no problem
  • RoarAudio: no problems (server and client side)
  • Rythmbox 0.11.6: no problem
  • Totem 2.24.3: no problem
  • VLC 0.9.4/0.9.6: doesn't play
    • Patch send to VLC to fix this - should get in 1.0.0
  • WinAmp: no problems
  • Windows Media Player 11: no problem
  • XMPlay 3.4.2: no problem
  • Nero ShowTime: no problem

Tested hardware players

  • Logitech Squeezebox: doesn't play this file (and all other oggs with embedded picture)
    • Workaround: The needed Server Software (called SqueezeCenter) can convert ogg to mp3 on the fly, and has also no problem to convert oggs with embedded pictures
  • Sandisk Sansa Fuze (Firmware 01.01.22): Hangs up when trying to playback the demo file - had to reset the player
    • Note: The "Fuze" can play ogg vorbis files which have embedded pictures from "Easytag"
  • Cowon iAudio U3 (Firmware 1.29, 4 GB): works
  • Cowon D2: no problem (latest Firmware: 2.59, 8GB Version)
  • iRiver E100: no problem (latest Firmware: 1.16 G_U, 8GB Version)

Tested tag editors

  • Easytag 2.1.6: can open the file to edit the normal tag fields
  • MP3Tag 2.42e: can open the file to edit the normal tag fields

Tested other software

  • Total Recorder: can open the file without a problem. After re-saving content of the "BINARY_COVERART" will be lost.

Dates and time

The goal is to specify one standard format for describing dates and time.

ISO proposal

The date format for any field describing a date must follow the ISO scheme: YYYY-MM-DD or shortened to just YYYY-MM or simply YYYY.

We have been recommending this usage with the DATE tag for some time. It is proposed that the spec be amended to include this information for machinability.

The time format for any field except track duration must be specified with leading T and ending with a time zone. Schemas with and without dates: YYYY-MM-DDTHH:MM:SS+TS THH:MM+TZ

New ENCODER field name proposal

The goal is to attribute encoder software. This value can be used in the future to determine which files can be improved by being re encoded with a newer version.

Comment: What is lacking from the vendor string present in the spec from the start? All libvorbis and encoder tunings I'm aware of have recorded the encoder version here.
Note that ffmpeg2theora uses ENCODER, but does not include a url.
A URI/L—especially one with version numbering—will be more unique. See the above goal for this comment.
I've also seen ENCODED_BY.
ENCODED_BY is usually the person who did the encoding. This should not be part of the recommendation due to legal problems around deliberate and accidental distribution to third parties. Basically the name of the encoder should not be included to protect encoders from their own egos and possible legal prosecution.
I am trying to get the specification to include that this field must contain a unique URL and version number. For the reason listed above. Whether to including the field at all would of course be optional.

Proposal

The encoder field name must be a unique URL providing both encoder software name and version. If no unique URL address is available were both name and version is available; then the version number can be specified by separating with a space character. For examples:

ENCODER=http://flac.sourceforge.net/ 1.2.1

Improving license data

The goal is to provide a method for proclaiming license and copyright information (basically clarifying ‘distribution rights (if any) and ownership’).

The specification document describes LICENSE and COPYRIGHT fields. But is not clear enough about whether these should be machine-readable.

We should consider working together with Creative Commons to have complementary and interlinked information on the CC and Xiph wikis. Refer to the Ogg page in the CC wiki.

New RIGHTS field name proposal

One proposal is to replace the COPYRIGHT and LICENSE field names with RIGHTS. RIGHTS must be a human-readable copyright statement. Basic example:

RIGHTS=Copyright © Recording Company Inc. All distribution rights reserved.

But this is not machine-readable. Adding two complementary field names should do the trick: RIGHTS-DATE, describing the date of copyright; and RIGHTS-URI, providing a method for linking to a license. Software agents can assume that multiple songs uses the sameURIs, such as in the case for Creative Commons. Full example:

RIGHTS=Copyright © 2019 Recording Company Inc. All distribution rights reserved.
RIGHTS-DATE=2019-04
RIGHTS-URI=http://somewhere.com/license.xhtml

Software such as for multimedia management and playback are encouraged to display the RIGHTS statement as a linked phrase using RIGHTS-URI.

RIGHTS-DATE does not need to be displayed as it is required in the human readable version by international copyright agreements. RIGHTS-DATE can be used to determine when a copyrighted work falls under the public domain and related matters. (The Beatles' copyright on their original studio recordings (not the remixes) are soon expiering. So mechanisms such as the RIGHTS-DATE are indeed required in music management and filesharing software!)

To remain machine-readable it would be required to have at most one instance of each RIGHTS field name. All fields would of course remain optional.

The Dublin Core Metadata Initiative recommends the use of ‘rights’ to describe license and copyright matters. The web feed format Atom 1.0 has implemented a rights element in their specification.

Improving existing fields proposal

Similar to the DATE tag above, we have generally recommended that a URL uniquely identifying the license be included in the LICENSE field to allow machine identification of the license. This is in agreement with the proposal in the CC wiki. Since the COPYRIGHT field is a human-readable statement of the copyright, like the proposed RIGHTS tag above, some people include a license url there. Therefore if a url can't be found in a LICENSE tag if any, applications should use one from the COPYRIGHT tag, if any. Contact information for verification, attribution, relicensing, etc. can be obtained from the COPYRIGHT field, but CC also recommend a separate CONTACT tag for this information. This is reasonable, so we propose it be included.

Attributing involved parties

The goal is to attribute more persons and organisations involved in audio and music productions to make room for more advanced search and sorting.

NO PROPOSALS! Needs much extending beyond just ARTIST field name. See work at proposed XML replacement for Vorbis Comments, M3F.

Geo Location fields

The LOCATION field is meant to carry a human readable location for the recording/creation of the media file.

Having geographical coordinates according to WGS84 can be useful as well, especially in a form that can be machine parsed. The agreed format is similar to this geo microformat:

GEO_LOCATION= latitude ; longitude [; elevation ]

where each value is a fixed point decimal number formatted in the C locale with a period (.) for the radix. Values are separated with a ';' and white space is not significant. The elevation is optional.

latitude is the geo latitude location of where the media has been recorded or produced in decimal degrees according to WGS84 (zero at the equator, negative values for southern latitudes) (C double).

longitude is the geo longitude location of where the media has been recorded or produced in decimal degrees according to WGS84 (zero at the prime meridian in Greenwich/UK, negative values for western longitudes). (C double).

elevation is the geo elevation of where the media has been recorded or produced in meters according to WGS84 (zero is average sea level) (C double).

Character encoding

The goal is to be offer better support for more languages and make machine processing faster.

The specification should be a little more strict to achieve this.

Proposals

Field names may be UTF-8 and all UPPERCASE for easier machine processing.

Allowing tag names to be UTF-8 instead of ASCII is a backwards-incompatible spec change. If we did this, requiring that the case mapping happen in the tagging application rather than in decoders is reasonable, since case mapping in unicode is non-trival.

The original argument for ASCII was that we need standardized tag names for interoperability, so there's no point in being able to localize them, and we might as well go with our native prejudice. Localizing the values should be done by appending a language code to the tag, since this is both machinable and there may be collisions between translated tag names.

UTF-8 is a bad idea in field names. The field names are for machine interpretation, localisation should be done on the software side. UTF-8 introduces matching problems (canonical form) and encoding/decoding problems (difficulty in finding length of a string). Please sign comments; I think the above is a cumulative set of comments, but no idea.--Imalone 01:11, 17 September 2007 (PDT)

Implementations