MatroskaOpus

From XiphWiki

(Difference between revisions)
Jump to: navigation, search
(Document some more of the outstanding issues with the Matroska mapping)
(DRAFT: empty is the same as not present)
(9 intermediate revisions not shown)
Line 2: Line 2:
This is an encapsulation spec for the [[Opus]] codec in [[http://matroska.org/ Matroska]]. There are a number of outstanding functional issues with muxing Opus in Matroska, and until those are resolved, use of this spec is NOT RECOMMENDED.
This is an encapsulation spec for the [[Opus]] codec in [[http://matroska.org/ Matroska]]. There are a number of outstanding functional issues with muxing Opus in Matroska, and until those are resolved, use of this spec is NOT RECOMMENDED.
-
 
-
Opus has few signaling requirements, so a simple mapping:
 
-
 
-
- CodecID is A_OPUS
 
-
- SampleFrequency is always 48000
 
-
- Channels is 1 or 2 based on what the muxer knows about the input
 
-
- CodecPrivate is void.
 
-
 
-
However, this doesn't work for multistream. Supporting multistream requires signalling the number of Opus streams packed in each frame and the mapping from those to output channels through the container.
 
-
 
-
To support multistream, we place the complete 'OpusHead' header packet from [[OggOpus]], as defined there, in the CodecPrivate element. This provides the number of streams and the channel mapping table, as well as related features like pre-skip and gain which improve the chances of lossless remuxing between the two encapsulations.
 
  - CodecID is A_OPUS
  - CodecID is A_OPUS
Line 19: Line 8:
  - CodecPrivate is the 'OpusHead' packet, identical to the Ogg mapping
  - CodecPrivate is the 'OpusHead' packet, identical to the Ogg mapping
-
The second 'OpusTags' header packet from OggOpus is not used in the Matroska encapsulation. Matroska has its own system for tag metadata, and this avoids duplicating it and the need for sub-framing to index multiple packets within the CodecPrivate element.
+
The 'OpusHead' format is defined by the [[http://tools.ietf.org/html/draft-terriberry-oggopus Ogg Opus]] mapping. In particular it includes pre-skip, gain, and the channel mapping table required for correct surround output.
 +
 
 +
The second 'OpusTags' header packet from Ogg Opus is not used in the Matroska encapsulation. Matroska has its own system for tag metadata, and this avoids duplicating it and the need for sub-framing to index multiple packets within the CodecPrivate element.
 +
 
 +
If the CodecPrivate is empty or not present and Channels is 1 or 2, players MAY treat it as a sane set of defaults, I guess. e.g. channel mapping family 0, no pre-skip or gain. For Channels > 2 the track MUST be rejected, since there's no way to map the encoded substreams to channels.
-
If the CodecPrivate is empty, players should treat it as the simpler mapping, I guess.
+
== Open Questions ==
-
== Questions ==
+
Seeking in Opus files requires a pre-roll (recommended to be at least 80 ms). However, currently Matroska requires its index entries to point directly to the data whose timestamp matches the corresponding seek point, not some place arbitrarily before that timestamp. These two requirements are incompatible, and mean that seeking in Opus will be broken in all existing Matroska software. In particularly unlucky cases (e.g., around a transient), playing back audio decoded without any pre-roll can produce extremely loud (possibly equipment-damaging) results. We need a new element to signal this, e.g. Track::TrackEntry::PreRoll.
Should we say muxers MAY or SHOULD NOT produce simple streams without filling in CodecPrivate?
Should we say muxers MAY or SHOULD NOT produce simple streams without filling in CodecPrivate?
-
How does the OpusHead pre-skip field interact with the timestamps? The SimpleBlock timestamp is signed 16 bits, so the format can signal about half of the pre-skip if playback timestamps are to start at zero.
+
How does the OpusHead pre-skip field interact with the timestamps? The SimpleBlock timestamp is signed 16 bits, so the format can signal about half of the pre-skip if playback timestamps are to start at zero. Moritz suggests this won't work because the resolution of the timestamps is controlled by the muxer, so the SimpleBlock timestamp offset isn't sample accurate anyway.[[http://lists.matroska.org/pipermail/matroska-devel/2012-September/004254.html ref]]
One could set an incorrect timestamp on the skipped blocks, and rely on the decoder to drop them based on the OpusHead preskip value. As long as the initial blocks are timestamped <= start of output this shouldn't affect seeking.
One could set an incorrect timestamp on the skipped blocks, and rely on the decoder to drop them based on the OpusHead preskip value. As long as the initial blocks are timestamped <= start of output this shouldn't affect seeking.
Line 33: Line 26:
How important is it that timestamps start at zero in a Matroska file?
How important is it that timestamps start at zero in a Matroska file?
-
The SimpleBlock structure also has an 'invisible' bit, which tell the player to decode, but not display, the contained frames. This lets the muxer signal the pre-skip semantics with frame accuracy, but not sample accuracy. If players implement this it will at least help with sync. Libav does not appear to support the invisible bit.
+
The SimpleBlock structure also has an 'invisible' bit, which tells the player to decode, but not display, the contained frames. This lets the muxer signal the pre-skip semantics with frame accuracy, but not sample accuracy. If players implement this it will at least help with sync. Libav does not appear to support the invisible bit.
-
Seeking in Opus files requires a pre-roll (recommended to be at least 80 ms). However, currently Matroska requires its index entries to point directly to the data whose timestamp matches the corresponding seek point, not some place arbitrarily before that timestamp. These two requirements are incompatible, and mean that seeking in Opus will be broken in all existing Matroska software. In particularly unlucky cases (e.g., around a transient), playing back audio decoded without any pre-roll can produce extremely loud (possibly equipment-damaging) results.
+
How can sample-accurate end-time trimming work in Matroska? Currently all software encapsulating Vorbis in Matroska is broken in this regard, and muxing a Vorbis file in Matroska causes it to get longer (i.e., produce more audio output than the original Ogg file). It would be unfortunate to repeat this disaster for Opus. This needs a new element specifying the number of samples to trim, perhaps a new BlockGroup child.
-
How can sample-accurate end-time trimming work in Matroska? Currently all software encapsulating Vorbis in Matroska is broken in this regard, and muxing a Vorbis file in Matroska causes it to get longer (i.e., produce more audio output than the original Ogg file). It would be unfortunate to repeat this disaster for Opus.
+
If new elements are required, can they be defined so as to enable correct seeking in rolling intra (a.k.a intra refresh) video as well?

Revision as of 22:49, 14 September 2012

DRAFT

This is an encapsulation spec for the Opus codec in [Matroska]. There are a number of outstanding functional issues with muxing Opus in Matroska, and until those are resolved, use of this spec is NOT RECOMMENDED.

- CodecID is A_OPUS
- SampleFrequecy is 48000
- Channels is number of output PCM channels
- CodecPrivate is the 'OpusHead' packet, identical to the Ogg mapping

The 'OpusHead' format is defined by the [Ogg Opus] mapping. In particular it includes pre-skip, gain, and the channel mapping table required for correct surround output.

The second 'OpusTags' header packet from Ogg Opus is not used in the Matroska encapsulation. Matroska has its own system for tag metadata, and this avoids duplicating it and the need for sub-framing to index multiple packets within the CodecPrivate element.

If the CodecPrivate is empty or not present and Channels is 1 or 2, players MAY treat it as a sane set of defaults, I guess. e.g. channel mapping family 0, no pre-skip or gain. For Channels > 2 the track MUST be rejected, since there's no way to map the encoded substreams to channels.

Open Questions

Seeking in Opus files requires a pre-roll (recommended to be at least 80 ms). However, currently Matroska requires its index entries to point directly to the data whose timestamp matches the corresponding seek point, not some place arbitrarily before that timestamp. These two requirements are incompatible, and mean that seeking in Opus will be broken in all existing Matroska software. In particularly unlucky cases (e.g., around a transient), playing back audio decoded without any pre-roll can produce extremely loud (possibly equipment-damaging) results. We need a new element to signal this, e.g. Track::TrackEntry::PreRoll.

Should we say muxers MAY or SHOULD NOT produce simple streams without filling in CodecPrivate?

How does the OpusHead pre-skip field interact with the timestamps? The SimpleBlock timestamp is signed 16 bits, so the format can signal about half of the pre-skip if playback timestamps are to start at zero. Moritz suggests this won't work because the resolution of the timestamps is controlled by the muxer, so the SimpleBlock timestamp offset isn't sample accurate anyway.[ref]

One could set an incorrect timestamp on the skipped blocks, and rely on the decoder to drop them based on the OpusHead preskip value. As long as the initial blocks are timestamped <= start of output this shouldn't affect seeking.

How important is it that timestamps start at zero in a Matroska file?

The SimpleBlock structure also has an 'invisible' bit, which tells the player to decode, but not display, the contained frames. This lets the muxer signal the pre-skip semantics with frame accuracy, but not sample accuracy. If players implement this it will at least help with sync. Libav does not appear to support the invisible bit.

How can sample-accurate end-time trimming work in Matroska? Currently all software encapsulating Vorbis in Matroska is broken in this regard, and muxing a Vorbis file in Matroska causes it to get longer (i.e., produce more audio output than the original Ogg file). It would be unfortunate to repeat this disaster for Opus. This needs a new element specifying the number of samples to trim, perhaps a new BlockGroup child.

If new elements are required, can they be defined so as to enable correct seeking in rolling intra (a.k.a intra refresh) video as well?

Personal tools


Main Page

Xiph.Org Projects

Audio—

Video—

Text—

Container—

Streaming—