Changes

Jump to: navigation, search

MatroskaOpus

5,323 bytes added, 13:32, 12 April 2013
Handling Pre-skip data
- SampleFrequecy is 48000
- Channels is number of output PCM channels
- SeekPreRoll is set to 80000000 - CodecPrivate is starts with the 'OpusHead' packet, identical to the OggOpus Ogg mapping, followed by the pre-skip data.
The second 'OpusTagsOpusHead' header packet from OggOpus format is not used in defined by the Matroska encapsulation[[http://tools. Matroska has its own system for tag metadataietf.org/html/draft-terriberry-oggopus Ogg Opus]] mapping. In particular it includes pre-skip, gain, and this avoids duplicating it and the need channel mapping table required for sub-framing to index multiple packets within the CodecPrivate elementcorrect surround output.
If the CodecPrivate is empty and The second 'ChannelsOpusTags' header packet from Ogg Opus is 1 or 2not used in the Matroska encapsulation. Matroska has its own system for tag metadata, players MAY treat and this avoids duplicating it as a sane set of defaults, I guess. e.g. channel mapping family 0, no preand the need for sub-skip or gain. For 'Channels > 2' the track MUST be rejected, since there's no way framing to map index multiple packets within the encoded substreams to channelsCodecPrivate element.
== Questions ==SeekPreRoll is a new unsigned integer element added to the TrackEntry element. The value is the number of nanoseconds that must be discarded, for that stream, after a seek until the decoded data is valid to render.
Seeking in Opus files requires a pre-roll (recommended to be at least 80 ms). However, currently Matroska requires its index entries to point directly to the data whose timestamp matches the corresponding seek point, not some place arbitrarily before that timestamp. These two requirements are incompatible, and mean that seeking in Opus Block timestamps will be broken in match how all existing Matroska softwareother Codecs are handled. In particularly unlucky cases (eI.g., around a transient), playing back audio decoded without any pre-roll can produce extremely loud (possibly equipment-damaging) results. We need a new element to signal this, e.g. Track::TrackEntry::PreRollThe Block timestamp is the starting time of the first PCM sample position in nanoseconds.
Should we say muxers MAY or SHOULD NOT produce simple streams without filling in (TODO) Define layout of CodecPrivate?.
How does the OpusHead pre-skip field interact with the timestamps? The SimpleBlock timestamp is signed 16 bits, so the format can signal about half of the pre-skip if playback timestamps are to start at zero.== Muxing Recommendations ==
One could set an incorrect timestamp on In order to prevent extraneous parsing of muxed content for the skipped blocksplayers that want to start playback at exactly time T, and rely on we will recommend muxers create files with another Cluster within N-1 at T-SeekPreRoll, where T is the decoder to drop them based on start time of Cluster N. Then add CuePoints for all the new T-SeekPreRoll Clusters with a CueTrack of the OpusHead preskip valueaudio stream. As long as The CuePoints for the initial blocks are timestamped <= start of output this shouldn't affect seekingvideo stream will not change.
How important For example, a file is it that timestamps start a muxed MKV with the following characteristics:- 5 second interval between video keyframes- Each video keyframe begins a new Cluster- Cues will contain video keyframe CuePoints- For each video keyframe at zero time T there will be new Cluster at T-SeekPreRoll- Cues will contain audio CuePoints for T-SeekPreRoll Clusters- Audio and video are interleaved in a Matroska file?monotonically increasing order
Assume SeekPreRoll is 80 milliseconds, the first Cluster starts at 0 milliseconds with a video keyframe Block and has a duration of 4920 milliseconds. The SimpleBlock structure also second Cluster starts at 4920 milliseconds with an audio Block and has an 'invisible' bit, which tell the player a duration of 80 milliseconds. Just to decode, but not displaybe clear, the contained framessecond Cluster can contain Blocks from all streams. This lets the muxer signal the pre-skip semantics The third Cluster starts at 5000 milliseconds with frame accuracy, but not sample accuracya video keyframe Block and has a duration of 4920 milliseconds. If players implement this it will The fourth Cluster starts at least help 9920 milliseconds with sync. Libav does not appear to support the invisible bitan audio Block and has a duration of 80 milliseconds.
Seeking in Opus files requires a pre-roll (recommended With this recommendation players that want audio and video to be start playback at least 80 ms). However, currently Matroska requires its index entries to point directly time T can seek to Cluster T-SeekPreRoll and start decoding the data whose timestamp matches audio stream. This will work the corresponding seek point, not some place arbitrarily before that timestamp. These two requirements are incompatible, same for both local and mean that seeking in Opus will be broken in all existing Matroska software. In particularly unlucky cases (e.g., around a transient), playing back audio decoded without any pre-roll can produce extremely loud (possibly equipment-damaging) results. We need a new element to signal this, e.g. Track::TrackEntry::PreRollHTTP playback.
== Open Questions == <ul> <li>Should we say muxers MAY or SHOULD NOT produce simple streams without filling in CodecPrivate?</li>  <ul> <li>If the CodecPrivate is empty or not present and Channels is 1 or 2, players MAY treat it as a sane set of defaults, I guess. e.g. channel mapping family 0, no pre-skip or gain. For Channels > 2 the track MUST be rejected, since there's no way to map the encoded substreams to channels. </li> </ul>  <li>How does the OpusHead pre-skip field interact with the timestamps?</li> <ul> <li>The SimpleBlock timestamp is signed 16 bits, so the format can signal about half of the pre-skip if playback timestamps are to start at zero. Moritz suggests this won't work because the resolution of the timestamps is controlled by the muxer, so the SimpleBlock timestamp offset isn't sample accurate anyway.[[http://lists.matroska.org/pipermail/matroska-devel/2012-September/004254.html ref]] </li>  <li>One could set an incorrect timestamp on the skipped blocks, and rely on the decoder to drop them based on the OpusHead preskip value. As long as the initial blocks are timestamped <= start of output this shouldn't affect seeking. </li>  <li>The SimpleBlock structure also has an 'invisible' bit, which tells the player to decode, but not display, the contained frames. This lets the muxer signal the pre-skip semantics with frame accuracy, but not sample accuracy. If players implement this it will at least help with sync. Libav does not appear to support the invisible bit. </li> </ul>  <li>How important is it that timestamps start at zero in a Matroska file?</li>  <li>How can sample-accurate end-time trimming work in Matroska? </li>  <ul> <li>Currently all software encapsulating Vorbis in Matroska is broken in this regard, and muxing a Vorbis file in Matroska causes it to get longer (i.e., produce more audio output than the original Ogg file). It would be unfortunate to repeat this disaster for Opus.This needs a new element specifying the number of samples to trim, perhaps a new BlockGroup child. </li> </ul>  <li> If new elements are required, can they be defined so as to enable correct seeking in rolling intra (a.k.a intra refresh) video as well? </li> <ul> <li>SeekPreRoll should work for rolling intra video.</li> </ul> </ul> == Handling Pre-skip data == <ul> <li>Use Cases:</li> <ul> <li>UC1: Playback starts from the beginning of the stream. Source stream time starts at 0.</li> <li>UC2: Playback starts from the beginning of the stream. Pre-skip data ends in middle of compressed packet.</li> <li>UC3: Playback starts from the middle of the stream > SeekPreRoll time.</li> <li>UC4: Playback starts from the middle of the stream < SeekPreRoll time.</li> </ul></ul>  <ul>  <li>one: Timeshift the timestamps by pre-skip data <ul> <li> The Opus audio stream pre-skip data starts from time 0 and adds the pre-skip time to the normal audio time, like how Opus files are muxed into ogg files. We would add a new element to the TrackEntry element, PreSkip, and the player would adjust the timestamps of the decoded samples by subtracting PreSkip. All use cases should be covered. </li>  <li>Cons:</li> <ul> <li>The timestamp of the Block does not match the timestamp of the playback position.</li> <li>Does not generalize known "decode, but not render" data.</li> <li>Forces the player to handle the pre-skip samples. I.e. not the decoder.</li> </ul>  </ul>  <li>two: Add pre-skip data to CodecPrivate.</li> <ul> <li> On every discontinuity the decoder would need to decode and throw away the pre-skip data. </li>  <li>Cons:</li> <ul> <li>UC2 will throw away valid data and the AV sync will be off.</li> <li>UC3 will redundantly decode the pre-skip data.</li> </ul>  </ul>  <li>three: Add TimeToDiscard to Block.</li> <ul> <li> Add an element to the Block element, TimeToDiscard in nanoseconds. A value of -1 would not render the whole Block, which would have the same effect as setting the invisible bit. How would this affect the Block timestamp? Maybe the new element should be SamplesToDiscard or DataToDiscard? </li>  <li>Cons:</li>  </ul>  <li>four: Blocks that contain pre-skip data will set invisible flag.</li> <ul> <li> Blocks that contain pre-skip data have timestamps from the beginning of the stream. Blocks that only contain normal data have timestamps from the playback position. </li>  <li>Cons:</li> <ul> <li>Forces the player to handle the pre-skip samples. I.e. not the decoder.</li> <li>UC2 will throw away valid data and the AV sync will be off. Other use cases should be fine.</li> </ul>  </ul>  <li>five: Force pre-skip packets to be prepended to the first normal packet in the first Block.</li> <ul> <li> The first Block's timestmap will be set to the start time of the source playback position. We would add a new element to the TrackEntry element, PreSkip. All use cases should be covered. </li>  <li>Cons:</li> <ul> <li>Does not generalize known "decode, but not render" data.</li> <li>Forces the player to handle the pre-skip samples. I.e. not the decoder.</li> </ul>  </ul>  <li>six: Create a new codec, OPUS_MKV.</li> <ul> <li> Basically the codec will wrap Opus packets with data telling the decoder what type of Opus packet it contains. Essentially we would be creating a new codec to handle pre-skip data within the decoder. </li>  <li>Cons:</li> <ul> <li>There will be two types of Opus data streams!</li> <li>Does not generalize known "decode, but not render" data.</li> </ul>  </ul>  </ul>
18
edits

Navigation menu