Mp4Opus: Difference between revisions
No edit summary |
mNo edit summary |
||
(6 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{draft}} | {{draft}} | ||
[http://vfrmaniac.fushizen.eu/contents/opus_in_isobmff.html Paranoialmaniac's draft] is the current draft encapsulation guide for [[Opus]] audio in the mp4 (ISO Base) media container. | |||
Below are some of the motivations behind the draft. | |||
[http://wiki.multimedia.cx/index.php?title=MP4 MP4] already has support for declaring encoder delay and pre-roll. | [http://wiki.multimedia.cx/index.php?title=MP4 MP4] already has support for declaring encoder delay and pre-roll. | ||
Line 7: | Line 9: | ||
For delay, Daemon404 suggested "whatever l-smash maps encoder delay to." | For delay, Daemon404 suggested "whatever l-smash maps encoder delay to." | ||
yusuke says, "ISO and Apple recommend the use of edit list for removing priming samples from the presentation." libavformat's demuxer supports *one* edit list. | yusuke says, "ISO and Apple recommend the use of edit list for removing priming samples from the presentation." libavformat's demuxer supports *one* edit list entry. | ||
There's some work on codec-independent channel mapping, downmix and dynamic range control as part of [http://mpeg.chiariglione.org/standards/mpeg-4/iso-base-media-file-format/text-isoiec-14496-122012-pdam-4-enhanced-audio-support ISO 14496-12 Amd4] We might be able use some of that, but it doesn't support the Opus case of needing to indicated which streams are coupled pairs. We'll still need to define our own extension for this. | There's some work on codec-independent channel mapping, downmix and dynamic range control as part of [http://mpeg.chiariglione.org/standards/mpeg-4/iso-base-media-file-format/text-isoiec-14496-122012-pdam-4-enhanced-audio-support ISO 14496-12 Amd4] We might be able use some of that, but it doesn't support the Opus case of needing to indicated which streams are coupled pairs. We'll still need to define our own extension for this. | ||
Line 14: | Line 15: | ||
Question: Better to reuse the channel mapping header entirely, or just report the coupled stream count and use the downmix table to do the mapping? | Question: Better to reuse the channel mapping header entirely, or just report the coupled stream count and use the downmix table to do the mapping? | ||
Possible registration process, send an email to http://mp4ra.org/request.html Possible registrar is ' | Possible registration process, send an email to http://mp4ra.org/request.html. Possible registrar is 'Opus'. | ||
Internally everything is resampled at 48000, this is always output by the decoder, floating point numbers. But the original sample rate is stored so that the decoder can act upon it. Atom AudioSampleEntry will have 48 and the original one will be stored in the codec's one. | Internally everything is resampled at 48000, this is always output by the decoder, floating point numbers. But the original sample rate is stored so that the decoder can act upon it. Atom AudioSampleEntry will have 48 and the original one will be stored in the codec's one. | ||
AudioRollRecoveryEntry - shall have a value of | '''Global Header''' | ||
AudioSampleEntry - hardcoded at 16fp | |||
''AudioRollRecoveryEntry'' - shall have a value of 3840 (80ms * 48k) | |||
''AudioSampleEntry'' - hardcoded at 16fp | |||
descriptor - same as ogg rather than as ts, to keep things simple | ''descriptor'' - same as ogg rather than as ts, to keep things simple | ||
channel count - already included | ''channel count'' - already included | ||
pre skip - already included | ''pre skip'' - already included | ||
Gain - volume atom? unused in practice - oggheader - not in ts (TODO?) | ''Gain'' - volume atom? unused in practice - oggheader - not in ts (TODO?) | ||
when you decode samples you're supposed to multiply against this value, so that decoder can apply post volume | when you decode samples you're supposed to multiply against this value, so that decoder can apply post volume | ||
Reusing the one in ogg. | Reusing the one in ogg. | ||
mapping family (with vorbis mapping) | ''mapping family'' (with vorbis mapping) | ||
- mono/stereo no channel config | |||
- specify # channel | |||
- map to the # ouput | |||
audio channel layout https://developer.apple.com/library/mac/documentation/musicaudio/reference/CoreAudioDataTypesRef/Reference/reference.html - too complex | audio channel layout https://developer.apple.com/library/mac/documentation/musicaudio/reference/CoreAudioDataTypesRef/Reference/reference.html - too complex | ||
plug it from ogg and put it in our custom atom | plug it from ogg and put it in our custom atom | ||
Things to put in custom atom | '''Things to put in custom atom''' | ||
- input sr | - input sr | ||
- output gain | - output gain | ||
- channel mappaing | - channel mappaing | ||
- channel count (for backup) | |||
'''Opus''' is the name of the atom, like in TS | |||
what about album art? quicktime/mp4/mp3 | |||
[[Category:Opus]] |
Latest revision as of 07:36, 18 August 2015
Paranoialmaniac's draft is the current draft encapsulation guide for Opus audio in the mp4 (ISO Base) media container.
Below are some of the motivations behind the draft.
MP4 already has support for declaring encoder delay and pre-roll.
For pre-roll I believe we can use 'AudioRollRecoveryEntry' for pre-roll.
For delay, Daemon404 suggested "whatever l-smash maps encoder delay to." yusuke says, "ISO and Apple recommend the use of edit list for removing priming samples from the presentation." libavformat's demuxer supports *one* edit list entry.
There's some work on codec-independent channel mapping, downmix and dynamic range control as part of ISO 14496-12 Amd4 We might be able use some of that, but it doesn't support the Opus case of needing to indicated which streams are coupled pairs. We'll still need to define our own extension for this.
Question: Better to reuse the channel mapping header entirely, or just report the coupled stream count and use the downmix table to do the mapping?
Possible registration process, send an email to http://mp4ra.org/request.html. Possible registrar is 'Opus'.
Internally everything is resampled at 48000, this is always output by the decoder, floating point numbers. But the original sample rate is stored so that the decoder can act upon it. Atom AudioSampleEntry will have 48 and the original one will be stored in the codec's one.
Global Header
AudioRollRecoveryEntry - shall have a value of 3840 (80ms * 48k)
AudioSampleEntry - hardcoded at 16fp
descriptor - same as ogg rather than as ts, to keep things simple
channel count - already included pre skip - already included
Gain - volume atom? unused in practice - oggheader - not in ts (TODO?) when you decode samples you're supposed to multiply against this value, so that decoder can apply post volume Reusing the one in ogg.
mapping family (with vorbis mapping)
- mono/stereo no channel config - specify # channel - map to the # ouput
audio channel layout https://developer.apple.com/library/mac/documentation/musicaudio/reference/CoreAudioDataTypesRef/Reference/reference.html - too complex plug it from ogg and put it in our custom atom
Things to put in custom atom
- input sr - output gain - channel mappaing - channel count (for backup)
Opus is the name of the atom, like in TS
what about album art? quicktime/mp4/mp3