OpusTodo: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(update for 1.1.3)
 
(80 intermediate revisions by 10 users not shown)
Line 1: Line 1:
== Code ==
== For 1.2 ==
* Low bitrate quality improvements
* AVX optimizations
* Fix compilation as a single module for gecko


=== For IETF draft ===
== Spec ==
* Code cleanup (any left?)
* Matroska mapping. See: [[MatroskaOpus]] And firefox/ffmpeg implementation
* Multi-channel signalling (done, needs more testing)
* RTP payload format. Mono/stereo mapping is complete [[https://tools.ietf.org/html/rfc7587 RFC 7587]], no multichannel mapping yet.
* Make opus-compare fail for single seriously trashed frames
* mp4 mapping. See [[https://opus-codec.org/docs/opus_in_isobmff.html ISO Base Media File Format draft]]
 
== Website ==
* De-uglify webpage - some suggestions:
** write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
** write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
** [https://en.wikipedia.org/wiki/Comparison_of_audio_coding_formats audio codec comparison table] (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
** future use in video files (Theora? Dirac? WebM? other future codecs...)
** audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
* Promotional material (some nice free/public-domain sounds/radio stations in Opus format)
 
== Other ==
 
* Oggz-validate (should also validate opus toc)


== SILK issues ==
== Opus-tools ==
* PLC buffer not fully initialized (fix needs verifying)
* Port opusdec to libopusfile/libopusurl.
* Mid and side using different sampling rates (fix needs verifying)
* A simple real time streaming example tool
* <s>LLBR stereo issue (has a proposed fix)</s>
** Start with opusrtp.c in [https://git.xiph.org/?p=opus-tools.git opus-tools]
** <s>Introduces prefill bug (fixed on greg's tree)</s>
** Make <code>opusrtp rtp://example.com:5431/</code> listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
*<s>Artefacts on SILK mono<->stereo switching</s>
** Make sending similarly generic. Maybe just <code>opusrtp source.opus -o rtp://example.com:5431/</code> to send source.opus out to the destination?
* <s>Artefacts on SILK frame size switching (e.g. 960,480 glitches)</s>
** Make --sniff save one file per
* Reduce SILK bandpass switching artefacts
** Implement DTLS-SRTP. See webrtc.
* <s>Use of signed overflow (undefined in C), intentionally and otherwise. </s>
** audio capture/encode, decode/playback?
* Encoder triggers DTX randomly (even if not enabled) for 40/60 ms stereo frames
** Parse and act on sdp for convenience and testing.
* <s>CLANG ARITHMETIC UNDEFINED at <silk/silk_NSQ_del_dec.c, (652:33)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 90005 right (int32): -25578</s>
* CLANG ARITHMETIC UNDEFINED at <silk/decode_core.c, (108:40)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 916258817 right (int32): -3
* CLANG ARITHMETIC UNDEFINED at <silk/decode_core.c, (108:40)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -418535217 right (int32): -1832517634
* CLANG ARITHMETIC UNDEFINED at <./silk/Inlines.h, (120:13)> : Op: -=, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): 2147454976 right (int32): -2147467848
<!-- ./test_opus voip 48000 2 32000 -bandwidth WB -framesize 10 /home/gmaxwell/big-fb.sw /dev/null -->
* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (68:25)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -2053682997 right (int32): -96356645
* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (67:25)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 2144927654 right (int32): 9275188
* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (63:21)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -1503335005 right (int32): -978520921
* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (64:21)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 1813111370 right (int32): 545673470
* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (65:21)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 2121005902 right (int32): 274731600
* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (72:21)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -79495168 right (int32): 2131398803
<!-- voip 16000 2 16000 -framesize 20 /home/gmaxwell/big-fb.sw /dev/null -->
* Review Tim's "LSB with no pulses" fix


=== Later ===
* EBU R128/Replaygain (half done— needs a gain tool)
* <strike>Exposed CELT constrained VBR</strike>
 
* <strike>Fixed-point build</strike>
== Surround work ==
* <strike>Fix build system (right now it fails to build shared libraries, drops .o files all over)</strike>
 
* <strike>Floating point API</strike>
* Apply spreading to energy masking
* Usable command-line tools (<s>opus is a major regression from libcelt right now</s>)
* More conservative energy masking (not just mean difference) and dynalloc
* Smart automatic mode decision
* Allow SILK/hybrid on center channel for voice?
* psymodel based VBR
 
* Remove copy in inverse MDCT
== Psychoacoustic stuff ==
* Save some float<->int conversions
 
* Adaptive width narrowing and forced intensity stereo bands


== Spec ==
== Optimisations ==


* Finish codec draft
* Vectorising comb_filter()
* Ogg mapping (including multi-channel). See: [[OggOpus]]
* Use 16-bit mul plus shift in denormalise_bands()
* RTP payload format
* Optimise MDCT somehow


== Other ==
== Third-Party tool enhancements ==
* mutagen: [https://bitbucket.org/lazka/mutagen/issue/202/oggopus-support-in-place-rewrites-for support padding in comments header], [https://bitbucket.org/lazka/mutagen/issue/203/oggopus-allow-updating-the-output_gain allow updating output gain in ID header]


* Logo
== Future work ==
* Test vectors
* psymodel based VBR
* Listening tests
* Remove copy in inverse MDCT
* Documentation (at a minimum every exported symbol should have complete and accurate documentation)
* Save some float<->int conversions
* Add content to opus-codec.org
* Improvements to LP mode CBR (greg has some code)
* Oggz-validate (should also validate opus toc)
* Unconstrained SILK VBR
** The above documentation
* Better handling for the case where FEC has a different bandwidth than the current mode
** Presentations
* PLC transitions on unprotected SILK-SILK bandwidth changes?
** Examples and test results  (hyperlink to Monty's demo, gmaxwell's HA results page, etc)
* Figure out how to use speech/music detection optimally
* Useful example software (e.g. streaming software that works correctly) (opus-tools in xiph git)
** find optimal switching time (low energy/tonality)
** <s>Support for resampling in tools</s>
* Improve variable frame size


== Third party software ==
[[Category:Opus]]
* Support in ekiga
* Support in mumble
* Support in asterisk
* Support in firefox (rtcweb and in ogg)
* Support in VLC
* Support in ogg123
* Support in ffmpeg
* Support in rockbox
* Support in foobar2000
* Support in gstreamer
* Support in mplayer
* Support in xmms
* Support in oggdsf
* Support in xiphqt

Latest revision as of 19:59, 19 July 2016

For 1.2

  • Low bitrate quality improvements
  • AVX optimizations
  • Fix compilation as a single module for gecko

Spec

Website

  • De-uglify webpage - some suggestions:
    • write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
    • write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
    • audio codec comparison table (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
    • future use in video files (Theora? Dirac? WebM? other future codecs...)
    • audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
  • Promotional material (some nice free/public-domain sounds/radio stations in Opus format)

Other

  • Oggz-validate (should also validate opus toc)

Opus-tools

  • Port opusdec to libopusfile/libopusurl.
  • A simple real time streaming example tool
    • Start with opusrtp.c in opus-tools
    • Make opusrtp rtp://example.com:5431/ listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
    • Make sending similarly generic. Maybe just opusrtp source.opus -o rtp://example.com:5431/ to send source.opus out to the destination?
    • Make --sniff save one file per
    • Implement DTLS-SRTP. See webrtc.
    • audio capture/encode, decode/playback?
    • Parse and act on sdp for convenience and testing.
  • EBU R128/Replaygain (half done— needs a gain tool)

Surround work

  • Apply spreading to energy masking
  • More conservative energy masking (not just mean difference) and dynalloc
  • Allow SILK/hybrid on center channel for voice?

Psychoacoustic stuff

  • Adaptive width narrowing and forced intensity stereo bands

Optimisations

  • Vectorising comb_filter()
  • Use 16-bit mul plus shift in denormalise_bands()
  • Optimise MDCT somehow

Third-Party tool enhancements

Future work

  • psymodel based VBR
  • Remove copy in inverse MDCT
  • Save some float<->int conversions
  • Improvements to LP mode CBR (greg has some code)
  • Unconstrained SILK VBR
  • Better handling for the case where FEC has a different bandwidth than the current mode
  • PLC transitions on unprotected SILK-SILK bandwidth changes?
  • Figure out how to use speech/music detection optimally
    • find optimal switching time (low energy/tonality)
  • Improve variable frame size