OpusTodo: Difference between revisions

← Older edit

Latest revision as of 02:59, 20 July 2016

For 1.2

Low bitrate quality improvements
AVX optimizations
Fix compilation as a single module for gecko

Spec

Matroska mapping. See: MatroskaOpus And firefox/ffmpeg implementation
RTP payload format. Mono/stereo mapping is complete [RFC 7587], no multichannel mapping yet.
mp4 mapping. See [ISO Base Media File Format draft]

Website

De-uglify webpage - some suggestions:
- write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
- write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
- audio codec comparison table (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
- future use in video files (Theora? Dirac? WebM? other future codecs...)
- audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
Promotional material (some nice free/public-domain sounds/radio stations in Opus format)

Other

Oggz-validate (should also validate opus toc)

Opus-tools

Port opusdec to libopusfile/libopusurl.
A simple real time streaming example tool
- Start with opusrtp.c in opus-tools
- Make opusrtp rtp://example.com:5431/ listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
- Make sending similarly generic. Maybe just opusrtp source.opus -o rtp://example.com:5431/ to send source.opus out to the destination?
- Make --sniff save one file per
- Implement DTLS-SRTP. See webrtc.
- audio capture/encode, decode/playback?
- Parse and act on sdp for convenience and testing.

EBU R128/Replaygain (half done— needs a gain tool)

Surround work

Apply spreading to energy masking
More conservative energy masking (not just mean difference) and dynalloc
Allow SILK/hybrid on center channel for voice?

Psychoacoustic stuff

Adaptive width narrowing and forced intensity stereo bands

Optimisations

Vectorising comb_filter()
Use 16-bit mul plus shift in denormalise_bands()
Optimise MDCT somehow

Third-Party tool enhancements

mutagen: support padding in comments header, allow updating output gain in ID header

Future work

psymodel based VBR
Remove copy in inverse MDCT
Save some float<->int conversions
Improvements to LP mode CBR (greg has some code)
Unconstrained SILK VBR
Better handling for the case where FEC has a different bandwidth than the current mode
PLC transitions on unprotected SILK-SILK bandwidth changes?
Figure out how to use speech/music detection optimally
- find optimal switching time (low energy/tonality)
Improve variable frame size

@@ Line 1: / Line 1: @@
-== Code ==
+== For 1.2 ==
+* Low bitrate quality improvements
+* AVX optimizations
+* Fix compilation as a single module for gecko
-=== For IETF draft ===
+== Spec ==
-* Code cleanup (any left?)
+* Matroska mapping. See: [[MatroskaOpus]] And firefox/ffmpeg implementation
-* Multi-channel signalling (done, needs more testing)
+* RTP payload format. Mono/stereo mapping is complete [[https://tools.ietf.org/html/rfc7587 RFC 7587]], no multichannel mapping yet.
-* Make opus-compare fail for single seriously trashed frames
+* mp4 mapping. See [[https://opus-codec.org/docs/opus_in_isobmff.html ISO Base Media File Format draft]]
+== Website ==
+* De-uglify webpage - some suggestions:
+** write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
+** write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
+** [https://en.wikipedia.org/wiki/Comparison_of_audio_coding_formats audio codec comparison table] (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
+** future use in video files (Theora? Dirac? WebM? other future codecs...)
+** audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
+* Promotional material (some nice free/public-domain sounds/radio stations in Opus format)
+== Other ==
+* Oggz-validate (should also validate opus toc)
-== SILK issues ==
+== Opus-tools ==
-* PLC buffer not fully initialized (fix needs verifying)
+* Port opusdec to libopusfile/libopusurl.
-* Mid and side using different sampling rates (fix needs verifying)
+* A simple real time streaming example tool
-* <s>LLBR stereo issue (has a proposed fix)</s>
+** Start with opusrtp.c in [https://git.xiph.org/?p=opus-tools.git opus-tools]
-** <s>Introduces prefill bug (fixed on greg's tree)</s>
+** Make <code>opusrtp rtp://example.com:5431/</code> listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
-*<s>Artefacts on SILK mono<->stereo switching</s>
+** Make sending similarly generic. Maybe just <code>opusrtp source.opus -o rtp://example.com:5431/</code> to send source.opus out to the destination?
-* <s>Artefacts on SILK frame size switching (e.g. 960,480 glitches)</s>
+** Make --sniff save one file per
-* Reduce SILK bandpass switching artefacts
+** Implement DTLS-SRTP. See webrtc.
-* <s>Use of signed overflow (undefined in C), intentionally and otherwise. </s>
+** audio capture/encode, decode/playback?
-* Encoder triggers DTX randomly (even if not enabled) for 40/60 ms stereo frames
+** Parse and act on sdp for convenience and testing.
-* <s>CLANG ARITHMETIC UNDEFINED at <silk/silk_NSQ_del_dec.c, (652:33)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 90005 right (int32): -25578</s>
-* CLANG ARITHMETIC UNDEFINED at <silk/decode_core.c, (108:40)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 916258817 right (int32): -3
-* CLANG ARITHMETIC UNDEFINED at <silk/decode_core.c, (108:40)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -418535217 right (int32): -1832517634
-* CLANG ARITHMETIC UNDEFINED at <./silk/Inlines.h, (120:13)> : Op: -=, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): 2147454976 right (int32): -2147467848
-<!-- ./test_opus voip 48000 2 32000 -bandwidth WB -framesize 10 /home/gmaxwell/big-fb.sw /dev/null -->
-* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (68:25)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -2053682997 right (int32): -96356645
-* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (67:25)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 2144927654 right (int32): 9275188
-* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (63:21)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -1503335005 right (int32): -978520921
-* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (64:21)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 1813111370 right (int32): 545673470
-* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (65:21)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 2121005902 right (int32): 274731600
-* CLANG ARITHMETIC UNDEFINED at <silk/LPC_analysis_filter.c, (72:21)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -79495168 right (int32): 2131398803
-<!-- voip 16000 2 16000 -framesize 20 /home/gmaxwell/big-fb.sw /dev/null -->
-* Review Tim's "LSB with no pulses" fix
-* Fix decoder-side resampling delay for rates other than 48 kHz and document behavior.
-* Find_poly improvements
-* Rename PLC.c
-* silk warning: comparison is always true due to limited range of data type with assert
-=== Later ===
+* EBU R128/Replaygain (half done— needs a gain tool)
-* <strike>Exposed CELT constrained VBR</strike>
-* <strike>Fixed-point build</strike>
+== Surround work ==
-* <strike>Fix build system (right now it fails to build shared libraries, drops .o files all over)</strike>
-* <strike>Floating point API</strike>
+* Apply spreading to energy masking
-* Usable command-line tools (<s>opus is a major regression from libcelt right now</s>)
+* More conservative energy masking (not just mean difference) and dynalloc
-* Smart automatic mode decision
+* Allow SILK/hybrid on center channel for voice?
-* psymodel based VBR
-* Remove copy in inverse MDCT
+== Psychoacoustic stuff ==
-* Save some float<->int conversions
+* Adaptive width narrowing and forced intensity stereo bands
-== Spec ==
+== Optimisations ==
-* Finish codec draft
+* Vectorising comb_filter()
-* Ogg mapping (including multi-channel). See: [[OggOpus]]
+* Use 16-bit mul plus shift in denormalise_bands()
-* RTP payload format
+* Optimise MDCT somehow
-== Other ==
+== Third-Party tool enhancements ==
+* mutagen: [https://bitbucket.org/lazka/mutagen/issue/202/oggopus-support-in-place-rewrites-for support padding in comments header], [https://bitbucket.org/lazka/mutagen/issue/203/oggopus-allow-updating-the-output_gain allow updating output gain in ID header]
-* Logo
+== Future work ==
-* Test vectors
+* psymodel based VBR
-* Listening tests
+* Remove copy in inverse MDCT
-* Documentation (at a minimum every exported symbol should have complete and accurate documentation)
+* Save some float<->int conversions
-* Add content to opus-codec.org
+* Improvements to LP mode CBR (greg has some code)
-* Oggz-validate (should also validate opus toc)
+* Unconstrained SILK VBR
-** The above documentation
+* Better handling for the case where FEC has a different bandwidth than the current mode
-** Presentations
+* PLC transitions on unprotected SILK-SILK bandwidth changes?
-** Examples and test results  (hyperlink to Monty's demo, gmaxwell's HA results page, etc)
+* Figure out how to use speech/music detection optimally
-* Useful example software (e.g. streaming software that works correctly) (opus-tools in xiph git)
+** find optimal switching time (low energy/tonality)
-** <s>Support for resampling in tools</s>
+* Improve variable frame size
-== Third party software ==
+[[Category:Opus]]
-* Support in ekiga
-* Support in mumble
-* Support in asterisk
-* Support in firefox (rtcweb and in ogg)
-* Support in VLC
-* Support in ogg123
-* Support in ffmpeg
-* Support in rockbox
-* Support in foobar2000
-* Support in gstreamer
-* Support in mplayer
-* Support in xmms
-* Support in oggdsf
-* Support in xiphqt