OpusTodo: Difference between revisions

← Older edit

Latest revision as of 02:59, 20 July 2016

For 1.2

Low bitrate quality improvements
AVX optimizations
Fix compilation as a single module for gecko

Spec

Matroska mapping. See: MatroskaOpus And firefox/ffmpeg implementation
RTP payload format. Mono/stereo mapping is complete [RFC 7587], no multichannel mapping yet.
mp4 mapping. See [ISO Base Media File Format draft]

Website

De-uglify webpage - some suggestions:
- write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
- write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
- audio codec comparison table (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
- future use in video files (Theora? Dirac? WebM? other future codecs...)
- audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
Promotional material (some nice free/public-domain sounds/radio stations in Opus format)

Other

Oggz-validate (should also validate opus toc)

Opus-tools

Port opusdec to libopusfile/libopusurl.
A simple real time streaming example tool
- Start with opusrtp.c in opus-tools
- Make opusrtp rtp://example.com:5431/ listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
- Make sending similarly generic. Maybe just opusrtp source.opus -o rtp://example.com:5431/ to send source.opus out to the destination?
- Make --sniff save one file per
- Implement DTLS-SRTP. See webrtc.
- audio capture/encode, decode/playback?
- Parse and act on sdp for convenience and testing.

EBU R128/Replaygain (half done— needs a gain tool)

Surround work

Apply spreading to energy masking
More conservative energy masking (not just mean difference) and dynalloc
Allow SILK/hybrid on center channel for voice?

Psychoacoustic stuff

Adaptive width narrowing and forced intensity stereo bands

Optimisations

Vectorising comb_filter()
Use 16-bit mul plus shift in denormalise_bands()
Optimise MDCT somehow

Third-Party tool enhancements

mutagen: support padding in comments header, allow updating output gain in ID header

Future work

psymodel based VBR
Remove copy in inverse MDCT
Save some float<->int conversions
Improvements to LP mode CBR (greg has some code)
Unconstrained SILK VBR
Better handling for the case where FEC has a different bandwidth than the current mode
PLC transitions on unprotected SILK-SILK bandwidth changes?
Figure out how to use speech/music detection optimally
- find optimal switching time (low energy/tonality)
Improve variable frame size

@@ Line 1: / Line 1: @@
-== 1.0.2 ==
+== For 1.2 ==
+* Low bitrate quality improvements
-* multi-frame FEC/PLC fix
+* AVX optimizations
-* opus_packet_get_duration()
+* Fix compilation as a single module for gecko
-* OPUS_GET_FRAME_SIZE() for decoder??
-== 1.1-beta ==
-* multi-frame FEC/PLC fix
-* tune transient detector
-* variable frame size?
-* LOTS of testing
-* re-tune hybrid rate allocation
-* re-tune mode switching decisions
-* figure out how to use speech/music detection optimally
-== Lower priority ==
-* Handle packets with PLC frames followed by FEC
-* Better handling for the case where FEC has a different bandwidth than the current mode
-* PLC transitions on unprotected SILK-SILK bandwidth changes?
 == Spec ==
-* Ogg mapping
+* Matroska mapping. See: [[MatroskaOpus]] And firefox/ffmpeg implementation
-* Matroska mapping. See: [[MatroskaOpus]]
+* RTP payload format. Mono/stereo mapping is complete [[https://tools.ietf.org/html/rfc7587 RFC 7587]], no multichannel mapping yet.
-* RTP payload format
+* mp4 mapping. See [[https://opus-codec.org/docs/opus_in_isobmff.html ISO Base Media File Format draft]]
 == Website ==
-* De-uglify webpage
+* De-uglify webpage - some suggestions:
-* Promotional material
+** write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
+** write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
+** [https://en.wikipedia.org/wiki/Comparison_of_audio_coding_formats audio codec comparison table] (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
+** future use in video files (Theora? Dirac? WebM? other future codecs...)
+** audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
+* Promotional material (some nice free/public-domain sounds/radio stations in Opus format)
 == Other ==
@@ Line 35: / Line 23: @@
 == Opus-tools ==
+* Port opusdec to libopusfile/libopusurl.
 * A simple real time streaming example tool
-* Replaygain (half done— needs a gain tool)
+** Start with opusrtp.c in [https://git.xiph.org/?p=opus-tools.git opus-tools]
+** Make <code>opusrtp rtp://example.com:5431/</code> listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
+** Make sending similarly generic. Maybe just <code>opusrtp source.opus -o rtp://example.com:5431/</code> to send source.opus out to the destination?
+** Make --sniff save one file per
+** Implement DTLS-SRTP. See webrtc.
+** audio capture/encode, decode/playback?
+** Parse and act on sdp for convenience and testing.
-== Experiments ==
+* EBU R128/Replaygain (half done— needs a gain tool)
-* Test exp_analysis and void_my_warranty.patch
+== Surround work ==
+* Apply spreading to energy masking
+* More conservative energy masking (not just mean difference) and dynalloc
+* Allow SILK/hybrid on center channel for voice?
+== Psychoacoustic stuff ==
+* Adaptive width narrowing and forced intensity stereo bands
+== Optimisations ==
+* Vectorising comb_filter()
+* Use 16-bit mul plus shift in denormalise_bands()
+* Optimise MDCT somehow
+== Third-Party tool enhancements ==
+* mutagen: [https://bitbucket.org/lazka/mutagen/issue/202/oggopus-support-in-place-rewrites-for support padding in comments header], [https://bitbucket.org/lazka/mutagen/issue/203/oggopus-allow-updating-the-output_gain allow updating output gain in ID header]
 == Future work ==
-* Smart automatic mode decision
 * psymodel based VBR
 * Remove copy in inverse MDCT
 * Save some float<->int conversions
 * Improvements to LP mode CBR (greg has some code)
+* Unconstrained SILK VBR
+* Better handling for the case where FEC has a different bandwidth than the current mode
+* PLC transitions on unprotected SILK-SILK bandwidth changes?
+* Figure out how to use speech/music detection optimally
+** find optimal switching time (low energy/tonality)
+* Improve variable frame size
+[[Category:Opus]]