Difference between revisions of "OpusTodo"

From XiphWiki
Jump to: navigation, search
(update for 1.1.3)
 
(36 intermediate revisions by 7 users not shown)
Line 1: Line 1:
== 1.0.2 ==
+
== For 1.2 ==
 
+
* Low bitrate quality improvements
* multi-frame FEC/PLC fix
+
* AVX optimizations
* opus_packet_get_duration()
+
* Fix compilation as a single module for gecko
* OPUS_GET_FRAME_SIZE() for decoder??
+
 
+
== 1.1-beta ==
+
 
+
* multi-frame FEC/PLC fix
+
* tune transient detector
+
* variable frame size?
+
* LOTS of testing
+
* re-tune hybrid rate allocation
+
* re-tune mode switching decisions
+
* figure out how to use speech/music detection optimally
+
 
+
== Lower priority ==
+
 
+
* Handle packets with PLC frames followed by FEC
+
* Better handling for the case where FEC has a different bandwidth than the current mode
+
* PLC transitions on unprotected SILK-SILK bandwidth changes?
+
  
 
== Spec ==
 
== Spec ==
* Ogg mapping
+
* Matroska mapping. See: [[MatroskaOpus]] And firefox/ffmpeg implementation
* Matroska mapping. See: [[MatroskaOpus]]
+
* RTP payload format. Mono/stereo mapping is complete [[https://tools.ietf.org/html/rfc7587 RFC 7587]], no multichannel mapping yet.
* RTP payload format
+
* mp4 mapping. See [[https://opus-codec.org/docs/opus_in_isobmff.html ISO Base Media File Format draft]]
  
 
== Website ==
 
== Website ==
* De-uglify webpage
+
* De-uglify webpage - some suggestions:
* Promotional material
+
** write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
 +
** write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
 +
** [https://en.wikipedia.org/wiki/Comparison_of_audio_coding_formats audio codec comparison table] (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
 +
** future use in video files (Theora? Dirac? WebM? other future codecs...)
 +
** audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
 +
* Promotional material (some nice free/public-domain sounds/radio stations in Opus format)
  
 
== Other ==
 
== Other ==
Line 35: Line 23:
  
 
== Opus-tools ==
 
== Opus-tools ==
 +
* Port opusdec to libopusfile/libopusurl.
 
* A simple real time streaming example tool
 
* A simple real time streaming example tool
* Replaygain (half done— needs a gain tool)
+
** Start with opusrtp.c in [https://git.xiph.org/?p=opus-tools.git opus-tools]
 +
** Make <code>opusrtp rtp://example.com:5431/</code> listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
 +
** Make sending similarly generic. Maybe just <code>opusrtp source.opus -o rtp://example.com:5431/</code> to send source.opus out to the destination?
 +
** Make --sniff save one file per
 +
** Implement DTLS-SRTP. See webrtc.
 +
** audio capture/encode, decode/playback?
 +
** Parse and act on sdp for convenience and testing.
  
== Experiments ==
+
* EBU R128/Replaygain (half done— needs a gain tool)
  
* Test exp_analysis and void_my_warranty.patch
+
== Surround work ==
  
 +
* Apply spreading to energy masking
 +
* More conservative energy masking (not just mean difference) and dynalloc
 +
* Allow SILK/hybrid on center channel for voice?
 +
 +
== Psychoacoustic stuff ==
 +
 +
* Adaptive width narrowing and forced intensity stereo bands
 +
 +
== Optimisations ==
 +
 +
* Vectorising comb_filter()
 +
* Use 16-bit mul plus shift in denormalise_bands()
 +
* Optimise MDCT somehow
 +
 +
== Third-Party tool enhancements ==
 +
* mutagen: [https://bitbucket.org/lazka/mutagen/issue/202/oggopus-support-in-place-rewrites-for support padding in comments header], [https://bitbucket.org/lazka/mutagen/issue/203/oggopus-allow-updating-the-output_gain allow updating output gain in ID header]
  
 
== Future work ==
 
== Future work ==
* Smart automatic mode decision
 
 
* psymodel based VBR
 
* psymodel based VBR
 
* Remove copy in inverse MDCT
 
* Remove copy in inverse MDCT
 
* Save some float<->int conversions
 
* Save some float<->int conversions
 
* Improvements to LP mode CBR (greg has some code)
 
* Improvements to LP mode CBR (greg has some code)
 +
* Unconstrained SILK VBR
 +
* Better handling for the case where FEC has a different bandwidth than the current mode
 +
* PLC transitions on unprotected SILK-SILK bandwidth changes?
 +
* Figure out how to use speech/music detection optimally
 +
** find optimal switching time (low energy/tonality)
 +
* Improve variable frame size
 +
 +
[[Category:Opus]]

Latest revision as of 19:59, 19 July 2016

For 1.2

  • Low bitrate quality improvements
  • AVX optimizations
  • Fix compilation as a single module for gecko

Spec

Website

  • De-uglify webpage - some suggestions:
    • write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?) and the proprietary ones)
    • write about implementations (libopus encoder/decoder, libavcodec decoder, any others?)
    • audio codec comparison table (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
    • future use in video files (Theora? Dirac? WebM? other future codecs...)
    • audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
  • Promotional material (some nice free/public-domain sounds/radio stations in Opus format)

Other

  • Oggz-validate (should also validate opus toc)

Opus-tools

  • Port opusdec to libopusfile/libopusurl.
  • A simple real time streaming example tool
    • Start with opusrtp.c in opus-tools
    • Make opusrtp rtp://example.com:5431/ listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
    • Make sending similarly generic. Maybe just opusrtp source.opus -o rtp://example.com:5431/ to send source.opus out to the destination?
    • Make --sniff save one file per
    • Implement DTLS-SRTP. See webrtc.
    • audio capture/encode, decode/playback?
    • Parse and act on sdp for convenience and testing.
  • EBU R128/Replaygain (half done— needs a gain tool)

Surround work

  • Apply spreading to energy masking
  • More conservative energy masking (not just mean difference) and dynalloc
  • Allow SILK/hybrid on center channel for voice?

Psychoacoustic stuff

  • Adaptive width narrowing and forced intensity stereo bands

Optimisations

  • Vectorising comb_filter()
  • Use 16-bit mul plus shift in denormalise_bands()
  • Optimise MDCT somehow

Third-Party tool enhancements

Future work

  • psymodel based VBR
  • Remove copy in inverse MDCT
  • Save some float<->int conversions
  • Improvements to LP mode CBR (greg has some code)
  • Unconstrained SILK VBR
  • Better handling for the case where FEC has a different bandwidth than the current mode
  • PLC transitions on unprotected SILK-SILK bandwidth changes?
  • Figure out how to use speech/music detection optimally
    • find optimal switching time (low energy/tonality)
  • Improve variable frame size