OpusTodo: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(→‎Later: I think the tool issue is covered by opus-tools)
(47 intermediate revisions by 8 users not shown)
Line 1: Line 1:
== Code ==
== For 1.1.1 ==
 
* Unconstrained SILK VBR
=== For IETF draft ===
* <s>New comparison tool</s> done in draft-11
* <s>Update test vectors</s> done in draft-11
 
== SILK issues ==
 
 
=== Later ===
* Smart automatic mode decision
* psymodel based VBR
* Remove copy in inverse MDCT
* Save some float<->int conversions


== Spec ==
== Spec ==
* Ogg mapping. See [[http://tools.ietf.org/html/draft-ietf-codec-oggopus IETF draft]]
* Matroska mapping. See: [[MatroskaOpus]] And firefox/ffmpeg implementation
* RTP payload format See [[http://tools.ietf.org/html/draft-spittka-payload-rtp-opus IETF draft]]


* Finish codec draft
== Website ==
* Ogg mapping (including multi-channel). See: [[OggOpus]]
* De-uglify webpage - some suggestions:
* Matroska mapping. See: [[MatroskaOpus]]
** write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?), and the prop. ones)
* RTP payload format
** write about implementations (is there only one so far?)
** comparison table (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
** future use in video files (Theora? Dirac? WebM? other future codecs...)
** audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...  
* Promotional material (some nice free/public-domain sounds/radio stations in Opus format)


== Other ==
== Other ==


* Logo See: [https://bugzilla.mozilla.org/show_bug.cgi?id=689261 Mozilla bug 689261] for some discussion
* Test vectors
* Listening tests
* Documentation (at a minimum every exported symbol should have complete and accurate documentation)
* Add content to opus-codec.org
** The above documentation
** Presentations
** Examples and test results  (hyperlink to Monty's demo, gmaxwell's HA results page, etc)
* Oggz-validate (should also validate opus toc)
* Oggz-validate (should also validate opus toc)


== Opus-tools ==
== Opus-tools ==
* Build infrastructure (e.g. autotools)
* Port opusdec to libopusfile/libopusurl.
* A simple real time streaming example tool
* A simple real time streaming example tool
* <s>Multichannel support</s> doneish.
** Start with opusrtp.c in [https://git.xiph.org/?p=opus-tools.git opus-tools]
** Make <code>opusrtp rtp://example.com:5431/</code> listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
** Make sending similarly generic. Maybe just <code>opusrtp source.opus -o rtp://example.com:5431/</code> to send source.opus out to the destination?
** Make --sniff save one file per
** Implement DTLS-SRTP. See webrtc.
** audio capture/encode, decode/playback?
** Parse and act on sdp for convenience and testing.
 
* Replaygain (half done— needs a gain tool)
* Replaygain (half done— needs a gain tool)
* <s>Testing (incl. jenkins automation)</s> doneish


== Third party software ==
== Surround work ==
* Support in ekiga
 
* Support in mumble
* Apply spreading to energy masking
* Support in asterisk
* More conservative energy masking (not just mean difference) and dynalloc
* Support in icecast
* Allow SILK/hybrid on center channel for voice?
* Support in firefox (rtcweb and in ogg)
 
* Support in VLC
== Psychoacoustic stuff ==
* Support in ogg123
 
* Support in ffmpeg
* Adaptive width narrowing and forced intensity stereo bands
* Support in rockbox
 
* Support in foobar2000
== Optimisations ==
* Support in gstreamer
 
* Support in mplayer
* Vectorising comb_filter()
* Support in xmms
* Use 16-bit mul plus shift in denormalise_bands()
* Support in oggdsf
* Optimise MDCT somehow
* Support in xiphqt
* Merge MIPS optimisations
 
== Third-Party tool enhancements ==
* mutagen: [https://bitbucket.org/lazka/mutagen/issue/202/oggopus-support-in-place-rewrites-for support padding in comments header], [https://bitbucket.org/lazka/mutagen/issue/203/oggopus-allow-updating-the-output_gain allow updating output gain in ID header]
 
== Future work ==
* psymodel based VBR
* Remove copy in inverse MDCT
* Save some float<->int conversions
* Improvements to LP mode CBR (greg has some code)
* Better handling for the case where FEC has a different bandwidth than the current mode
* PLC transitions on unprotected SILK-SILK bandwidth changes?
* Figure out how to use speech/music detection optimally
** find optimal switching time (low energy/tonality)
* Improve variable frame size

Revision as of 19:20, 23 December 2014

For 1.1.1

  • Unconstrained SILK VBR

Spec

Website

  • De-uglify webpage - some suggestions:
    • write about codecs obsoleted by OPUS (Speex, CELT, Vorbis(?), and the prop. ones)
    • write about implementations (is there only one so far?)
    • comparison table (Opus, Vorbis, Speex, ..., MP5) of features (channels, freq, bits per sample, license, language (C89), integer impl. (Vorbis decoder only, Opus YES, ...)
    • future use in video files (Theora? Dirac? WebM? other future codecs...)
    • audio files for storage (like Vorbis, no raw Opus defined, only inside OGG), ...
  • Promotional material (some nice free/public-domain sounds/radio stations in Opus format)

Other

  • Oggz-validate (should also validate opus toc)

Opus-tools

  • Port opusdec to libopusfile/libopusurl.
  • A simple real time streaming example tool
    • Start with opusrtp.c in opus-tools
    • Make opusrtp rtp://example.com:5431/ listen to that host and port and mux packets from there. Generalize the cpac bases --sniff implementation
    • Make sending similarly generic. Maybe just opusrtp source.opus -o rtp://example.com:5431/ to send source.opus out to the destination?
    • Make --sniff save one file per
    • Implement DTLS-SRTP. See webrtc.
    • audio capture/encode, decode/playback?
    • Parse and act on sdp for convenience and testing.
  • Replaygain (half done— needs a gain tool)

Surround work

  • Apply spreading to energy masking
  • More conservative energy masking (not just mean difference) and dynalloc
  • Allow SILK/hybrid on center channel for voice?

Psychoacoustic stuff

  • Adaptive width narrowing and forced intensity stereo bands

Optimisations

  • Vectorising comb_filter()
  • Use 16-bit mul plus shift in denormalise_bands()
  • Optimise MDCT somehow
  • Merge MIPS optimisations

Third-Party tool enhancements

Future work

  • psymodel based VBR
  • Remove copy in inverse MDCT
  • Save some float<->int conversions
  • Improvements to LP mode CBR (greg has some code)
  • Better handling for the case where FEC has a different bandwidth than the current mode
  • PLC transitions on unprotected SILK-SILK bandwidth changes?
  • Figure out how to use speech/music detection optimally
    • find optimal switching time (low energy/tonality)
  • Improve variable frame size