From XiphWiki
Revision as of 03:39, 8 August 2008 by Jmspeex (talk | contribs) (bio)
Jump to navigation Jump to search


This talk introduces CELT, a new open source audio codec from Xiph.Org designed for high-quality communication with very low delay. CELT bridges the gap between Speex and Vorbis by providing very high quality speech and music with less than 10 ms delay. This allows new applications, such as "CD-quality" video-conferencing, and even makes it possible for musicians to play together remotely over a cable or DSL connection. We will explain why latency is a very important issue for audio compression and how CELT solves the difficulty of having good quality with low delay. The CELT API will be presented, along with guidelines for writing low-delay audio applications.

The effort is led by the developer of Speex, with the help of both Vorbis and Theora developers. Although the codec is still in development, we will also show how CELT can already provide both higher quality at a given bitrate and lower delay than the current generation of proprietary low-delay audio codecs. A live demonstration of CELT will be included.

  • We want very high quality VoIP/videoconference/...
  • Speex doesn't reach a high enough quality level and Vorbis has too much latency.
  • For once, open-source is ahead of the proprietary codecs.

Possible contents:

  • Explaining why latency is an issue
  • How CELT works
  • API and writing for low latency
  • Comparison, samples
  • Low-delay demo (e.g. vs Skype or Ekiga)

Need to address: why relevant to the Linux community


Ever felt like jamming across the Internet using your DSL connection only to find there's no application that can do it? That's mainly because there was no audio codec that could handle that task, until now. Once again, Xiph.Org is back to save the day with a new codec.


Ever wondered why your telephone doesn't sound as good as your music player? Historically, low-latency speech codecs such as Speex have had poor performance on music or other general audio, due to limits in the sampling rate. High quality general purpose codecs such as Vorbis have too high a latency to be used for telephony. We present a new codec, CELT, which has both extremely low latency and high sampling rates. This makes it possible to play live music together with someone over a DSL connection or enjoy "CD-quality" video conferencing, and enables or enhances a host of other interactive applications. Unlike mainstream audio and video, there is no entrenched proprietary codec in this domain, and our open source alternative is already better than the competition. This talk will explain why latency is an issue in interactive applications, briefly describe how we were able to extend the techniques used in speech codecs beyond the sampling rates where they are traditionally effective, and talk about the libcelt API and how to write low-latency audio applications for Linux.


Jean-Marc Valin has a B.S., M.S., and PhD in Electrical Engineering from the University of Sherbrooke. He is the primary author of the Speex codec, which provides a free alternative to patented, proprietary speech codecs. He joined the Xiph.Org Foundation in 2002, just after Speex was created. From 2005 to 2008, he was a post-doc at the CSIRO, where he started working on the next-generation audio codec named CELT. His expertise includes speech and audio coding, speech recognition, echo cancellation, and other audio-related topics.

Dr. Terriberry received dual B.S. and M.S. degrees in both Mathematics and Computer Science from Virginia Tech in 1999 and 2001, respectively, and a Ph.D. in Computer Science from the University of North Carolina at Chapel Hill in 2006. Since 2002 he has been a volunteer for the Xiph.org Foundation, a non-profit organization that develops free, open multimedia protocols and software. He is the primary author of the Theora specification, and contributed to several of the mathematical components of the CELT codec, enabling optimal encoding of quantized data. His research interests include audio and video compression, motion tracking, target recognition, medical image analysis, computer vision, optical character recognition, and general purpose computation on GPUs.