Vorbis over RTP
This page documents current consensus on the RTP payload format for Vorbis audio. This encapsulation is useful for interactive and multicast streaming.
In 2008, the Vorbis RTP specification was accepted by the IETF's Network Working Group as RFC 5215. It may be found at
Basically, we pack Vorbis packets into RTP packets; that's been part of the draft for some time. The hard part is achieving reliable header transmission, since RTP is usually used in combination with a no-guaranteed-delivery networking mechanism. Equally important, in multicast the headers can't be prepended on connection the way Icecast does with HTTP streams.
The current suggestion (due to Jack) is that we dispense with the chaining feature of Ogg streams, and only allow a single set of vorbis headers per RTP stream. Since RTP is most often used to stream live or at least individual events, this is not a serious limitation. Even in the "simulated live" case encoded files often have the same set of codebooks, and those that do not can be re-encoded on the fly.
The only real drawback is the loss of a metadata update mechanism. This can be resolved either by using a completely separate metadata stream (which we've always wanted to do anyway) or by altering the spec to allow comment header packets to occur outside the headers. In the later case a recording application would need to insert chaining boundaries and duplicate headers at each metadata change. The former offers the client more control over the metadata stream bandwidth.
- Vorbis packets get packed into RTP packets
- Header parameters are fixed per stream, so they can be passed in the SDP
* inline in the SDP? * HTTP reference or other protocol? * some kind of hash so clients can cache codebooks/parameters * inline in the RTP stream should also be allowed
- Granulepos becomes timestamp (still in samples)
Tor-Einar Jarnbjo has an example implementation that extends the ideas in the kerr 03 draft a bit.