VorbisRTP

Vorbis over RTP

This page documents current consensus on the RTP payload format for Vorbis audio. This encapsulation is useful for interactive and multicast streaming.

Drafts

draft-kerr-avt-vorbis-rtp-03 (now expired) was the last spec attempt and serves as the baseline for current work.

Issues

Basically, we pack Vorbis packets into RTP packets; that's been part of the draft for some time. The hard part is achieving reliable header transmission, since RTP is usually used in combination with a no-guaranteed-delivery networking mechanism. Equally important, in multicast the headers can't be prepended on connection the way icecast does with HTTP streams.

The current suggestion (due to Jack) is that we dispense with the chaining feature of Ogg streams, and only allow a single set of vorbis headers per RTP stream. Since RTP is most often used to stream live or at least individual events, this is not a serious limitation. Even in the "simulated live" case encoded files often have the same set of codebooks, and those that do not can be re-encoded on the fly.

The only real drawback is the loss of a metadata update mechanism. This can be resolved either by using a completely separate metadata stream (which we've always wanted to do anyway) or by altering the spec to allow comment header packets to occur outside the headers. In the later case a recording application would need to insert chaining boundaries and duplicate headers at each metadata change. The former offers the client more control over the metadata stream bandwidth.

Design

Vorbis packets get packed into RTP packets
Header parameters are fixed per stream, so they can be passed in the SDP

 * inline in the SDP?
 * HTTP reference or other protocol?
 * some kind of hash so clients can cache codebooks/parameters
 * inline in the RTP stream should also be allowed

Granulepos becomes timestamp (still in samples)

Implementation

Tor-Einar Jarnbjo has an example implementation that extends the ideas in the 03 draft a bit.