Ambisonics
Ambisonics is a surround sound system first developed in the 1970s. Its main difference from other surround techniques is that it separates transmission channels from speaker feeds, the speaker feeds being derived using a decoder situated in the living room. Decoders can be implemented in either hardware or software. Typically more speakers are used than transmission channels, and the more speakers used then the more stable the resulting soundfield. Speakers can be arranged in a number of configurations, regular polygons being the most popular.
Ambisonic files can come in a number of different formats. The main one is called B-Format, the other formats being derived from this. UHJ Format is mono and stereo compatible. G-Format is a set of speaker feeds, so can be enjoyed in surround sound without the need for a decoder in the living room.
Resources on Ambisonics
- There is now a set of Wikipedia articles on Ambisonics.
- Of particular relevance is the ".amb" specification for downloadable B-Format files. However the ".amb" spec has some limitations which it would be useful to overcome.
- This website has many pages on Ambisonics (including at the bottom links to other Ambisonic websites).
B-Format
B-Format is a single coherent soundfield composed of a set of related channels. The number of channels used depends on whether the soundfiled is horizontal-only or full-sphere, and on the order. These B-Format channels are transmission channels, not speaker feeds. Some numbers of channels are tabulated below.
There is a file specification in use for downloadable B-Format files called the ".amb" specification.
Limitations of the ".amb" specification
The ".amb" specification for downloadable B-Format files is based on the WAVE-EX format. There are currently over 75 pieces available in this format for free download. Most of these are first-order full-sphere soundfields. Some of the limitations of the specification are:
- It is limited to 4 GByte files (2 GBytes if somebody screwed up).
- It is limited to third-order soundfields and below. While third-order looks like a lot (16 channels), there already exists a prototype mic that can record up to fourth-order.
- No compression (particularly lossless).
- No flag to indicate whether the W channel has been -3 dB attenuated or not. With a flag it becomes optional (in ".amb" it is mandatory).
Malham notation
The reason that the ".amb" file specification is limited to third-order and below is because it uses the number of channels to uniquely define the soundfield order. Unfortunately this simple and elegant scheme does not work above third-order as ambiguities creep in. (One ambiguity is illustrated in the table below.) A more general file format will have to use something else, such as Malham notation.
Malham notation specifies the order of a B-Format soundfield using a string of characters, each character being either f (for full-sphere) or h (for horizontal). The first character in the string specifies the type of the first-order components, the second character the type of the second-order components, etc.
Horizontal order |
Height order |
Soundfield_type | Malham notation |
Number of_channels |
Channels |
---|---|---|---|---|---|
1 | 0 | horizontal | h | 3 | WXY |
1 | 1 | full-sphere | f | 4 | WXYZ |
2 | 0 | horizontal | hh | 5 | WXYRS |
2 | 1 | mixed-order | fh | 6 | WXYZRS |
2 | 2 | full-sphere | ff | 9 | WXYZRSTUV |
3 | 0 | horizontal | hhh | 7 | WXYRSPQ |
3 | 1 | mixed-order | fhh | 8 | WXYZRSPQ |
3 | 2 | mixed-order | ffh | 11 | WXYZRSTUVPQ |
3 | 3 | full-sphere | fff | 16 | WXYZRSTUVKLMNOPQ |
4 | 0 | horizontal | hhhh | 9 | extra channels unlabled |
Default channel conversions from B-Format
Converting a B-Format file to a mono file is straightforward. Use Mono = W*sqrt(2).
Converting a B-Format file to a stereo file is more difficult. The "proper" way to do this is to convert the W,X,Y channels to two-channel UHJ. Unfortunately this requires the use of 90-degree wide-band phase shifters. In the digital domain these are usually implemented as convolution filters.
Assuming 90-degree phase shifters are unavailable then the problem is one of choice. Starting from B-Format, it is possible to synthesize any mic response pointing in any direction. Hence, it is possible to synthesize all coincident stereo mic techniques. Here are two popular stereo techniques.
Blumlein Mid-Side
Mid = (W*sqrt(2)) + X /*This is a cardioid response pointing forward*/ Left = Mid + Y Right = Mid - Y
Blumlein Crossed Pairs
Left = (X + Y)/sqrt(2) Right = (X - Y)/sqrt(2)
Which conversion to stereo is better depends on the material and how it was recorded. A good suggestion is to not specify a particular default channel conversion; instead, simply specify that there must be one. If one has to be specified then Blumlein Crossed Pairs is the simpler.