Ambisonics: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(Deleted {{delete}}. See Talk.)
(Created page)
Line 1: Line 1:
'''Ambisonics''' is a surround sound system first developed in the 1970s.  Its
main difference from other surround techniques is that it separates trandmission
channels from speaker feeds, the speaker feeds being derived using a decoder
situated in the living room.  Decoders can be implemented in either hardware or
software.  Typically more speakers are used than transmission channels, and the
more speakers used then the more stable the resulting soundfield.  Speakers can
be arranged in a number of configurations, regular polygons being the most
popular.


== Resources on Ambisonics ==
*There is now a set of [http://en.wikipedia.org/wiki/Ambisonic Wikipedia articles on Ambisonics].
*Of particular relevance is the [http://www.ambisonicbootlegs.net/Members/mleese/file-format-for-b-format/ ".amb" specification] for downloadable B-Format files.  However the ".amb" spec has some limitations which it would be useful to overcome.
*[http://members.tripod.com/martin_leese/Ambisonic/ This website] has many pages on Ambisonics (including at the bottom links to other Ambisonic websites).
== Limitations of the ".amb" specification ==
The [http://www.ambisonicbootlegs.net/Members/mleese/file-format-for-b-format/ ".amb" specification]
for downloadable B-Format files is based on the WAVE-EX format.  There are
currently about 75 pieces available in this format for free download. Most
of these are full-sphere soundfields.  Some of the specifications'
limitations are: 
#It is limited to 4 GBytes (2 GBytes if somebody screwed up).
#It is limited to third-order soundfields and below.  While third-order looks like a lot (16 channels), there already exists a prototype mic that can record up to fourth-order.
#No compression (particularly lossless).
#No flag to indicate whether the W channel has been -3 dB attenuated or not.  With a flag it becomes optional (instead of mandatory).
=== Malham notation ===
The order of a B-Format soundfield can be specified using
''Malham notation''.  This uses a string of characters, each character
being either '''f''' (for full-sphere) or '''h''' (for horizontal).  The
first character in the string specifies the type of the first-order
components, the second character the type of the second-order components,
etc.
Malham notation is not used in the ".amb" specification.  Instead the
number of channels uniquely defines the soundfield order.  Unfortunately
this simple and elegant scheme does not work above third-order as
ambiguities creep in.  A more general file format will have to use
something else, such as Malham notation.
Here are some examples of Malham notation: 
#'''h''' - first-order horizontal (3 channels)
#'''f''' - first-order full-sphere (4 channels)
#'''hh''' - second-order horizontal (5 channels)
#'''fh''' - second-order horizontal + first-order height (6 channels)
#'''fff''' - third-order full-sphere (16 channels)
== Channel conversions ==
Converting a B-Format file to a mono file is straightforward.  Use Mono =
W*sqrt(2).
Converting a B-Format file to a stereo file is more difficult.  The "proper"
way to do this is to convert the W,X,Y channels to two-channel UHJ. 
Unfortunately this requires the use of 90-degree wide-band phase shifters. 
In the digital domain these are usually implemented as convolution filters.
Assuming 90-degree phase shifters are unavaiable then the problem is one of
choice.  Starting from B-Format, it is possible to synthesize ''any'' mic
response pointing in ''any'' direction.  Hence, it is possible to synthesize
any ''coincident'' stereo mic technique.  Here are two popular stereo
techniques.
 
=== Blumlein Mid-Side ===
Mid = (W*sqrt(2)) + X ''this is a cardioid response pointing forward''<br>
Left = Mid + Y<br>
Right = Mid - Y
=== Blumlein Crossed Pairs ===
Left = (X + Y)/sqrt(2)<br>
Right = (X - Y)/sqrt(2)
Which conversion to stereo is better depends on the material and how it was
recorded.  A good suggestion is to not specify a particular default channel
conversion; instead, simply specify that there must be one.  If one has to
be specified then Blumlein Crossed Pairs is the simpler.

Revision as of 00:10, 28 January 2007

Ambisonics is a surround sound system first developed in the 1970s. Its main difference from other surround techniques is that it separates trandmission channels from speaker feeds, the speaker feeds being derived using a decoder situated in the living room. Decoders can be implemented in either hardware or software. Typically more speakers are used than transmission channels, and the more speakers used then the more stable the resulting soundfield. Speakers can be arranged in a number of configurations, regular polygons being the most popular.

Resources on Ambisonics

  • There is now a set of Wikipedia articles on Ambisonics.
  • Of particular relevance is the ".amb" specification for downloadable B-Format files. However the ".amb" spec has some limitations which it would be useful to overcome.
  • This website has many pages on Ambisonics (including at the bottom links to other Ambisonic websites).

Limitations of the ".amb" specification

The ".amb" specification for downloadable B-Format files is based on the WAVE-EX format. There are currently about 75 pieces available in this format for free download. Most of these are full-sphere soundfields. Some of the specifications' limitations are:

  1. It is limited to 4 GBytes (2 GBytes if somebody screwed up).
  2. It is limited to third-order soundfields and below. While third-order looks like a lot (16 channels), there already exists a prototype mic that can record up to fourth-order.
  3. No compression (particularly lossless).
  4. No flag to indicate whether the W channel has been -3 dB attenuated or not. With a flag it becomes optional (instead of mandatory).

Malham notation

The order of a B-Format soundfield can be specified using Malham notation. This uses a string of characters, each character being either f (for full-sphere) or h (for horizontal). The first character in the string specifies the type of the first-order components, the second character the type of the second-order components, etc.

Malham notation is not used in the ".amb" specification. Instead the number of channels uniquely defines the soundfield order. Unfortunately this simple and elegant scheme does not work above third-order as ambiguities creep in. A more general file format will have to use something else, such as Malham notation.

Here are some examples of Malham notation:

  1. h - first-order horizontal (3 channels)
  2. f - first-order full-sphere (4 channels)
  3. hh - second-order horizontal (5 channels)
  4. fh - second-order horizontal + first-order height (6 channels)
  5. fff - third-order full-sphere (16 channels)

Channel conversions

Converting a B-Format file to a mono file is straightforward. Use Mono = W*sqrt(2).

Converting a B-Format file to a stereo file is more difficult. The "proper" way to do this is to convert the W,X,Y channels to two-channel UHJ. Unfortunately this requires the use of 90-degree wide-band phase shifters. In the digital domain these are usually implemented as convolution filters.

Assuming 90-degree phase shifters are unavaiable then the problem is one of choice. Starting from B-Format, it is possible to synthesize any mic response pointing in any direction. Hence, it is possible to synthesize any coincident stereo mic technique. Here are two popular stereo techniques.

Blumlein Mid-Side

Mid = (W*sqrt(2)) + X this is a cardioid response pointing forward
Left = Mid + Y
Right = Mid - Y

Blumlein Crossed Pairs

Left = (X + Y)/sqrt(2)
Right = (X - Y)/sqrt(2)

Which conversion to stereo is better depends on the material and how it was recorded. A good suggestion is to not specify a particular default channel conversion; instead, simply specify that there must be one. If one has to be specified then Blumlein Crossed Pairs is the simpler.