<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.xiph.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Bemasc</id>
	<title>XiphWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.xiph.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Bemasc"/>
	<link rel="alternate" type="text/html" href="https://wiki.xiph.org/Special:Contributions/Bemasc"/>
	<updated>2026-06-09T03:06:09Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Daala_Quickstart&amp;diff=15539</id>
		<title>Daala Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Daala_Quickstart&amp;diff=15539"/>
		<updated>2015-03-21T16:27:44Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Note that SDL 2 won&amp;#039;t do&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This is a guide to getting a copy of the latest code and encoding a video.&lt;br /&gt;
&lt;br /&gt;
There is also a &#039;&#039;&#039;[[Daala Quickstart Windows]]&#039;&#039;&#039; page.&lt;br /&gt;
&lt;br /&gt;
== Installation ==&lt;br /&gt;
&lt;br /&gt;
=== Linux ===&lt;br /&gt;
You&#039;ll need:&lt;br /&gt;
&lt;br /&gt;
* Standard build tools (autoconf, automake v1.11 or later, libtool, pkg-config, and a C compiler)&lt;br /&gt;
* git&lt;br /&gt;
* libogg (v1.3 or later)&lt;br /&gt;
* libpng&lt;br /&gt;
* libjpeg&lt;br /&gt;
* libcheck (v0.9.8 or later, can be skipped if you pass --disable-unit-tests to ./configure)&lt;br /&gt;
* libsdl (1.2, not 2!) (can by skipped if you pass --disable-player to ./configure)&lt;br /&gt;
&lt;br /&gt;
Do not use linuxbrew.&lt;br /&gt;
&lt;br /&gt;
Instructions for installing these packages are OS-specific (feel free to contribute some here, especially if you tried installing these somewhere and ran into difficulties; you will likely save other people some pain). If you have a package manager that has separate -dev versions with the public headers, make sure you install those in addition to the actual libraries.&lt;br /&gt;
&lt;br /&gt;
==== Mac OS X ====&lt;br /&gt;
Install Apple&#039;s command line developer tools. E.g. install [https://developer.apple.com/xcode/ Xcode] from the App Store and select &#039;Command Line Tools&#039; from the Preferences::Downloads panel, or download and install the pkg directly from [https://developer.apple.com/downloads/ developer.apple.com].&lt;br /&gt;
&lt;br /&gt;
Install [http://brew.sh/ Homebrew]&lt;br /&gt;
&lt;br /&gt;
Run the following command to install dependencies:&lt;br /&gt;
  brew install autoconf automake libtool libogg libpng libjpeg check sdl&lt;br /&gt;
&lt;br /&gt;
=== Installation Procedure ===&lt;br /&gt;
&lt;br /&gt;
Just run these commands:&lt;br /&gt;
&lt;br /&gt;
    git clone https://git.xiph.org/daala.git&lt;br /&gt;
    cd daala&lt;br /&gt;
    ./autogen.sh&lt;br /&gt;
    ./configure&lt;br /&gt;
    make&lt;br /&gt;
&lt;br /&gt;
Note that the git clone can take several minutes to complete.&lt;br /&gt;
&lt;br /&gt;
And optionally&lt;br /&gt;
&lt;br /&gt;
    make tools&lt;br /&gt;
&lt;br /&gt;
Make sure you run the git clone operation on the same machine where you intend to use the code. Checking out a copy on Windows and then trying to use it on Linux will not work, as executable permissions and line-endings will not be set properly.&lt;br /&gt;
&lt;br /&gt;
== Encoding a Video ==&lt;br /&gt;
&lt;br /&gt;
If you do not have one, get a sample video or two in .y4m format from [https://media.xiph.org/video/derf/ media.xiph.org]. These videos are relatively large and will take a long time to encode. There are also subsets of 1 second long videos for faster encoding:&lt;br /&gt;
* [https://people.xiph.org/~tdaede/video-1-short/ video-1-short]&lt;br /&gt;
&lt;br /&gt;
We also maintain a set of still-image collections in .y4m format:&lt;br /&gt;
* [https://people.xiph.org/~tterribe/daala/subset1-y4m.tar.gz Subset 1] (50 images, small training set)&lt;br /&gt;
* [https://people.xiph.org/~tterribe/daala/subset2-y4m.tar.gz Subset 2] (50 images, small testing set)&lt;br /&gt;
* [https://people.xiph.org/~tterribe/daala/subset3-y4m.tar.gz Subset 3] (1000 images, large training set)&lt;br /&gt;
* [https://people.xiph.org/~tterribe/daala/subset4-y4m.tar.gz Subset 4] (1000 images, large testing set)&lt;br /&gt;
&lt;br /&gt;
Encode the video:&lt;br /&gt;
&lt;br /&gt;
    ./examples/encoder_example -v 30 video.y4m -o video.ogv&lt;br /&gt;
&lt;br /&gt;
where&lt;br /&gt;
* video.y4m is the input video you want to encode,&lt;br /&gt;
* video.ogv is the name of the encoded video file to output,&lt;br /&gt;
* -v specifies the quality (currently from 0 to 511, where 0 is lossless)&lt;br /&gt;
&lt;br /&gt;
== Decoding/Playing a Video ==&lt;br /&gt;
&lt;br /&gt;
Play the video in a window:&lt;br /&gt;
&lt;br /&gt;
    ./examples/player_example video.ogv&lt;br /&gt;
&lt;br /&gt;
For information on the controls available while playing, run&lt;br /&gt;
&lt;br /&gt;
    ./examples/player_example --help&lt;br /&gt;
&lt;br /&gt;
If you want to use a different player, you can decode the video back to .y4m with&lt;br /&gt;
&lt;br /&gt;
    ./examples/dump_video video.ogv -o decoded_video.y4m&lt;br /&gt;
&lt;br /&gt;
Many other players can play back these .y4m files, and other tools can convert them to various other formats.&lt;br /&gt;
&lt;br /&gt;
== Using PNG Images ==&lt;br /&gt;
&lt;br /&gt;
To encode a series of images:&lt;br /&gt;
&lt;br /&gt;
    make tools&lt;br /&gt;
    ./tools/png2y4m video%05d.png -o video.y4m&lt;br /&gt;
&lt;br /&gt;
where %05d means your input images are named video00000.png, video00001.png, etc. You can leave out the %05d tag if you only want to convert a single image (which does not need to be numbered).&lt;br /&gt;
&lt;br /&gt;
To convert a y4m back to PNGs:&lt;br /&gt;
&lt;br /&gt;
    ./tools/y4m2png video.y4m -o video%05d.png&lt;br /&gt;
&lt;br /&gt;
If you are converting a .y4m file that only contains a single frame (e.g., from one of the still-image subsets linked above), you can leave out the %05d tag. Conversion from PNG to Y4M uses the Rec 709 matrix with video levels, a box filter for chroma subsampling, and a triangular dither. Conversion back from Y4M to PNG uses the same matrix, levels, and box filter, but does not dither.&lt;br /&gt;
&lt;br /&gt;
== Creating y4m from other formats ==&lt;br /&gt;
&lt;br /&gt;
You can use the ffmpeg tool to generate y4m from any of it supported video formats:&lt;br /&gt;
&lt;br /&gt;
    ffmpeg -i video.webm -pix_fmt yuv420p video.y4m&lt;br /&gt;
&lt;br /&gt;
Note that ffmpeg is optimized for speed. You may not get repeatable results across machines.&lt;br /&gt;
&lt;br /&gt;
[[Category:Daala]]&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13944</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13944"/>
		<updated>2013-02-26T05:59:56Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: /* Epilogue */ typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  [[WikiPedia:Rounding|Obvious]],&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t [[WikiPedia:Sound_masking|drown out or mask]]&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s &#039;&#039;watch&#039;&#039; what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At 8 bits this effect is exaggerated. At 16 bits,&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and ... there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as [http://www.acoustics.salford.ac.uk/res/cox/sound_quality/?content=subjective inoffensive] and&lt;br /&gt;
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]&lt;br /&gt;
as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise &#039;&#039;is&#039;&#039; higher [[WikiPedia:power|Sound_power]] overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise.&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below [[WikiPedia:Full_scale|full scale]].  Maybe&lt;br /&gt;
if the CD had been&lt;br /&gt;
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],&lt;br /&gt;
dither &#039;&#039;might&#039;&#039; be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using [[WikiPedia:Sine_wave|sine waves]]. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a [[WikiPedia:Square_wave|square wave]]...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that [any waveform is also [[WikiPedia:Fourier_series|the sum of discrete frequencies]],&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of [[WikiPedia:Even_and_odd_functions#Harmonics|odd harmonics]].  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp [[WikiPedia:Low-pass_filter|anti-aliasing filter]] that cuts off right&lt;br /&gt;
above 20kHz, so our signal is [[WikiPedia:Bandlimiting|bandlimited]], which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the [[WikiPedia:/Gibbs_phenomenon|Gibbs effect]]. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of [[WikiPedia:Sonic_artifact|artifact]] that&#039;s added by anti-aliasing and [[WikiPedia:Reconstruction_filter|anti-imaging]]&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal &#039;&#039;is&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
[[WikiPedia:Interpolation|connect-the-dots]].&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that [[WikiPedia:Dirac_delta_function|impulses]] or&lt;br /&gt;
[[WikiPedia:Synthesizer#ADSR_envelope|fast attacks]] have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in [[Videos/A_Digital_Media_Primer_For_Geeks|the previous episode]], we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the&lt;br /&gt;
[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-003-signals-and-systems-spring-2010/index.htm 6.003]&lt;br /&gt;
and&lt;br /&gt;
[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-007-electromagnetic-energy-from-motors-to-lasers-spring-2011 6.007]&lt;br /&gt;
Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13943</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13943"/>
		<updated>2013-02-26T05:58:31Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: /* Dither */ Link for &amp;quot;inoffensive&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  [[WikiPedia:Rounding|Obvious]],&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t [[WikiPedia:Sound_masking|drown out or mask]]&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s &#039;&#039;watch&#039;&#039; what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At 8 bits this effect is exaggerated. At 16 bits,&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and ... there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as [http://www.acoustics.salford.ac.uk/res/cox/sound_quality/?content=subjective inoffensive] and&lt;br /&gt;
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]&lt;br /&gt;
as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise &#039;&#039;is&#039;&#039; higher [[WikiPedia:power|Sound_power]] overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise.&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below [[WikiPedia:Full_scale|full scale]].  Maybe&lt;br /&gt;
if the CD had been&lt;br /&gt;
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],&lt;br /&gt;
dither &#039;&#039;might&#039;&#039; be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using [[WikiPedia:Sine_wave|sine waves]]. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a [[WikiPedia:Square_wave|square wave]]...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that [any waveform is also [[WikiPedia:Fourier_series|the sum of discrete frequencies]],&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of [[WikiPedia:Even_and_odd_functions#Harmonics|odd harmonics]].  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp [[WikiPedia:Low-pass_filter|anti-aliasing filter]] that cuts off right&lt;br /&gt;
above 20kHz, so our signal is [[WikiPedia:Bandlimiting|bandlimited]], which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the [[WikiPedia:/Gibbs_phenomenon|Gibbs effect]]. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of [[WikiPedia:Sonic_artifact|artifact]] that&#039;s added by anti-aliasing and [[WikiPedia:Reconstruction_filter|anti-imaging]]&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal &#039;&#039;is&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
[[WikiPedia:Interpolation|connect-the-dots]].&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that [[WikiPedia:Dirac_delta_function|impulses]] or&lt;br /&gt;
[[WikiPedia:Synthesizer#ADSR_envelope|fast attacks]] have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in [[Videos/A_Digital_Media_Primer_For_Geeks|the previous episode]], we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the&lt;br /&gt;
[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-003-signals-and-systems-spring-2010/index.htm 6.003]]&lt;br /&gt;
and&lt;br /&gt;
[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-007-electromagnetic-energy-from-motors-to-lasers-spring-2011 6.007]&lt;br /&gt;
Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13942</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13942"/>
		<updated>2013-02-26T05:46:58Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: OCW links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  [[WikiPedia:Rounding|Obvious]],&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t [[WikiPedia:Sound_masking|drown out or mask]]&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s &#039;&#039;watch&#039;&#039; what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At 8 bits this effect is exaggerated. At 16 bits,&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and ... there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and&lt;br /&gt;
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]&lt;br /&gt;
as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise &#039;&#039;is&#039;&#039; higher [[WikiPedia:power|Sound_power]] overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise.&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below [[WikiPedia:Full_scale|full scale]].  Maybe&lt;br /&gt;
if the CD had been&lt;br /&gt;
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],&lt;br /&gt;
dither &#039;&#039;might&#039;&#039; be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using [[WikiPedia:Sine_wave|sine waves]]. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a [[WikiPedia:Square_wave|square wave]]...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that [any waveform is also [[WikiPedia:Fourier_series|the sum of discrete frequencies]],&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of [[WikiPedia:Even_and_odd_functions#Harmonics|odd harmonics]].  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp [[WikiPedia:Low-pass_filter|anti-aliasing filter]] that cuts off right&lt;br /&gt;
above 20kHz, so our signal is [[WikiPedia:Bandlimiting|bandlimited]], which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the [[WikiPedia:/Gibbs_phenomenon|Gibbs effect]]. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of [[WikiPedia:Sonic_artifact|artifact]] that&#039;s added by anti-aliasing and [[WikiPedia:Reconstruction_filter|anti-imaging]]&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal &#039;&#039;is&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
[[WikiPedia:Interpolation|connect-the-dots]].&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that [[WikiPedia:Dirac_delta_function|impulses]] or&lt;br /&gt;
[[WikiPedia:Synthesizer#ADSR_envelope|fast attacks]] have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in [[Videos/A_Digital_Media_Primer_For_Geeks|the previous episode]], we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the&lt;br /&gt;
[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-003-signals-and-systems-spring-2010/index.htm 6.003]]&lt;br /&gt;
and&lt;br /&gt;
[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-007-electromagnetic-energy-from-motors-to-lasers-spring-2011 6.007]&lt;br /&gt;
Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13941</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13941"/>
		<updated>2013-02-26T05:43:14Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Wiki links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  [[WikiPedia:Rounding|Obvious]],&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t [[WikiPedia:Sound_masking|drown out or mask]]&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s &#039;&#039;watch&#039;&#039; what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At 8 bits this effect is exaggerated. At 16 bits,&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and ... there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and&lt;br /&gt;
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]&lt;br /&gt;
as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise &#039;&#039;is&#039;&#039; higher [[WikiPedia:power|Sound_power]] overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise.&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below [[WikiPedia:Full_scale|full scale]].  Maybe&lt;br /&gt;
if the CD had been&lt;br /&gt;
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],&lt;br /&gt;
dither &#039;&#039;might&#039;&#039; be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using [[WikiPedia:Sine_wave|sine waves]]. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a [[WikiPedia:Square_wave|square wave]]...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that [any waveform is also [[WikiPedia:Fourier_series|the sum of discrete frequencies]],&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of [[WikiPedia:Even_and_odd_functions#Harmonics|odd harmonics]].  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp [[WikiPedia:Low-pass_filter|anti-aliasing filter]] that cuts off right&lt;br /&gt;
above 20kHz, so our signal is [[WikiPedia:Bandlimiting|bandlimited]], which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the [[WikiPedia:/Gibbs_phenomenon|Gibbs effect]]. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of [[WikiPedia:Sonic_artifact|artifact]] that&#039;s added by anti-aliasing and [[WikiPedia:Reconstruction_filter|anti-imaging]]&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal &#039;&#039;is&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
[[WikiPedia:Interpolation|connect-the-dots]].&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that [[WikiPedia:Dirac_delta_function|impulses]] or&lt;br /&gt;
[[WikiPedia:Synthesizer#ADSR_envelope|fast attacks]] have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13940</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13940"/>
		<updated>2013-02-26T05:31:26Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  [[WikiPedia:Rounding|Obvious]],&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t [[WikiPedia:Sound_masking|drown out or mask]]&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s &#039;&#039;watch&#039;&#039; what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At 8 bits this effect is exaggerated. At 16 bits,&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and ... there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and&lt;br /&gt;
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]&lt;br /&gt;
as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise &#039;&#039;is&#039;&#039; higher [[WikiPedia:power|Sound_power]] overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise.&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below [[WikiPedia:Full_scale|full scale]].  Maybe&lt;br /&gt;
if the CD had been&lt;br /&gt;
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],&lt;br /&gt;
dither &#039;&#039;might&#039;&#039; be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13939</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13939"/>
		<updated>2013-02-26T05:30:55Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Links, mostly wikipedia&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  [[WikiPedia:Rounding|Obvious]],&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t [[WikiPedia:Sound_masking|drown out or mask]]&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s &#039;&#039;watch&#039;&#039; what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At 8 bits this effect is exaggerated. At 16 bits,&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and ... there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and&lt;br /&gt;
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]&lt;br /&gt;
as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise &#039;&#039;is&#039;&#039; higher [[WikiPedia:power|Sound_power]] overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise.&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below [[WikiPedia:Full_scale|full scale]].  Maybe&lt;br /&gt;
if the CD had been&lt;br /&gt;
[http://www.research.philips.com/technologies/projects/cd/index.html|14 bits as originally designed],&lt;br /&gt;
dither &#039;&#039;might&#039;&#039; be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13938</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13938"/>
		<updated>2013-02-26T05:07:58Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to eight bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits,&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13937</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13937"/>
		<updated>2013-02-26T05:07:28Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Wikipedia links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
&#039;&#039;also&#039;&#039; smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]].  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
[[WikiPedia:Dither|dither]] down to 8 bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the 8-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,&lt;br /&gt;
and [[WikiPedia:Quantization_error|quantization adds noise]].&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our 8-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it&#039;s&lt;br /&gt;
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It &#039;&#039;is&#039;&#039; tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]&lt;br /&gt;
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as 9 bits in perfect conditions, though 5 to 6 bits was&lt;br /&gt;
more typical, especially if it was a recording made on a&lt;br /&gt;
[[WikiPedia:Cassette_deck|tape deck]]. That&#039;s right... your mix tapes were only about 6 bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits &#039;&#039;with&#039;&#039; [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]].  And&lt;br /&gt;
that&#039;s why seeing &#039;[[WikiPedia:SPARS_code|D D D]]&#039; on a [[WikiPedia:Compact_disk||Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to eight bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits,&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13936</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13936"/>
		<updated>2013-02-26T04:49:43Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Wikipedia links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to&lt;br /&gt;
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],&lt;br /&gt;
same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1 kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to [[WikiPedia:Nyquist_frequency|Nyquist]], say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a [[WikiPedia:Compact_disk||Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to eight bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits,&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13935</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13935"/>
		<updated>2013-02-26T04:43:11Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: More wikipedia links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1 [[WikiPedia:Hertz#SI_multiples|kHz]]&lt;br /&gt;
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1 kHz at 1 Volt RMS, which is 2.8 Volts&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a [[WikiPedia:Compact_disk||Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to eight bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits,&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13934</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13934"/>
		<updated>2013-02-26T04:39:46Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: syntax typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;small&amp;gt;&#039;&#039;Wiki edition&#039;&#039;&amp;lt;/small&amp;gt;&lt;br /&gt;
[[Image:dsat_001.jpg|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Continuing in the &amp;quot;firehose&amp;quot; tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org&#039;s second video on digital media explores multiple facets of digital audio signals and how they &#039;&#039;really&#039;&#039; behave in the real world.&lt;br /&gt;
&lt;br /&gt;
Demonstrations of sampling, quantization, bit-depth, and dither put digital audio through its paces on consumer-grade audio equipment using both modern digital analysis and vintage analog equipment (Just in case we can&#039;t trust those newfangled digital gizmos). You can download the demo application source code and try it all for yourself!&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;center&amp;gt;&amp;lt;font size=&amp;quot;+2&amp;quot;&amp;gt;[http://www.xiph.org/video/vid2.shtml Download or Watch online]&amp;lt;/font&amp;gt;&amp;lt;/center&amp;gt;&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]&lt;br /&gt;
&lt;br /&gt;
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]&lt;br /&gt;
&lt;br /&gt;
If you&#039;re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.&lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;hr/&amp;gt;&lt;br /&gt;
&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
[[Image:Xiph_ep02_test.png|400px|right]]&lt;br /&gt;
&lt;br /&gt;
Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest, let&#039;s&lt;br /&gt;
take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.&lt;br /&gt;
&lt;br /&gt;
==Veritas ex machina==&lt;br /&gt;
[[Image:Dsat_002.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_003.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_004.jpg|200px|right]]&lt;br /&gt;
[[Image:Dsat_005.jpg|200px|right]]&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
First up, we need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],&lt;br /&gt;
like this Tektronix 2246 from the mid-90s, one of the last and very best analog scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an&lt;br /&gt;
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one Volt [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V&lt;br /&gt;
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]&lt;br /&gt;
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],&lt;br /&gt;
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below&lt;br /&gt;
[[WikiPedia:Fundamental_frequency|the fundamental]].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],&lt;br /&gt;
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],&lt;br /&gt;
[[WikiPedia:Jitter#Sampling_jitter|jitter]],&lt;br /&gt;
[[WikiPedia:Noise_floor|noise behavior]],&lt;br /&gt;
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear:both;&amp;quot;/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Stairsteps==&lt;br /&gt;
[[Image:Dsat 006.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat 007.png|360px|right]]&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point]]. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
[[Image:Dsat 008.png|360px|right]]&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
So, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion]] is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].&lt;br /&gt;
&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==Bit-depth==&lt;br /&gt;
[[Image:Dsat_009.jpg|360px|right]]&lt;br /&gt;
[[Image:Dsat_010.jpg|260px|right]]&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor.&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  [[WikiPedia:Compact cassettes|Compact cassettes]] (for those of you who are old enough to remember them) could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a [[WikiPedia:Compact_disk||Compact Disc]] used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==Dither==&lt;br /&gt;
[[Image:Dsat_011.png|360px|right]]&lt;br /&gt;
[[Image:Dsat_012.gif|360px|right]]&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with [[Wikipedia:dither|dither]], so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much noise for this test so we&#039;ll produce a mathematically perfect sine wave with the ThinkPad and quantize it to eight bits with dithering.&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display and output scope and, once the analog spectrum analyzer catches up...&lt;br /&gt;
a clean frequency peak with a uniform noise floor on both spectral displays&lt;br /&gt;
&lt;br /&gt;
just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off.&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits,&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==Bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==Epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13886</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13886"/>
		<updated>2013-02-25T06:23:31Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Wiki link to MacPaint for fat bits&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Intro==&lt;br /&gt;
&lt;br /&gt;
{{rounded|content=Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter convert from digital back to analog].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest,&lt;br /&gt;
let&#039;s take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.}}&lt;br /&gt;
&lt;br /&gt;
==veritas ex machina==&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3325 ]&lt;br /&gt;
&lt;br /&gt;
First up, we need a [http://en.wikipedia.org/wiki/Function_generator signal generator]&lt;br /&gt;
 to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
[[ close on 2246 ]]&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on&lt;br /&gt;
[http://en.wikipedia.org/wiki/Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29 analog oscilloscopes],&lt;br /&gt;
like this&lt;br /&gt;
Tektronix 2246 from the mid-90s, one of the last and very best analog&lt;br /&gt;
scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3585]]&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectral_density#Electrical_engineering frequency spectrum] of our signals using an&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectrum_analyzer#Swept-tuned analog spectrum analyzer], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the&lt;br /&gt;
specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one Volt [http://en.wikipedia.org/wiki/Amplitude#Root_mean_square_amplitude RMS].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V&lt;br /&gt;
[http://en.wikipedia.org/wiki/Amplitude#Peak-to-peak_amplitude peak-to-peak],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level&lt;br /&gt;
[http://en.wikipedia.org/wiki/White_noise white noise]&lt;br /&gt;
and just a bit of&lt;br /&gt;
[http://en.wikipedia.org/wiki/Harmonic_distortion#Harmonic_distortion harmonic distortion],&lt;br /&gt;
with the highest peak about 70[http://en.wikipedia.org/wiki/Decibel dB] or so below&lt;br /&gt;
[http://en.wikipedia.org/wiki/Fundamental_frequency the fundamental].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
[[ cut to complete setup ]]&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Reconstruction_filter#Sampled_data_reconstruction_filters Flatness],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Analog-to-digital_converter#Non-linearity linearity],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Jitter#Sampling_jitter jitter],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Noise_floor noise behavior],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter#DAC_performance everything]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
[[out to see emagic initialize and digital waveform appear on TP ]]&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&lt;br /&gt;
==stairsteps==&lt;br /&gt;
&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
[[close to 3325]] &lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
[[close to TP: spectrum]]&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
[[close to SA]]&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
[[close to TP ; overview/waveform ]]&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope: press CH1 button to show waveform]]&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
[[set 3325 to 15kHz ]]&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
[[close to TP]]&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
[[close to TP; freeze display; draw in zero-order]]&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a zero-order hold, and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
[[ Wikipedia DAC lookup + scroll down to hold image]]&lt;br /&gt;
So, anyone who looks up digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
[http://en.wikipedia.org/wiki/MacPaint#Development fat bits in an image editor].&lt;br /&gt;
&lt;br /&gt;
[[gimp RMD animation]]]&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==bit-depth==&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
[[panel 2; demonstrate changing bit depth on tablet ]]&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor. [[demonstrate changing bit depth on tablet]].&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
[[audio: eight bit sine]]&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
[[audio: hit notch + gain button]]&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  Compact cassettes...&lt;br /&gt;
&lt;br /&gt;
[[ reveal cassette ]]&lt;br /&gt;
&lt;br /&gt;
for those of you who are old enough to remember them, could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a Compact Disc used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==dither==&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with dither, so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
[[ Illustration: quantization ]]&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
[[Illustration: correlated quantization noise ]]&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much&lt;br /&gt;
noise for this test so&lt;br /&gt;
&lt;br /&gt;
[[close: panel 3]]&lt;br /&gt;
&lt;br /&gt;
we&#039;ll produce a mathematically perfect sine wave with the ThinkPad&lt;br /&gt;
[[press]]&lt;br /&gt;
&lt;br /&gt;
and quantize it to eight bits [[press]]&lt;br /&gt;
&lt;br /&gt;
with dithering. [[press]]&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display&lt;br /&gt;
&lt;br /&gt;
[[ show outscope ]]  and output scope&lt;br /&gt;
&lt;br /&gt;
[[ show analyzer]]  and, once the analog spectrum analyzer catches up...&lt;br /&gt;
&lt;br /&gt;
[[time accel sweep]] a clean frequency peak with a uniform noise floor&lt;br /&gt;
on both spectral displays&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]  just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off. [[ deactivate dither ]]&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits, [[click 16]]&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
[[draw line across -100]]&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
[[ overview: waveform ]]&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
[[dither on]]&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
[[panel 5]]&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
[[annotate: underline 2-4kHz]]&lt;br /&gt;
&lt;br /&gt;
[[click shaped]&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
[[annotate: arrow to HF]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
[[close]]&lt;br /&gt;
[[unshaped white to shaped]]&lt;br /&gt;
 &lt;br /&gt;
[[out]] Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
  [[ panel 6 audio :: flat, unmodulated ]]&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated ]]&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
  [[ panel 6 audio :: shaped, modulated, notch ]]&lt;br /&gt;
  [[ reset panel :: shaped, unmodulated, no notch ]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13885</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13885"/>
		<updated>2013-02-25T06:17:35Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: More wiki links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Intro==&lt;br /&gt;
&lt;br /&gt;
{{rounded|content=Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter convert from digital back to analog].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest,&lt;br /&gt;
let&#039;s take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.}}&lt;br /&gt;
&lt;br /&gt;
==veritas ex machina==&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3325 ]&lt;br /&gt;
&lt;br /&gt;
First up, we need a [http://en.wikipedia.org/wiki/Function_generator signal generator]&lt;br /&gt;
 to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
[[ close on 2246 ]]&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on&lt;br /&gt;
[http://en.wikipedia.org/wiki/Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29 analog oscilloscopes],&lt;br /&gt;
like this&lt;br /&gt;
Tektronix 2246 from the mid-90s, one of the last and very best analog&lt;br /&gt;
scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3585]]&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectral_density#Electrical_engineering frequency spectrum] of our signals using an&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectrum_analyzer#Swept-tuned analog spectrum analyzer], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the&lt;br /&gt;
specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one Volt [http://en.wikipedia.org/wiki/Amplitude#Root_mean_square_amplitude RMS].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V&lt;br /&gt;
[http://en.wikipedia.org/wiki/Amplitude#Peak-to-peak_amplitude peak-to-peak],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level&lt;br /&gt;
[http://en.wikipedia.org/wiki/White_noise white noise]&lt;br /&gt;
and just a bit of&lt;br /&gt;
[http://en.wikipedia.org/wiki/Harmonic_distortion#Harmonic_distortion harmonic distortion],&lt;br /&gt;
with the highest peak about 70[http://en.wikipedia.org/wiki/Decibel dB] or so below&lt;br /&gt;
[http://en.wikipedia.org/wiki/Fundamental_frequency the fundamental].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
[[ cut to complete setup ]]&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
[http://en.wikipedia.org/wiki/Reconstruction_filter#Sampled_data_reconstruction_filters Flatness],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Analog-to-digital_converter#Non-linearity linearity],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Jitter#Sampling_jitter jitter],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Noise_floor noise behavior],&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter#DAC_performance everything]...&lt;br /&gt;
you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
[[out to see emagic initialize and digital waveform appear on TP ]]&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&lt;br /&gt;
==stairsteps==&lt;br /&gt;
&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
[[close to 3325]] &lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
[[close to TP: spectrum]]&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
[[close to SA]]&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
[[close to TP ; overview/waveform ]]&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope: press CH1 button to show waveform]]&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
[[set 3325 to 15kHz ]]&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
[[close to TP]]&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
[[close to TP; freeze display; draw in zero-order]]&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a zero-order hold, and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
[[ Wikipedia DAC lookup + scroll down to hold image]]&lt;br /&gt;
So, anyone who looks up digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
fat bits in an image editor.&lt;br /&gt;
&lt;br /&gt;
[[gimp RMD animation]]]&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==bit-depth==&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
[[panel 2; demonstrate changing bit depth on tablet ]]&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor. [[demonstrate changing bit depth on tablet]].&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
[[audio: eight bit sine]]&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
[[audio: hit notch + gain button]]&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  Compact cassettes...&lt;br /&gt;
&lt;br /&gt;
[[ reveal cassette ]]&lt;br /&gt;
&lt;br /&gt;
for those of you who are old enough to remember them, could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a Compact Disc used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==dither==&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with dither, so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
[[ Illustration: quantization ]]&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
[[Illustration: correlated quantization noise ]]&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much&lt;br /&gt;
noise for this test so&lt;br /&gt;
&lt;br /&gt;
[[close: panel 3]]&lt;br /&gt;
&lt;br /&gt;
we&#039;ll produce a mathematically perfect sine wave with the ThinkPad&lt;br /&gt;
[[press]]&lt;br /&gt;
&lt;br /&gt;
and quantize it to eight bits [[press]]&lt;br /&gt;
&lt;br /&gt;
with dithering. [[press]]&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display&lt;br /&gt;
&lt;br /&gt;
[[ show outscope ]]  and output scope&lt;br /&gt;
&lt;br /&gt;
[[ show analyzer]]  and, once the analog spectrum analyzer catches up...&lt;br /&gt;
&lt;br /&gt;
[[time accel sweep]] a clean frequency peak with a uniform noise floor&lt;br /&gt;
on both spectral displays&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]  just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off. [[ deactivate dither ]]&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits, [[click 16]]&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
[[draw line across -100]]&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
[[ overview: waveform ]]&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
[[dither on]]&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
[[panel 5]]&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
[[annotate: underline 2-4kHz]]&lt;br /&gt;
&lt;br /&gt;
[[click shaped]&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
[[annotate: arrow to HF]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
[[close]]&lt;br /&gt;
[[unshaped white to shaped]]&lt;br /&gt;
 &lt;br /&gt;
[[out]] Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
  [[ panel 6 audio :: flat, unmodulated ]]&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated ]]&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
  [[ panel 6 audio :: shaped, modulated, notch ]]&lt;br /&gt;
  [[ reset panel :: shaped, unmodulated, no notch ]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13884</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13884"/>
		<updated>2013-02-25T05:55:12Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: /* veritas ex machina */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Intro==&lt;br /&gt;
&lt;br /&gt;
{{rounded|content=Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter convert from digital back to analog].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest,&lt;br /&gt;
let&#039;s take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.}}&lt;br /&gt;
&lt;br /&gt;
==veritas ex machina==&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3325 ]&lt;br /&gt;
&lt;br /&gt;
First up, we need a [http://en.wikipedia.org/wiki/Function_generator signal generator]&lt;br /&gt;
 to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
[[ close on 2246 ]]&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on&lt;br /&gt;
[http://en.wikipedia.org/wiki/Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29 analog oscilloscopes],&lt;br /&gt;
like this&lt;br /&gt;
Tektronix 2246 from the mid-90s, one of the last and very best analog&lt;br /&gt;
scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3585]]&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectral_density#Electrical_engineering frequency spectrum] of our signals using an&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectrum_analyzer#Swept-tuned analog spectrum analyzer], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the&lt;br /&gt;
specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one Volt [http://en.wikipedia.org/wiki/Amplitude#Root_mean_square_amplitude RMS].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V&lt;br /&gt;
[http://en.wikipedia.org/wiki/Amplitude#Peak-to-peak_amplitude peak-to-peak],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level&lt;br /&gt;
[http://en.wikipedia.org/wiki/White_noise white noise]&lt;br /&gt;
and just a bit of&lt;br /&gt;
[http://en.wikipedia.org/wiki/Harmonic_distortion#Harmonic_distortion harmonic distortion],&lt;br /&gt;
with the highest peak about 70[http://en.wikipedia.org/wiki/Decibel dB] or so below&lt;br /&gt;
[http://en.wikipedia.org/wiki/Fundamental_frequency the fundamental].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
[[ cut to complete setup ]]&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
Flatness, linearity, jitter, noise behavior, everything... you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
[[out to see emagic initialize and digital waveform appear on TP ]]&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&lt;br /&gt;
==stairsteps==&lt;br /&gt;
&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
[[close to 3325]] &lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
[[close to TP: spectrum]]&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
[[close to SA]]&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
[[close to TP ; overview/waveform ]]&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope: press CH1 button to show waveform]]&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
[[set 3325 to 15kHz ]]&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
[[close to TP]]&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
[[close to TP; freeze display; draw in zero-order]]&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a zero-order hold, and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
[[ Wikipedia DAC lookup + scroll down to hold image]]&lt;br /&gt;
So, anyone who looks up digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
fat bits in an image editor.&lt;br /&gt;
&lt;br /&gt;
[[gimp RMD animation]]]&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==bit-depth==&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
[[panel 2; demonstrate changing bit depth on tablet ]]&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor. [[demonstrate changing bit depth on tablet]].&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
[[audio: eight bit sine]]&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
[[audio: hit notch + gain button]]&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  Compact cassettes...&lt;br /&gt;
&lt;br /&gt;
[[ reveal cassette ]]&lt;br /&gt;
&lt;br /&gt;
for those of you who are old enough to remember them, could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a Compact Disc used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==dither==&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with dither, so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
[[ Illustration: quantization ]]&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
[[Illustration: correlated quantization noise ]]&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much&lt;br /&gt;
noise for this test so&lt;br /&gt;
&lt;br /&gt;
[[close: panel 3]]&lt;br /&gt;
&lt;br /&gt;
we&#039;ll produce a mathematically perfect sine wave with the ThinkPad&lt;br /&gt;
[[press]]&lt;br /&gt;
&lt;br /&gt;
and quantize it to eight bits [[press]]&lt;br /&gt;
&lt;br /&gt;
with dithering. [[press]]&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display&lt;br /&gt;
&lt;br /&gt;
[[ show outscope ]]  and output scope&lt;br /&gt;
&lt;br /&gt;
[[ show analyzer]]  and, once the analog spectrum analyzer catches up...&lt;br /&gt;
&lt;br /&gt;
[[time accel sweep]] a clean frequency peak with a uniform noise floor&lt;br /&gt;
on both spectral displays&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]  just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off. [[ deactivate dither ]]&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits, [[click 16]]&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
[[draw line across -100]]&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
[[ overview: waveform ]]&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
[[dither on]]&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
[[panel 5]]&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
[[annotate: underline 2-4kHz]]&lt;br /&gt;
&lt;br /&gt;
[[click shaped]&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
[[annotate: arrow to HF]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
[[close]]&lt;br /&gt;
[[unshaped white to shaped]]&lt;br /&gt;
 &lt;br /&gt;
[[out]] Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
  [[ panel 6 audio :: flat, unmodulated ]]&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated ]]&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
  [[ panel 6 audio :: shaped, modulated, notch ]]&lt;br /&gt;
  [[ reset panel :: shaped, unmodulated, no notch ]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13883</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13883"/>
		<updated>2013-02-25T05:54:14Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: More wikipedia links and some links to the Agilent device pages&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Intro==&lt;br /&gt;
&lt;br /&gt;
{{rounded|content=Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter convert from digital back to analog].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest,&lt;br /&gt;
let&#039;s take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.}}&lt;br /&gt;
&lt;br /&gt;
==veritas ex machina==&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3325 ]&lt;br /&gt;
&lt;br /&gt;
First up, we need a [http://en.wikipedia.org/wiki/Function_generator signal generator]&lt;br /&gt;
 to provide us with analog input&lt;br /&gt;
signals--in this case, an&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&amp;amp;nid=-536900197.536896863&amp;amp;cc=SE&amp;amp;lc=swe HP3325]&lt;br /&gt;
from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
[[ close on 2246 ]]&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on&lt;br /&gt;
[http://en.wikipedia.org/wiki/Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29 analog oscilloscopes],&lt;br /&gt;
like this&lt;br /&gt;
Tektronix 2246 from the mid-90s, one of the last and very best analog&lt;br /&gt;
scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3585]]&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectral_density#Electrical_engineering frequency spectrum] of our signals using an&lt;br /&gt;
[http://en.wikipedia.org/wiki/Spectrum_analyzer#Swept-tuned analog spectrum analyzer], this&lt;br /&gt;
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&amp;amp;nid=-536900197.536897319&amp;amp;cc=SE&amp;amp;lc=swe HP3585]&lt;br /&gt;
from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has&lt;br /&gt;
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],&lt;br /&gt;
but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the&lt;br /&gt;
specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one Volt [http://en.wikipedia.org/wiki/Amplitude#Root_mean_square_amplitude RMS].&lt;br /&gt;
We see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V&lt;br /&gt;
[http://en.wikipedia.org/wiki/Amplitude#Peak-to-peak_amplitude peak-to-peak],&lt;br /&gt;
and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level&lt;br /&gt;
[http://en.wikipedia.org/wiki/White_noise white noise]&lt;br /&gt;
and just a bit of&lt;br /&gt;
[http://en.wikipedia.org/wiki/Harmonic_distortion#Harmonic_distortion harmonic distortion],&lt;br /&gt;
with the highest peak about 70[en.wikipedia.org/wiki/Decibel dB] or so below&lt;br /&gt;
[http://en.wikipedia.org/wiki/Fundamental_frequency the fundamental].&lt;br /&gt;
Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
[[ cut to complete setup ]]&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
Flatness, linearity, jitter, noise behavior, everything... you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
[[out to see emagic initialize and digital waveform appear on TP ]]&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&lt;br /&gt;
==stairsteps==&lt;br /&gt;
&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
[[close to 3325]] &lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
[[close to TP: spectrum]]&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
[[close to SA]]&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
[[close to TP ; overview/waveform ]]&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope: press CH1 button to show waveform]]&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
[[set 3325 to 15kHz ]]&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
[[close to TP]]&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
[[close to TP; freeze display; draw in zero-order]]&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a zero-order hold, and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
[[ Wikipedia DAC lookup + scroll down to hold image]]&lt;br /&gt;
So, anyone who looks up digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
fat bits in an image editor.&lt;br /&gt;
&lt;br /&gt;
[[gimp RMD animation]]]&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==bit-depth==&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
[[panel 2; demonstrate changing bit depth on tablet ]]&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor. [[demonstrate changing bit depth on tablet]].&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
[[audio: eight bit sine]]&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
[[audio: hit notch + gain button]]&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  Compact cassettes...&lt;br /&gt;
&lt;br /&gt;
[[ reveal cassette ]]&lt;br /&gt;
&lt;br /&gt;
for those of you who are old enough to remember them, could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a Compact Disc used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==dither==&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with dither, so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
[[ Illustration: quantization ]]&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
[[Illustration: correlated quantization noise ]]&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much&lt;br /&gt;
noise for this test so&lt;br /&gt;
&lt;br /&gt;
[[close: panel 3]]&lt;br /&gt;
&lt;br /&gt;
we&#039;ll produce a mathematically perfect sine wave with the ThinkPad&lt;br /&gt;
[[press]]&lt;br /&gt;
&lt;br /&gt;
and quantize it to eight bits [[press]]&lt;br /&gt;
&lt;br /&gt;
with dithering. [[press]]&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display&lt;br /&gt;
&lt;br /&gt;
[[ show outscope ]]  and output scope&lt;br /&gt;
&lt;br /&gt;
[[ show analyzer]]  and, once the analog spectrum analyzer catches up...&lt;br /&gt;
&lt;br /&gt;
[[time accel sweep]] a clean frequency peak with a uniform noise floor&lt;br /&gt;
on both spectral displays&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]  just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off. [[ deactivate dither ]]&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits, [[click 16]]&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
[[draw line across -100]]&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
[[ overview: waveform ]]&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
[[dither on]]&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
[[panel 5]]&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
[[annotate: underline 2-4kHz]]&lt;br /&gt;
&lt;br /&gt;
[[click shaped]&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
[[annotate: arrow to HF]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
[[close]]&lt;br /&gt;
[[unshaped white to shaped]]&lt;br /&gt;
 &lt;br /&gt;
[[out]] Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
  [[ panel 6 audio :: flat, unmodulated ]]&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated ]]&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
  [[ panel 6 audio :: shaped, modulated, notch ]]&lt;br /&gt;
  [[ reset panel :: shaped, unmodulated, no notch ]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13882</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13882"/>
		<updated>2013-02-25T05:27:11Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Add link to wiki article on DACs&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Intro==&lt;br /&gt;
&lt;br /&gt;
{{rounded|content=Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you&lt;br /&gt;
[http://en.wikipedia.org/wiki/Digital-to-analog_converter convert from digital back to analog].&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest,&lt;br /&gt;
let&#039;s take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.}}&lt;br /&gt;
&lt;br /&gt;
==veritas ex machina==&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3325 ]&lt;br /&gt;
&lt;br /&gt;
First up, we need a signal generator to provide us with analog input&lt;br /&gt;
signals--in this case, an HP3325 from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
[[ close on 2246 ]]&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on analog oscilloscopes, like this&lt;br /&gt;
Tektronix 2246 from the mid-90s, one of the last and very best analog&lt;br /&gt;
scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3585]]&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the frequency spectrum of our signals using an&lt;br /&gt;
analog spectrum analyzer, this HP3585 from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has a&lt;br /&gt;
rudimentary and hilariously large microcontroller, but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the&lt;br /&gt;
specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one volt RMS,&lt;br /&gt;
&lt;br /&gt;
we see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V peak-to-peak, and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level white noise and just a bit of&lt;br /&gt;
harmonic distortion, with the highest peak about 70dB or so below the&lt;br /&gt;
fundamental. Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
[[ cut to complete setup ]]&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
Flatness, linearity, jitter, noise behavior, everything... you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
[[out to see emagic initialize and digital waveform appear on TP ]]&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&lt;br /&gt;
==stairsteps==&lt;br /&gt;
&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
[[close to 3325]] &lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
[[close to TP: spectrum]]&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
[[close to SA]]&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
[[close to TP ; overview/waveform ]]&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope: press CH1 button to show waveform]]&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
[[set 3325 to 15kHz ]]&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
[[close to TP]]&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
[[close to TP; freeze display; draw in zero-order]]&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a zero-order hold, and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
[[ Wikipedia DAC lookup + scroll down to hold image]]&lt;br /&gt;
So, anyone who looks up digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
fat bits in an image editor.&lt;br /&gt;
&lt;br /&gt;
[[gimp RMD animation]]]&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==bit-depth==&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
[[panel 2; demonstrate changing bit depth on tablet ]]&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor. [[demonstrate changing bit depth on tablet]].&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
[[audio: eight bit sine]]&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
[[audio: hit notch + gain button]]&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  Compact cassettes...&lt;br /&gt;
&lt;br /&gt;
[[ reveal cassette ]]&lt;br /&gt;
&lt;br /&gt;
for those of you who are old enough to remember them, could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a Compact Disc used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==dither==&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with dither, so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
[[ Illustration: quantization ]]&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
[[Illustration: correlated quantization noise ]]&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much&lt;br /&gt;
noise for this test so&lt;br /&gt;
&lt;br /&gt;
[[close: panel 3]]&lt;br /&gt;
&lt;br /&gt;
we&#039;ll produce a mathematically perfect sine wave with the ThinkPad&lt;br /&gt;
[[press]]&lt;br /&gt;
&lt;br /&gt;
and quantize it to eight bits [[press]]&lt;br /&gt;
&lt;br /&gt;
with dithering. [[press]]&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display&lt;br /&gt;
&lt;br /&gt;
[[ show outscope ]]  and output scope&lt;br /&gt;
&lt;br /&gt;
[[ show analyzer]]  and, once the analog spectrum analyzer catches up...&lt;br /&gt;
&lt;br /&gt;
[[time accel sweep]] a clean frequency peak with a uniform noise floor&lt;br /&gt;
on both spectral displays&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]  just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off. [[ deactivate dither ]]&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits, [[click 16]]&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
[[draw line across -100]]&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
[[ overview: waveform ]]&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
[[dither on]]&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
[[panel 5]]&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
[[annotate: underline 2-4kHz]]&lt;br /&gt;
&lt;br /&gt;
[[click shaped]&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
[[annotate: arrow to HF]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
[[close]]&lt;br /&gt;
[[unshaped white to shaped]]&lt;br /&gt;
 &lt;br /&gt;
[[out]] Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
  [[ panel 6 audio :: flat, unmodulated ]]&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated ]]&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
  [[ panel 6 audio :: shaped, modulated, notch ]]&lt;br /&gt;
  [[ reset panel :: shaped, unmodulated, no notch ]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13881</id>
		<title>Videos/Digital Show and Tell</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&amp;diff=13881"/>
		<updated>2013-02-25T05:23:56Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Add links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Intro==&lt;br /&gt;
&lt;br /&gt;
{{rounded|content=Hi, I&#039;m Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].&lt;br /&gt;
&lt;br /&gt;
A few months ago, I wrote&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don&#039;t make sense].&lt;br /&gt;
In the article, I&lt;br /&gt;
mentioned--almost in passing--that a digital waveform is&lt;br /&gt;
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],&lt;br /&gt;
and you certainly don&#039;t get a stairstep when you convert&lt;br /&gt;
from digital back to analog.&lt;br /&gt;
&lt;br /&gt;
Of everything in the entire article, &#039;&#039;&#039;that&#039;&#039;&#039; was the number one thing&lt;br /&gt;
people wrote about. In fact, more than half the mail I got was questions and&lt;br /&gt;
comments about basic digital signal behavior.  Since there&#039;s interest,&lt;br /&gt;
let&#039;s take a little time to play with some &#039;&#039;simple&#039;&#039; digital signals.}}&lt;br /&gt;
&lt;br /&gt;
==veritas ex machina==&lt;br /&gt;
&lt;br /&gt;
Pretend for a moment that we have no idea how digital signals really&lt;br /&gt;
behave. In that case it doesn&#039;t make sense for us to use digital test&lt;br /&gt;
equipment either.  Fortunately for this exercise, there&#039;s still plenty&lt;br /&gt;
of working analog lab equipment out there.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3325 ]&lt;br /&gt;
&lt;br /&gt;
First up, we need a signal generator to provide us with analog input&lt;br /&gt;
signals--in this case, an HP3325 from 1978.  It&#039;s still a pretty good&lt;br /&gt;
generator, so if you don&#039;t mind the size, the weight, the power&lt;br /&gt;
consumption, and the noisy fan, you can find them on eBay... occasionally&lt;br /&gt;
for only slightly more than you&#039;ll pay for shipping.&lt;br /&gt;
&lt;br /&gt;
[[ close on 2246 ]]&lt;br /&gt;
&lt;br /&gt;
Next, we&#039;ll observe our analog waveforms on analog oscilloscopes, like this&lt;br /&gt;
Tektronix 2246 from the mid-90s, one of the last and very best analog&lt;br /&gt;
scopes ever made. Every home lab should have one.&lt;br /&gt;
&lt;br /&gt;
[[ close on 3585]]&lt;br /&gt;
&lt;br /&gt;
...and finally inspect the frequency spectrum of our signals using an&lt;br /&gt;
analog spectrum analyzer, this HP3585 from the same product line as&lt;br /&gt;
the signal generator.  Like the other equipment here it has a&lt;br /&gt;
rudimentary and hilariously large microcontroller, but the signal path&lt;br /&gt;
from input to what you see on the screen is completely analog.&lt;br /&gt;
&lt;br /&gt;
All of this equipment is vintage, but aside from its raw tonnage, the&lt;br /&gt;
specs are still quite good.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
At the moment, we have our signal generator set to output a nice 1kHz&lt;br /&gt;
sine wave at one volt RMS,&lt;br /&gt;
&lt;br /&gt;
we see the sine wave on the oscilloscope, can verify that it is indeed&lt;br /&gt;
1kHz at one volt RMS, which is 2.8V peak-to-peak, and that matches the&lt;br /&gt;
measurement on the spectrum analyzer as well.&lt;br /&gt;
&lt;br /&gt;
The analyzer also shows some low-level white noise and just a bit of&lt;br /&gt;
harmonic distortion, with the highest peak about 70dB or so below the&lt;br /&gt;
fundamental. Now, this doesn&#039;t matter at all in our demos, but I&lt;br /&gt;
wanted to point it out now just in case you didn&#039;t notice it until&lt;br /&gt;
later.&lt;br /&gt;
&lt;br /&gt;
[[ cut to complete setup ]]&lt;br /&gt;
&lt;br /&gt;
Now, we drop digital sampling in the middle.&lt;br /&gt;
&lt;br /&gt;
For the conversion, we&#039;ll use a boring, consumer-grade, eMagic USB1&lt;br /&gt;
audio device.  It&#039;s also more than ten years old at this point, and it&#039;s&lt;br /&gt;
getting obsolete.&lt;br /&gt;
&lt;br /&gt;
A recent converter can easily have an order of magnitude better specs.&lt;br /&gt;
Flatness, linearity, jitter, noise behavior, everything... you may not&lt;br /&gt;
have noticed.  Just because we can measure an improvement doesn&#039;t&lt;br /&gt;
mean we can hear it, and even these old consumer boxes were already at&lt;br /&gt;
the edge of ideal transparency.&lt;br /&gt;
&lt;br /&gt;
[[out to see emagic initialize and digital waveform appear on TP ]]&lt;br /&gt;
&lt;br /&gt;
The eMagic connects to my ThinkPad, which displays a digital&lt;br /&gt;
waveform and spectrum for comparison, then the ThinkPad&lt;br /&gt;
sends the digital signal right back out to the eMagic for&lt;br /&gt;
re-conversion to analog and observation on the output scopes.&lt;br /&gt;
&lt;br /&gt;
Input to output, left to right.&lt;br /&gt;
&lt;br /&gt;
==stairsteps==&lt;br /&gt;
&lt;br /&gt;
OK, it&#039;s go time. We begin by converting an analog signal to digital and&lt;br /&gt;
then right back to analog again with no other steps.&lt;br /&gt;
&lt;br /&gt;
[[close to 3325]] &lt;br /&gt;
The signal generator is set to produce a 1kHz sine wave just like&lt;br /&gt;
before.&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
We can see our analog sine wave on our input-side oscilloscope.&lt;br /&gt;
&lt;br /&gt;
[[close to TP: spectrum]]&lt;br /&gt;
We digitize our signal to 16 bit PCM at 44.1kHz, same as on a CD.&lt;br /&gt;
The spectrum of the digitized signal matches what we saw earlier&lt;br /&gt;
&lt;br /&gt;
[[close to SA]]&lt;br /&gt;
and what we see now on the analog spectrum analyzer, aside from its &lt;br /&gt;
high-impedance input being just a smidge noisier.&lt;br /&gt;
&lt;br /&gt;
[[close to TP ; overview/waveform ]]&lt;br /&gt;
For now, the waveform display shows our digitized sine wave as a&lt;br /&gt;
stairstep pattern, one step for each sample.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
And when we look at the output signal that&#039;s been converted&lt;br /&gt;
from digital back to analog, we see...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope: press CH1 button to show waveform]]&lt;br /&gt;
It&#039;s exactly like the original sine wave.  No stairsteps.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
OK, 1kHz is still a fairly low frequency, maybe the stairsteps are just&lt;br /&gt;
hard to see or they&#039;re being smoothed away.  Fair enough. Let&#039;s choose&lt;br /&gt;
a higher frequency, something close to Nyquist, say 15kHz.&lt;br /&gt;
&lt;br /&gt;
[[set 3325 to 15kHz ]]&lt;br /&gt;
Now the sine wave is represented by less than three samples per cycle, and...&lt;br /&gt;
&lt;br /&gt;
[[close to TP]]&lt;br /&gt;
the digital waveform looks pretty awful.  Well, looks&lt;br /&gt;
can be deceiving. The analog output...&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
is still a perfect sine wave, exactly like the original.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s keep going up.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s see if I can do this without blocking any cameras.&lt;br /&gt;
&lt;br /&gt;
16kHz.... 17kHz... 18kHz... 19kHz... &lt;br /&gt;
&lt;br /&gt;
20kHz.  Welcome to the upper limits of human hearing. The output&lt;br /&gt;
waveform is still perfect. No jagged edges, no dropoff, no stairsteps.&lt;br /&gt;
&lt;br /&gt;
So where&#039;d the stairsteps go? Don&#039;t answer, it&#039;s a trick question.&lt;br /&gt;
They were never there.&lt;br /&gt;
&lt;br /&gt;
Drawing a digital waveform as a stairstep... was wrong to begin with.&lt;br /&gt;
&lt;br /&gt;
Why? A stairstep is a continuous-time function.  It&#039;s jagged, and it&#039;s&lt;br /&gt;
piecewise, but it has a defined value at every point in time.&lt;br /&gt;
&lt;br /&gt;
A sampled signal is entirely different. It&#039;s discrete-time; it&#039;s only&lt;br /&gt;
got a value right at each instantaneous sample point and it&#039;s&lt;br /&gt;
undefined, there is no value at all, everywhere between.  A&lt;br /&gt;
discrete-time signal is properly drawn as a lollipop graph.&lt;br /&gt;
&lt;br /&gt;
The continuous, analog counterpart of a digital signal passes&lt;br /&gt;
smoothly through each sample point, and that&#039;s just as true for high&lt;br /&gt;
frequencies as it is for low.&lt;br /&gt;
&lt;br /&gt;
Now, the interesting and not at all obvious bit is: there&#039;s only one&lt;br /&gt;
bandlimited signal that passes exactly through each sample point. It&#039;s&lt;br /&gt;
a unique solution. So if you sample a bandlimited signal and then&lt;br /&gt;
convert it back, the original input is also the only possible output.&lt;br /&gt;
&lt;br /&gt;
And before you say, &amp;quot;oh, I can draw a different signal that passes&lt;br /&gt;
through those points&amp;quot;, well, yes you can, but if it differs even&lt;br /&gt;
minutely from the original, it includes frequency content at or beyond&lt;br /&gt;
Nyquist, breaks the bandlimiting requirement and isn&#039;t a valid&lt;br /&gt;
solution.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
So how did everyone get confused and start thinking of digital signals&lt;br /&gt;
as stairsteps? I can think of two good reasons.&lt;br /&gt;
&lt;br /&gt;
[[close to TP; freeze display; draw in zero-order]]&lt;br /&gt;
First: it&#039;s easy enough to convert a sampled signal to a true stairstep. Just&lt;br /&gt;
extend each sample value forward until the next sample period.  This is&lt;br /&gt;
called a zero-order hold, and it&#039;s an important part of how some&lt;br /&gt;
digital-to-analog converters work, especially the simplest ones.&lt;br /&gt;
&lt;br /&gt;
[[ Wikipedia DAC lookup + scroll down to hold image]]&lt;br /&gt;
So, anyone who looks up digital-to-analog converter or&lt;br /&gt;
digital-to-analog conversion is probably going to see a diagram of a&lt;br /&gt;
stairstep waveform somewhere, but that&#039;s not a finished conversion,&lt;br /&gt;
and it&#039;s not the signal that comes out.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
Second, and this is probably the more likely reason, engineers who&lt;br /&gt;
supposedly know better, like me, draw stairsteps even though they&#039;re&lt;br /&gt;
technically wrong. It&#039;s a sort of like a one-dimensional version of&lt;br /&gt;
fat bits in an image editor.&lt;br /&gt;
&lt;br /&gt;
[[gimp RMD animation]]]&lt;br /&gt;
Pixels aren&#039;t squares either, they&#039;re samples of a 2-dimensional&lt;br /&gt;
function space and so they&#039;re also, conceptually, infinitely small&lt;br /&gt;
points. Practically, it&#039;s a real pain in the ass to see or manipulate&lt;br /&gt;
infinitely small anything, so big squares it is.  Digital stairstep&lt;br /&gt;
drawings are exactly the same thing.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
It&#039;s just a convenient drawing. The stairsteps aren&#039;t really there.&lt;br /&gt;
&lt;br /&gt;
==bit-depth==&lt;br /&gt;
&lt;br /&gt;
When we convert a digital signal back to analog, the result is&lt;br /&gt;
_also_ smooth regardless of the bit depth.  24 bits or 16 bits...&lt;br /&gt;
or 8 bits...  it doesn&#039;t matter.&lt;br /&gt;
&lt;br /&gt;
So does that mean that the digital bit depth makes no difference at&lt;br /&gt;
all? Of course not.&lt;br /&gt;
&lt;br /&gt;
Channel 2 here is the same sine wave input, but we quantize with&lt;br /&gt;
dither down to eight bits.&lt;br /&gt;
&lt;br /&gt;
On the scope, we still see a nice&lt;br /&gt;
smooth sine wave on channel 2. Look very close, and you&#039;ll also see a&lt;br /&gt;
bit more noise.  That&#039;s a clue.&lt;br /&gt;
&lt;br /&gt;
If we look at the spectrum of the signal... aha!  Our sine wave is&lt;br /&gt;
still there unaffected, but the noise level of the eight-bit signal on&lt;br /&gt;
the second channel is much higher!&lt;br /&gt;
&lt;br /&gt;
And that&#039;s the difference the number of bits makes.  That&#039;s it!&lt;br /&gt;
&lt;br /&gt;
When we digitize a signal, first we sample it. The&lt;br /&gt;
sampling step is perfect; it loses nothing. But then we quantize it,&lt;br /&gt;
and quantization adds noise.&lt;br /&gt;
&lt;br /&gt;
[[panel 2; demonstrate changing bit depth on tablet ]]&lt;br /&gt;
&lt;br /&gt;
The number of bits determines how much noise and so the level of the&lt;br /&gt;
noise floor. [[demonstrate changing bit depth on tablet]].&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
What does this dithered quantization noise sound like?  Let&#039;s listen&lt;br /&gt;
to our eight-bit sine wave.&lt;br /&gt;
&lt;br /&gt;
[[audio: eight bit sine]]&lt;br /&gt;
&lt;br /&gt;
That may have been hard to hear anything but the tone.  Let&#039;s listen&lt;br /&gt;
to just the noise after we notch out the sine wave and then bring the&lt;br /&gt;
gain up a bit because the noise is quiet.&lt;br /&gt;
&lt;br /&gt;
[[audio: hit notch + gain button]]&lt;br /&gt;
&lt;br /&gt;
Those of you who have used analog recording equipment may have just&lt;br /&gt;
thought to yourselves, &amp;quot;My goodness! That sounds like tape hiss!&amp;quot;&lt;br /&gt;
Well, it doesn&#039;t just sound like tape hiss, it acts like it too, and&lt;br /&gt;
if we use a gaussian dither then it&#039;s mathematically&lt;br /&gt;
equivalent in every way. It _is_ tape hiss.&lt;br /&gt;
&lt;br /&gt;
Intuitively, that means that we can measure tape hiss and thus the noise floor&lt;br /&gt;
of magnetic audio tape in bits instead of decibels, in order to put things in a&lt;br /&gt;
digital perspective.  Compact cassettes...&lt;br /&gt;
&lt;br /&gt;
[[ reveal cassette ]]&lt;br /&gt;
&lt;br /&gt;
for those of you who are old enough to remember them, could reach as&lt;br /&gt;
deep as nine bits in perfect conditions, though five to six bits was&lt;br /&gt;
more typical, especially if it was a recording made on a tape&lt;br /&gt;
deck. That&#039;s right... your mix tapes were only about six bits&lt;br /&gt;
deep... if you were lucky!&lt;br /&gt;
&lt;br /&gt;
The very best professional open reel tape used in studios could barely&lt;br /&gt;
hit...  any guesses? 13 bits _with_ advanced noise reduction.  And&lt;br /&gt;
that&#039;s why seeing &#039;D D D&#039; on a Compact Disc used to be such a big,&lt;br /&gt;
high-end deal.&lt;br /&gt;
&lt;br /&gt;
==dither==&lt;br /&gt;
&lt;br /&gt;
I keep saying that I&#039;m quantizing with dither, so what is dither&lt;br /&gt;
exactly and, more importantly, what does it do?&lt;br /&gt;
&lt;br /&gt;
[[ Illustration: quantization ]]&lt;br /&gt;
&lt;br /&gt;
The simple way to quantize a signal is to choose the digital&lt;br /&gt;
amplitude value closest to the original analog amplitude.  Obvious,&lt;br /&gt;
right?  Unfortunately, the exact noise you get from this simple&lt;br /&gt;
quantization scheme depends somewhat on the input signal,&lt;br /&gt;
&lt;br /&gt;
[[Illustration: correlated quantization noise ]]&lt;br /&gt;
&lt;br /&gt;
so we may get noise that&#039;s inconsistent, or causes distortion, or is&lt;br /&gt;
undesirable in some other way.&lt;br /&gt;
&lt;br /&gt;
[show/attribute the dither paper]&lt;br /&gt;
Dither is specially-constructed noise that substitutes for the noise&lt;br /&gt;
produced by simple quantization. Dither doesn&#039;t drown out or mask&lt;br /&gt;
quantization noise, it actually replaces it with noise characteristics&lt;br /&gt;
of our choosing that aren&#039;t influenced by the input.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Let&#039;s _watch_ what dither does.  The signal generator has too much&lt;br /&gt;
noise for this test so&lt;br /&gt;
&lt;br /&gt;
[[close: panel 3]]&lt;br /&gt;
&lt;br /&gt;
we&#039;ll produce a mathematically perfect sine wave with the ThinkPad&lt;br /&gt;
[[press]]&lt;br /&gt;
&lt;br /&gt;
and quantize it to eight bits [[press]]&lt;br /&gt;
&lt;br /&gt;
with dithering. [[press]]&lt;br /&gt;
&lt;br /&gt;
We see a nice sine wave on the waveform display&lt;br /&gt;
&lt;br /&gt;
[[ show outscope ]]  and output scope&lt;br /&gt;
&lt;br /&gt;
[[ show analyzer]]  and, once the analog spectrum analyzer catches up...&lt;br /&gt;
&lt;br /&gt;
[[time accel sweep]] a clean frequency peak with a uniform noise floor&lt;br /&gt;
on both spectral displays&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]  just like before. Again, this is with dither.&lt;br /&gt;
&lt;br /&gt;
Now I turn dithering off. [[ deactivate dither ]]&lt;br /&gt;
&lt;br /&gt;
The quantization noise, that dither had spread out into a nice, flat noise&lt;br /&gt;
floor, piles up into harmonic distortion peaks.  The noise floor is&lt;br /&gt;
lower, but the level of distortion becomes nonzero, and the distortion&lt;br /&gt;
peaks sit higher than the dithering noise did.&lt;br /&gt;
&lt;br /&gt;
At eight bits this effect is exaggerated. At sixteen bits, [[click 16]]&lt;br /&gt;
&lt;br /&gt;
even without dither, harmonic distortion is going to be so low as to&lt;br /&gt;
be completely inaudible.&lt;br /&gt;
&lt;br /&gt;
[[draw line across -100]]&lt;br /&gt;
&lt;br /&gt;
Still, we can use dither to eliminate it completely if we so choose.&lt;br /&gt;
&lt;br /&gt;
Turning the dither off again for a moment, you&#039;ll notice that the&lt;br /&gt;
absolute level of distortion from undithered quantization stays&lt;br /&gt;
approximately constant regardless of the input amplitude.&lt;br /&gt;
&lt;br /&gt;
[[ overview: waveform ]]&lt;br /&gt;
&lt;br /&gt;
But when the signal level drops below a half a bit, everything&lt;br /&gt;
quantizes to zero.&lt;br /&gt;
&lt;br /&gt;
[[ overview: spectrum ]]&lt;br /&gt;
&lt;br /&gt;
In a sense, everything quantizing to zero is just 100% distortion!&lt;br /&gt;
Dither eliminates this distortion too. We reenable dither&lt;br /&gt;
and...&lt;br /&gt;
&lt;br /&gt;
[[dither on]]&lt;br /&gt;
&lt;br /&gt;
there&#039;s our signal back at 1/4 bit, with our nice flat noise floor.&lt;br /&gt;
&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
The noise floor doesn&#039;t have to be flat.  Dither is noise of our&lt;br /&gt;
choosing, so let&#039;s choose a noise as inoffensive and difficult to&lt;br /&gt;
notice as possible.&lt;br /&gt;
&lt;br /&gt;
[[panel 5]]&lt;br /&gt;
&lt;br /&gt;
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,&lt;br /&gt;
so that&#039;s where background noise is going to be the most obvious.&lt;br /&gt;
&lt;br /&gt;
[[annotate: underline 2-4kHz]]&lt;br /&gt;
&lt;br /&gt;
[[click shaped]&lt;br /&gt;
&lt;br /&gt;
We can shape dithering noise away from sensitive frequencies to where&lt;br /&gt;
hearing is less sensitive, usually the highest frequencies.&lt;br /&gt;
&lt;br /&gt;
[[annotate: arrow to HF]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
16-bit dithering noise is normally much too quiet to hear at all, but&lt;br /&gt;
let&#039;s listen to our noise shaping example, again with the gain&lt;br /&gt;
brought way up...&lt;br /&gt;
&lt;br /&gt;
[[close]]&lt;br /&gt;
[[unshaped white to shaped]]&lt;br /&gt;
 &lt;br /&gt;
[[out]] Lastly, dithered quantization noise _is_ higher power overall&lt;br /&gt;
than undithered quantization noise even when it sounds quieter, and&lt;br /&gt;
you can see that on a VU meter during passages of near-silence.  But&lt;br /&gt;
dither isn&#039;t only an on or off choice. We can reduce the dither&#039;s&lt;br /&gt;
power to balance less noise against a bit of distortion to minimize&lt;br /&gt;
the overall effect.&lt;br /&gt;
&lt;br /&gt;
  [[ panel 6 audio :: flat, unmodulated ]]&lt;br /&gt;
We&#039;ll also modulate the input signal like this:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated ]]&lt;br /&gt;
...to show how a varying input affects the quantization noise.  At&lt;br /&gt;
full dithering power, the noise is uniform, constant, and featureless&lt;br /&gt;
just like we expect:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
As we reduce the dither&#039;s power, the input increasingly&lt;br /&gt;
affects the amplitude and the character of the quantization noise:&lt;br /&gt;
  [[ panel 6 audio :: flat, modulated, notch ]]&lt;br /&gt;
Shaped dither behaves similarly, but noise shaping lends one more nice&lt;br /&gt;
advantage.  To make a long story short, it can use a somewhat lower&lt;br /&gt;
dither power before the input has as much effect on the output.&lt;br /&gt;
  [[ panel 6 audio :: shaped, modulated, notch ]]&lt;br /&gt;
  [[ reset panel :: shaped, unmodulated, no notch ]]&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
&lt;br /&gt;
Despite all the time I just spent on dither, we&#039;re talking about&lt;br /&gt;
differences that start 100 decibels and more below full scale.  Maybe&lt;br /&gt;
if the CD had been 14 bits as originally designed, dither _might_ be&lt;br /&gt;
more important.  Maybe.  At 16 bits, really, it&#039;s mostly a wash.  You&lt;br /&gt;
can think of dither as an insurance policy that gives several extra&lt;br /&gt;
decibels of dynamic range just in case. The simple fact is, though, no&lt;br /&gt;
one ever ruined a great recording by not dithering the final master.&lt;br /&gt;
&lt;br /&gt;
==bandlimitation and timing==&lt;br /&gt;
&lt;br /&gt;
We&#039;ve been using sine waves. They&#039;re the obvious choice when what we&lt;br /&gt;
want to see is a system&#039;s behavior at a given isolated frequency.  Now&lt;br /&gt;
let&#039;s look at something a bit more complex.  What should we expect to&lt;br /&gt;
happen when I change the input to a square wave...&lt;br /&gt;
&lt;br /&gt;
[[close to sig analyzer-- press the button]]&lt;br /&gt;
&lt;br /&gt;
[[close to input scope]]&lt;br /&gt;
The input scope confirms our 1kHz square wave.  The output scope shows..&lt;br /&gt;
&lt;br /&gt;
[[close to output scope]]&lt;br /&gt;
Exactly what it should.&lt;br /&gt;
 ...&lt;br /&gt;
What is a square wave really?  &lt;br /&gt;
[[illustrate]]&lt;br /&gt;
&lt;br /&gt;
Well, we can say it&#039;s a waveform that&#039;s&lt;br /&gt;
some positive value for half a cycle and then transitions&lt;br /&gt;
instantaneously to a negative value for the other half. But that doesn&#039;t&lt;br /&gt;
really tell us anything useful about how this input [[close/point]]&lt;br /&gt;
becomes this output [[close/point]].&lt;br /&gt;
&lt;br /&gt;
[[animated diagram]]&lt;br /&gt;
Then we remember that any waveform is also the sum of discrete frequencies,&lt;br /&gt;
and a square wave is particularly simple sum: a fundamental and an&lt;br /&gt;
infinite series of odd harmonics.  Sum them all up, you get a&lt;br /&gt;
square wave.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
At first glance, that doesn&#039;t seem very useful either. You have to sum&lt;br /&gt;
up an infinite number of harmonics to get the answer.  Ah, but we don&#039;t&lt;br /&gt;
have an infinite number of harmonics.&lt;br /&gt;
&lt;br /&gt;
[[close to panel, annotate circling cutoff, and line at 20kHz on spectrum]]&lt;br /&gt;
&lt;br /&gt;
We&#039;re using a quite sharp anti-aliasing filter that cuts off right&lt;br /&gt;
above 20kHz, so our signal is bandlimited, which means we get this:&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
..and that&#039;s exactly what we see on the output scope.&lt;br /&gt;
[[pan/fade to scope display showing they line up perfectly]]&lt;br /&gt;
&lt;br /&gt;
The rippling you see around sharp edges in a bandlimited signal is&lt;br /&gt;
called the Gibbs effect. It happens whenever you slice off part of the&lt;br /&gt;
frequency domain in the middle of nonzero energy.&lt;br /&gt;
&lt;br /&gt;
[[out]]&lt;br /&gt;
The usual rule of thumb you&#039;ll hear is &amp;quot;the sharper the cutoff, the&lt;br /&gt;
stronger the rippling&amp;quot;, which is approximately true, but we have to be&lt;br /&gt;
careful how we think about it.&lt;br /&gt;
&lt;br /&gt;
For example... what would you expect our quite sharp anti-aliasing filter&lt;br /&gt;
to do if I run our signal through it a second time?&lt;br /&gt;
&lt;br /&gt;
[[ plug plug go]]&lt;br /&gt;
[[outscope]]&lt;br /&gt;
&lt;br /&gt;
Aside from adding a few fractional cycles of delay, the answer is...&lt;br /&gt;
nothing at all.  The signal is already bandlimited. Bandlimiting it&lt;br /&gt;
again doesn&#039;t do anything.  A second pass can&#039;t remove frequencies&lt;br /&gt;
that we already removed.&lt;br /&gt;
&lt;br /&gt;
[[out]] And that&#039;s important.  People tend to think of the ripples as&lt;br /&gt;
a kind of artifact that&#039;s added by anti-aliasing and anti-imaging&lt;br /&gt;
filters, implying that the ripples get worse each time the signal&lt;br /&gt;
passes through.  We can see that in this case that didn&#039;t happen. So&lt;br /&gt;
was it really the filter that added the ripples the first time&lt;br /&gt;
through?  No, not really. It&#039;s a subtle distinction, but Gibbs effect&lt;br /&gt;
ripples aren&#039;t added by filters, they&#039;re just part of what a&lt;br /&gt;
bandlimited signal _is_.&lt;br /&gt;
&lt;br /&gt;
[[close: panel 8]]&lt;br /&gt;
&lt;br /&gt;
Even if we synthetically construct what looks like a perfect digital&lt;br /&gt;
square wave,&lt;br /&gt;
&lt;br /&gt;
[[ turn on digital &#039;square wave&#039; ]]&lt;br /&gt;
&lt;br /&gt;
it&#039;s still limited to the channel bandwidth.  Remember,&lt;br /&gt;
the stairstep representation is misleading.&lt;br /&gt;
&lt;br /&gt;
[[go to lollipop]]&lt;br /&gt;
&lt;br /&gt;
What we really have here are instantaneous sample points,&lt;br /&gt;
&lt;br /&gt;
[[to diagram, trace original ]]&lt;br /&gt;
&lt;br /&gt;
and only one bandlimited signal fits those points.  All we did when we&lt;br /&gt;
drew our apparently perfect square wave was line up the sample points&lt;br /&gt;
just right so it appeared that there were no ripples if we played&lt;br /&gt;
connect-the-dots.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: shift samples forward and back; fade to waveform display&lt;br /&gt;
showing same ]]&lt;br /&gt;
&lt;br /&gt;
But the original bandlimited signal, complete with ripples, was&lt;br /&gt;
still there.&lt;br /&gt;
&lt;br /&gt;
[[ show output scope ]]&lt;br /&gt;
[[ out ]]&lt;br /&gt;
&lt;br /&gt;
And that leads us to one more important point.  You&#039;ve probably heard&lt;br /&gt;
that the timing precision of a digital signal is limited by its sample&lt;br /&gt;
rate; put another way,&lt;br /&gt;
&lt;br /&gt;
[[diagram]]&lt;br /&gt;
&lt;br /&gt;
that digital signals can&#039;t represent anything that falls between the&lt;br /&gt;
samples.. implying that impulses or fast attacks have to align exactly&lt;br /&gt;
with a sample, or the timing gets mangled... or they just disappear.&lt;br /&gt;
&lt;br /&gt;
[[ scribble it out ]]&lt;br /&gt;
&lt;br /&gt;
At this point, we can easily see why that&#039;s wrong.&lt;br /&gt;
&lt;br /&gt;
[[ diagram: both an edge and an impulse ]]&lt;br /&gt;
&lt;br /&gt;
Again, our input signals are bandlimited. And digital signals are&lt;br /&gt;
samples, not stairsteps, not &#039;connect-the-dots&#039;.  We most certainly&lt;br /&gt;
can, for example, put the rising edge of our bandlimited square wave&lt;br /&gt;
anywhere we want between samples.&lt;br /&gt;
&lt;br /&gt;
It&#039;s represented perfectly [[show on the waveform display, move slider]]&lt;br /&gt;
and it&#039;s reconstructed perfectly [[show on output scope with moving slider]].&lt;br /&gt;
&lt;br /&gt;
==epilogue==&lt;br /&gt;
&lt;br /&gt;
[[ back in :20 sign ]]&lt;br /&gt;
&lt;br /&gt;
Just like in the previous episode, we&#039;ve covered a broad range of&lt;br /&gt;
topics, and yet barely scratched the surface of each one.  If anything, my&lt;br /&gt;
sins of omission are greater this time around... but this is a good&lt;br /&gt;
stopping point.&lt;br /&gt;
&lt;br /&gt;
Or maybe, a good starting point.  Dig deeper.  Experiment.  I chose my&lt;br /&gt;
demos very carefully to be simple and give clear results. You can&lt;br /&gt;
reproduce every one of them on your own if you like.  But let&#039;s face&lt;br /&gt;
it, sometimes we learn the most about a spiffy toy by breaking it open&lt;br /&gt;
and studying all the pieces that fall out.  And that&#039;s OK, we&#039;re&lt;br /&gt;
engineers.  Play with the demo parameters, hack up the code, set up&lt;br /&gt;
alternate experiments.  The source code for everything, including the&lt;br /&gt;
little pushbutton demo application, is up at xiph.org.&lt;br /&gt;
&lt;br /&gt;
In the course of experimentation, you&#039;re likely to run into something&lt;br /&gt;
that you didn&#039;t expect and can&#039;t explain.  Don&#039;t worry!  My earlier&lt;br /&gt;
snark aside, Wikipedia is fantastic for exactly this kind of casual&lt;br /&gt;
research. And, if you&#039;re really serious about understanding signals,&lt;br /&gt;
several universities have advanced materials online, such as the 6.003&lt;br /&gt;
and 6.007 Signals and Systems modules at MIT OpenCourseWare. And of&lt;br /&gt;
course, there&#039;s always the community here at Xiph.Org.&lt;br /&gt;
&lt;br /&gt;
Digging deeper or not, I am out of coffee, so, until next time, happy&lt;br /&gt;
hacking!&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=MatroskaOpus&amp;diff=13649</id>
		<title>MatroskaOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=MatroskaOpus&amp;diff=13649"/>
		<updated>2012-09-14T22:23:58Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== &#039;&#039;&#039;DRAFT&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
This is an encapsulation spec for the [[Opus]] codec in [[http://matroska.org/ Matroska]]. There are a number of outstanding functional issues with muxing Opus in Matroska, and until those are resolved, use of this spec is NOT RECOMMENDED.&lt;br /&gt;
&lt;br /&gt;
 - CodecID is A_OPUS&lt;br /&gt;
 - SampleFrequecy is 48000&lt;br /&gt;
 - Channels is number of output PCM channels&lt;br /&gt;
 - CodecPrivate is the &#039;OpusHead&#039; packet, identical to the Ogg mapping&lt;br /&gt;
&lt;br /&gt;
The &#039;OpusHead&#039; format is defined by the [[http://tools.ietf.org/html/draft-terriberry-oggopus Ogg Opus]] mapping. In particular it includes pre-skip, gain, and the channel mapping table required for correct surround output.&lt;br /&gt;
&lt;br /&gt;
The second &#039;OpusTags&#039; header packet from Ogg Opus is not used in the Matroska encapsulation. Matroska has its own system for tag metadata, and this avoids duplicating it and the need for sub-framing to index multiple packets within the CodecPrivate element.&lt;br /&gt;
&lt;br /&gt;
If the CodecPrivate is empty and Channels is 1 or 2, players MAY treat it as a sane set of defaults, I guess. e.g. channel mapping family 0, no pre-skip or gain. For Channels &amp;gt; 2 the track MUST be rejected, since there&#039;s no way to map the encoded substreams to channels.&lt;br /&gt;
&lt;br /&gt;
== Open Questions ==&lt;br /&gt;
&lt;br /&gt;
Seeking in Opus files requires a pre-roll (recommended to be at least 80 ms). However, currently Matroska requires its index entries to point directly to the data whose timestamp matches the corresponding seek point, not some place arbitrarily before that timestamp. These two requirements are incompatible, and mean that seeking in Opus will be broken in all existing Matroska software. In particularly unlucky cases (e.g., around a transient), playing back audio decoded without any pre-roll can produce extremely loud (possibly equipment-damaging) results. We need a new element to signal this, e.g. Track::TrackEntry::PreRoll.&lt;br /&gt;
&lt;br /&gt;
Should we say muxers MAY or SHOULD NOT produce simple streams without filling in CodecPrivate?&lt;br /&gt;
&lt;br /&gt;
How does the OpusHead pre-skip field interact with the timestamps? The SimpleBlock timestamp is signed 16 bits, so the format can signal about half of the pre-skip if playback timestamps are to start at zero. Moritz suggests this won&#039;t work because the resolution of the timestamps is controlled by the muxer, so the SimpleBlock timestamp offset isn&#039;t sample accurate anyway.[[http://lists.matroska.org/pipermail/matroska-devel/2012-September/004254.html ref]]&lt;br /&gt;
&lt;br /&gt;
One could set an incorrect timestamp on the skipped blocks, and rely on the decoder to drop them based on the OpusHead preskip value. As long as the initial blocks are timestamped &amp;lt;= start of output this shouldn&#039;t affect seeking.&lt;br /&gt;
&lt;br /&gt;
How important is it that timestamps start at zero in a Matroska file?&lt;br /&gt;
&lt;br /&gt;
The SimpleBlock structure also has an &#039;invisible&#039; bit, which tells the player to decode, but not display, the contained frames. This lets the muxer signal the pre-skip semantics with frame accuracy, but not sample accuracy. If players implement this it will at least help with sync. Libav does not appear to support the invisible bit.&lt;br /&gt;
&lt;br /&gt;
How can sample-accurate end-time trimming work in Matroska? Currently all software encapsulating Vorbis in Matroska is broken in this regard, and muxing a Vorbis file in Matroska causes it to get longer (i.e., produce more audio output than the original Ogg file). It would be unfortunate to repeat this disaster for Opus. This needs a new element specifying the number of samples to trim, perhaps a new BlockGroup child.&lt;br /&gt;
&lt;br /&gt;
If new elements are required, can they be defined so as to enable correct seeking in rolling intra (a.k.a intra refresh) video as well?&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13215</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13215"/>
		<updated>2012-01-23T01:08:44Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Add explanation of the max size packet&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.  It must begin with the 8 bytes &amp;quot;OpusHead&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet MUST contain the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.  It must begin with the 8 bytes &amp;quot;OpusTags&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits unsigned): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned, little endian)&lt;br /&gt;
 - Input sample rate (32 bits unsigned, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits unsigned)&lt;br /&gt;
  --  0 = one stream: mono or L,R stereo&lt;br /&gt;
  --  1 = channels in vorbis spec order: mono or L,R stereo or ... or FL,C,FR,RL,RR,LFE, ...&lt;br /&gt;
  --  2..254 = reserved (treat as 255)&lt;br /&gt;
  --  255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
Valid Opus packets can be arbitrarily large due to padding (even larger than 2^64 bytes!).  These packets may be spread over a similarly enormous number of Ogg pages.  Decoders SHOULD avoid attempting to allocate excessive amounts of memory when presented with a very large packet.  The presence of an extremely large packet in the stream could indicate a potential memory exhaustion attack or stream corruption.  Decoders should reject a packet that is too large to process, and print a warning message.  In an Ogg Opus stream, the largest possible valid packet that does not use padding has a size of 15630988 bytes (14.9 MiB) and can span up to 61298 Ogg Pages, all but one of which will have a granulepos of -1.  (This is of course a very extreme packet, consisting of 255 channels, each containing 120ms of audio encoded as 2.5ms frames, each frame using the maximum possible number of bytes and stored in the least efficient manner allowed.)  &#039;&#039;&#039;FIXME&#039;&#039;&#039;: should we make an actual recommendation on behavior for decoders running on modern desktops?&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13214</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13214"/>
		<updated>2012-01-23T01:00:30Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: wording&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.  It must begin with the 8 bytes &amp;quot;OpusHead&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet MUST contain the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.  It must begin with the 8 bytes &amp;quot;OpusTags&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits unsigned): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned, little endian)&lt;br /&gt;
 - Input sample rate (32 bits unsigned, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits unsigned)&lt;br /&gt;
  --  0 = one stream: mono or L,R stereo&lt;br /&gt;
  --  1 = channels in vorbis spec order: mono or L,R stereo or ... or FL,C,FR,RL,RR,LFE, ...&lt;br /&gt;
  --  2..254 = reserved (treat as 255)&lt;br /&gt;
  --  255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
Valid Opus packets can be arbitrarily large due to padding (even larger than 2^64 bytes!).  These packets may be spread over a similarly enormous number of Ogg pages.  Decoders SHOULD avoid attempting to allocate excessive amounts of memory when presented with a very large packet.  The presence of an extremely large packet in the stream could indicate a potential memory exhaustion attack or stream corruption.  Decoders should reject a packet that is too large to process, and print a warning message.  In an Ogg Opus stream, the largest possible valid packet that does not use padding has a size of 15630988 bytes (14.9 MiB) and can span up to 61298 Ogg Pages, all but one of which will have a granulepos of -1.  &#039;&#039;&#039;FIXME&#039;&#039;&#039;: should we make an actual recommendation on behavior for decoders running on modern desktops?&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13213</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13213"/>
		<updated>2012-01-23T00:57:41Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: wording&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.  It must begin with the 8 bytes &amp;quot;OpusHead&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet MUST contain the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.  It must begin with the 8 bytes &amp;quot;OpusTags&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits unsigned): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned, little endian)&lt;br /&gt;
 - Input sample rate (32 bits unsigned, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits unsigned)&lt;br /&gt;
  --  0 = one stream: mono or L,R stereo&lt;br /&gt;
  --  1 = channels in vorbis spec order: mono or L,R stereo or ... or FL,C,FR,RL,RR,LFE, ...&lt;br /&gt;
  --  2..254 = reserved (treat as 255)&lt;br /&gt;
  --  255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
Valid Opus packets can be arbitrarily large due to padding (even larger than 2^64 bytes!).  These packets may be spread over a similarly enormous number of Ogg pages.  Decoders SHOULD avoid attempting to allocate excessive amounts of memory when presented with a very large packet.  The presence of an extremely large packet in the stream could indicate a potential memory exhaustion attack or stream corruption.  Decoders should reject a packet that is too large to process, and print a warning message.  In Ogg Opus streams that do not use padding, the largest valid packet has a size of 15630988 bytes (14.9 MiB), and can span up to 61298 Ogg Pages, all but one of which will have a granulepos of -1.  &#039;&#039;&#039;FIXME&#039;&#039;&#039;: should we make an actual recommendation on behavior for decoders running on modern desktops?&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13212</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13212"/>
		<updated>2012-01-23T00:44:11Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Increase max size estimate&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.  It must begin with the 8 bytes &amp;quot;OpusHead&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet MUST contain the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.  It must begin with the 8 bytes &amp;quot;OpusTags&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits unsigned): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned, little endian)&lt;br /&gt;
 - Input sample rate (32 bits unsigned, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits unsigned)&lt;br /&gt;
  --  0 = one stream: mono or L,R stereo&lt;br /&gt;
  --  1 = channels in vorbis spec order: mono or L,R stereo or ... or FL,C,FR,RL,RR,LFE, ...&lt;br /&gt;
  --  2..254 = reserved (treat as 255)&lt;br /&gt;
  --  255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
Valid Opus packets can be arbitrarily large due to padding (even larger than 2^64 bytes!).  These packets may be spread over a similarly enormous number of Ogg pages.  Decoders SHOULD avoid attempting to allocate excessive amounts of memory when presented with such a packet, which could indicate a potential memory exhaustion attack or stream corruption.  Decoders should reject a packet that is too large to process, and print a warning message.  In Ogg Opus streams that do not use padding, the largest valid packet has a size of 15630988 bytes (14.9 MiB), and can span up to 61298 Ogg Pages, all but one of which will have a granulepos of -1.  &#039;&#039;&#039;FIXME&#039;&#039;&#039;: should we make an actual recommendation on behavior for decoders running on modern desktops?&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13211</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13211"/>
		<updated>2012-01-23T00:27:22Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Discuss the issue of enormous packets.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.  It must begin with the 8 bytes &amp;quot;OpusHead&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet MUST contain the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.  It must begin with the 8 bytes &amp;quot;OpusTags&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits unsigned): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned, little endian)&lt;br /&gt;
 - Input sample rate (32 bits unsigned, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits unsigned)&lt;br /&gt;
  --  0 = one stream: mono or L,R stereo&lt;br /&gt;
  --  1 = channels in vorbis spec order: mono or L,R stereo or ... or FL,C,FR,RL,RR,LFE, ...&lt;br /&gt;
  --  2..254 = reserved (treat as 255)&lt;br /&gt;
  --  255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
Valid Opus packets can be arbitrarily large due to padding (even larger than 2^64 bytes!).  These packets may be spread over a similarly enormous number of Ogg pages.  Decoders SHOULD avoid attempting to allocate excessive amounts of memory when presented with such a packet, which could indicate a potential memory exhaustion attack or stream corruption.  Decoders should reject a packet that is too large to process, and print a warning message.  In Ogg Opus streams that do not use padding, all valid packets will be smaller than 325890 bytes.  &#039;&#039;&#039;FIXME&#039;&#039;&#039;: should we recommend that decoders generally attempt to process packets larger than this size if they have sufficient memory, or not?&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13180</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13180"/>
		<updated>2011-12-19T14:31:43Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Clearly rule out some illegal header configurations&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.  It must begin with the 8 bytes &amp;quot;OpusHead&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet MUST contain the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.  It must begin with the 8 bytes &amp;quot;OpusTags&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13179</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13179"/>
		<updated>2011-12-19T14:21:03Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Attempt to minimize confusion with preskip&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet contains the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page can be used to determine the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13178</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13178"/>
		<updated>2011-12-19T14:05:34Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Remove end-on-first-page requirement.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Opus is framed in a Continuous Ogg stream.&lt;br /&gt;
&lt;br /&gt;
(&#039;&#039;&#039;FIXME:&#039;&#039;&#039; add an anchor to the relevant section in http://xiph.org/ogg/doc/oggstream.html and link the above to it -- a link to http://xiph.org/ogg/doc/ogg-multiplex.html might be useful too)&lt;br /&gt;
&lt;br /&gt;
There are two mandatory headers. The granule position of the pages containing these headers is zero.&lt;br /&gt;
&lt;br /&gt;
The first Opus packet (the identification header), which uniquely identifies a stream as Opus audio, is placed alone in the first page of the logical Ogg stream.  This page is marked ’beginning of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
The second Opus packet contains the comment header and may span one or more pages beginning on the second page of the logical stream. However many pages it spans, the comment header packet finishes the page on which it ends.&lt;br /&gt;
&lt;br /&gt;
The next (first audio) packet of the logical stream MUST begin on a fresh Ogg page.&lt;br /&gt;
&lt;br /&gt;
Packets are placed into Ogg pages in order until the end of stream.&lt;br /&gt;
&lt;br /&gt;
The last page is marked ’end of stream’ in the page flags.&lt;br /&gt;
&lt;br /&gt;
Opus packets may span page boundaries.&lt;br /&gt;
&lt;br /&gt;
The granule position of pages containing Opus audio is in units of PCM audio samples at a fixed rate of 48kHz (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).&lt;br /&gt;
&lt;br /&gt;
The granule position of a page represents the end PCM sample position of the last packet completed on that page. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position and the granule position is set to ’-1’. &lt;br /&gt;
&lt;br /&gt;
The granule (PCM) position of the first audio page need not indicate that the stream started at position zero, however it MUST be greater than or equal to the number of samples contained in completed packets. ie. negative sample positions are not permitted, but a stream may be cropped from its beginning without rewriting the granule position of all remaining pages.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST reject as invalid any stream with an initial sample time of less than zero.  It may defer this action until the last complete packet of the first page has been decoded.&lt;br /&gt;
&lt;br /&gt;
A decoder MUST treat a zero size Ogg packet in the audio stream as if it were an Opus packet with an illegal TOC sequence.&lt;br /&gt;
&lt;br /&gt;
A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13171</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13171"/>
		<updated>2011-12-18T16:25:02Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: whoops cleanup&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of the logical bitstream that starts earliest.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus encoder MUST ensure that the first output sample has non-negative granule, and an Ogg Opus decoder MUST refuse to decode any bitstream that does not meet this requirement.  This condition is equivalent to requiring that the granpos of the first (non-header) page is greater than or equal to the number of samples it produces (at 48 kHz) (before preskip) (unless no packets end on this page, in which case the requirement applies to the first page on which a packet ends).  A decoder may assess this requirement by inspecting the first few bytes of each Opus packet, which indicate the number of output samples that packet produces.  Alternatively, it may accumulate the total number of decoded samples, and stop decoding with an error message if this total is ever greater than the granpos.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
For pages other than the first and last page, the granpos of page N MUST be greater than the granpos on page N-1 by exactly the number of (48 kHz) samples produced by packets on page N, i.e. granpos[N] == granpos[N-1] + duration[N].&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13170</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13170"/>
		<updated>2011-12-18T16:23:18Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Be more explicit about granpos requirements&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of the logical bitstream that starts earliest.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus encoder MUST ensure that the first output sample has non-negative granule, and an Ogg Opus decoder MUST refuse to decode any bitstream that does not meet this requirement.  This condition is equivalent to requiring that the granpos of the first (non-header) page is greater than or equal to the number of samples it produces (at 48 kHz) (before preskip) (unless no packets end on this page, in which case the requirement applies to the first page on which a packet ends).  A decoder may assess this requirement by inspecting the first few bytes of each Opus packet, which indicate the number of output samples that packet produces.  Alternatively, it may accumulate the total number of decoded samples, and stop decoding with an error message if this total is ever greater than the granpos.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
For pages other than the first and last page, the granpos of page N MUST be greater than the granpos on page N-1 by exactly the number of (48 kHz) samples produced by packets on page N, i.e. granpos[N] == granpos[N-1] + duration[N].  The last page &lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13169</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13169"/>
		<updated>2011-12-18T15:13:59Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Mention the lazy implementation&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of the logical bitstream that starts earliest.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus encoder MUST ensure that the first output sample has non-negative granule, and an Ogg Opus decoder MUST refuse to decode any bitstream that does not meet this requirement.  This condition is equivalent to requiring that the granpos of the first page is greater than or equal to the number of samples it produces (at 48 kHz) (before preskip).  A decoder may assess this requirement by inspecting the first few bytes of each Opus packet, which indicate the number of output samples that packet produces.  Alternatively, it may accumulate the total number of decoded samples, and stop decoding with an error message if this total is ever greater than the granpos.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13168</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13168"/>
		<updated>2011-12-18T15:08:21Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Change preskip logic to ban negative granule&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of the logical bitstream that starts earliest.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus encoder MUST ensure that the first output sample has non-negative granule, and an Ogg Opus decoder MUST refuse to decode any bitstream that does not meet this requirement.  This condition is equivalent to requiring that the granpos of the first page is greater than or equal to the number of samples it produces (at 48 kHz) (before preskip).  A decoder may assess this requirement by inspecting the first few bytes of each Opus packet, which indicate the number of output samples that packet produces.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13167</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13167"/>
		<updated>2011-12-17T15:03:21Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Improve wording on preskip&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.  If, after discarding this number of samples, the subsequent samples still have granule less than zero, those samples MUST be discarded as well.  (In Ogg Opus, the granule of an output sample is defined as the PCM sample position relative to time zero, measured at 48 KHz, which is the granulerate.  The granule corresponding to a given sample may be computed by working backwards from the page&#039;s granpos, using the preskip and the duration of all subsequent packets on the page.)&lt;br /&gt;
&lt;br /&gt;
To determine how many samples to discard, the decoder must know the preskip, the firstpage_granpos, and the total duration of packets that end on the first page (firstpage_duration), all of which are times or timespans measured at the granulerate.  After decoding packet contents, the decoder must discard an amount of time given by preskip + max(0, firstpage_duration - firstpage_granpos).&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of the logical bitstream that starts earliest.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13158</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13158"/>
		<updated>2011-12-15T18:59:21Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Change preskip logic specification to reflect current consensus&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output when starting playback, and also the number to subtract from a page&#039;s granpos to calculate its end-granule.  If, after discarding this number of samples, the subsequent samples still have granule less than zero, those samples MUST be discarded as well. &lt;br /&gt;
&lt;br /&gt;
To determine how many samples to discard, the decoder must know the preskip, the firstpage_granpos, and the total duration of packets that end on the first page (firstpage_duration).  After decoding packet contents, the decoder must discard a number of samples given by preskip + max(0, firstpage_duration - firstpage_granpos).&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding and without rewriting the Ogg pages except at the beginning and end.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of the logical bitstream that starts earliest.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13156</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13156"/>
		<updated>2011-12-15T16:26:29Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Further attempt to clarify and correct pre-skip granpos logic&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* stream count and coupling for multichannel audio&lt;br /&gt;
* metadata and tags&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two mandatory headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to subtract from the granpos to calculate the granule.&lt;br /&gt;
&lt;br /&gt;
Like Ogg Vorbis, Ogg Opus streams may encode samples whose implicit playback time (granule) is negative, but such samples must be discarded, and not played back or otherwise provided as output.  The first sample with non-negative granule (often zero) is the first sample of output from the stream.  To determine how many samples to discard, the decoder must know the preskip, the firstpage_granpos, and the total duration of packets that end on the first page (firstpage_duration).  After decoding packet contents, the decoder must discard a number of samples given by max(0, preskip + firstpage_duration - firstpage_granpos).&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59971, and the preskip is 11971, then the page provides output audio up to granule 48000, i.e. it completes the first second of absolute time. The last sample decoded from the page is the sample that starts at granule 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
Note that time is measured in absolute samples from the zero granule reference.  When multiple logical bitstreams are muxed together in one Ogg file (e.g. audio and video), their timelines are defined such that granule zero is played back at the same instant in all streams.  Decoders SHOULD begin decoding or playback at the time corresponding to the first output of any muxed stream.  When displaying the time, players MAY indicate either absolute time (relative to granule zero) or relative time (relative to the first output).&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13147</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13147"/>
		<updated>2011-12-07T02:47:00Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: /* Id header */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
Because more than one page can be needed for re-convergence the Vorbis scheme for signaling pre-skip is not used for Opus.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 59970, and the preskip is 11971, then last sample decoded from the page is sample 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13121</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13121"/>
		<updated>2011-11-21T16:50:18Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Attempt to clarify pre-skip granpos calculation&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
==== Id header ====&lt;br /&gt;
&lt;br /&gt;
 - Magic signature: &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - Version number (8 bits): zero for this spec&lt;br /&gt;
 - Channel count &#039;c&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Pre-skip (16 bits unsigned)&lt;br /&gt;
 - Input sample rate (32 bits, little endian): informational only&lt;br /&gt;
 - Output gain (16 bits, little endian, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - Channel mapping family (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 If channel mapping family &amp;gt; 0&lt;br /&gt;
 - Stream count &#039;N&#039; (8 bits unsigned): MUST be &amp;gt; 0&lt;br /&gt;
 - Two-channel stream count &#039;M&#039; (8 bits unsigned): MUST satisfy M &amp;lt;= N, M+N &amp;lt;= 255&lt;br /&gt;
 - Channel mapping (8*c bits)&lt;br /&gt;
   -- one stream index (8 bits unsigned) per channel (255 means silent throughout the file)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Magic signature&#039;&#039;&#039;&lt;br /&gt;
The magic signature &amp;quot;OpusHead&amp;quot; allows codec identification and is human readable. Starting with &#039;Op&#039; helps distinguish it from data packets, as this is an invalid TOC sequence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Version&#039;&#039;&#039;&lt;br /&gt;
The version number must always be zero for this version of the encapsulation spec. We do not plan to revise the spec, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel count&#039;&#039;&#039; &#039;c&#039;&lt;br /&gt;
The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
Because more than one page can be needed for re-convergence the Vorbis scheme for signaling pre-skip is not used for Opus.&lt;br /&gt;
&lt;br /&gt;
The granule corresponding to the end time of an Ogg Opus page can be determined by subtracting the pre-skip from the page&#039;s granpos value.  For example, if the page&#039;s granpos is 49970, and the preskip is 11971, then last sample decoded from the page is sample 47999, i.e. the last sample from the first second of absolute time.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Input sample rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate to use playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal sample rates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise to the user, who might reasonably expect to get back a file with the same sample rate as the one they fed to the encoder.&lt;br /&gt;
&lt;br /&gt;
A value of zero indicates &#039;unspecified&#039;. Implementations which do something with this field should take care to behave sanely if given crazy values (e.g. don&#039;t &lt;br /&gt;
actually upsample the output to 10MHz) and encoders should write the actual input rate or zero. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN (see below) without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping family&#039;&#039;&#039;&lt;br /&gt;
This byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  &lt;br /&gt;
&lt;br /&gt;
Each possible value of this byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo if and only if c==2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping families (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with a channel mapping family of 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stream count&#039;&#039;&#039; &#039;N&#039;&lt;br /&gt;
This field indicates the total number of streams so the decoder can correctly parse the packed Opus packets inside the Ogg packet.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to 1, and is not coded.&lt;br /&gt;
&lt;br /&gt;
A multi-channel Opus file is composed of one or more individual Opus streams, each of which produce one or two channels of decoded data. Each Ogg packet contains one Opus packet from each stream. The first N-1 Opus packets are packed using the self-delimiting framing from Appendix B of the Opus specification. The remaining Opus packet is packed using the regular, undelimited framing from Section 3 of the Opus specification. All the Opus packets in a single Ogg packet are constrained to produce the same number of decoded samples.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Two-channel stream count&#039;&#039;&#039; &#039;M&#039;&lt;br /&gt;
Describes the number of streams whose decoders should be configured to produce two channels. This must be no larger than the number of total streams.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, this value defaults to c-1 (i.e., 0 for mono and 1 for stereo), and is not coded.&lt;br /&gt;
&lt;br /&gt;
Each packet in an Opus stream has an internal channel count of 1 or 2, which can change from packet to packet. This is selected by the encoder depending on the bitrate and the contents being encoded. The original channel count of the encoder input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
Regardless of the internal channel count, any Opus stream may be decoded as mono (single channel) or stereo (two channels) by appropriate initialization of the decoder.  The &amp;quot;two-channel stream count&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining N-M decoders should be initialized in mono mode. The total number of decoded channels (M+N) must be no larger than 255, as there is no way to index more channels than that in the channel mapping.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Channel mapping&#039;&#039;&#039;&lt;br /&gt;
Contains one index per output channel indicating which decoded channel should be used. If the index is less than 2*M, the output MUST be taken from decoding stream (index/2) as stereo and selecting the left channel if index is even, and the right channel if index is odd. If the index is 2*M or larger, the output MUST be taken from decoding stream (index-M) as mono. As a special case, an index of 255 means that the corresponding output channel MUST contain pure silence.&lt;br /&gt;
&lt;br /&gt;
For channel mapping family 0, the first index defaults to 0, and if c==2, the second index defaults to 1. Neither index is coded.&lt;br /&gt;
&lt;br /&gt;
The number of output channels (c) is not constrained to match the number of decoded channels (M+N). A single index MAY appear multiple times, i.e., the same decoded channel may be mapped to multiple output channels. Some decoded channels might not be assigned to any output channel, as well.&lt;br /&gt;
&lt;br /&gt;
==== Comment header ====&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis (without the &amp;quot;framing-bit&amp;quot;), OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13117</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13117"/>
		<updated>2011-11-16T23:29:23Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Note about seeking upstream of target&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - channel mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if channel mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;channel mapping&#039;&#039;&#039; and &#039;&#039;&#039;number of channels&#039;&#039;&#039;&lt;br /&gt;
The channel mapping byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
Each channel mapping byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo iff number of channels == 2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping values (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with channel mapping 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
Note: Any Opus stream may be decoded as mono (single output) or stereo (two outputs), regardless of its contents, by appropriate initialization of the decoder.  The &amp;quot;number of two-output streams&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining decoder should be initialized in mono mode.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal samplerates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encode input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.&lt;br /&gt;
&lt;br /&gt;
When seeking within an Ogg Opus stream, the decoder should start decoding (and discarding the output) at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples prior to the seek point in order to ensure that the output audio is correct at the seek point.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13115</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13115"/>
		<updated>2011-11-15T21:01:48Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Fix typo in URL&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [http://tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - channel mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if channel mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;channel mapping&#039;&#039;&#039; and &#039;&#039;&#039;number of channels&#039;&#039;&#039;&lt;br /&gt;
The channel mapping byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
Each channel mapping byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo iff number of channels == 2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping values (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with channel mapping 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
Note: Any Opus stream may be decoded as mono (single output) or stereo (two outputs), regardless of its contents, by appropriate initialization of the decoder.  The &amp;quot;number of two-output streams&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining decoder should be initialized in mono mode.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal samplerates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encode input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13114</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=13114"/>
		<updated>2011-11-15T21:00:30Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Mention truncation rule&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - channel mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if channel mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;channel mapping&#039;&#039;&#039; and &#039;&#039;&#039;number of channels&#039;&#039;&#039;&lt;br /&gt;
The channel mapping byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
Each channel mapping byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo iff number of channels == 2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping values (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with channel mapping 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
Note: Any Opus stream may be decoded as mono (single output) or stereo (two outputs), regardless of its contents, by appropriate initialization of the decoder.  The &amp;quot;number of two-output streams&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining decoder should be initialized in mono mode.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal samplerates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encode input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
== Other implementation notes ==&lt;br /&gt;
As [http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2 in Ogg Vorbis], a granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process. &lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12986</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12986"/>
		<updated>2011-08-23T18:21:54Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: clarify wording regarding volume adjustment&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - channel mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if channel mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  If present it MUST correctly represent the R128 normalization gain (relative to the OpusHead output gain).  If a player chooses to make use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and write &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;channel mapping&#039;&#039;&#039; and &#039;&#039;&#039;number of channels&#039;&#039;&#039;&lt;br /&gt;
The channel mapping byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
Each channel mapping byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo iff number of channels == 2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping values (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with channel mapping 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
Note: Any Opus stream may be decoded as mono (single output) or stereo (two outputs), regardless of its contents, by appropriate initialization of the decoder.  The &amp;quot;number of two-output streams&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining decoder should be initialized in mono mode.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, the adjustment MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and instead apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal samplerates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encode input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12985</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12985"/>
		<updated>2011-08-18T20:56:23Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Fix list syntax&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - channel mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if channel mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  When a player makes use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain, and MUST correctly represent the R128 normalization gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and set &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;channel mapping&#039;&#039;&#039; and &#039;&#039;&#039;number of channels&#039;&#039;&#039;&lt;br /&gt;
The channel mapping byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
Each channel mapping byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
** Family 0 (RTP mapping)&lt;br /&gt;
*** Allowed numbers of channels: 1 or 2&lt;br /&gt;
*** 1 channel: monophonic (mono)&lt;br /&gt;
*** 2 channels: stereo (left, right)&lt;br /&gt;
*** &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo iff number of channels == 2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
** Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
*** Allowed numbers of channels: 1 ... 8&lt;br /&gt;
*** Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
** Family 255 (no defined channel meaning)&lt;br /&gt;
*** Allowed numbers of channels: 1...255&lt;br /&gt;
*** Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping values (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with channel mapping 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
Note: Any Opus stream may be decoded as mono (single output) or stereo (two outputs), regardless of its contents, by appropriate initialization of the decoder.  The &amp;quot;number of two-output streams&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining decoder should be initialized in mono mode.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  Any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal samplerates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encode input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
** If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
** else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
** else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12984</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12984"/>
		<updated>2011-08-18T20:53:22Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Cleanup and clarify text&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - channel mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if channel mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  When a player makes use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain, and MUST correctly represent the R128 normalization gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and set &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;channel mapping&#039;&#039;&#039; and &#039;&#039;&#039;number of channels&#039;&#039;&#039;&lt;br /&gt;
The channel mapping byte indicates the order and semantic meaning of the various channels encoded in each Opus packet.  The number of channels byte specifies the number of output channels (1...255) for this Ogg Opus stream.&lt;br /&gt;
&lt;br /&gt;
Each channel mapping byte indicates a &#039;&#039;mapping family&#039;&#039;, which defines a set of allowed numbers of channels, and the ordered set of channel names for each allowed number of channels.  Currently there are three defined mapping families, although more may be added:&lt;br /&gt;
&lt;br /&gt;
- Family 0 (RTP mapping)&lt;br /&gt;
-- Allowed numbers of channels: 1 or 2&lt;br /&gt;
-- 1 channel: monophonic (mono)&lt;br /&gt;
-- 2 channels: stereo (left, right)&lt;br /&gt;
-- &#039;&#039;&#039;Special mapping&#039;&#039;&#039;: this channel mapping value also indicates that the contents consists of a single Opus stream that is stereo iff number of channels == 2, with stream index 0 mapped to channel 0, and (if stereo) stream index 1 mapped to channel 1.  When the channel mapping byte has this value, no further fields are present in OpusHead.&lt;br /&gt;
- Family 1 ([http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9 Vorbis mapping])&lt;br /&gt;
-- Allowed numbers of channels: 1 ... 8&lt;br /&gt;
-- Channel meanings depend on the number of channels, see the Vorbis mapping for details.&lt;br /&gt;
- Family 255 (no defined channel meaning)&lt;br /&gt;
-- Allowed numbers of channels: 1...255&lt;br /&gt;
-- Channels are unidentified.  General-purpose players SHOULD NOT attempt to play these streams, and offline decoders MAY deinterleave the output into separate PCM files, one per channel.  Decoders SHOULD NOT produce output for channels mapped to stream index 255 (pure silence) unless they have no other way to indicate the index of non-silent channels.&lt;br /&gt;
&lt;br /&gt;
The remaining channel mapping values (2...254) are reserved.  A decoder encountering a reserved mapping byte should act as though the mapping byte is 255.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player MUST play any Ogg Opus stream with channel mapping 0 or 1, even if the number of channels does not match the physically connected audio hardware.  Players SHOULD perform channel mixing to increase or reduce the number of channels as needed.&lt;br /&gt;
&lt;br /&gt;
Note: Any Opus stream may be decoded as mono (single output) or stereo (two outputs), regardless of its contents, by appropriate initialization of the decoder.  The &amp;quot;number of two-output streams&amp;quot; field indicates that the first M Opus decoders should be initialized in stereo mode, and the remaining decoder should be initialized in mono mode.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback.&lt;br /&gt;
&lt;br /&gt;
The purpose of pre-skip is to allow a time-segment of an existing Opus stream to be saved as an independent Ogg file, with single-sample time granularity, without re-encoding.  Opus is an asymptotically convergent predictive codec, so the decoded contents of each frame depend on the recent history of decoder inputs.  Pre-skip can be used to provide sufficient history to the decoder so that it has already converged before the stream&#039;s output begins.&lt;br /&gt;
&lt;br /&gt;
When constructing cropped Ogg Opus streams, we recommend a pre-skip of at least &#039;&#039;&#039;FIXME&#039;&#039;&#039; samples to ensure complete convergence.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  Any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, with internal samplerates of 8, 12, 16, 24, and 48 kHz. Each packet in the stream may have a different internal sample rate. Regardless of the internal sample rate, the reference decoder supports decoding any stream to any of these sample rates.  The original sample rate of the encode input is not preserved by the lossy compression.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:&lt;br /&gt;
- If the hardware supports 48 kHz playback, decode at 48 kHz&lt;br /&gt;
- else if the hardware&#039;s highest available sample rate is a supported rate, decode at this sample rate&lt;br /&gt;
- else if the hardware&#039;s highest available sample rate is less than 48 kHz, decode at the next higher supported rate and resample&lt;br /&gt;
- else decode at 48 kHz and resample.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata.  This may be useful when the user requires the output sample rate to match the input sample rate.  For example, a non-player decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus/testvectors&amp;diff=12982</id>
		<title>OggOpus/testvectors</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus/testvectors&amp;diff=12982"/>
		<updated>2011-08-17T20:53:43Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page lists test vectors needed for OggOpus which are specific to the Ogg mapping (separate from the opus bitstream test vectors, though they do some bitstream testing as a side efffect)&lt;br /&gt;
&lt;br /&gt;
* All test vectors should be chained files with at least two parts&lt;br /&gt;
** Chained file where the second link has no pregap and starts with inter frames (to ensure that decoder state is reset)&lt;br /&gt;
* Pre-skip (set large pre-skip with a chime &amp;quot;if you just heard a chime, your player is broken&amp;quot;)&lt;br /&gt;
* Multichannel&lt;br /&gt;
** Multichannel stereo (e.g. mono+mono)&lt;br /&gt;
** Multichannel w/pre-skip and random channel maps&lt;br /&gt;
** Multichannel with silent channels&lt;br /&gt;
*** Totally silent multichannel  (Should this one be invalid?)&lt;br /&gt;
** Multichannel with repeated channels (i.e. one stream used for multiple channels)&lt;br /&gt;
** Multichannel with 256 channels&lt;br /&gt;
** Mapping tests for the Vorbis mappings (e.g. name of the speaker spoken by each speaker)&lt;br /&gt;
* Files with crazy input rate.&lt;br /&gt;
* Header-gain set very high with a very quiet input (silent if you don&#039;t implement header gain).&lt;br /&gt;
* Header-gain set very low with an input that will clip a decoder if the header gain is not done internally.&lt;br /&gt;
* Header-gain set very low, and R128_TRACK_GAIN to normalize it&lt;br /&gt;
** matching WAV outputs ... but matching to what?&lt;br /&gt;
* Single packet per page&lt;br /&gt;
* Utterly stuffed pages with constant continued pages&lt;br /&gt;
* Pages whose contents are entirely and partially dropped frames (len=0) (maybe redundant with bitstream tests)&lt;br /&gt;
* Files with chimes after the end (testing end length chopping)&lt;br /&gt;
* File with all opus modes and frame sizes&lt;br /&gt;
* Stereo files using many mono frames at the beginning/end&lt;br /&gt;
* OpusTags comment values containing very large nonsense comments, duplicate comment values etc.&lt;br /&gt;
&lt;br /&gt;
=== Illegal test vectors that MUST fail ===&lt;br /&gt;
* Zero streams (N=0)&lt;br /&gt;
* Too many two-output streams&lt;br /&gt;
** M&amp;gt;N&lt;br /&gt;
** M&amp;lt;=N but M+N&amp;gt;255&lt;br /&gt;
* Channels mapped to nonexistent stream indices (255 &amp;gt; index &amp;gt;= M+N)&lt;br /&gt;
* Illegal OpusTags comments&lt;br /&gt;
** Total length larger or shorter than the packet&lt;br /&gt;
** Illegal field names&lt;br /&gt;
** Illegal field contents&lt;br /&gt;
** Illegal field (no &amp;quot;=&amp;quot;)&lt;br /&gt;
** Multiple R128_TRACK_GAIN comments (should this be required to fail?)&lt;br /&gt;
** R128_TRACK_GAIN comments containing illegal values (should this be required to fail?)&lt;br /&gt;
*** Non-ASCII encodings of correct-looking values&lt;br /&gt;
*All GP==0&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus/testvectors&amp;diff=12981</id>
		<title>OggOpus/testvectors</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus/testvectors&amp;diff=12981"/>
		<updated>2011-08-17T20:51:49Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Update proposed tests to match latest header ideas&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page lists test vectors needed for OggOpus which are specific to the Ogg mapping (separate from the opus bitstream test vectors, though they do some bitstream testing as a side efffect)&lt;br /&gt;
&lt;br /&gt;
* All test vectors should be chained files with at least two parts&lt;br /&gt;
** Chained file where the second link has no pregap and starts with inter frames (to ensure that decoder state is reset)&lt;br /&gt;
* Pre-skip (set large pre-skip with a chime &amp;quot;if you just heard a chime, your player is broken&amp;quot;)&lt;br /&gt;
* Multichannel&lt;br /&gt;
** Multichannel stereo (e.g. mono+mono)&lt;br /&gt;
** Multichannel w/pre-skip and random channel maps&lt;br /&gt;
** Multichannel with silent channels&lt;br /&gt;
** Multichannel with repeated channels (i.e. one stream used for multiple channels)&lt;br /&gt;
*** Totally silent multichannel  (Should this one be invalid?)&lt;br /&gt;
** Multichannel with 256 channels&lt;br /&gt;
** Mapping tests for the Vorbis mappings (e.g. name of the speaker spoken by each speaker)&lt;br /&gt;
* Files with crazy input rate.&lt;br /&gt;
* Header-gain set very high with a very quiet input (silent if you don&#039;t implement header gain).&lt;br /&gt;
* Header-gain set very low with an input that will clip a decoder if the header gain is not done internally.&lt;br /&gt;
* Header-gain set very low, and R128_TRACK_GAIN to normalize it&lt;br /&gt;
** matching WAV outputs ... but matching to what?&lt;br /&gt;
* Single packet per page&lt;br /&gt;
* Utterly stuffed pages with constant continued pages&lt;br /&gt;
* Pages whose contents are entirely and partially dropped frames (len=0) (maybe redundant with bitstream tests)&lt;br /&gt;
* Files with chimes after the end (testing end length chopping)&lt;br /&gt;
* File with all opus modes and frame sizes&lt;br /&gt;
* Stereo files using many mono frames at the beginning/end&lt;br /&gt;
* OpusTags comment values containing very large nonsense comments, duplicate comment values etc.&lt;br /&gt;
&lt;br /&gt;
=== Illegal test vectors that MUST fail ===&lt;br /&gt;
* Zero streams (N=0)&lt;br /&gt;
* Too many two-output streams&lt;br /&gt;
** M&amp;gt;N&lt;br /&gt;
** M&amp;lt;=N but M+N&amp;gt;255&lt;br /&gt;
* Channels mapped to nonexistent stream indices (255 &amp;gt; index &amp;gt;= M+N)&lt;br /&gt;
* Illegal OpusTags comments&lt;br /&gt;
** Total length larger or shorter than the packet&lt;br /&gt;
** Illegal field names&lt;br /&gt;
** Illegal field contents&lt;br /&gt;
** Illegal field (no &amp;quot;=&amp;quot;)&lt;br /&gt;
** Multiple R128_TRACK_GAIN comments (should this be required to fail?)&lt;br /&gt;
** R128_TRACK_GAIN comments containing illegal values (should this be required to fail?)&lt;br /&gt;
*** Non-ASCII encodings of correct-looking values&lt;br /&gt;
*All GP==0&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12979</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12979"/>
		<updated>2011-08-17T20:25:11Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Change to fixed-point R128_TRACK_GAIN&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - stream mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if stream mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-573  &lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  The gain is a Q7.8 fixed point number in dB, as in the OpusHead &amp;quot;output gain&amp;quot; field.  This field acts similarly to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.&lt;br /&gt;
&lt;br /&gt;
An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be an integer from -32768 to +32767 inclusive, represented in ASCII with no whitespace.  When a player makes use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain, and MUST correctly represent the R128 normalization gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the R128 gain into the OpusHead output gain and set &amp;quot;R128_TRACK_GAIN=0&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream mapping&#039;&#039;&#039;&lt;br /&gt;
We want to support multichannel. This defines the order and semantic meaning of the various channels encoded in each Opus packet.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 as three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. Or, when routing multitrack audio between mixing boards, it helps to be able to flag which instruments should be treated as mono and which are stereo.&lt;br /&gt;
&lt;br /&gt;
We don&#039;t need 8 bits of separate channel meanings, so if we want to make it easier to parse the number of channels, we can make that part of some of the stream mappings: 0 = mono, 1 = stereo, 2 = 5.1 in vorbis order, 3 = 6.0 in some order, 4 = 7.1, etc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback. The idea is to mitigate transients, and to allow sample-accurate editing through Ogg chaining.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  Any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
Note that although the output gain has enormous range (+/- 128 dB, enough to amplify inaudible sounds to the threshold of physical pain), most applications can only reasonably use a small portion of this range around zero.  The large range serves in part to ensure that gain can always be losslessly transferred between OpusHead and R128_TRACK_GAIN without saturating.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, supporting 8, 12, 16, 24, and 48 kHz signals. Which mode is chosen can be switched dynamically from packet to packet in the stream, but the reference decoder can generate output at any of those sample rates from the compressed data. Fidelity to the original sample rate of the encode input is not preserved by the lossy compression. Therefore, if the playback system supports one of those modes natively, &#039;&#039;the best option is to not resample&#039;&#039; but to play back directly at 48 kHz for best quality regardless of the value of this field.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata. We felt this could be useful downstream, and as something intended for machine consumption, didn&#039;t belong in the tag header. For example, a decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream description&#039;&#039;&#039;&lt;br /&gt;
For each Opus stream framed into the ogg packets of this logical bitstream, we define whether to decode it as mono or stereo, and give a channel index for how it should be mapped to playback. The semantic meaning of each channel index is defined by the &#039;&#039;stream mapping&#039;&#039; byte. E.g. it might be LEFT_REAR or CENTER.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 and three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. So we want to be able to say:&lt;br /&gt;
&lt;br /&gt;
 (stream mapping: vorbis channel order)&lt;br /&gt;
 stream 0: stereo: LEFT_FRONT, RIGHT_FRONT&lt;br /&gt;
 stream 1: mono: CENTER&lt;br /&gt;
 stream 2: stereo: LEFT_REAR, RIGHT_REAR&lt;br /&gt;
 stream 3: mono: LFE&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12978</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12978"/>
		<updated>2011-08-17T19:51:13Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Minor gain-related clarifications&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - stream mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if stream mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-7.03 dB&lt;br /&gt;
representing the volume shift needed to normalize the track&#039;s volume.  This field is similar to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]], although the normal volume reference is the [http://tech.ebu.ch/loudness EBU-R128] standard.  An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be of the form &amp;quot;[number] dB&amp;quot; in 7-bit ASCII.  When a player makes use of the TRACK_GAIN, it MUST be applied &#039;&#039;in addition&#039;&#039; to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the track gain into the OpusHead output gain and set &amp;quot;R128_TRACK_GAIN=0 dB&amp;quot;.  If a tool modifies the OpusHead &amp;quot;output gain&amp;quot; field, it MUST also update or remove the R128_TRACK_GAIN comment field.&lt;br /&gt;
&lt;br /&gt;
There is no comment field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should instead be stored in the OpusHead &amp;quot;output gain&amp;quot; field.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream mapping&#039;&#039;&#039;&lt;br /&gt;
We want to support multichannel. This defines the order and semantic meaning of the various channels encoded in each Opus packet.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 as three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. Or, when routing multitrack audio between mixing boards, it helps to be able to flag which instruments should be treated as mono and which are stereo.&lt;br /&gt;
&lt;br /&gt;
We don&#039;t need 8 bits of separate channel meanings, so if we want to make it easier to parse the number of channels, we can make that part of some of the stream mappings: 0 = mono, 1 = stereo, 2 = 5.1 in vorbis order, 3 = 6.0 in some order, 4 = 7.1, etc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback. The idea is to mitigate transients, and to allow sample-accurate editing through Ogg chaining.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  Any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, MUST be applied &#039;&#039;in addition&#039;&#039; to this output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
An encoder SHOULD set the output gain to zero, and apply any gain prior to encoding, when this is possible and does not conflict with the user&#039;s wishes.  The output gain should only be nonzero when the gain is adjusted after encoding, or when the user wishes to adjust the gain for playback while preserving the ability to recover the original signal amplitude.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, supporting 8, 12, 16, 24, and 48 kHz signals. Which mode is chosen can be switched dynamically from packet to packet in the stream, but the reference decoder can generate output at any of those sample rates from the compressed data. Fidelity to the original sample rate of the encode input is not preserved by the lossy compression. Therefore, if the playback system supports one of those modes natively, &#039;&#039;the best option is to not resample&#039;&#039; but to play back directly at 48 kHz for best quality regardless of the value of this field.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata. We felt this could be useful downstream, and as something intended for machine consumption, didn&#039;t belong in the tag header. For example, a decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream description&#039;&#039;&#039;&lt;br /&gt;
For each Opus stream framed into the ogg packets of this logical bitstream, we define whether to decode it as mono or stereo, and give a channel index for how it should be mapped to playback. The semantic meaning of each channel index is defined by the &#039;&#039;stream mapping&#039;&#039; byte. E.g. it might be LEFT_REAR or CENTER.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 and three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. So we want to be able to say:&lt;br /&gt;
&lt;br /&gt;
 (stream mapping: vorbis channel order)&lt;br /&gt;
 stream 0: stereo: LEFT_FRONT, RIGHT_FRONT&lt;br /&gt;
 stream 1: mono: CENTER&lt;br /&gt;
 stream 2: stereo: LEFT_REAR, RIGHT_REAR&lt;br /&gt;
 stream 3: mono: LFE&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12976</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12976"/>
		<updated>2011-08-17T19:23:10Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: More details regarding output gain&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - stream mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if stream mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-7.03 dB&lt;br /&gt;
similar to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]].  (There is no field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should be stored in the header&#039;s &amp;quot;output gain&amp;quot; field.)  An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be of the form &amp;quot;[number] dB&amp;quot; in 7-bit ASCII.  When a player makes use of the TRACK_GAIN, it MUST be applied in addition to the OpusHead output gain.  If an encoder populates the TRACK_GAIN field, and the output gain is not otherwise constrained or specified, the encoder SHOULD write the track gain into the OpusHead output gain and set &amp;quot;R128_TRACK_GAIN=0 dB&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
To avoid confusion with multiple normalization schemes, an OpusTags packet SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN, REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or REPLAYGAIN_ALBUM_PEAK fields.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream mapping&#039;&#039;&#039;&lt;br /&gt;
We want to support multichannel. This defines the order and semantic meaning of the various channels encoded in each Opus packet.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 as three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. Or, when routing multitrack audio between mixing boards, it helps to be able to flag which instruments should be treated as mono and which are stereo.&lt;br /&gt;
&lt;br /&gt;
We don&#039;t need 8 bits of separate channel meanings, so if we want to make it easier to parse the number of channels, we can make that part of some of the stream mappings: 0 = mono, 1 = stereo, 2 = 5.1 in vorbis order, 3 = 6.0 in some order, 4 = 7.1, etc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback. The idea is to mitigate transients, and to allow sample-accurate editing through Ogg chaining.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  Any volume adjustment or gain modification, such as the R128_TRACK_GAIN or a user-facing volume knob, should be applied in addition to this output gain, in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, supporting 8, 12, 16, 24, and 48 kHz signals. Which mode is chosen can be switched dynamically from packet to packet in the stream, but the reference decoder can generate output at any of those sample rates from the compressed data. Fidelity to the original sample rate of the encode input is not preserved by the lossy compression. Therefore, if the playback system supports one of those modes natively, &#039;&#039;the best option is to not resample&#039;&#039; but to play back directly at 48 kHz for best quality regardless of the value of this field.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata. We felt this could be useful downstream, and as something intended for machine consumption, didn&#039;t belong in the tag header. For example, a decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream description&#039;&#039;&#039;&lt;br /&gt;
For each Opus stream framed into the ogg packets of this logical bitstream, we define whether to decode it as mono or stereo, and give a channel index for how it should be mapped to playback. The semantic meaning of each channel index is defined by the &#039;&#039;stream mapping&#039;&#039; byte. E.g. it might be LEFT_REAR or CENTER.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 and three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. So we want to be able to say:&lt;br /&gt;
&lt;br /&gt;
 (stream mapping: vorbis channel order)&lt;br /&gt;
 stream 0: stereo: LEFT_FRONT, RIGHT_FRONT&lt;br /&gt;
 stream 1: mono: CENTER&lt;br /&gt;
 stream 2: stereo: LEFT_REAR, RIGHT_REAR&lt;br /&gt;
 stream 3: mono: LFE&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12975</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12975"/>
		<updated>2011-08-17T19:06:36Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Approaching consensus on gain tags&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding&lt;br /&gt;
 - stream mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if stream mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
One new comment field is introduced for Ogg Opus:&lt;br /&gt;
 R128_TRACK_GAIN=-7.03 dB&lt;br /&gt;
similar to the [[VorbisComment#Replay_Gain|REPLAYGAIN_TRACK_GAIN field in Vorbis]].  (There is no field corresponding to Replaygain&#039;s ALBUM_GAIN; that information should be stored in the header&#039;s &amp;quot;output gain&amp;quot; field.)  An Ogg Opus file MUST NOT have more than one such field, and if present its value MUST be of the form &amp;quot;[number] dB&amp;quot; in 7-bit ASCII.&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream mapping&#039;&#039;&#039;&lt;br /&gt;
We want to support multichannel. This defines the order and semantic meaning of the various channels encoded in each Opus packet.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 as three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. Or, when routing multitrack audio between mixing boards, it helps to be able to flag which instruments should be treated as mono and which are stereo.&lt;br /&gt;
&lt;br /&gt;
We don&#039;t need 8 bits of separate channel meanings, so if we want to make it easier to parse the number of channels, we can make that part of some of the stream mappings: 0 = mono, 1 = stereo, 2 = 5.1 in vorbis order, 3 = 6.0 in some order, 4 = 7.1, etc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback. The idea is to mitigate transients, and to allow sample-accurate editing through Ogg chaining.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;output gain&#039;&#039;&#039;&lt;br /&gt;
This is a gain to be applied by the decoder.  Virtually all players and media frameworks should apply it by default.  When using the R128_TRACK_GAIN, or any other gain modification, players must also apply the output gain in order to achieve playback at the desired volume.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, supporting 8, 12, 16, 24, and 48 kHz signals. Which mode is chosen can be switched dynamically from packet to packet in the stream, but the reference decoder can generate output at any of those sample rates from the compressed data. Fidelity to the original sample rate of the encode input is not preserved by the lossy compression. Therefore, if the playback system supports one of those modes natively, &#039;&#039;the best option is to not resample&#039;&#039; but to play back directly at 48 kHz for best quality regardless of the value of this field.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata. We felt this could be useful downstream, and as something intended for machine consumption, didn&#039;t belong in the tag header. For example, a decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream description&#039;&#039;&#039;&lt;br /&gt;
For each Opus stream framed into the ogg packets of this logical bitstream, we define whether to decode it as mono or stereo, and give a channel index for how it should be mapped to playback. The semantic meaning of each channel index is defined by the &#039;&#039;stream mapping&#039;&#039; byte. E.g. it might be LEFT_REAR or CENTER.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 and three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. So we want to be able to say:&lt;br /&gt;
&lt;br /&gt;
 (stream mapping: vorbis channel order)&lt;br /&gt;
 stream 0: stereo: LEFT_FRONT, RIGHT_FRONT&lt;br /&gt;
 stream 1: mono: CENTER&lt;br /&gt;
 stream 2: stereo: LEFT_REAR, RIGHT_REAR&lt;br /&gt;
 stream 3: mono: LFE&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12972</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12972"/>
		<updated>2011-08-11T20:12:52Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: /* Built-in R128 */ Clarification&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding (under discussion)&lt;br /&gt;
 - stream mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if stream mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream mapping&#039;&#039;&#039;&lt;br /&gt;
We want to support multichannel. This defines the order and semantic meaning of the various channels encoded in each Opus packet.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 as three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. Or, when routing multitrack audio between mixing boards, it helps to be able to flag which instruments should be treated as mono and which are stereo.&lt;br /&gt;
&lt;br /&gt;
We don&#039;t need 8 bits of separate channel meanings, so if we want to make it easier to parse the number of channels, we can make that part of some of the stream mappings: 0 = mono, 1 = stereo, 2 = 5.1 in vorbis order, 3 = 6.0 in some order, 4 = 7.1, etc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback. The idea is to mitigate transients, and to allow sample-accurate editing through Ogg chaining.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, supporting 8, 12, 16, 24, and 48 kHz signals. Which mode is chosen can be switched dynamically from packet to packet in the stream, but the reference decoder can generate output at any of those sample rates from the compressed data. Fidelity to the original sample rate of the encode input is not preserved by the lossy compression. Therefore, if the playback system supports one of those modes natively, &#039;&#039;the best option is to not resample&#039;&#039; but to play back directly at 48 kHz for best quality regardless of the value of this field.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata. We felt this could be useful downstream, and as something intended for machine consumption, didn&#039;t belong in the tag header. For example, a decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream description&#039;&#039;&#039;&lt;br /&gt;
For each Opus stream framed into the ogg packets of this logical bitstream, we define whether to decode it as mono or stereo, and give a channel index for how it should be mapped to playback. The semantic meaning of each channel index is defined by the &#039;&#039;stream mapping&#039;&#039; byte. E.g. it might be LEFT_REAR or CENTER.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 and three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. So we want to be able to say:&lt;br /&gt;
&lt;br /&gt;
 (stream mapping: vorbis channel order)&lt;br /&gt;
 stream 0: stereo: LEFT_FRONT, RIGHT_FRONT&lt;br /&gt;
 stream 1: mono: CENTER&lt;br /&gt;
 stream 2: stereo: LEFT_REAR, RIGHT_REAR&lt;br /&gt;
 stream 3: mono: LFE&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
== Proposals for handling Opus gain and/or ReplayGain ==&lt;br /&gt;
=== Built-in R128 ===&lt;br /&gt;
Three Q7.9 dB fields consisting of&lt;br /&gt;
# The &amp;quot;file normalization gain&amp;quot;, i.e. the gain needed to make this file R128 normal. (a la REPLAYGAIN_TRACK)&lt;br /&gt;
# The &amp;quot;program normalization gain&amp;quot;, i.e. the gain that will make some set of files normal. (a la REPLAYGAIN_ALBUM).  Most players should use this by default.&lt;br /&gt;
# The &amp;quot;raw gain&amp;quot;, i.e. a gain to apply if the goal is to reproduce the input faithfully.  The true raw gain needed to reproduce the input in a sane encoder is always zero, so the real purpose of this gain is to provide adjustment capability even when decoding through a non-player system that will lose these tags.  For example, software like ffmpeg might desire to provide faithful input reproduction for transcoding, and might also fail to expose any parameter for users to activate normalization instead.  If the newly encoded file also loses the normalization information, then it will play back at the wrong (i.e. unmodified) volume.  If ffmpeg supports raw gain, then at least the user may manipulate the input&#039;s raw gain so that the tagless transcoded file has appropriate raw volume.  (Note, however, that ffmpeg also provides command-line volume adjustment knobs, so this utility is only relevant if it is more convenient to modify the input header than to modify the transcoding invocation.)  Similar arguments may apply to web browsers and other frameworks that (a) may be used for both input reproduction and playback, and (b) do not provide convenient parameters to the user to activate normalization ... although it would typically be sufficient for these frameworks to operate with program normalization, which would also be the library default, and possibly the opusdec default behavior.&lt;br /&gt;
&lt;br /&gt;
The exact normalization scheme might be noted in a comment in OpusTags, for the sake of keeping up with trends in volume measurement algorithms.&lt;br /&gt;
&lt;br /&gt;
For the sake of sanity, in this proposal it would probably be best to outlaw ReplayGain tags altogether, although we have not reproduced the clipping-level functionality.&lt;br /&gt;
&lt;br /&gt;
=== Built-in ReplayGain ===&lt;br /&gt;
Include in the Opus header a section for the 4 standard ReplayGain values, stored as binary numbers with a value reserved for &amp;quot;unset&amp;quot;.  Generalize the &amp;quot;Album gain&amp;quot; to &amp;quot;expected listening context gain&amp;quot; for items that do not come from albums.  Include a complete ReplayGain implementation in the Ogg-Opus library so that players get support that is on by default (but can be turned off for WAV output).  Mandate that decoders must ignore any replaygain comment fields, and encoders must not add them.&lt;br /&gt;
=== Compatible mandatory ReplayGain ===&lt;br /&gt;
Preserve the VorbisComment REPLAYGAIN tags, but make support mandatory and provide an implementation, so that new decoders may simply hand over all header packets to the library and get replaygain support for free.&lt;br /&gt;
=== Inseparable OpusGain ===&lt;br /&gt;
Include a header field that encodes a gain.  This gain is regarded as part of the bitstream, and the data is never decoded without applying this gain.&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12971</id>
		<title>OggOpus</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=OggOpus&amp;diff=12971"/>
		<updated>2011-08-11T20:04:49Z</updated>

		<summary type="html">&lt;p&gt;Bemasc: Add description of latest proposal&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ogg mapping for Opus ==&lt;br /&gt;
&lt;br /&gt;
The IETF Opus codec is a low-latency audio codec optimized for both voice and general-purpose audio. See [tools.ietf.org/html/draft-ietf-codec-opus the spec] for technical details.&lt;br /&gt;
&lt;br /&gt;
Almost everything about this codec is either fixed or dynamically switchable, so the usual id and setup header parameters in the header packets of an Ogg encapsulation aren&#039;t useful. In particular, bitrate, frame size, mono/stereo, and coding modes are all dynamically switchable from packet to packet. A one-byte header on each data packet defines the parameters for that particular packet.&lt;br /&gt;
&lt;br /&gt;
Remaining parameters we need to signal are:&lt;br /&gt;
&lt;br /&gt;
* magic number for stream identification&lt;br /&gt;
* comment/metadata tags&lt;br /&gt;
&lt;br /&gt;
Additionally there&#039;s been a desire to support some kind of channel bonding for surround, and some kind of option signalling for &amp;quot;Opus Custom&amp;quot;, in particular the granulerate.&lt;br /&gt;
&lt;br /&gt;
=== Draft spec ===&lt;br /&gt;
&lt;br /&gt;
Granulepos is the count of decodeable samples at a fixed rate of 48 kHz.&lt;br /&gt;
&lt;br /&gt;
Two headers: id, comment&lt;br /&gt;
&lt;br /&gt;
Id header:&lt;br /&gt;
&lt;br /&gt;
 - &amp;quot;OpusHead&amp;quot; (64 bits)&lt;br /&gt;
 - version number (8 bits) zero for this spec&lt;br /&gt;
 - Number of channels &#039;c&#039; (8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Pre-skip (16 bits)&lt;br /&gt;
 - Input sample rate (32 bits) (informational only)&lt;br /&gt;
 - Output gain (16 bits, signed Q7.8 in dB) to apply when decoding (under discussion)&lt;br /&gt;
 - stream mapping (8 bits)&lt;br /&gt;
  --  0 = one stream, RTP order, 1 = channels in vorbis spec order, 2..254 reserved (treat as 255), 255 = no defined channel meaning&lt;br /&gt;
 if stream mapping &amp;gt; 0&lt;br /&gt;
 - Number of streams &#039;N&#039;(8 bits) (must be &amp;gt; 0)&lt;br /&gt;
 - Number of two-output streams &#039;M&#039; (8 bits) (M+N strictly smaller than 255)&lt;br /&gt;
 - for each output channel [0..c]&lt;br /&gt;
   -- read stream index (8 bits) (255 means silent through the file)&lt;br /&gt;
&lt;br /&gt;
All two-output streams come first, so if the stream index is &amp;lt; 2*M, the channel&lt;br /&gt;
decode the (index/2)th opus stream as stereo, selecting the (index%2)th output&lt;br /&gt;
(left for even, right for odd). If index &amp;gt;= 2*M, decode the (index - M)th stream&lt;br /&gt;
as mono and use that as the output. As a special case, a stream index of 255&lt;br /&gt;
means to write silence to that output channel. &lt;br /&gt;
&lt;br /&gt;
Comment header:&lt;br /&gt;
&lt;br /&gt;
 - 8 byte &#039;OpusTags&#039; magic signature (64 bits)&lt;br /&gt;
 - rest follows the vorbis-comment header design used in OggVorbis, OggTheora, and Speex.&lt;br /&gt;
  ** Vendor string (always present)&lt;br /&gt;
  ** tag=value metadata strings (zero or more)&lt;br /&gt;
&lt;br /&gt;
Some discussion is in order.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;magic signature&#039;&#039;&#039;&lt;br /&gt;
The signature magic values allow codec identification and are being human readable. Starting with &#039;Op&#039; helps distinguish them from data packets.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;version&#039;&#039;&#039;&lt;br /&gt;
Version number. Must always be zero for this version of the encapsulation spec. In general revising the spec later isn&#039;t a good idea, but this also acts as a null terminator for the signature bytes and helps align the rest of the fields.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream mapping&#039;&#039;&#039;&lt;br /&gt;
We want to support multichannel. This defines the order and semantic meaning of the various channels encoded in each Opus packet.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 as three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. Or, when routing multitrack audio between mixing boards, it helps to be able to flag which instruments should be treated as mono and which are stereo.&lt;br /&gt;
&lt;br /&gt;
We don&#039;t need 8 bits of separate channel meanings, so if we want to make it easier to parse the number of channels, we can make that part of some of the stream mappings: 0 = mono, 1 = stereo, 2 = 5.1 in vorbis order, 3 = 6.0 in some order, 4 = 7.1, etc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;pre-skip&#039;&#039;&#039;&lt;br /&gt;
This is the number of samples (at 48 kHz) to discard from the decoder output before starting playback. The idea is to mitigate transients, and to allow sample-accurate editing through Ogg chaining.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;input rate&#039;&#039;&#039;&lt;br /&gt;
This is &#039;&#039;not&#039;&#039; the sample rate for playback of the encoded data.&lt;br /&gt;
&lt;br /&gt;
Opus has a handful of coding modes, supporting 8, 12, 16, 24, and 48 kHz signals. Which mode is chosen can be switched dynamically from packet to packet in the stream, but the reference decoder can generate output at any of those sample rates from the compressed data. Fidelity to the original sample rate of the encode input is not preserved by the lossy compression. Therefore, if the playback system supports one of those modes natively, &#039;&#039;the best option is to not resample&#039;&#039; but to play back directly at 48 kHz for best quality regardless of the value of this field.&lt;br /&gt;
&lt;br /&gt;
However, the Ogg mapping allows the encoder to pass the sample rate of the original input stream as metadata. We felt this could be useful downstream, and as something intended for machine consumption, didn&#039;t belong in the tag header. For example, a decoder writing PCM format to disk might choose to resample the output audio back to the original input rate to reduce surprise.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream count&#039;&#039;&#039;&lt;br /&gt;
It is necessary to describe the number of streams so the decoder can correctly parse the packed frames inside the packet. We store the count-minus-one here, to remove invalid configuration of zero Opus streams in the this Ogg stream.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;stream description&#039;&#039;&#039;&lt;br /&gt;
For each Opus stream framed into the ogg packets of this logical bitstream, we define whether to decode it as mono or stereo, and give a channel index for how it should be mapped to playback. The semantic meaning of each channel index is defined by the &#039;&#039;stream mapping&#039;&#039; byte. E.g. it might be LEFT_REAR or CENTER.&lt;br /&gt;
&lt;br /&gt;
For example, we can&#039;t just code 5.1 and three stereo Opus streams, because then LFE ends up sharing a stereo pair with another channel (RR in the Vorbis channel order) which isn&#039;t a good idea, while 6 mono channels wastes bandwidth. So we want to be able to say:&lt;br /&gt;
&lt;br /&gt;
 (stream mapping: vorbis channel order)&lt;br /&gt;
 stream 0: stereo: LEFT_FRONT, RIGHT_FRONT&lt;br /&gt;
 stream 1: mono: CENTER&lt;br /&gt;
 stream 2: stereo: LEFT_REAR, RIGHT_REAR&lt;br /&gt;
 stream 3: mono: LFE&lt;br /&gt;
&lt;br /&gt;
== Test vectors ==&lt;br /&gt;
&lt;br /&gt;
* [[OggOpus/testvectors|Planned test vectors for OggOpus]]&lt;br /&gt;
* Opus test vectors&lt;br /&gt;
&lt;br /&gt;
== Proposals for handling Opus gain and/or ReplayGain ==&lt;br /&gt;
=== Built-in R128 ===&lt;br /&gt;
Three Q7.9 dB fields consisting of&lt;br /&gt;
# The &amp;quot;file normalization gain&amp;quot;, i.e. the gain needed to make this file R128 normal. (a la REPLAYGAIN_TRACK)&lt;br /&gt;
# The &amp;quot;program normalization gain&amp;quot;, i.e. the gain that will make some set of files normal. (a la REPLAYGAIN_ALBUM).  Most players should use this by default.&lt;br /&gt;
# The &amp;quot;raw gain&amp;quot;, i.e. a gain to apply if the goal is to reproduce the input faithfully.  The true raw gain needed to reproduce the input in a sane encoder is always zero, so the real purpose of this gain is to provide adjustment capability even when decoding through a non-player system that will lose these tags.  For example, software like ffmpeg might desire to provide faithful input reproduction for transcoding, and might also fail to expose any parameter for users to choose normalization instead.  If the newly encoded file also loses the normalization information, then it will play back at the wrong volume.  If ffmpeg supports raw gain, then at least the user may manipulate the input&#039;s raw gain so that they tagless transcoded file has appropriate raw volume.  (Note, however, that ffmpeg also provides command-line volume adjustment knobs, so this utility is more convenient to modify the input than to modify the transcoding invocation.)  Similar arguments may apply to web browsers and other frameworks that (a) may be used for both input reproduction and playback, and (b) do not provide convenient parameters to the user to control normalization state ... although it would typically be sufficient for these frameworks to operate with program normalization, which would also be the library default.&lt;br /&gt;
&lt;br /&gt;
The exact normalization scheme might be noted in a comment in OpusTags, for the sake of keeping up with trends in volume measurement algorithms.&lt;br /&gt;
&lt;br /&gt;
For the sake of sanity, in this proposal it would probably be best to outlaw ReplayGain tags altogether, although we have not reproduced the clipping-level functionality.&lt;br /&gt;
&lt;br /&gt;
=== Built-in ReplayGain ===&lt;br /&gt;
Include in the Opus header a section for the 4 standard ReplayGain values, stored as binary numbers with a value reserved for &amp;quot;unset&amp;quot;.  Generalize the &amp;quot;Album gain&amp;quot; to &amp;quot;expected listening context gain&amp;quot; for items that do not come from albums.  Include a complete ReplayGain implementation in the Ogg-Opus library so that players get support that is on by default (but can be turned off for WAV output).  Mandate that decoders must ignore any replaygain comment fields, and encoders must not add them.&lt;br /&gt;
=== Compatible mandatory ReplayGain ===&lt;br /&gt;
Preserve the VorbisComment REPLAYGAIN tags, but make support mandatory and provide an implementation, so that new decoders may simply hand over all header packets to the library and get replaygain support for free.&lt;br /&gt;
=== Inseparable OpusGain ===&lt;br /&gt;
Include a header field that encodes a gain.  This gain is regarded as part of the bitstream, and the data is never decoded without applying this gain.&lt;/div&gt;</summary>
		<author><name>Bemasc</name></author>
	</entry>
</feed>