https://wiki.xiph.org/api.php?action=feedcontributions&user=Lee+Carr%C3%A9&feedformat=atomXiphWiki - User contributions [en]2024-03-29T11:23:03ZUser contributionsMediaWiki 1.40.1https://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&diff=14211Videos/Digital Show and Tell2013-07-02T16:19:46Z<p>Lee Carré: /* Epilogue */ link-text choice and grammar</p>
<hr />
<div><small>''Wiki edition''</small><br />
[[Image:dsat_001.jpg|400px|right]]<br />
<br />
Continuing in the "firehose" tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org's second video on digital media explores multiple facets of digital audio signals and how they ''really'' behave in the real world.<br />
<br />
Demonstrations of sampling, quantization, bit-depth, and dither explore digital audio behavior on real audio equipment using both modern digital analysis and vintage analog bench equipment, just in case we can't trust those newfangled digital gizmos. You can download the source code for each demo and try it all for yourself!<br />
<br/><br/><br/><br />
<center><font size="+2">[http://www.xiph.org/video/vid2.shtml Download or Watch online]</font></center><br />
<br style="clear:both;"/><br />
Supported players: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera]. Or see [http://www.webmproject.org/users/ other WebM] or [[TheoraSoftwarePlayers|other Theora]] players.<br />
<br />
If you're having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br/><br />
<hr/><br />
<br/><br/><br/><br />
[[Image:Xiph_ep02_test.png|400px|right]]<br />
<br />
&ldquo;Hi, I'm Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].<br />
<br />
&ldquo;A few months ago, I wrote<br />
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don't make sense].<br />
In the article, I<br />
mentioned--almost in passing--that a digital waveform is<br />
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],<br />
and you certainly don't get a stairstep when you<br />
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].<br />
<br />
&ldquo;Of everything in the entire article, '''that''' was the number one thing<br />
people wrote about. In fact, more than half the mail I got was questions and<br />
comments about basic digital signal behavior. Since there's interest, let's<br />
take a little time to play with some ''simple'' digital signals. &rdquo;<br />
<br />
==Veritas ex machina==<br />
[[Image:Dsat_002.jpg|200px|right]]<br />
[[Image:Dsat_003.jpg|200px|right]]<br />
[[Image:Dsat_004.jpg|200px|right]]<br />
[[Image:Dsat_005.jpg|200px|right]]<br />
<br />
If we pretend for a moment that we have no idea how digital signals really<br />
behave, then it doesn't make sense for us to use digital test<br />
equipment. Fortunately for this exercise, there's still plenty<br />
of working analog lab equipment out there.<br />
<br />
We need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input<br />
signals--in this case, an<br />
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&nid=-536900197.536896863&cc=SE&lc=swe HP3325]<br />
from 1978.<br />
<br />
We'll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],<br />
like this Tektronix 2246 from the mid-90s, one of the last and best analog scopes made.<br />
<br />
Finally, we'll inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an<br />
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this<br />
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&nid=-536900197.536897319&cc=SE&lc=swe HP3585]<br />
from the same product line as<br />
the signal generator. Like the other equipment here it has<br />
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],<br />
but the signal path<br />
from input to what you see on the screen is completely analog.<br />
<br />
All of this equipment is vintage, but the specs are still quite good.<br />
We start with the signal generator set to output a 1 [[WikiPedia:Hertz#SI_multiples|kHz]]<br />
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].<br />
We see the sine wave on the oscilloscope, can verify that it is indeed<br />
1 kHz at 1 Volt RMS, which is 2.8 Volts<br />
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],<br />
and that matches the<br />
measurement on the spectrum analyzer as well.<br />
<br />
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]<br />
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],<br />
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below<br />
[[WikiPedia:Fundamental_frequency|the fundamental]].<br />
This doesn't matter to the demos, but it's good to take notice of it now to avoid confusion later.<br />
<br />
For digital conversion, we use a boring, consumer-grade, eMagic USB1<br />
audio device. It's more than ten years old at this point, and it's<br />
getting obsolete.<br />
<br />
A recent converter can easily have an order of magnitude better specs.<br />
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],<br />
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],<br />
[[WikiPedia:Jitter#Sampling_jitter|jitter]],<br />
[[WikiPedia:Noise_floor|noise behavior]],<br />
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...<br />
You may not<br />
have noticed. Just because we can measure an improvement doesn't<br />
mean we can hear it, and even these old consumer boxes were already at<br />
the edge of ideal transparency.<br />
<br />
The eMagic connects to my ThinkPad, which displays a digital<br />
waveform and spectrum for comparison, then the ThinkPad<br />
sends the digital signal right back out to the eMagic for<br />
re-conversion to analog and observation on the output scopes.<br />
<br />
<br style="clear:both;"/><br />
<br />
==Stairsteps==<br />
[[Image:Dsat 007.png|360px|right]]<br />
[[Image:Dsat 006.jpg|360px|right]]<br />
First demo: We begin by converting an analog signal to digital and<br />
then right back to analog again with no other steps.<br />
<br />
The signal generator is set to produce a 1kHz sine wave just like<br />
before and we can see the analog sine wave on the input-side oscilloscope. The eMagic digitizes our signal to<br />
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],<br />
same as on a CD. The spectrum of the digitized signal on the Thinkpad matches what we saw earlier and what we see now on the analog spectrum analyzer, aside from its <br />
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier. For now, the waveform display shows our digitized sine wave as a<br />
stairstep pattern, one step for each sample.<br />
<br />
When we look at the output signal that's been converted<br />
from digital back to analog, we see that it's exactly like the original sine wave. No stairsteps.<br />
<br />
1 kHz is still a fairly low frequency, so perhaps the stairsteps are just<br />
hard to see or they're being smoothed away. Next, set the signal generator to 15kHz, which is much closer to [[WikiPedia:Nyquist_frequency|Nyquist]].<br />
Now the sine wave is represented by less than three samples per cycle, and the digital waveform appears rather poor! Yet the analog output is still a perfect sine wave, exactly like the original.<br />
As we keep increasing frequency, all the way to 20kHz, the output waveform is still perfect. No jagged edges, no dropoff, no stairsteps.<br />
<br />
So where'd the stairsteps go? It's a trick question; they were never there. Drawing a digital waveform as a stairstep was wrong to begin with.<br />
<br />
A stairstep is a continuous-time function. It's jagged, and it's piecewise, but it has a defined value at every point in time.<br />
A sampled signal is entirely different. It's discrete-time; it's only got a value right at each instantaneous sample point and it's<br />
undefined, there is no value at all, everywhere between. A discrete-time signal is properly drawn as a lollipop graph.<br />
The continuous, analog counterpart of a digital signal passes smoothly through each sample point, and that's just as true for high<br />
frequencies as it is for low.<br />
<br />
[[Image:Dsat 008.png|360px|right]]<br />
The interesting and non-obvious bit is that [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there's only one<br />
bandlimited signal that passes exactly through each sample point]]; it's a unique solution. If you sample a bandlimited signal and then convert it back, the original input is also the only possible output.<br />
A signal that differs even minutely from the original includes frequency content at or beyond Nyquist, breaks the bandlimiting requirement and isn't a valid solution.<br />
<br />
So how did everyone get confused and start thinking of digital signals as stairsteps? I can think of two good reasons.<br />
<br />
First: it's easy to convert a sampled signal to a true stairstep. Just<br />
extend each sample value forward until the next sample period. This is<br />
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it's an important part of how some<br />
digital-to-analog converters work, especially the simplest ones.<br />
As a result, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or<br />
digital-to-analog conversion]] is probably going to see a diagram of a<br />
stairstep waveform somewhere, but that's not a finished conversion,<br />
and it's not the signal that comes out.<br />
<br />
Second, and this is probably the more likely reason, engineers who<br />
supposedly know better (yes, even I) draw stairsteps even though they're<br />
technically wrong. It's a sort of one-dimensional version of<br />
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].<br />
Pixels aren't squares either, they're samples of a 2-dimensional<br />
function space and so they're also, conceptually, infinitely small<br />
points. Practically, it's a real pain in the ass to see or manipulate<br />
infinitely small anything, so big squares it is.<br />
<br />
Digital stairstep drawings are exactly the same thing. It's just a convenient drawing. The stairsteps aren't really there.<br />
<div style="clear:both"></div><br />
<br />
==Bit-depth==<br />
[[Image:Dsat_009.jpg|360px|right]]<br />
[[Image:Dsat_010.jpg|260px|right]]<br />
<br />
When we convert a digital signal back to analog, the result is<br />
''also'' smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]]. It doesn't matter if it's 24 bits or 16 bits or 8 bits.<br />
So does that mean that the digital bit depth makes no difference at<br />
all? Of course not.<br />
<br />
Channel 2 is the same sine wave input, but we quantize it with<br />
[[WikiPedia:Dither|dither]] down to 8 bits.<br />
On the scope, we still see a nice<br />
smooth sine wave on channel 2. Look very close, and you'll also see a<br />
bit more noise. That's a clue.<br />
<br />
If we look at the spectrum of the signal, our sine wave is<br />
still there unaffected, but the noise level of the 8-bit signal on<br />
the second channel is much higher. And that's the difference, the only difference, the number of bits makes.<br />
<br />
When we digitize a signal, first we sample it. The<br />
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,<br />
and [[WikiPedia:Quantization_error|quantization adds noise]].<br />
The number of bits determines how much noise and so the level of the<br />
noise floor.<br />
<br />
What does this dithered quantization noise sound like?<br />
Those of you who have used analog recording equipment might think to yourselves, "My goodness! That sounds like tape hiss!"<br />
Well, it doesn't just sound like tape hiss, it acts like it too, and<br />
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it's<br />
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It ''is'' tape hiss.<br />
<br />
Intuitively, that means that we can measure tape hiss and thus the noise floor<br />
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]<br />
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a<br />
digital perspective. [[WikiPedia:Compact cassettes|Compact cassettes]], for those of you who are old enough to remember them, could reach as<br />
deep as 9 bits in perfect conditions. 5 to 6 bits was<br />
more typical, especially if it was a recording made on a<br />
[[WikiPedia:Cassette_deck|tape deck]]. That's right; your old mix tapes were only about 6 bits<br />
deep if you were lucky!<br />
<br />
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely<br />
hit 13 bits ''with'' [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]]. <br />
That's why seeing '[[WikiPedia:SPARS_code|D D D]]' on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,<br />
high-end deal.<br />
<br />
<div style="clear:both"></div><br />
<br />
==Dither==<br />
[[Image:Dsat_011.png|360px|right]]<br />
<br />
We've been quantizing with [[Wikipedia:dither|dither]]. What is dither<br />
exactly and, more importantly, what does it do?<br />
<br />
The simplest way to quantize a signal is to choose the digital<br />
amplitude value [[WikiPedia:Rounding|closest to the original analog amplitude]].<br />
Unfortunately, the exact noise that results from this simple<br />
quantization scheme depends somewhat on the input signal.<br />
It may be inconsistent, cause distortion, or be<br />
undesirable in some other way.<br />
<br style="clear:both;"/><br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
*Cameron Nicklaus Christou's thesis [http://uwspace.uwaterloo.ca/bitstream/10012/3867/1/thesis.pdf Optimal Dither and Noise Shaping in Image Processing] provides an ''excellent'' explanation of dither and noise shaping.<br />
</div>&nbsp;</center><br />
<br />
Dither is specially-constructed noise that substitutes for the noise<br />
produced by simple quantization. Dither doesn't [[WikiPedia:Sound_masking|drown out or mask]]<br />
quantization noise, it replaces it with noise characteristics<br />
of our choosing that aren't influenced by the input.<br />
<br />
The signal generator has too much noise for this test so we produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.<br />
We see the sine wave on waveform display and output scope, and <br />
a clean frequency peak with a uniform noise floor on both spectral displays<br />
just like before. Again, this is with dither.<br />
<br />
Now I turn dithering off.<br />
<br />
The quantization noise that dither had spread out into a nice, flat noise<br />
floor, piles up into harmonic distortion peaks. The noise floor is<br />
lower, but the level of distortion becomes nonzero, and the distortion<br />
peaks sit higher than the dithering noise did.<br />
<br />
At 8 bits this effect is exaggerated. At 16 bits,<br />
even without dither, harmonic distortion is going to be so low as to<br />
be completely inaudible. Still, we can use dither to eliminate it completely if we so choose.<br />
<br />
Turning the dither off again for a moment, you'll notice that the<br />
absolute level of distortion from undithered quantization stays<br />
approximately constant regardless of the input amplitude.<br />
But when the signal level drops below a half a bit, everything<br />
quantizes to zero.<br />
<br />
In a sense, everything quantizing to zero is just 100% distortion!<br />
Dither eliminates this distortion too. When we reenable dither, we clearly see our signal at 1/4 bit with a nice flat noise floor.<br />
<br />
The noise floor doesn't have to be flat. Dither is noise of our<br />
choosing, so it makes sense to choose a noise as [http://www.acoustics.salford.ac.uk/res/cox/sound_quality/?content=subjective inoffensive] and<br />
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]<br />
as possible.<br />
<br />
Human hearing is [[WikiPedia:Equal-loudness_contour|most sensitive in the midrange from 2kHz to 4kHz]]; that's where background noise is going to be the most obvious.<br />
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where<br />
hearing is less sensitive, usually the highest frequencies.<br />
<br />
Lastly, dithered quantization noise ''is'' higher [[WikiPedia:Sound_power|power]] overall<br />
than undithered quantization noise, even though it often sounds quieter, and<br />
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence. However,<br />
dither isn't only an on or off choice. We can reduce the dither's<br />
power to balance less noise against a bit of distortion to minimize<br />
the overall effect.<br />
<br />
For the next test, we also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise. At<br />
full dithering power, the noise is uniform, constant, and featureless<br />
just like we expect.<br />
<br />
As we reduce the dither's power, the input increasingly<br />
affects the amplitude and the character of the quantization noise.<br />
Shaped dither behaves similarly, but noise shaping lends one more nice<br />
advantage; it can use a somewhat lower<br />
dither power before the input has as much effect on the output.<br />
<br />
Despite all this text spent on dither, the differences exist 100 decibels or more below [[WikiPedia:Full_scale|full scale]]. If the CD had been<br />
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],<br />
perhaps dither ''might'' be<br />
more important. At 16 bits it's mostly a wash. It's reasonable to treat<br />
dither as an insurance policy that gives several extra<br />
decibels of dynamic range just in case. That said no<br />
one ever ruined a great recording by not dithering the final master.<br />
<br />
==Bandlimitation and timing==<br />
[[image:Dsat_016.jpg|360px|right]]<br />
<br />
We've been using [[WikiPedia:Sine_wave|sine waves]]. They're the obvious choice when what we<br />
want to see is a system's behavior at a given isolated frequency. Now let's look at something a bit more complex. What should we expect to happen when I change the input to a [[WikiPedia:Square_wave|square wave]]? <br />
<br />
The input scope confirms a 1kHz square wave. The output scope shows... exactly what it should.<br />
<br />
What is a square wave really? <br />
<br />
We can say it's a waveform that's some positive value for half a cycle and then transitions instantaneously to a negative value for the other half.<br />
<br />
:<math><br />
\ squarewave(t) = \begin{cases} 1, & |t| < T_1 \\ 0, & T_1 < |t| \leq {1 \over 2}T \end{cases}<br />
</math><br />
<br />
But that doesn't really tell us anything useful about how that input becomes this output.<br />
<br />
We remember that any waveform is also [[WikiPedia:Fourier_series|the sum of discrete frequencies]],<br />
and a square wave is particularly simple sum: a fundamental and an infinite series of [[WikiPedia:Even_and_odd_functions#Harmonics|odd harmonics]].<br />
<br />
[[image:Dsat_013.jpg|360px|right]]<br />
:<math>\begin{align}<br />
\ squarewave(t) = \frac{4}{\pi}\sin(\omega t) + \frac{4}{3\pi}\sin(3\omega t) + \frac{4}{5\pi}\sin(5\omega t) + \\<br />
\frac{4}{7\pi}\sin(7\omega t) + \frac{4}{9\pi}\sin(9\omega t) + \frac{4}{11\pi}\sin(11\omega t) + \\ <br />
\frac{4}{13\pi}\sin(13\omega t) + \frac{4}{15\pi}\sin(15\omega t) + \frac{4}{17\pi}\sin(17\omega t) + \\<br />
\frac{4}{19\pi}\sin(19\omega t) + \frac{4}{21\pi}\sin(21\omega t) + \frac{4}{23\pi}\sin(23\omega t) + \\<br />
\frac{4}{25\pi}\sin(25\omega t) + \frac{4}{27\pi}\sin(27\omega t) + \frac{4}{29\pi}\sin(29\omega t) + \\<br />
\frac{4}{31\pi}\sin(31\omega t) + \frac{4}{33\pi}\sin(33\omega t) + \cdots <br />
\end{align}</math><br />
<br />
At first glance, that doesn't seem very useful either; you'd have to sum an infinite number of harmonics to get the answer! However, we don't have an infinite number of harmonics.<br />
<br />
We're using a quite sharp [[WikiPedia:Low-pass_filter|anti-aliasing filter]] that cuts off right<br />
above 20kHz, so our signal is [[WikiPedia:Bandlimiting|bandlimited]]. Only the first ten terms make it through, and that's exactly what we see on the output scope.<br />
<br />
<br style="clear:both;"/><br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
*In modern web browsers you can program audio synthesizers directly in javascript. Use the two square wave formulas to get a square wave out of [http://js.do/blog/sound-waves-with-javascript/ this page]. (Note: The scope is not very accurate/useful.) <br />
</div></center><br />
<div>&nbsp;</div><br />
<br />
[[Image:dsat_015.jpg|360px|right]]<br />
The rippling you see around sharp edges in a bandlimited signal is called the [[WikiPedia:Gibbs phenomenon|Gibbs effect]]. It happens whenever you slice off part of the frequency domain in the middle of nonzero energy.<br />
<br />
The usual rule of thumb you'll hear is "the sharper the cutoff, the<br />
stronger the rippling", which is approximately true, but we have to be<br />
careful how we think about it. For example, what would you expect our quite sharp anti-aliasing filter<br />
to do if I run our signal through it a second time?<br />
<br />
Aside from adding a few fractional cycles of delay, the answer is:<br />
Nothing at all. The signal is already bandlimited. Bandlimiting it<br />
again doesn't do anything. A second pass can't remove frequencies<br />
that we already removed.<br />
<br />
That's important. People tend to think of the ripples as<br />
a kind of [[WikiPedia:Sonic_artifact|artifact]] that's added by anti-aliasing and [[WikiPedia:Reconstruction_filter|anti-imaging]]<br />
filters, implying that the ripples get worse each time the signal<br />
passes through. We see that in this case that didn't happen, so<br />
it wasn't really the filter that added the ripples the first time<br />
through. It's a subtle distinction, but Gibbs effect<br />
ripples aren't added by filters, they're just part of what a<br />
bandlimited signal ''is''.<br />
<br />
Even if we synthetically construct what looks like a perfect digital<br />
square wave it's still limited to the channel bandwidth. Remember that<br />
the stairstep representation is misleading. What we really have here are instantaneous sample points<br />
and only one bandlimited signal fits those points. All we did when we<br />
drew our apparently perfect square wave was line up the sample points<br />
just right so it appeared that there were no ripples if we played<br />
[[WikiPedia:Interpolation|connect-the-dots]]. The original bandlimited signal, complete with ripples, was<br />
still there.<br />
<br />
[[image:Dsat_014.gif|360px|right]]<br />
That leads us to one more important point. You've probably heard<br />
that the timing precision of a digital signal is limited by its sample<br />
rate; put another way, that digital signals can't represent anything that falls between the<br />
samples.. implying that [[WikiPedia:Dirac_delta_function|impulses]] or<br />
[[WikiPedia:Synthesizer#ADSR_envelope|fast attacks]] have to align exactly<br />
with a sample, or the timing gets mangled or they just disappear.<br />
At this point, we can easily see why that's wrong.<br />
<br />
Again, our input signals are bandlimited. And digital signals are<br />
samples, not stairsteps, not 'connect-the-dots'. We most certainly<br />
can, for example, put the rising edge of our bandlimited square wave<br />
anywhere we want between samples.<br />
<br />
It's represented perfectly and it's reconstructed perfectly.<br />
<br />
==Epilogue==<br />
<br />
[[Image:Moffey.jpg|360px|right]]<br />
<br />
Like in [[Videos/A Digital Media Primer For Geeks|A Digital Media Primer for Geeks]], we've covered a broad range of topics, and yet barely scratched the surface of each one. If anything, my sins of omission are greater this time around.<br />
<br />
Thus I encourage you to dig deeper and experiment. I chose my demos carefully to be simple and give clear results. You can reproduce every one of them on your own if you like, but let's face it: sometimes we learn the most about a spiffy toy by breaking it open and studying all the pieces that fall out. Play with the demo parameters, hack up the code, set up alternate experiments. The [[#Use The Source Luke|source code for everything, including the little push-button demo application]], is at the end of this transcript.<br />
<br />
In the course of experimentation, you're likely to run into something that you didn't expect and can't explain. Don't worry! My earlier snark aside, Wikipedia is fantastic for exactly this kind of casual research. If you're really serious about understanding signals, several universities have advanced materials online, such as the [http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-003-signals-and-systems-fall-2011/ 6.003] and [http://ocw.mit.edu/resources/res-6-007-signals-and-systems-spring-2011/ RES.6-007] Signals and Systems modules at MIT OpenCourseWare. And, of course, there's always the [http://webchat.freenode.net/?channels=xiph community here at Xiph.Org].<br />
<br />
==Credits==<br />
[[Image:Dmpfg_019.png|360px|right]]<br />
Written by: Christopher (Monty) Montgomery and the Xiph.Org Community<br />
<br />
Special thanks to:<br />
*Heidi Baumgartner, for the second Tektronix oscilloscope<br />
*Gregory Maxwell and Dr. Timothy Terriberry, for additional technical review<br />
<br />
Intro, title and credits music:<br><br />
"[http://music.lousyrobot.com/track/andy-warhol-is-gone Andy Warhol Is Gone]", by Lousy Robot<br><br />
Used by permission of Lousy Robot.<br><br />
Original source track All Rights Reserved.<br><br />
[http://www.lousyrobot.com www.lousyrobot.com]<br />
<br />
This Video Was Produced Entirely With Free and Open Source Software:<br><br />
<br />
*[http://www.gnu.org/ GNU]<br><br />
*[http://www.linux.org/ Linux]<br><br />
*[http://fedoraproject.org/ Fedora]<br><br />
*[http://cinelerra.org/ Cinelerra]<br><br />
*[http://www.gimp.org/ The Gimp]<br><br />
*[http://audacity.sourceforge.net/ Audacity]<br><br />
*[http://svn.xiph.org/trunk/postfish/README Postfish]<br><br />
*[http://gstreamer.freedesktop.org/ Gstreamer]<br><br />
<br />
All trademarks are the property of their respective owners. <br />
<br />
*''Complete video'' [http://creativecommons.org/licenses/by-sa/3.0/legalcode CC-BY-SA]<br><br />
*''Text transcript and Wiki edition'' [http://creativecommons.org/licenses/by-sa/3.0/legalcode CC-BY-SA]<br><br />
<br />
A Co-Production of Xiph.Org and Red Hat, Inc.<br><br />
(C) 2012-2013, Some Rights Reserved<br><br />
<hr/><br />
<br />
== Use The Source Luke ==<br />
<br />
As stated in the Epilogue, everything that appears in the video demos is driven by open source software, which means the source is both available for inspection and freely usable by the community. The Thinkpad that appears in the video was running Fedora 17 and Gnome Shell (Gnome 3). The demonstration software does not require Fedora specifically, but it does require Gnu/Linux to run in its current form. In all, the video involved just under 50,000 lines of new and custom-purpose code (including contributions to non-Xiph projects such as Cinelerra and [http://sourceforge.net/projects/gromit-mpx/ Gromit]).<br />
<br />
=== The Spectrum and Waveform Viewer ===<br />
<br />
The realtime software spectrum analyzer application that appears in the video was a preexisting application that was dusted off and updated for use in the video. The waveform viewer (effectively a simple software oscilloscope) was written from scratch making use of some of the internals from the spectrum analyzer application. Both are available from Xiph.Org svn:<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
*Source for the Spectrum and Waveform applications is found at:<br />
https://svn.xiph.org/trunk/spectrum/<br />
*The source can be checked out of svn using the following command line:<br />
svn co https://svn.xiph.org/trunk/spectrum<br />
*Trac is a convenient way to browse the source without checking out a copy:<br />
https://trac.xiph.org/browser/trunk/spectrum<br />
</div></center><br />
<br />
Spectrum and Waveform both expect an input stream on the command line, either as raw data or as a WAV file.<br />
<br />
=== GTK-Bounce ===<br />
<br />
The touch-controlled application used in the video is named 'gtk-bounce' and was custom-written for the sole purpose of the in-video demonstrations. It is so named because, for the most part, all it does is read the input from an audio device, and then immediately write the same data back out for playback. It also forwards a copy of this data to up to two external monitoring applications, and in several demos, applies simple filters or generates simple waveforms. It includes several demos not included in the video.<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
*Source for gtk-bounce is found at:<br />
https://svn.xiph.org/trunk/Xiph-episode-II/bounce/<br />
*The source can be checked out of svn using the following command line:<br />
svn co https://svn.xiph.org/trunk/Xiph-episode-II/bounce/<br />
*Trac is a convenient way to browse the source without checking out a copy:<br />
https://trac.xiph.org/browser/trunk/Xiph-episode-II/bounce/<br />
</div></center><br />
<br />
==== Starting Gtk-bounce ====<br />
The application is somewhat hardwired for specific demo parameters, but most of the hardwired settings can be found at the top of each source file. As found in SVN, the application expects an ALSA hardware audio device at hw:1, and if none if found, it will wait for one to appear. Once a sound device is successfully initialized, it expects to find and open two pipes named pipe0 and pipe1 for output in the current directory. In the video, the waveform and spectrum applications are started to take input from pipe0 and pipe1 respectively. The output sent to the two pipes is identical, and in most demos matches the output data sent to the hardware device for conversion to analog. The only exception is the tenth demo panel (which does not appear in the video) where gtk-bounce can be set to monitor the hardware inputs instead while the outputs are used to produce test waveforms.<br />
<br />
Assuming gtk-bounce, spectrum and waveform have been checked out and built, the configuration seen in the video can be started using the following commands:<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
* make the pipe fifos for the applications to communicate (only needs to be done once)<br />
mkfifo pipe0 pipe1<br />
* start all three applications<br />
waveform pipe0 & spectrum pipe1 & gtk-bounce &<br />
</div></center><br />
<br />
==== Using Gtk-bounce ====<br />
<br />
Gtk-bounce consists of eleven pushbutton panels (numbered zero through ten) that can be selected by scrolling up and down with the arrow buttons on the right side. Each panel is intended for a specific demo or part of a demo.<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
<br />
[[Image:Dsat-panel0.png|700px|center]]<br />
* '''Panel 0''': This panel presents buttons that allow the sound card to be configured in several sampling rates and bit depths. Samples read from the audio inputs are sent to the output pipes and audio outputs for playback without modification.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel1.png|700px|center]]<br />
* '''Panel 1''': Both channels are forwarded to the outputs, however the user may select the bit depth of each channel independently. When the sound card is running in 16 bit mode and 16-bit depth is selected, the data is untouched. Requantization to a lower bit depth is performed with a flat triangle dither.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel2.png|700px|center]]<br />
* '''Panel 2''': Both channels are re-quantized to the selected bit depth. Requantization to a lower bit depth is performed with a flat triangle dither.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel3.png|700px|center]]<br />
* '''Panel 3''': 'generate sine wave' discards the audio inputs and instead internally generates a sine wave at 32 bit precision, which is then quantized to the selected bit depth, optionally with dither. The resulting signal is then forwarded to the output. <br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel4.png|700px|center]]<br />
* '''Panel 4''': gtk-bounce generates a 16-bit sine wave of the selected amplitude, optionally with dither, and forwards the resulting signal to the outputs. The audio input from the audio device is discarded. Note that the slider sets the peak amplitude, not the peak-to-peak amplitude.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel5.png|700px|center]]<br />
* '''Panel 5''': generates a 16-bit sine wave, optionally quantized using dither. The user may additionally select a flat or a shaped dither. The 'notch and gain' button applies a notch filter to the resulting signal, and boosts the gain of the remaining noise so that it's easily audible. The audio input from the audio device is discarded.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel6.png|700px|center]]<br />
* '''Panel 6''': allows the user to play with the power of the dithering noise applied before quantizing the sine wave. Shaped or flat dither are available. The sine wave may also be modulated with a varying amplitude to highlight correlations between the input and the resulting quantization noise. The 'notch and gain' button applies a notch filter to the resulting signal, and boosts the gain of the remaining noise so that it's easily audible. The audio input from the audio device is discarded.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel7.png|700px|center]]<br />
* '''Panel 7''': applies a sharper antialiasing (lowpass) filter than is likely to be built into the sound-card hardware (as there's generally no reason to use a filter quite this sharp in practice). The very sharp filter allows us to bandpass the demonstration square wave without any harmonics landing in the transition band. The input is read from the audio device, passed through this sharper filter, and then forwarded to the outputs.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel8.png|700px|center]]<br />
* '''Panel 8''': when selected, generate a synthetic 'square wave' (this is not quite equivalent to a bandlimited analog square wave; the harmonic amplitudes are a bit different) that when aligned with the sampling phase just right gives the appearance of having infinite rise and fall time. The slider allows us to shift the waveform sample alignment back and forth by +/- one sample to reveal that the underlying signal is still band-limited.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel9.png|700px|center]]<br />
* '''Panel 9''': as in panel 8, generate a 'perfect' synthetic 'square wave'. However, the slider now allows us to shift the sample alignment of the second channel with respect to the first, instead of shifting both channels. This allows us the trigger/lock the scope timing to the channel 1 waveform so we can see the fractional sample movement and alignment of the waveform on channel 2. The audio input from the audio device is discarded.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel10.png|700px|center]]<br />
* '''Panel 10''': not used in the video; The audio device is configured to 24-bit input/output. The user may produce one of a range of test signals that are output to both the external applications and the audio device on the first channel. The input on the second channel is passed-through to the applications and audio device outputs unchanged. The first channel input is unused unless 'two input mode' is selected. When two input mode is selected, both input channels are read and the data sent to the external applications. Generated test signals are sent only to the audio hardware (on the first channel). This combination of test signals and input modes allows self-references frequency response, phase, noise, distortion and crosstalk testing of a given audio device.<br />
<br />
<div style="clear: both">&nbsp;</div><br />
<br />
</div></center><br />
<br />
=== Cairo Animations ===<br />
<br />
The animations featured throughout the Episode 2 video were rapid-development spaghetti hack-jobs coded by hand in raw Cairo. Each module generated a series of PNG stills that were then stitched into an animation with Cinelerra or mplayer. In the interest of pointing and laughing at what really bad code looks like...<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
*Source for the Cairo animations is found at:<br />
https://svn.xiph.org/trunk/Xiph-episode-II/cairo/<br />
*The source can be checked out of svn using the following command line:<br />
svn co https://svn.xiph.org/trunk/Xiph-episode-II/cairo/<br />
*Trac is a convenient way to browse the source without checking out a copy:<br />
https://trac.xiph.org/browser/trunk/Xiph-episode-II/cairo/<br />
</div></center></div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos/Digital_Show_and_Tell&diff=14210Videos/Digital Show and Tell2013-07-02T16:08:47Z<p>Lee Carré: /* Epilogue */ replaced 404-returning MIT OCW link, wiki-link underscores to spaces, minor copyediting</p>
<hr />
<div><small>''Wiki edition''</small><br />
[[Image:dsat_001.jpg|400px|right]]<br />
<br />
Continuing in the "firehose" tradition of [[Videos/A_Digital_Media_Primer_For_Geeks|Episode 01]], Xiph.Org's second video on digital media explores multiple facets of digital audio signals and how they ''really'' behave in the real world.<br />
<br />
Demonstrations of sampling, quantization, bit-depth, and dither explore digital audio behavior on real audio equipment using both modern digital analysis and vintage analog bench equipment, just in case we can't trust those newfangled digital gizmos. You can download the source code for each demo and try it all for yourself!<br />
<br/><br/><br/><br />
<center><font size="+2">[http://www.xiph.org/video/vid2.shtml Download or Watch online]</font></center><br />
<br style="clear:both;"/><br />
Supported players: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/ Firefox ], [http://www.chromium.org/Home Chrome ], [http://www.opera.com/ Opera]. Or see [http://www.webmproject.org/users/ other WebM] or [[TheoraSoftwarePlayers|other Theora]] players.<br />
<br />
If you're having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br/><br />
<hr/><br />
<br/><br/><br/><br />
[[Image:Xiph_ep02_test.png|400px|right]]<br />
<br />
&ldquo;Hi, I'm Monty Montgomery from [http://www.redhat.com/ Red Hat] and [http://xiph.org/ Xiph.Org].<br />
<br />
&ldquo;A few months ago, I wrote<br />
[http://people.xiph.org/~xiphmont/demo/neil-young.html an article on digital audio and why 24bit/192kHz music downloads don't make sense].<br />
In the article, I<br />
mentioned--almost in passing--that a digital waveform is<br />
[http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_sfam not a stairstep],<br />
and you certainly don't get a stairstep when you<br />
[[WikiPedia:Digital-to-analog_converter|convert from digital back to analog]].<br />
<br />
&ldquo;Of everything in the entire article, '''that''' was the number one thing<br />
people wrote about. In fact, more than half the mail I got was questions and<br />
comments about basic digital signal behavior. Since there's interest, let's<br />
take a little time to play with some ''simple'' digital signals. &rdquo;<br />
<br />
==Veritas ex machina==<br />
[[Image:Dsat_002.jpg|200px|right]]<br />
[[Image:Dsat_003.jpg|200px|right]]<br />
[[Image:Dsat_004.jpg|200px|right]]<br />
[[Image:Dsat_005.jpg|200px|right]]<br />
<br />
If we pretend for a moment that we have no idea how digital signals really<br />
behave, then it doesn't make sense for us to use digital test<br />
equipment. Fortunately for this exercise, there's still plenty<br />
of working analog lab equipment out there.<br />
<br />
We need a [[WikiPedia:Function_generator|signal generator]] to provide us with analog input<br />
signals--in this case, an<br />
[http://www.home.agilent.com/en/pd-3325A%3Aepsg%3Apro-pn-3325A/synthesizer-function-generator?pm=PL&nid=-536900197.536896863&cc=SE&lc=swe HP3325]<br />
from 1978.<br />
<br />
We'll observe our analog waveforms on [[WikiPedia:Oscilloscope_types#Cathode-ray_oscilloscope_.28CRO.29|analog oscilloscopes]],<br />
like this Tektronix 2246 from the mid-90s, one of the last and best analog scopes made.<br />
<br />
Finally, we'll inspect the [[WikiPedia:Spectral_density#Electrical_engineering|frequency spectrum]] of our signals using an<br />
[[WikiPedia:Spectrum_analyzer#Swept-tuned|analog spectrum analyzer]], this<br />
[http://www.home.agilent.com/en/pd-3585A%3Aepsg%3Apro-pn-3585A/spectrum-analyzer-high-perf-20hz-40mhz?pm=PL&nid=-536900197.536897319&cc=SE&lc=swe HP3585]<br />
from the same product line as<br />
the signal generator. Like the other equipment here it has<br />
[http://www.hp9845.net/9845/hardware/processors/ a rudimentary and hilariously large microcontroller],<br />
but the signal path<br />
from input to what you see on the screen is completely analog.<br />
<br />
All of this equipment is vintage, but the specs are still quite good.<br />
We start with the signal generator set to output a 1 [[WikiPedia:Hertz#SI_multiples|kHz]]<br />
sine wave at one [[WikiPedia:Volt|Volt]] [[WikiPedia:Amplitude#Root_mean_square_amplitude|RMS]].<br />
We see the sine wave on the oscilloscope, can verify that it is indeed<br />
1 kHz at 1 Volt RMS, which is 2.8 Volts<br />
[[WikiPedia:Amplitude#Peak-to-peak_amplitude|peak-to-peak]],<br />
and that matches the<br />
measurement on the spectrum analyzer as well.<br />
<br />
The analyzer also shows some low-level [[WikiPedia:White_noise|white noise]]<br />
and just a bit of [[WikiPedia:Harmonic_distortion#Harmonic_distortion|harmonic distortion]],<br />
with the highest peak about 70[[WikiPedia:Decibel|dB]] or so below<br />
[[WikiPedia:Fundamental_frequency|the fundamental]].<br />
This doesn't matter to the demos, but it's good to take notice of it now to avoid confusion later.<br />
<br />
For digital conversion, we use a boring, consumer-grade, eMagic USB1<br />
audio device. It's more than ten years old at this point, and it's<br />
getting obsolete.<br />
<br />
A recent converter can easily have an order of magnitude better specs.<br />
[[WikiPedia:Reconstruction_filter#Sampled_data_reconstruction_filters|Flatness]],<br />
[[WikiPedia:Analog-to-digital_converter#Non-linearity|linearity]],<br />
[[WikiPedia:Jitter#Sampling_jitter|jitter]],<br />
[[WikiPedia:Noise_floor|noise behavior]],<br />
[[WikiPedia:Digital-to-analog_converter#DAC_performance|everything]]...<br />
You may not<br />
have noticed. Just because we can measure an improvement doesn't<br />
mean we can hear it, and even these old consumer boxes were already at<br />
the edge of ideal transparency.<br />
<br />
The eMagic connects to my ThinkPad, which displays a digital<br />
waveform and spectrum for comparison, then the ThinkPad<br />
sends the digital signal right back out to the eMagic for<br />
re-conversion to analog and observation on the output scopes.<br />
<br />
<br style="clear:both;"/><br />
<br />
==Stairsteps==<br />
[[Image:Dsat 007.png|360px|right]]<br />
[[Image:Dsat 006.jpg|360px|right]]<br />
First demo: We begin by converting an analog signal to digital and<br />
then right back to analog again with no other steps.<br />
<br />
The signal generator is set to produce a 1kHz sine wave just like<br />
before and we can see the analog sine wave on the input-side oscilloscope. The eMagic digitizes our signal to<br />
[[Videos/A_Digital_Media_Primer_For_Geeks#Raw_.28digital_audio.29_meat|16 bit PCM at 44.1kHz]],<br />
same as on a CD. The spectrum of the digitized signal on the Thinkpad matches what we saw earlier and what we see now on the analog spectrum analyzer, aside from its <br />
[[WikiPedia:High_impedance|high-impedance input]] being just a smidge noisier. For now, the waveform display shows our digitized sine wave as a<br />
stairstep pattern, one step for each sample.<br />
<br />
When we look at the output signal that's been converted<br />
from digital back to analog, we see that it's exactly like the original sine wave. No stairsteps.<br />
<br />
1 kHz is still a fairly low frequency, so perhaps the stairsteps are just<br />
hard to see or they're being smoothed away. Next, set the signal generator to 15kHz, which is much closer to [[WikiPedia:Nyquist_frequency|Nyquist]].<br />
Now the sine wave is represented by less than three samples per cycle, and the digital waveform appears rather poor! Yet the analog output is still a perfect sine wave, exactly like the original.<br />
As we keep increasing frequency, all the way to 20kHz, the output waveform is still perfect. No jagged edges, no dropoff, no stairsteps.<br />
<br />
So where'd the stairsteps go? It's a trick question; they were never there. Drawing a digital waveform as a stairstep was wrong to begin with.<br />
<br />
A stairstep is a continuous-time function. It's jagged, and it's piecewise, but it has a defined value at every point in time.<br />
A sampled signal is entirely different. It's discrete-time; it's only got a value right at each instantaneous sample point and it's<br />
undefined, there is no value at all, everywhere between. A discrete-time signal is properly drawn as a lollipop graph.<br />
The continuous, analog counterpart of a digital signal passes smoothly through each sample point, and that's just as true for high<br />
frequencies as it is for low.<br />
<br />
[[Image:Dsat 008.png|360px|right]]<br />
The interesting and non-obvious bit is that [[WikiPedia:Nyquist%E2%80%93Shannon_sampling_theorem|there's only one<br />
bandlimited signal that passes exactly through each sample point]]; it's a unique solution. If you sample a bandlimited signal and then convert it back, the original input is also the only possible output.<br />
A signal that differs even minutely from the original includes frequency content at or beyond Nyquist, breaks the bandlimiting requirement and isn't a valid solution.<br />
<br />
So how did everyone get confused and start thinking of digital signals as stairsteps? I can think of two good reasons.<br />
<br />
First: it's easy to convert a sampled signal to a true stairstep. Just<br />
extend each sample value forward until the next sample period. This is<br />
called a [[WikiPedia:Zero-order hold|zero-order hold]], and it's an important part of how some<br />
digital-to-analog converters work, especially the simplest ones.<br />
As a result, anyone who looks up [[WikiPedia:Digital-to-analog_converter#Practical_operation|digital-to-analog converter or<br />
digital-to-analog conversion]] is probably going to see a diagram of a<br />
stairstep waveform somewhere, but that's not a finished conversion,<br />
and it's not the signal that comes out.<br />
<br />
Second, and this is probably the more likely reason, engineers who<br />
supposedly know better (yes, even I) draw stairsteps even though they're<br />
technically wrong. It's a sort of one-dimensional version of<br />
[[WikiPedia:MacPaint#Development|fat bits in an image editor]].<br />
Pixels aren't squares either, they're samples of a 2-dimensional<br />
function space and so they're also, conceptually, infinitely small<br />
points. Practically, it's a real pain in the ass to see or manipulate<br />
infinitely small anything, so big squares it is.<br />
<br />
Digital stairstep drawings are exactly the same thing. It's just a convenient drawing. The stairsteps aren't really there.<br />
<div style="clear:both"></div><br />
<br />
==Bit-depth==<br />
[[Image:Dsat_009.jpg|360px|right]]<br />
[[Image:Dsat_010.jpg|260px|right]]<br />
<br />
When we convert a digital signal back to analog, the result is<br />
''also'' smooth regardless of the [[WikiPedia:Audio_bit_depth|bit depth]]. It doesn't matter if it's 24 bits or 16 bits or 8 bits.<br />
So does that mean that the digital bit depth makes no difference at<br />
all? Of course not.<br />
<br />
Channel 2 is the same sine wave input, but we quantize it with<br />
[[WikiPedia:Dither|dither]] down to 8 bits.<br />
On the scope, we still see a nice<br />
smooth sine wave on channel 2. Look very close, and you'll also see a<br />
bit more noise. That's a clue.<br />
<br />
If we look at the spectrum of the signal, our sine wave is<br />
still there unaffected, but the noise level of the 8-bit signal on<br />
the second channel is much higher. And that's the difference, the only difference, the number of bits makes.<br />
<br />
When we digitize a signal, first we sample it. The<br />
sampling step is perfect; it loses nothing. But then we [[WikiPedia:Quantization_(sound_processing)|quantize]] it,<br />
and [[WikiPedia:Quantization_error|quantization adds noise]].<br />
The number of bits determines how much noise and so the level of the<br />
noise floor.<br />
<br />
What does this dithered quantization noise sound like?<br />
Those of you who have used analog recording equipment might think to yourselves, "My goodness! That sounds like tape hiss!"<br />
Well, it doesn't just sound like tape hiss, it acts like it too, and<br />
if we use a [[WikiPedia:Dither#Different_types|gaussian dither]] then it's<br />
[[WikiPedia:Central_limit_theorem|mathematically equivalent]] in every way. It ''is'' tape hiss.<br />
<br />
Intuitively, that means that we can measure tape hiss and thus the noise floor<br />
of [[WikiPedia:Magnetic_tape_sound_recording|magnetic audio tape]]<br />
in [[WikiPedia:Shannon–Hartley_theorem#Examples|bits instead of decibels]], in order to put things in a<br />
digital perspective. [[WikiPedia:Compact cassettes|Compact cassettes]], for those of you who are old enough to remember them, could reach as<br />
deep as 9 bits in perfect conditions. 5 to 6 bits was<br />
more typical, especially if it was a recording made on a<br />
[[WikiPedia:Cassette_deck|tape deck]]. That's right; your old mix tapes were only about 6 bits<br />
deep if you were lucky!<br />
<br />
The very best professional [[WikiPedia:Reel-to-reel_audio_tape_recording|open reel tape]] used in studios could barely<br />
hit 13 bits ''with'' [[WikiPedia:Reel-to-reel_audio_tape_recording#Noise_reduction|advanced noise reduction]]. <br />
That's why seeing '[[WikiPedia:SPARS_code|D D D]]' on a [[WikiPedia:Compact_disk|Compact Disc]] used to be such a big,<br />
high-end deal.<br />
<br />
<div style="clear:both"></div><br />
<br />
==Dither==<br />
[[Image:Dsat_011.png|360px|right]]<br />
<br />
We've been quantizing with [[Wikipedia:dither|dither]]. What is dither<br />
exactly and, more importantly, what does it do?<br />
<br />
The simplest way to quantize a signal is to choose the digital<br />
amplitude value [[WikiPedia:Rounding|closest to the original analog amplitude]].<br />
Unfortunately, the exact noise that results from this simple<br />
quantization scheme depends somewhat on the input signal.<br />
It may be inconsistent, cause distortion, or be<br />
undesirable in some other way.<br />
<br style="clear:both;"/><br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
*Cameron Nicklaus Christou's thesis [http://uwspace.uwaterloo.ca/bitstream/10012/3867/1/thesis.pdf Optimal Dither and Noise Shaping in Image Processing] provides an ''excellent'' explanation of dither and noise shaping.<br />
</div>&nbsp;</center><br />
<br />
Dither is specially-constructed noise that substitutes for the noise<br />
produced by simple quantization. Dither doesn't [[WikiPedia:Sound_masking|drown out or mask]]<br />
quantization noise, it replaces it with noise characteristics<br />
of our choosing that aren't influenced by the input.<br />
<br />
The signal generator has too much noise for this test so we produce a mathematically perfect sine wave with the ThinkPad and quantize it to 8 bits with dithering.<br />
We see the sine wave on waveform display and output scope, and <br />
a clean frequency peak with a uniform noise floor on both spectral displays<br />
just like before. Again, this is with dither.<br />
<br />
Now I turn dithering off.<br />
<br />
The quantization noise that dither had spread out into a nice, flat noise<br />
floor, piles up into harmonic distortion peaks. The noise floor is<br />
lower, but the level of distortion becomes nonzero, and the distortion<br />
peaks sit higher than the dithering noise did.<br />
<br />
At 8 bits this effect is exaggerated. At 16 bits,<br />
even without dither, harmonic distortion is going to be so low as to<br />
be completely inaudible. Still, we can use dither to eliminate it completely if we so choose.<br />
<br />
Turning the dither off again for a moment, you'll notice that the<br />
absolute level of distortion from undithered quantization stays<br />
approximately constant regardless of the input amplitude.<br />
But when the signal level drops below a half a bit, everything<br />
quantizes to zero.<br />
<br />
In a sense, everything quantizing to zero is just 100% distortion!<br />
Dither eliminates this distortion too. When we reenable dither, we clearly see our signal at 1/4 bit with a nice flat noise floor.<br />
<br />
The noise floor doesn't have to be flat. Dither is noise of our<br />
choosing, so it makes sense to choose a noise as [http://www.acoustics.salford.ac.uk/res/cox/sound_quality/?content=subjective inoffensive] and<br />
[[WikiPedia:Absolute_threshold_of_hearing|difficult to notice]]<br />
as possible.<br />
<br />
Human hearing is [[WikiPedia:Equal-loudness_contour|most sensitive in the midrange from 2kHz to 4kHz]]; that's where background noise is going to be the most obvious.<br />
We can [[WikiPedia:Noise_shaping|shape dithering noise]] away from sensitive frequencies to where<br />
hearing is less sensitive, usually the highest frequencies.<br />
<br />
Lastly, dithered quantization noise ''is'' higher [[WikiPedia:Sound_power|power]] overall<br />
than undithered quantization noise, even though it often sounds quieter, and<br />
you can see that on a [[WikiPedia:VU_meter|VU meter]] during passages of near-silence. However,<br />
dither isn't only an on or off choice. We can reduce the dither's<br />
power to balance less noise against a bit of distortion to minimize<br />
the overall effect.<br />
<br />
For the next test, we also [[WikiPedia:Amplitude_modulation|modulate the input signal]] like this to show how a varying input affects the quantization noise. At<br />
full dithering power, the noise is uniform, constant, and featureless<br />
just like we expect.<br />
<br />
As we reduce the dither's power, the input increasingly<br />
affects the amplitude and the character of the quantization noise.<br />
Shaped dither behaves similarly, but noise shaping lends one more nice<br />
advantage; it can use a somewhat lower<br />
dither power before the input has as much effect on the output.<br />
<br />
Despite all this text spent on dither, the differences exist 100 decibels or more below [[WikiPedia:Full_scale|full scale]]. If the CD had been<br />
[http://www.research.philips.com/technologies/projects/cd/index.html 14 bits as originally designed],<br />
perhaps dither ''might'' be<br />
more important. At 16 bits it's mostly a wash. It's reasonable to treat<br />
dither as an insurance policy that gives several extra<br />
decibels of dynamic range just in case. That said no<br />
one ever ruined a great recording by not dithering the final master.<br />
<br />
==Bandlimitation and timing==<br />
[[image:Dsat_016.jpg|360px|right]]<br />
<br />
We've been using [[WikiPedia:Sine_wave|sine waves]]. They're the obvious choice when what we<br />
want to see is a system's behavior at a given isolated frequency. Now let's look at something a bit more complex. What should we expect to happen when I change the input to a [[WikiPedia:Square_wave|square wave]]? <br />
<br />
The input scope confirms a 1kHz square wave. The output scope shows... exactly what it should.<br />
<br />
What is a square wave really? <br />
<br />
We can say it's a waveform that's some positive value for half a cycle and then transitions instantaneously to a negative value for the other half.<br />
<br />
:<math><br />
\ squarewave(t) = \begin{cases} 1, & |t| < T_1 \\ 0, & T_1 < |t| \leq {1 \over 2}T \end{cases}<br />
</math><br />
<br />
But that doesn't really tell us anything useful about how that input becomes this output.<br />
<br />
We remember that any waveform is also [[WikiPedia:Fourier_series|the sum of discrete frequencies]],<br />
and a square wave is particularly simple sum: a fundamental and an infinite series of [[WikiPedia:Even_and_odd_functions#Harmonics|odd harmonics]].<br />
<br />
[[image:Dsat_013.jpg|360px|right]]<br />
:<math>\begin{align}<br />
\ squarewave(t) = \frac{4}{\pi}\sin(\omega t) + \frac{4}{3\pi}\sin(3\omega t) + \frac{4}{5\pi}\sin(5\omega t) + \\<br />
\frac{4}{7\pi}\sin(7\omega t) + \frac{4}{9\pi}\sin(9\omega t) + \frac{4}{11\pi}\sin(11\omega t) + \\ <br />
\frac{4}{13\pi}\sin(13\omega t) + \frac{4}{15\pi}\sin(15\omega t) + \frac{4}{17\pi}\sin(17\omega t) + \\<br />
\frac{4}{19\pi}\sin(19\omega t) + \frac{4}{21\pi}\sin(21\omega t) + \frac{4}{23\pi}\sin(23\omega t) + \\<br />
\frac{4}{25\pi}\sin(25\omega t) + \frac{4}{27\pi}\sin(27\omega t) + \frac{4}{29\pi}\sin(29\omega t) + \\<br />
\frac{4}{31\pi}\sin(31\omega t) + \frac{4}{33\pi}\sin(33\omega t) + \cdots <br />
\end{align}</math><br />
<br />
At first glance, that doesn't seem very useful either; you'd have to sum an infinite number of harmonics to get the answer! However, we don't have an infinite number of harmonics.<br />
<br />
We're using a quite sharp [[WikiPedia:Low-pass_filter|anti-aliasing filter]] that cuts off right<br />
above 20kHz, so our signal is [[WikiPedia:Bandlimiting|bandlimited]]. Only the first ten terms make it through, and that's exactly what we see on the output scope.<br />
<br />
<br style="clear:both;"/><br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
*In modern web browsers you can program audio synthesizers directly in javascript. Use the two square wave formulas to get a square wave out of [http://js.do/blog/sound-waves-with-javascript/ this page]. (Note: The scope is not very accurate/useful.) <br />
</div></center><br />
<div>&nbsp;</div><br />
<br />
[[Image:dsat_015.jpg|360px|right]]<br />
The rippling you see around sharp edges in a bandlimited signal is called the [[WikiPedia:Gibbs phenomenon|Gibbs effect]]. It happens whenever you slice off part of the frequency domain in the middle of nonzero energy.<br />
<br />
The usual rule of thumb you'll hear is "the sharper the cutoff, the<br />
stronger the rippling", which is approximately true, but we have to be<br />
careful how we think about it. For example, what would you expect our quite sharp anti-aliasing filter<br />
to do if I run our signal through it a second time?<br />
<br />
Aside from adding a few fractional cycles of delay, the answer is:<br />
Nothing at all. The signal is already bandlimited. Bandlimiting it<br />
again doesn't do anything. A second pass can't remove frequencies<br />
that we already removed.<br />
<br />
That's important. People tend to think of the ripples as<br />
a kind of [[WikiPedia:Sonic_artifact|artifact]] that's added by anti-aliasing and [[WikiPedia:Reconstruction_filter|anti-imaging]]<br />
filters, implying that the ripples get worse each time the signal<br />
passes through. We see that in this case that didn't happen, so<br />
it wasn't really the filter that added the ripples the first time<br />
through. It's a subtle distinction, but Gibbs effect<br />
ripples aren't added by filters, they're just part of what a<br />
bandlimited signal ''is''.<br />
<br />
Even if we synthetically construct what looks like a perfect digital<br />
square wave it's still limited to the channel bandwidth. Remember that<br />
the stairstep representation is misleading. What we really have here are instantaneous sample points<br />
and only one bandlimited signal fits those points. All we did when we<br />
drew our apparently perfect square wave was line up the sample points<br />
just right so it appeared that there were no ripples if we played<br />
[[WikiPedia:Interpolation|connect-the-dots]]. The original bandlimited signal, complete with ripples, was<br />
still there.<br />
<br />
[[image:Dsat_014.gif|360px|right]]<br />
That leads us to one more important point. You've probably heard<br />
that the timing precision of a digital signal is limited by its sample<br />
rate; put another way, that digital signals can't represent anything that falls between the<br />
samples.. implying that [[WikiPedia:Dirac_delta_function|impulses]] or<br />
[[WikiPedia:Synthesizer#ADSR_envelope|fast attacks]] have to align exactly<br />
with a sample, or the timing gets mangled or they just disappear.<br />
At this point, we can easily see why that's wrong.<br />
<br />
Again, our input signals are bandlimited. And digital signals are<br />
samples, not stairsteps, not 'connect-the-dots'. We most certainly<br />
can, for example, put the rising edge of our bandlimited square wave<br />
anywhere we want between samples.<br />
<br />
It's represented perfectly and it's reconstructed perfectly.<br />
<br />
==Epilogue==<br />
<br />
[[Image:Moffey.jpg|360px|right]]<br />
<br />
Like in [[Videos/A Digital Media Primer For Geeks|A Digital Media Primer for Geeks]], we've covered a broad range of topics, and yet barely scratched the surface of each one. If anything, my sins of omission are greater this time around.<br />
<br />
Thus I encourage you to dig deeper and experiment. I chose my demos carefully to be simple and give clear results. You can reproduce every one of them on your own if you like, but let's face it: sometimes we learn the most about a spiffy toy by breaking it open and studying all the pieces that fall out. Play with the demo parameters, hack up the code, set up alternate experiments. The source code for everything, including the little pushbutton demo application, is at the [[Videos/Digital Show and Tell#Use The Source Luke|bottom of this page]].<br />
<br />
In the course of experimentation, you're likely to run into something that you didn't expect and can't explain. Don't worry! My earlier snark aside, Wikipedia is fantastic for exactly this kind of casual research. If you're really serious about understanding signals, several universities have advanced materials online, such as the [http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-003-signals-and-systems-fall-2011/ 6.003] and [http://ocw.mit.edu/resources/res-6-007-signals-and-systems-spring-2011/ RES.6-007] Signals and Systems modules at MIT OpenCourseWare. And, of course, there's always the [http://webchat.freenode.net/?channels=xiph community here at Xiph.Org].<br />
<br />
==Credits==<br />
[[Image:Dmpfg_019.png|360px|right]]<br />
Written by: Christopher (Monty) Montgomery and the Xiph.Org Community<br />
<br />
Special thanks to:<br />
*Heidi Baumgartner, for the second Tektronix oscilloscope<br />
*Gregory Maxwell and Dr. Timothy Terriberry, for additional technical review<br />
<br />
Intro, title and credits music:<br><br />
"[http://music.lousyrobot.com/track/andy-warhol-is-gone Andy Warhol Is Gone]", by Lousy Robot<br><br />
Used by permission of Lousy Robot.<br><br />
Original source track All Rights Reserved.<br><br />
[http://www.lousyrobot.com www.lousyrobot.com]<br />
<br />
This Video Was Produced Entirely With Free and Open Source Software:<br><br />
<br />
*[http://www.gnu.org/ GNU]<br><br />
*[http://www.linux.org/ Linux]<br><br />
*[http://fedoraproject.org/ Fedora]<br><br />
*[http://cinelerra.org/ Cinelerra]<br><br />
*[http://www.gimp.org/ The Gimp]<br><br />
*[http://audacity.sourceforge.net/ Audacity]<br><br />
*[http://svn.xiph.org/trunk/postfish/README Postfish]<br><br />
*[http://gstreamer.freedesktop.org/ Gstreamer]<br><br />
<br />
All trademarks are the property of their respective owners. <br />
<br />
*''Complete video'' [http://creativecommons.org/licenses/by-sa/3.0/legalcode CC-BY-SA]<br><br />
*''Text transcript and Wiki edition'' [http://creativecommons.org/licenses/by-sa/3.0/legalcode CC-BY-SA]<br><br />
<br />
A Co-Production of Xiph.Org and Red Hat, Inc.<br><br />
(C) 2012-2013, Some Rights Reserved<br><br />
<hr/><br />
<br />
== Use The Source Luke ==<br />
<br />
As stated in the Epilogue, everything that appears in the video demos is driven by open source software, which means the source is both available for inspection and freely usable by the community. The Thinkpad that appears in the video was running Fedora 17 and Gnome Shell (Gnome 3). The demonstration software does not require Fedora specifically, but it does require Gnu/Linux to run in its current form. In all, the video involved just under 50,000 lines of new and custom-purpose code (including contributions to non-Xiph projects such as Cinelerra and [http://sourceforge.net/projects/gromit-mpx/ Gromit]).<br />
<br />
=== The Spectrum and Waveform Viewer ===<br />
<br />
The realtime software spectrum analyzer application that appears in the video was a preexisting application that was dusted off and updated for use in the video. The waveform viewer (effectively a simple software oscilloscope) was written from scratch making use of some of the internals from the spectrum analyzer application. Both are available from Xiph.Org svn:<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
*Source for the Spectrum and Waveform applications is found at:<br />
https://svn.xiph.org/trunk/spectrum/<br />
*The source can be checked out of svn using the following command line:<br />
svn co https://svn.xiph.org/trunk/spectrum<br />
*Trac is a convenient way to browse the source without checking out a copy:<br />
https://trac.xiph.org/browser/trunk/spectrum<br />
</div></center><br />
<br />
Spectrum and Waveform both expect an input stream on the command line, either as raw data or as a WAV file.<br />
<br />
=== GTK-Bounce ===<br />
<br />
The touch-controlled application used in the video is named 'gtk-bounce' and was custom-written for the sole purpose of the in-video demonstrations. It is so named because, for the most part, all it does is read the input from an audio device, and then immediately write the same data back out for playback. It also forwards a copy of this data to up to two external monitoring applications, and in several demos, applies simple filters or generates simple waveforms. It includes several demos not included in the video.<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
*Source for gtk-bounce is found at:<br />
https://svn.xiph.org/trunk/Xiph-episode-II/bounce/<br />
*The source can be checked out of svn using the following command line:<br />
svn co https://svn.xiph.org/trunk/Xiph-episode-II/bounce/<br />
*Trac is a convenient way to browse the source without checking out a copy:<br />
https://trac.xiph.org/browser/trunk/Xiph-episode-II/bounce/<br />
</div></center><br />
<br />
==== Starting Gtk-bounce ====<br />
The application is somewhat hardwired for specific demo parameters, but most of the hardwired settings can be found at the top of each source file. As found in SVN, the application expects an ALSA hardware audio device at hw:1, and if none if found, it will wait for one to appear. Once a sound device is successfully initialized, it expects to find and open two pipes named pipe0 and pipe1 for output in the current directory. In the video, the waveform and spectrum applications are started to take input from pipe0 and pipe1 respectively. The output sent to the two pipes is identical, and in most demos matches the output data sent to the hardware device for conversion to analog. The only exception is the tenth demo panel (which does not appear in the video) where gtk-bounce can be set to monitor the hardware inputs instead while the outputs are used to produce test waveforms.<br />
<br />
Assuming gtk-bounce, spectrum and waveform have been checked out and built, the configuration seen in the video can be started using the following commands:<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
* make the pipe fifos for the applications to communicate (only needs to be done once)<br />
mkfifo pipe0 pipe1<br />
* start all three applications<br />
waveform pipe0 & spectrum pipe1 & gtk-bounce &<br />
</div></center><br />
<br />
==== Using Gtk-bounce ====<br />
<br />
Gtk-bounce consists of eleven pushbutton panels (numbered zero through ten) that can be selected by scrolling up and down with the arrow buttons on the right side. Each panel is intended for a specific demo or part of a demo.<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
<br />
[[Image:Dsat-panel0.png|700px|center]]<br />
* '''Panel 0''': This panel presents buttons that allow the sound card to be configured in several sampling rates and bit depths. Samples read from the audio inputs are sent to the output pipes and audio outputs for playback without modification.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel1.png|700px|center]]<br />
* '''Panel 1''': Both channels are forwarded to the outputs, however the user may select the bit depth of each channel independently. When the sound card is running in 16 bit mode and 16-bit depth is selected, the data is untouched. Requantization to a lower bit depth is performed with a flat triangle dither.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel2.png|700px|center]]<br />
* '''Panel 2''': Both channels are re-quantized to the selected bit depth. Requantization to a lower bit depth is performed with a flat triangle dither.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel3.png|700px|center]]<br />
* '''Panel 3''': 'generate sine wave' discards the audio inputs and instead internally generates a sine wave at 32 bit precision, which is then quantized to the selected bit depth, optionally with dither. The resulting signal is then forwarded to the output. <br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel4.png|700px|center]]<br />
* '''Panel 4''': gtk-bounce generates a 16-bit sine wave of the selected amplitude, optionally with dither, and forwards the resulting signal to the outputs. The audio input from the audio device is discarded. Note that the slider sets the peak amplitude, not the peak-to-peak amplitude.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel5.png|700px|center]]<br />
* '''Panel 5''': generates a 16-bit sine wave, optionally quantized using dither. The user may additionally select a flat or a shaped dither. The 'notch and gain' button applies a notch filter to the resulting signal, and boosts the gain of the remaining noise so that it's easily audible. The audio input from the audio device is discarded.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel6.png|700px|center]]<br />
* '''Panel 6''': allows the user to play with the power of the dithering noise applied before quantizing the sine wave. Shaped or flat dither are available. The sine wave may also be modulated with a varying amplitude to highlight correlations between the input and the resulting quantization noise. The 'notch and gain' button applies a notch filter to the resulting signal, and boosts the gain of the remaining noise so that it's easily audible. The audio input from the audio device is discarded.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel7.png|700px|center]]<br />
* '''Panel 7''': applies a sharper antialiasing (lowpass) filter than is likely to be built into the sound-card hardware (as there's generally no reason to use a filter quite this sharp in practice). The very sharp filter allows us to bandpass the demonstration square wave without any harmonics landing in the transition band. The input is read from the audio device, passed through this sharper filter, and then forwarded to the outputs.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel8.png|700px|center]]<br />
* '''Panel 8''': when selected, generate a synthetic 'square wave' (this is not quite equivalent to a bandlimited analog square wave; the harmonic amplitudes are a bit different) that when aligned with the sampling phase just right gives the appearance of having infinite rise and fall time. The slider allows us to shift the waveform sample alignment back and forth by +/- one sample to reveal that the underlying signal is still band-limited.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel9.png|700px|center]]<br />
* '''Panel 9''': as in panel 8, generate a 'perfect' synthetic 'square wave'. However, the slider now allows us to shift the sample alignment of the second channel with respect to the first, instead of shifting both channels. This allows us the trigger/lock the scope timing to the channel 1 waveform so we can see the fractional sample movement and alignment of the waveform on channel 2. The audio input from the audio device is discarded.<br />
<div style="clear: both">&nbsp;</div><br />
<br />
[[Image:Dsat-panel10.png|700px|center]]<br />
* '''Panel 10''': not used in the video; The audio device is configured to 24-bit input/output. The user may produce one of a range of test signals that are output to both the external applications and the audio device on the first channel. The input on the second channel is passed-through to the applications and audio device outputs unchanged. The first channel input is unused unless 'two input mode' is selected. When two input mode is selected, both input channels are read and the data sent to the external applications. Generated test signals are sent only to the audio hardware (on the first channel). This combination of test signals and input modes allows self-references frequency response, phase, noise, distortion and crosstalk testing of a given audio device.<br />
<br />
<div style="clear: both">&nbsp;</div><br />
<br />
</div></center><br />
<br />
=== Cairo Animations ===<br />
<br />
The animations featured throughout the Episode 2 video were rapid-development spaghetti hack-jobs coded by hand in raw Cairo. Each module generated a series of PNG stills that were then stitched into an animation with Cinelerra or mplayer. In the interest of pointing and laughing at what really bad code looks like...<br />
<br />
<center><div style="background-color:#DDDDDD;border-color:#CCCCCC;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
*Source for the Cairo animations is found at:<br />
https://svn.xiph.org/trunk/Xiph-episode-II/cairo/<br />
*The source can be checked out of svn using the following command line:<br />
svn co https://svn.xiph.org/trunk/Xiph-episode-II/cairo/<br />
*Trac is a convenient way to browse the source without checking out a copy:<br />
https://trac.xiph.org/browser/trunk/Xiph-episode-II/cairo/<br />
</div></center></div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos&diff=13233Videos2012-02-13T15:17:19Z<p>Lee Carré: Removed underscores from intra-wiki links</p>
<hr />
<div>“A Digital Media Primer for Geeks” is a video series, from [https://www.xiph.org/ Xiph.Org], which presents the technical foundations of modern digital media as increasingly detailed instalments.<br />
One community member described the [[/Episode 01|first episode]] as “a Uni[versity] lecture I never got[,] but really wanted”.<br />
<br />
The series explains modern digital media from historical origins, basic concepts, to modern implementation.<br />
It’s intended for engineers, [http://www.catb.org/~esr/faqs/hacker-howto.html (software) hackers], mathematicians — the people who are interested in discovering and making things and building the technology itself — and budding geeks wanting to begin exploring video coding, as well as the technically‐curious who want to know more about the media they wrangle for work or play.<br />
<br />
So, without any further ado, welcome to one hell of a new hobby.<br />
<br />
==Episodes==<br />
* [[/Episode 01|Episode 01]] — History & basic concepts<br />
<br />
==Playback Software==<br />
If you’re having trouble with playback in a modern browser or player, please visit our [[Playback Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br />
===Players supporting [http://www.webmproject.org/ WebM]===<br />
* [http://www.videolan.org/vlc/ VLC] v1·1+<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.chromium.org/ Google Chrome]<br />
* [http://www.opera.com/ Opera]<br />
* [http://www.webmproject.org/users/ Other WebM players…]<br />
<br />
===Players supporting Ogg/[[Theora]]===<br />
* [http://www.videolan.org/vlc/ VLC]<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.opera.com/ Opera]<br />
* [[TheoraSoftwarePlayers|Other Theora players…]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos&diff=13232Videos2012-02-13T15:05:56Z<p>Lee Carré: Copy-editing: disambiguation of terms (Internationalisation)</p>
<hr />
<div>“A Digital Media Primer for Geeks” is a video series, from [https://www.xiph.org/ Xiph.Org], which presents the technical foundations of modern digital media as increasingly detailed instalments.<br />
One community member described the [[/Episode 01|first episode]] as “a Uni[versity] lecture I never got[,] but really wanted”.<br />
<br />
The series explains modern digital media from historical origins, basic concepts, to modern implementation.<br />
It’s intended for engineers, [http://www.catb.org/~esr/faqs/hacker-howto.html (software) hackers], mathematicians — the people who are interested in discovering and making things and building the technology itself — and budding geeks wanting to begin exploring video coding, as well as the technically‐curious who want to know more about the media they wrangle for work or play.<br />
<br />
So, without any further ado, welcome to one hell of a new hobby.<br />
<br />
==Episodes==<br />
* [[/Episode 01|Episode 01]] — History & basic concepts<br />
<br />
==Playback Software==<br />
If you’re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br />
===Players supporting [http://www.webmproject.org/ WebM]===<br />
* [http://www.videolan.org/vlc/ VLC] v1·1+<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.chromium.org/ Google Chrome]<br />
* [http://www.opera.com/ Opera]<br />
* [http://www.webmproject.org/users/ Other WebM players…]<br />
<br />
===Players supporting Ogg/[[Theora]]===<br />
* [http://www.videolan.org/vlc/ VLC]<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.opera.com/ Opera]<br />
* [[TheoraSoftwarePlayers|Other Theora players…]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos&diff=13231Videos2012-02-13T15:00:45Z<p>Lee Carré: Copy-editing: grammar/punctuation, linkification, document structure (headings), accessibility (link text)</p>
<hr />
<div>“A Digital Media Primer for Geeks” is a video series, from [https://www.xiph.org/ Xiph.Org], which presents the technical foundations of modern digital media as increasingly detailed instalments.<br />
One community member described the [[/Episode 01|first episode]] as “a Uni lecture I never got[,] but really wanted”.<br />
<br />
The series explains modern digital media from historical origins, basic concepts, to modern implementation.<br />
It’s intended for engineers, [http://www.catb.org/~esr/faqs/hacker-howto.html (software) hackers], mathematicians — the people who are interested in discovering and making things and building the technology itself — and budding geeks wanting to begin exploring video coding, as well as the technically‐curious who want to know more about the media they wrangle for work or play.<br />
<br />
So, without any further ado, welcome to one hell of a new hobby.<br />
<br />
==Episodes==<br />
* [[/Episode 01|Episode 01]] — History & basic concepts<br />
<br />
==Playback Software==<br />
If you’re having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br />
===Players supporting [http://www.webmproject.org/ WebM]===<br />
* [http://www.videolan.org/vlc/ VLC] v1·1+<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.chromium.org/ Google Chrome]<br />
* [http://www.opera.com/ Opera]<br />
* [http://www.webmproject.org/users/ Other WebM players…]<br />
<br />
===Players supporting Ogg/[[Theora]]===<br />
* [http://www.videolan.org/vlc/ VLC]<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.opera.com/ Opera]<br />
* [[TheoraSoftwarePlayers|Other Theora players…]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos&diff=13049Videos2011-09-16T22:37:06Z<p>Lee Carré: /* Playback Software */ clean-up</p>
<hr />
<div>“A Digital Media Primer for Geeks” is a video series, from Xiph.Org, which presents the technical foundations of modern digital media as increasingly detailed instalments.<br />
One community member described the [[/Episode 01|first episode]] as “a Uni lecture I never got but really wanted”.<br />
<br />
The series explains modern digital media from origins, basic concepts, to implementation.<br />
It's intended for engineers, (software) hackers, mathematicians — the people who are interested in discovering and making things and building the technology itself — and budding geeks looking to begin exploring video coding, as well as the technically curious who want to know more about the media they wrangle for work or play.<br />
<br />
So, without any further ado, welcome to one hell of a new hobby.<br />
<br />
==Episodes==<br />
* [[/Episode 01|Episode 01]] — History & basic concepts<br />
<br />
==Playback Software==<br />
<br />
If you're having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br />
Players supporting [[WebM]]:<br />
* [http://www.videolan.org/vlc/ VLC] v1·1+<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.chromium.org/ Google Chrome]<br />
* [http://www.opera.com/ Opera]<br />
* [http://www.webmproject.org/users/ more…]<br />
<br />
Players supporting Ogg/Theora:<br />
* [http://www.videolan.org/vlc/ VLC]<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.opera.com/ Opera]<br />
* [[TheoraSoftwarePlayers|more…]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos&diff=13048Videos2011-09-16T22:33:24Z<p>Lee Carré: Initial overview page</p>
<hr />
<div>“A Digital Media Primer for Geeks” is a video series, from Xiph.Org, which presents the technical foundations of modern digital media as increasingly detailed instalments.<br />
One community member described the [[/Episode 01|first episode]] as “a Uni lecture I never got but really wanted”.<br />
<br />
The series explains modern digital media from origins, basic concepts, to implementation.<br />
It's intended for engineers, (software) hackers, mathematicians — the people who are interested in discovering and making things and building the technology itself — and budding geeks looking to begin exploring video coding, as well as the technically curious who want to know more about the media they wrangle for work or play.<br />
<br />
So, without any further ado, welcome to one hell of a new hobby.<br />
<br />
==Episodes==<br />
* [[/Episode 01|Episode 01]] — History & basic concepts<br />
<br />
==Playback Software==<br />
<br />
If you're having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br />
Players supporting [[WebM]]:<br />
* [http://www.videolan.org/vlc/ VLC 1.1+]<br />
* [http://www.firefox.com/ Mozilla Firefox]<br />
* [http://www.chromium.org/getting-involved/dev-channel Chrome (development versions)]<br />
* [http://www.opera.com/ Opera]<br />
* [http://www.webmproject.org/users/ more…]<br />
<br />
Players supporting Ogg/Theora:<br />
* [http://www.videolan.org/vlc/ VLC]<br />
* [http://www.firefox.com/ Firefox]<br />
* [http://www.opera.com/ Opera]<br />
* [[TheoraSoftwarePlayers|more…]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)&diff=13046Talk:A Digital Media Primer For Geeks (episode 1)2011-09-16T21:57:22Z<p>Lee Carré: moved Talk:A Digital Media Primer For Geeks (episode 1) to Talk:A Digital Media Primer For Geeks/Episode 01: Establishing a hierarchical structure, especially in anticipation of future episodes. This allows for meta pages, such as giving a su...</p>
<hr />
<div>#REDIRECT [[Talk:A Digital Media Primer For Geeks/Episode 01]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Talk:Videos/A_Digital_Media_Primer_For_Geeks&diff=13045Talk:Videos/A Digital Media Primer For Geeks2011-09-16T21:57:22Z<p>Lee Carré: moved Talk:A Digital Media Primer For Geeks (episode 1) to Talk:A Digital Media Primer For Geeks/Episode 01: Establishing a hierarchical structure, especially in anticipation of future episodes. This allows for meta pages, such as giving a su...</p>
<hr />
<div>Welcome to the discussion. <br />
<br />
To discuss the video, make an account and hit edit. Please feel free to point out errata, suggested additional resources, or just ask questions!<br />
<br />
<br />
==Introduction==<br />
<br />
==Analog vs Digital==<br />
<br />
==Raw (digital audio) meat==<br />
Don't forget when talking about higher sampling rates that frequency and temporal response are inherently linked. One often overlooked aspect of this is the value of higher sampling rates in presenting subtle differences in multi-channel timing (e.g. the stereo field). Even fairly uncritical listeners presented sample audio blind can notice this. --Chaboud<br />
<br />
:They aren't merely "technically linked". They're mathematically indistinguishable. If a system doesn't has a response beyond some frequency it also lacks time resolution beyond some point.<br />
:To the best of my knowledge a perceptually justified need for higher rates is not supported by the available science on the subject. Not only is there no real physiological mechanism proposed for this kind of sensitivity, well controlled blind listening tests don't support it— well controlled being key, loudspeakers can suffer from considerable non-linear effects including intermodulation, and having a lot of otherwise inaudible ultrasonics can produce audible distortion at lower frequencies. Another common error is running the DAC at different frequencies— with the obvious interactions with the reconstruction and analog filters. A correct test for determining the audibility differences of higher sample rates needs to use a single DAC stage at the highest frequency, re-sampling digitally to create the bandpass... etc. I'm not aware of any such test supporting a need for information beyond 24kHz. <br />
:I normally suggest to people looking for increased to look into acoustic holography techniques like higher-order ambisonics and wavefield synthesis. <br />
:The beyond 48kHz sampling subject subject has been [http://www.google.com/custom?domains=hydrogenaudio.org&q=96khz&sa=Google+Search&sitesearch=hydrogenaudio.org&client=pub-4544327213918729&forid=1&channel=7051718642&ie=ISO-8859-1&oe=ISO-8859-1&flav=0000&sig=6_g3ghDcS6bRpfcd&cof=GALT%3A%23008000%3BGL%3A1%3BDIV%3A%23336699%3BVLC%3A663399%3BAH%3Acenter%3BBGC%3AFFFFFF%3BLBGC%3AFFFFFF%3BALC%3A0000FF%3BLC%3A0000FF%3BT%3A000000%3BGFNT%3A0000FF%3BGIMP%3A0000FF%3BLH%3A50%3BLW%3A262%3BL%3Ahttp%3A%2F%2Fwww.hydrogenaudio.org%2Fforums%2Flogo50.png%3BS%3Ahttp%3A%2F%2Fwww.hydrogenaudio.org%3BFORID%3A1&hl=en discussed a number of times on hydrogen audio], I recommend reading the thread there. They are quite informative. Most audio groups out there online and off are not very scientifically oriented (e.g. evidence based)— HA is special because it is one of the few that are.--[[User:Gmaxwell|Gmaxwell]] 06:00, 24 September 2010 (UTC)<br />
::I don't think higher Fs's than 48000 Hz are justified for psychoacoustic reasons, but we definitely use them in production for sound effects and music mastering, because it significantly improves the quality of pitch shifting and time stretching. On features we record all sound effects at 96 kHz (at least) so we have the liberty to pitch it down an octave. We shoot at 96k or 192k, and I archive to FLAC and use Apple Lossless .m4a's for online use, since they're better supported by our DAWs. As a distribution format though a base rate of 48kHz is definitely all you need for the home listening environment. [[User:Iluvcapra|Iluvcapra]] 18:39, 26 September 2010 (UTC)<br />
<br />
==Video vegetables (they're good for you!)==<br />
<br />
An interesting point is that the discussion of the linear segment in the normal display responses (e.g. sRGB) is incorrect, or at best incomplete, though I've coming up short on good citations for this, so Wikipedia remains uncorrected at this time.--[[User:Gmaxwell|Gmaxwell]] 05:15, 22 September 2010 (UTC)<br />
<br />
<br />
Hi there, great tutorial, but in fact the most common DVD standard is 720 pixels by 480 pixels, with a pixel ratio of 0.9, yielding a device aspect ratio of 1.35. I understand that you're trying to simplify the lecture to 4:3 aspect (1.333) for newbies, I think this is ultimately misleading, since the vast majority of DVDs are not sampled at 704x480. --Dryo<br />
<br />
: Sort of-- the most common encoding is 720x480, but with the crop area set to 704x480; that's what the standard calls for (I was being sneaky when I said 'display resolution of 704x480'). Many software players ignore the crop rectangle and also display the horizontal overscan area. Many software encoders also just blindly encode 720x480 without setting the crop area. It is a source of *much* confusion. --[[User:Xiphmont|Monty]]<br />
<br />
::"The standard" here being— Rec. 601? Is there anything else? We should probably at least link [[Wikipedia:overscan]]. --[[User:Gmaxwell|Gmaxwell]] 13:13, 24 September 2010 (UTC)<br />
<br />
::OK, thanks for the clarification Monty... I did not even know that the horizontal crop area existed.<br />
<br />
"''[...] most displays use [RGB] colors [...]''". Doesn't that sentence contradict this one : "''[...] video usually is represented as a [...] luma channel along with additional [...] chroma channels, the color''". I don't understand what "''position the chroma pixels''" means exactly. Are we talking of real points on a display ? Thanks, great video ! --[[User:Ledahulevogyre|Ledahulevogyre]] 13:59, 24 September 2010 (UTC)<br />
<br />
:Display devices use RGB. Most video is actually encoded as YUV, luma plus two color "difference" channels. This reduces the bandwidth of raw video by cleverly exploiting limitations in human perception. Additionally, color samples need not be as frequent as luminance samples. So "chroma pixels" are the color data samples, not the pixels on a real display. --Dryo<br />
<br />
::Thanks Dryo ! that's what I thought. Then I don't quite understand what this chroma samples positioning/siting is about. Is it actually defining the algorithm you should use to compute RGB pixels from YUV samples ? Is is defining the influence zone of chroma samples over luminance ones ? What I don't get is how you can talk about spatial positioning for something that is, well... not spatial (samples). Thank you again ! --[[User:Ledahulevogyre|Ledahulevogyre]] 09:52, 25 September 2010 (UTC)<br />
<br />
:::Imagine a small 2x2 image, with the top two pixels blue, and the bottom two pixels red. Luminance will be sampled at each pixel, but (for 4:2:0), only one sample of Cr will be taken for this 2x2 set, so you'll have to decide where. If you place the sample on the middle horizontally, but aligned with every even or odd line, you'll get a sample from either blue, or red. If you place the sample horizontally and vertically, you'll get a sample from pink. Similarly for each other possible placement algorithm. [[User:Ogg.k.ogg.k|Ogg.k.ogg.k]] 10:24, 25 September 2010 (UTC)<br />
<br />
=== color and colorspace ===<br />
I'm not 100% sure of this, but when I was messing with analogue and digitised-analogue video at University, I thought the key difference between 4:1:1 YUV and 4:2:0 YUV is that the former has the same color sub-sampling on every field (dealing with interlaced content), each at one-quarter of full rate, where as 4:2:0 YUV has only Y and U in the first field, then only Y and V in the second.<br />
<br />
Effectively the "4:2:0" signal is successively 4:2:0 and 4:0:2, which is why the V component doesn't go away altogether as the name implies. The reason for such a strange encoding standard and it's use for PAL-format DV encoding is that the analogue PAL signal already does the throwing away of half the colour information from each field in the analogue composite signal used to get video around production facilities, so there would only be reduced temporal resolution U and V input to be digitised. It obviously makes a lot less sense for progressive-scan content without the interlace (it would look pretty poor to reduce the colour frame rate to 12.5 fps). There is an argument that the conversion of 4:2:0 interlaced content into progressive data for computer display converts it into 4:1:1 material when the color planes of the two fields are married up?<br />
<br />
The only important thing to come out of this is that the diagram on the whiteboard looks a lot more like 4:1:1 video, and I would expect that to be the correct choice for progressive-scan content (which I take your images to be, it being simpler). The narration of the next scene also uses 4:1:1 rather than 4:2:0, which tends to emphasise the same point.<br />
<br />
==Containers==<br />
<br />
==General discussion==<br />
<br />
The video hasn't yet been formally released but we have all the sites up early in order to get everything debugged... Feedback on site functionality prior to the official release would be very helpful. --[[User:Gmaxwell|Gmaxwell]] 15:15, 22 September 2010 (UTC)<br />
:Released now, but still tell us about bugs :-) --[[User:Xiphmont|Monty]]<br />
<br />
When do you plan to create and/or release the next episode in this series? --[[User:Minerva|Minerva]] 05:52, 16 November 2010 (UTC)<br />
<br />
Mad props to you for taking the time and spending the effort to make this video. I've been waiting for episode 2 since this first came out, any chance of that being produced soon? --[[User:StFS|StFS]] 17:20, 14 June 2011 (PDT)<br />
<br />
=== Atom/RSS feed ===<br />
<br />
Could not find an Atom/RSS feed for the video episodes. A videocast url with video-link enclosures would be ideal for getting future episodes. But even a announce-only feed would be convenient to track new episode releases. --[[User:Gsauthof|Gsauthof]] 17:41, 24 September 2010 (UTC)<br />
:One does not exist yet— as a stopgap you can follow the [http://xiphmont.livejournal.com/tag/xiph Xiph tag on Monty's blog] and you'll be sure to hear about new videos. This has to be the most requested feature— I'll make sure we do it before the next video.--[[User:Gmaxwell|Gmaxwell]] 20:50, 24 September 2010 (UTC)<br />
::Monty has uses the tag 'admpfg' for these videos on his blog, and Livejournal supports RSS feeds for tags. So, here's an RSS feed: http://xiphmont.livejournal.com/data/rss?tag=admpfg [[User:Nerd65536|Nerd65536]] 17:52, 26 September 2010 (UTC)<br />
<br />
== 44100 Hz Trivia ==<br />
<br />
The reason CDs use a 44,100 Hz (actually 44,056 Hz in the United States) is because, before dedicated digital recorders became mainstream, the only way a recording engineer or producer could record digital audio was with a piece of gear called a "PCM processor" or a "PCM Adaptor" (like a Sony PCM-F1 of PCM-501). These would take an audio input and, after running through the A/D if necessary, it would modulate it onto a baseband monochrome NTSC or PAL video signal that could then be recorded onto a 3/4" U-Matic video tape. The processors would accept two inputs, at 16 bits, giving a total bit rate of 1411200 bps. This number has the serendipitous property of being evenly divisible by both 30 and 25, 47040 and 56448, and these numbers allow both NTSC and PAL to encode the same number of bits, 98, per scan line (with the NTSC 480 line raster and PAL 576 line raster). It was just convenient selection of integers. CDs would be recorded at 44.1k in Europe as they were mastered onto 25 fps tapes, while CDs in the US were recorded at a "nominal" 30fps were actually at 44.056, but the difference in tone is basically inaudible. [[User:Iluvcapra|Iluvcapra]] 18:44, 24 September 2010 (UTC)<br />
<br />
:Note that the PCM audio signal, once modulated to NTSC or PAL, can be recorded on any video recorder, not just U-matic. The most common tape format for PCM audio was Sony Betamax. Sony sold Betamax decks bundled with external PCM A/D converter units for the pro audio market. The PCM-F1 was designed to be used with Betacam VCRs. -- Dryo</div>Lee Carréhttps://wiki.xiph.org/index.php?title=A_Digital_Media_Primer_For_Geeks_(episode_1)/making&diff=13044A Digital Media Primer For Geeks (episode 1)/making2011-09-16T21:57:22Z<p>Lee Carré: moved A Digital Media Primer For Geeks (episode 1)/making to A Digital Media Primer For Geeks/Episode 01/making: Establishing a hierarchical structure, especially in anticipation of future episodes. This allows for meta pages, such as giving ...</p>
<hr />
<div>#REDIRECT [[A Digital Media Primer For Geeks/Episode 01/making]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos/A_Digital_Media_Primer_For_Geeks/making&diff=13043Videos/A Digital Media Primer For Geeks/making2011-09-16T21:57:22Z<p>Lee Carré: moved A Digital Media Primer For Geeks (episode 1)/making to A Digital Media Primer For Geeks/Episode 01/making: Establishing a hierarchical structure, especially in anticipation of future episodes. This allows for meta pages, such as giving ...</p>
<hr />
<div>[[Image:Dmpfg_mo_001.jpg|360px|right]]<br />
This page documents some of the background information behind the production of Digital Media Primer For Geeks. To see the video or its wiki-edition visit [[A Digital Media Primer For Geeks (episode 1)|the main video page]].<br />
<br />
=The making of…=<br />
<br />
==Equipment==<br />
===Camera===<br />
Canon HV40 HDV camera w/ wide-angle lens operating on a tripod. At the time I was looking for MyFirstVideoCamera, the six people I asked who did video work all recommended this same camera, and two said not to get it without the wide angle lens. I took their advice and have been happy with it. Among other nifty features, the camera offers true progressive scan modes, live firewire output, and the ability to act as a digitizer for external video input. With the patches I made in my Git repo, Cinelerra natively handles the Canon HDV progressive modes.<br />
<br />
The wide angle lens gives the camera a nice close macro mode, and approximately triples the amount of light coming into the sensor for a given zoom/aperture. Useful for shooting indoors at night (eg, this entire video)<br />
<br />
No additional lighting kit was used.<br />
<br />
===Audio===<br />
<br />
Two Crown PCC160 boundary microphones placed on a table approximately 4-8 feet in front of the speaker, run through a cheap Behringer portable mixer and into the camera's microphone input. <br />
<br />
No additional audio kit was used.<br />
<br />
===Sundries===<br />
<br />
Whiteboard markers by 'Bic'<br />
<br />
Drawing aids by Staedtler, McMaster Carr, and 'Generic'.<br />
<br />
==Video shooting sequence==<br />
<br />
Scenes were pre-scripted and memorized, usually with lots of on-the-fly revision. In the future... I'm getting a teleprompter. OTOH, I can totally rattle off the entire video script from beginning to end as a party trick, thus ensuring I'll not be invited to many parties.<br />
<br />
Diagrams were drawn by hand on a physical whiteboard with whiteboard markers and magnetic T-squares, triangles, and yardsticks. Despite looking a lot like greenscreen work, there is no image compositing in use (actually-- there are two small composites where an error in a whiteboard diagram was corrected by subtracting part of the original image and then adding a corrected version of the diagram).<br />
<br />
Camera operated in 24F shutter priority mode (Tv set to "24") with exposure and white balance both calibrated to the white board (or a white piece of paper) and locked. Microphone attenuation setting was active, with gain locked such that room noise peaked at -40dB (all the rooms in the shooting sequences were noisy due to the building's ventilation system, or active equipment). Lighting in the whiteboard rooms tended to be odd, with little relative light cast on a presenter standing just in front of the whiteboard; a presenter is practically standing in the room's only shadow. Most of the room light is focused on the table and walls. Additional fill lighting kit would have been useful, but for the first vid, I didn't want 'perfect' to be the enemy of 'good'.<br />
<br />
Autofocus used for whiteboard scenes, manual focus used for several workshop scenes as the autofocus tended to hunt continuously in very low light.<br />
<br />
Continuous capture to a Thinkpad with firewire input via a simple [http://people.xiph.org/~xiphmont/video/gst-rec gstreamer script].<br />
<br />
==Production sequence==<br />
===All hail Cinelerra. You better hail, or Cinelerra will get pissy about it.===<br />
<br />
Most of the production sequence hinged on making Cinelerra happy; it is a hulking rusty cast iron WWI tank of a program that can seem like it's composed entirely of compressed bugs. That said, it was neither particularly crashy nor did it ever accidentally corrupt or lose work. It was also the only FOSS editor with a working 2D compositor. It got the job done once I found a workflow it would cope with (and fixed a number of bugs; these fixes are available from my cinelerra Git repo at http://git.xiph.org/?p=users/xiphmont/cinelerraCV.git;a=summary)<br />
<br />
===Choosing takes===<br />
<br />
Each shooting session yielded four to six hours of raw video. The first step was to load the raw video into the cinelerra timeline, label each complete take, compare and choose the take to use, then render the chosen take out to a raw clip as a YUV4MPEG raw video file and a WAV raw audio file. Be careful that Settings->Align Cursor On Frames is set, else the audio and video renders won't start on the same boundary.<br />
<br />
===Postprocessing===<br />
<br />
At this point, the raw video clips were adjusted for gamma, contrast, and saturation in gstreamer and mplayer. In the earlier shoots the camera was underexposing due to pilot error, which required quite a bit of gamma and saturation inflation to 'correct' (there is no real correction as the low-end data is gone, but it's possible to make it look better). Later shoots used saner settings and the adjustments were mostly to keep different shooting sessions more uniform. The whiteboard tends not to look white because it's mildly reflective, and picked up the color of the cyan and orange audio baffles in the room like a big diffuse mirror.<br />
<br />
The audio was both noisy (due to the building's ventilation system which either sounded like a low loud rumble or a jet-engine taking off) and reverberant (the rooms were glass on two sides and plaster on the other two). Early takes used no additional sound absorbing material in the rooms, and the Postfish filtering and deverb was used heavily. It gives the early audio in the vid a slightly odd, processed feel (I had almost decided the original audio was simply unusable). Later takes used some big fleece 'soft flats' in the room to absorb some additional reverb, and the later takes are less heavily filtered.<br />
<br />
The postfish filtering chain used declip (for the occasional overrange oops), deverb (remove room reverberation), multicompand (noise gating), single compand (for volume levelling) and EQ (the Crown mics are nice, but are very midrange heavy).<br />
<br />
===Special Effects===<br />
<br />
Audio special effects were one-offs, mostly done using Sox. The processed demo sections of audio were then spliced back into the original audio takes using Audactity.<br />
<br />
Video special effects (eg, removing a color channel, etc) were done by writing quick, one-off filters in C for y4oi. A few effects were done by dumping a take as a directory full of PNGs and then batch-processing the PNGs again using a one-off C program, then reassembling with mplayer. Video effects were then stitched back into the original video takes in Cinelerra.<br />
<br />
===Editing===<br />
<br />
All editing was done in Cinelerra. This primarily consisted of stitching the individual takes back together with crossfades. All input and rendering output were done with raw YUV4MPEG and WAV files. Note that making this work well and correctly required several patches to the YUV4MPEG handler and colorspace conversion code.<br />
<br />
===Encoding===<br />
<br />
I encoded by hand external to Cinelerra using mplayer for final postprocessing, the example_encoder included with the [Ptalarbvorm] Theora source distribution, and ivfenc for WebM. I synced subtitles to the video by hand with Audacity (I already had the script) in SRT format [for easy editing/translation and syncing with the video in HTML5], and transcoded to Ogg Kate using kateenc. The Kate subs were then muxed with the Ogg video encoding using oggz-merge, and finally indexing added to the Ogg with OggIndex.<br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Sample Ogg command lines…'''<br />
...for producing 360p, 128-ish (a4) audio and 500-ish (v50) video with subtitles and index<br />
<br />
* perform a little denoising, scale, and deband the raw render:<br />
mplayer -vf hqdn3d,scale=640:360,gradfun=1.5,unsharp=l3x3:.1 complete.y4m -fast -noconsolecontrols -vo yuv4mpeg:file=filtered.y4m<br />
* encode the basic Ogg Vorbis/Theora file:<br />
encoder_example -a 4 -v 50 -k 240 complete.wav filtered.y4m -o basic.ogv<br />
* produce Kate subs from the SRT input file:<br />
kateenc -t srt -l en_US -c SUB -o subs.kate subs.srt<br />
* add the subs to the Ogg video file:<br />
oggz-merge basic.ogv subs.kate -o subbed.ogv<br />
* add index for faster seeking on the Web:<br />
OggIndex subbed.ogv -o A_Digital_Media_Primer_For_Geeks-360p.ogv<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Sample WebM command lines…'''<br />
...for producing 360p, 128-ish (a4) audio and 500kbps video with index<br />
<br />
* Might as well reuse the Vorbis encoding already done for the Ogg file:<br />
oggz-rip -c vorbis A_Digital_Media_Primer_For_Geeks-360p.ogv -o vorbis.ogg<br />
* Produce VP8 encoding from the y4m file used for Theora<br />
ivfenc filtered.y4m vp8.ivf -p 2 -t 4 --best --target-bitrate=1500 --end-usage=0 --auto-alt-ref=1 -v --minsection-pct=5 --maxsection-pct=800 --lagin-frames=16 --kf-min-dist=0 --kf-max-dist=120 --static-thresh=0 --drop-frame=0 --min-q=0 --max-q=60<br />
* Mux the audio and video into our first-stage WebM file<br />
mkvmerge vorbis.ogg vp8.ivf -o first-stage.webm<br />
* mkvmerge by itself doesn't generate a fully-compliant WebM file; mkclean will make the last necessary alterations<br />
mkclean --remux first-stage.webm A_Digital_Media_Primer_For_Geeks-360p.webm<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
==Web Presentation==<br />
<br />
HTML5 is new, so I found (to my unpleasant surprise) that I got to script all my website controls from scratch. Virtually everything preexisting was either very large/inscrutable/inflexible (a 'complete web video solution!') and offered both features I did not want and was missing features I did, or was a proof of concept that was obviously unfinished, unpolished, and not well tested.<br />
<br />
===Playback Controls===<br />
<br />
I wanted more than the standard set of controls, but I did *not* want to fall into the usual web geek trap: a UI with 50 buttons in a big heap with no thought to usability, and extra points for using at least twelve colors. I wanted new controls to be unobtrusive but obvious when you wanted them, and to blend into the prexisting controls.<br />
<br />
The clearly best way to do this would be to put a transparent canvas layer over the video window and implement completely fresh controls. This would probably give the most bug-proofness/future-proofness and definitely give the most consistent look/feel across browsers. I also estimate it would take several weeks of full time scripting to make it work as expected (remember, HTML5 is new and still a draft, so there are endless inconsistencies and implementation bugs to deal with. Writing a script is easy and fast. Making it work consistently is time consuming and frustrating).<br />
<br />
Adding a fade-in bar that approximately matched the existing controls in most players would be finicky, shorter-lived and not as pretty, but it could be practical and working far faster than the overkill solution of reimplementing everything. As HTML5 is as yet a draft anyway and I'll probably have to revisit any site scripting regularly anyway, option two seemed the sensible way to go.<br />
<br />
The nice thing about HTML and JavaScript both is that they're inherently Open Source; anyone can inspect the code I wrote (and point and laugh).<br />
<br />
===Subtitles===<br />
<br />
Although I think external subtitles aren't the best overall direction, it's all HTML5 currently offers. The Ogg files include Kate format subtitles, but HTML5 offers no API for accessing them. What HTML5 does give is a high-resolution playback timer, and the ability to load and parse subtitle files. <br />
<br />
[http://www.xiph.org/video/subtitles.js subtitles.js] is an updated version of jQuery.srt that loads and parses SRT format subs on demand from any URL, and places the text of the each subtitle into a &lt;div&gt; element in synchronization with the video playback timer. A little additional CSS is all that's necessary to put a translucent background behind it, and display it over the video frame.<br />
<br />
===Resolution / stream switching===<br />
<br />
This was considerably less elegant due to some apparent inadequacies in the HTML5 draft spec. There seems to be two basic ways of changing the video currently playing back in the current draft.<br />
<br />
The first way to change streams is to create a new video element via javascript, wait for it to load, then replace the current video with the new one. Unfortunately, HTML5 gives no way to prevent the original video, even when stopped, from using all available bandwidth to keep buffering as fast as it can. This starves the replacement video of network access, causing a lengthy delay when loading. It looks very nice and seamless when it finally works, but can easily result is switching video streams taking 15-30 seconds or more.<br />
<br />
The second option is to switch the preexisting video element to a new stream. This is much faster as the original stream stops sinking bandwidth immediately, but upon loading it always starts from the beginning and in current browsers also displays the first frame, even if playback isn't started. After the load completes, then it's possible to seek forward to where the original stream started. It doesn't look as good, but it's much faster in practice.<br />
<br />
I use the second, faster option, so there's a brief flash back to the beginning of the video upon resolution switch. <br />
<br />
===Chapter Navigation===<br />
<br />
Nothing special here, all it is is a &lt;select&gt; dropdown with an onchange handler that sets a new 'video.currentTime'.<br />
<br />
===Control pop/unpop===<br />
<br />
Oddly enough this was the hardest part, not because it's hard to do, but it's hard to make it consistent across browsers. Every browser fires radically different UI events for the same mouse/keyboard actions.<br />
<br />
===Dimming===<br />
<br />
...In retrospect, not as gratuitous as it seemed when I first wrote it. Many aspects about how video is made and presented assume viewing in a relatively dim environment, where the video being watched is the brightest thing in sight [or close to it]. Xiph's web styling uses white backgrounds, which I found actively distracting and out of place, but altering the style of the site for just the video pages also seemed clearly wrong. So I added an animated dim/undim on playback/pause (instantaneous dim/undim was jarring). I'm now convinced it was a good call, assuming it actually works everywhere as intended (it won't work on browsers using the Cortado fallback).</div>Lee Carréhttps://wiki.xiph.org/index.php?title=A_Digital_Media_Primer_For_Geeks_(episode_1)&diff=13042A Digital Media Primer For Geeks (episode 1)2011-09-16T21:57:21Z<p>Lee Carré: moved A Digital Media Primer For Geeks (episode 1) to A Digital Media Primer For Geeks/Episode 01: Establishing a hierarchical structure, especially in anticipation of future episodes. This allows for meta pages, such as giving a summary of e...</p>
<hr />
<div>#REDIRECT [[A Digital Media Primer For Geeks/Episode 01]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Videos/A_Digital_Media_Primer_For_Geeks&diff=13041Videos/A Digital Media Primer For Geeks2011-09-16T21:57:21Z<p>Lee Carré: moved A Digital Media Primer For Geeks (episode 1) to A Digital Media Primer For Geeks/Episode 01: Establishing a hierarchical structure, especially in anticipation of future episodes. This allows for meta pages, such as giving a summary of e...</p>
<hr />
<div><small>''Wiki edition''</small><br />
[[Image:Dmpfg_001.jpg|360px|right]]<br />
<br />
This first video from Xiph.Org presents the technical foundations of modern digital media via a half-hour firehose of information. One community member called it "a Uni lecture I never got but really wanted."<br />
<br />
The program offers a brief history of digital media, a quick summary of the sampling theorem, and myriad details of low level audio and video characterization and formatting. It's intended for budding geeks looking to get into video coding, as well as the technically curious who want to know more about the media they wrangle for work or play.<br />
<br/><br/><br/><br />
<center><font size="+2">[http://www.xiph.org/video/vid1.shtml Download or Watch online]</font></center><br />
<br style="clear:both;"/><br />
Players supporting WEBM: [http://www.videolan.org/vlc/ VLC 1.1+], [https://www.mozilla.com/en-US/firefox/all-beta.html Firefox 4 (beta)], [http://www.chromium.org/getting-involved/dev-channel Chrome (development versions)], [http://www.opera.com/ Opera], [http://www.webmproject.org/users/ more…]<br />
<br />
Players supporting Ogg/Theora: [http://www.videolan.org/vlc/ VLC], [http://www.firefox.com/ Firefox], [http://www.opera.com/ Opera], [[TheoraSoftwarePlayers|more…]]<br />
<br />
If you're having trouble with playback in a modern browser or player, please visit our [[Playback_Troubleshooting|playback troubleshooting and discussion]] page.<br />
<br/><br />
<hr/><br />
<br />
==Introduction==<br />
[[Image:Dmpfg_000.jpg|360px|right]]<br />
[[Image:Dmpfg_002.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Introduction|Discuss this section]]</small><br />
<br />
Workstations and high-end personal computers have been able to<br />
manipulate digital audio pretty easily for about fifteen years now.<br />
It's only been about five years that a decent workstation's been able<br />
to handle raw video without a lot of expensive special purpose<br />
hardware.<br />
<br />
But today even most cheap home PCs have the processor power and<br />
storage necessary to really toss raw video around, at least without<br />
too much of a struggle. So now that everyone has all of this cheap media-capable hardware, <br />
more people, not surprisingly, want to do interesting<br />
things with digital media, especially streaming. YouTube was the first huge<br />
success, and now everybody wants in.<br />
<br />
Well good! Because this stuff is a lot of fun!<br />
<br />
<br />
It's no problem finding consumers for digital media. But here I'd<br />
like to address the engineers, the mathematicians, the hackers, the<br />
people who are interested in discovering and making things and<br />
building the technology itself. The people after my own heart.<br />
<br />
Digital media, compression especially, is perceived to be super-elite,<br />
somehow incredibly more difficult than anything else in computer<br />
science. The big industry players in the field don't mind this<br />
perception at all; it helps justify the staggering number of very<br />
basic patents they hold. They like the image that their media<br />
researchers "are the best of the best, so much smarter than anyone<br />
else that their brilliant ideas can't even be understood by mere<br />
mortals." This is bunk. <br />
<br />
Digital audio and video and streaming and compression offer endless<br />
deep and stimulating mental challenges, just like any other<br />
discipline. It seems elite because so few people have been<br />
involved. So few people have been involved perhaps because so few<br />
people could afford the expensive, special-purpose equipment it<br />
required. But today, just about anyone watching this video has a<br />
cheap, general-purpose computer powerful enough to play with the big<br />
boys. There are battles going on today around HTML5 and browsers and<br />
video and open vs. closed. So now is a pretty good time to get<br />
involved. The easiest place to start is probably understanding the<br />
technology we have right now.<br />
<br />
This is an introduction. Since it's an introduction, it glosses over a<br />
ton of details so that the big picture's a little easier to see.<br />
Quite a few people watching are going to be way past anything that I'm<br />
talking about, at least for now. On the other hand, I'm probably<br />
going to go too fast for folks who really are brand new to all of<br />
this, so if this is all new, relax. The important thing is to pick out<br />
any ideas that really grab your imagination. Especially pay attention<br />
to the terminology surrounding those ideas, because with those, and<br />
Google, and Wikipedia, you can dig as deep as interests you.<br />
<br />
So, without any further ado, welcome to one hell of a new hobby.<br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
*[http://www.xiph.org/about/ About Xiph.Org]: Why you should care about open media<br />
*[http://www.0xdeadbeef.com/weblog/2010/01/html5-video-and-h-264-what-history-tells-us-and-why-were-standing-with-the-web/ HTML5 Video and H.264: what history tells us and why we're standing with the web]: Chris Blizzard of Mozilla on free formats and the open web<br />
*[http://diveintohtml5.org/video.html Dive into HTML5]: tutorial on HTML5 web video<br />
*[http://webchat.freenode.net/?channels=xiph Chat with the creators of the video] via freenode IRC in #xiph.<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
==Analog vs Digital==<br />
[[Image:Dmpfg_004.jpg|360px|right]]<br />
[[Image:Dmpfg_006.jpg|360px|right]]<br />
[[Image:Dmpfg_007.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Analog_vs_Digital|Discuss this section]]</small><br />
<br />
<br />
[[WikiPedia:Sound|Sound]] is the propagation of pressure waves through air, spreading out<br />
from a source like ripples spread from a stone tossed into a pond. A<br />
microphone, or the human ear for that matter, transforms these passing<br />
ripples of pressure into an electric signal. Right, this is<br />
middle school science class, everyone remembers this. Moving on.<br />
<br />
That audio signal is a one-dimensional function, a single value<br />
varying over time. If we slow the [[WikiPedia:Oscilloscope|'scope]] down a bit... that should be<br />
a little easier to see. A few other aspects of the signal are<br />
important. It's [[WikiPedia:Continuous_function|continuous]] in both value and time; that is, at any<br />
given time it can have any real value, and there's a smoothly varying<br />
value at every point in time. No matter how much we zoom in, there<br />
are no discontinuities, no singularities, no instantaneous steps or<br />
points where the signal ceases to exist. It's defined<br />
everywhere. Classic continuous math works very well on these signals.<br />
<br />
A digital signal on the other hand is [[WikiPedia:Discrete_math|discrete]] in both value and time.<br />
In the simplest and most common system, called [[WikiPedia:Pulse code modulation|Pulse Code Modulation]],<br />
one of a fixed number of possible values directly represents the<br />
instantaneous signal amplitude at points in time spaced a fixed<br />
distance apart. The end result is a stream of digits.<br />
<br />
Now this looks an awful lot like this. It seems intuitive that we<br />
should somehow be able to rigorously transform one into the other, and<br />
good news, the [[WikiPedia:Nyquist-Shannon sampling theorem|Sampling Theorem]] says we can and tells us<br />
how. Published in its most recognizable form by [[WikiPedia:Claude Shannon|Claude Shannon]] in 1949<br />
and built on the work of [[WikiPedia:Harry Nyquist|Nyquist]], and [[WikiPedia:Ralph Hartley|Hartley]], and tons of others, the<br />
sampling theorem states that not only can we go back and<br />
forth between analog and digital, but also lays<br />
down a set of conditions for which conversion is lossless and the two<br />
representations become equivalent and interchangeable. When the<br />
lossless conditions aren't met, the sampling theorem tells us how and<br />
how much information is lost or corrupted.<br />
<br />
Up until very recently, analog technology was the basis for<br />
practically everything done with audio, and that's not because most<br />
audio comes from an originally analog source. You may also think that<br />
since computers are fairly recent, analog signal technology must have<br />
come first. Nope. Digital is actually older. The [[WikiPedia:Telegraph|telegraph]] predates<br />
the telephone by half a century and was already fully mechanically<br />
automated by the 1860s, sending coded, multiplexed digital signals<br />
long distances. You know... [[WikiPedia:Tickertape|tickertape]]. Harry Nyquist of [[WikiPedia:Bell_labs|Bell Labs]] was<br />
researching telegraph pulse transmission when he published his<br />
description of what later became known as the [[WikiPedia:Nyquist_frequency|Nyquist frequency]], the<br />
core concept of the sampling theorem. Now, it's true the telegraph<br />
was transmitting symbolic information, text, not a digitized analog<br />
signal, but with the advent of the telephone and radio, analog and<br />
digital signal technology progressed rapidly and side-by-side.<br />
<br />
Audio had always been manipulated as an analog signal because... well,<br />
gee, it's so much easier. A [[WikiPedia:Low-pass_filter#Continuous-time_low-pass_filters|second-order low-pass filter]], for example,<br />
requires two passive components. An all-analog [[WikiPedia:Short-time_Fourier_transform|short-time Fourier<br />
transform]], a few hundred. Well, maybe a thousand if you want to build<br />
something really fancy (bang on the [http://www.testequipmentdepot.com/usedequipment/hewlettpackard/spectrumanalyzers/3585a.htm 3585]). Processing signals<br />
digitally requires millions to billions of transistors running at<br />
microwave frequencies, support hardware at very least to digitize and<br />
reconstruct the analog signals, a complete software ecosystem for<br />
programming and controlling that billion-transistor juggernaut,<br />
digital storage just in case you want to keep any of those bits for<br />
later...<br />
<br />
So we come to the conclusion that analog is the only practical way to<br />
do much with audio... well, unless you happen to have a billion<br />
transistors and all the other things just lying around. And [[WikiPedia:File:Transistor_Count_and_Moore's_Law_-_2008.svg|since we<br />
do]], digital signal processing becomes very attractive.<br />
<br />
For one thing, analog componentry just doesn't have the flexibility of<br />
a general purpose computer. Adding a new function to this<br />
beast [the 3585]... yeah, it's probably not going to happen. On a digital<br />
processor though, just write a new program. Software isn't trivial,<br />
but it is a lot easier.<br />
<br />
Perhaps more importantly though every analog component is an<br />
approximation. There's no such thing as a perfect transistor, or a<br />
perfect inductor, or a perfect capacitor. In analog, every component<br />
adds [[WikiPedia:Johnson–Nyquist_noise|noise]] and [[WikiPedia:Distortion#Electronic_signals|distortion]], usually not very much, but it adds up. Just<br />
transmitting an analog signal, especially over long distances,<br />
progressively, measurably, irretrievably corrupts it. Besides, all of<br />
those single-purpose analog components take up a lot of space. Two<br />
lines of code on the billion transistors back here can implement a<br />
filter that would require an [[WikiPedia:Inductor|inductor]] the size of a refrigerator.<br />
<br />
Digital systems don't have these drawbacks. Digital signals can be<br />
stored, copied, manipulated, and transmitted without adding any noise<br />
or distortion. We do use [[WikiPedia:Lossy_compression|lossy]] algorithms from time to time, but the<br />
only unavoidably non-ideal steps are digitization and reconstruction,<br />
where digital has to interface with all of that messy analog. Messy<br />
or not, modern [[WikiPedia:Digital-to-analog_converter|conversion stages]] are very, very good. By the<br />
standards of our ears, we can consider them practically lossless as<br />
well.<br />
<br />
With a little extra hardware, then, most of which is now small and<br />
inexpensive due to our modern industrial infrastructure, digital audio<br />
is the clear winner over analog. So let us then go about storing it,<br />
copying it, manipulating it, and transmitting it.<br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
*Wikipedia: [[WikiPedia:Nyquist–Shannon_sampling_theorem|Nyquist–Shannon sampling theorem]]<br />
*MIT OpenCourseWare [http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-003-signals-and-systems-spring-2010/lecture-notes/ Lecture notes from 6.003 signals and systems.]<br />
*Wikipedia: [[WikiPedia:Passive_analogue_filter_development|The history of analog filters]] such as the [[WikiPedia:RC circuit|RC low-pass]] shown connected to the [[wikipedia:Spectrum_analyzer|spectrum analyzer]] in the video.<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
==Raw (digital audio) meat==<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Raw_.28digital_audio.29_meat|Discuss this section]]</small><br />
<br />
Pulse Code Modulation is the most common representation for <br />
raw audio. Other practical representations do exist: for example, the<br />
[[WikiPedia:Delta-sigma_modulation|Sigma-Delta coding]] used by the [[WikiPedia:Super_Audio_CD|SACD]], which is a form of [[wikipedia:Pulse-density_modulation|Pulse Density<br />
Modulation]]. That said, Pulse Code Modulation is far<br />
and away dominant, mainly because it's so mathematically<br />
convenient. An audio engineer can spend an entire career without<br />
running into anything else.<br />
<br />
PCM encoding can be characterized in three parameters, making it easy<br />
to account for every possible PCM variant with mercifully little<br />
hassle.<br />
<br style="clear:both;"/><br />
===sample rate===<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Raw_.28digital_audio.29_meat|Discuss this section]]</small><br />
[[Image:Dmpfg_009.jpg|360px|right]]<br />
[[Image:Dmpfg_008.jpg|360px|right]]<br />
The first parameter is the [[wikipedia:Sampling_rate|sampling rate]]. The highest frequency an<br />
encoding can represent is called the Nyquist Frequency. The Nyquist<br />
frequency of PCM happens to be exactly half the sampling rate.<br />
Therefore, the sampling rate directly determines the highest possible<br />
frequency in the digitized signal.<br />
<br />
Analog telephone systems traditionally [[wikipedia:Bandlimiting|band-limited]] voice channels to<br />
just under 4kHz, so digital telephony and most classic voice<br />
applications use an 8kHz sampling rate: the minimum sampling rate<br />
necessary to capture the entire bandwidth of a 4kHz channel. This is<br />
what an 8kHz sampling rate sounds like&mdash;a bit muffled but perfectly<br />
intelligible for voice. This is the lowest sampling rate that's ever<br />
been used widely in practice.<br />
<br />
From there, as power, and memory, and storage increased, consumer<br />
computer hardware went to offering 11, and then 16, and then 22, and<br />
then 32kHz sampling. With each increase in the sampling rate and the<br />
Nyquist frequency, it's obvious that the high end becomes a little<br />
clearer and the sound more natural.<br />
<br />
The Compact Disc uses a 44.1kHz sampling rate, which is again slightly<br />
better than 32kHz, but the gains are becoming less distinct. 44.1kHz<br />
is a bit of an oddball choice, especially given that it hadn't been<br />
used for anything prior to the compact disc, but the huge success of<br />
the CD has made it a common rate.<br />
<br />
The most common hi-fidelity sampling rate aside from the CD is 48kHz.<br />
There's virtually no audible difference between the two. This video,<br />
or at least the original version of it, was shot and produced with<br />
48kHz audio, which happens to be the original standard for<br />
high-fidelity audio with video.<br />
<br />
Super-hi-fidelity sampling rates of 88, and 96, and 192kHz have also<br />
appeared. The reason for the sampling rates beyond 48kHz isn't to<br />
extend the audible high frequencies further. It's for a different<br />
reason.<br />
<br />
Stepping back for just a second, the French mathematician [[wikipedia:Joseph_Fourier|Jean<br />
Baptiste Joseph Fourier]] showed that we can also think of signals like<br />
audio as a set of component frequencies. This [[wikipedia:Frequency_domain|frequency-domain]]<br />
representation is equivalent to the time representation; the signal is<br />
exactly the same, we're just looking at it [[wikipedia:Basis_(linear_algebra)|a different way]]. Here we see the<br />
frequency-domain representation of a hypothetical analog signal we<br />
intend to digitally sample.<br />
<br />
The sampling theorem tells us two essential things about the sampling<br />
process. First, that a digital signal can't represent any<br />
frequencies above the Nyquist frequency. Second, and this is the new<br />
part, if we don't remove those frequencies with a low-pass [[wikipedia:Audio_filter|filter]]<br />
before sampling, the sampling process will fold them down into the<br />
representable frequency range as [[wikipedia:Aliasing|aliasing distortion]].<br />
<br />
Aliasing, in a nutshell, sounds freakin' awful, so it's essential to<br />
remove any beyond-Nyquist frequencies before sampling and after<br />
reconstruction.<br />
<br />
Human frequency perception is considered to extend to about 20kHz. In<br />
44.1 or 48kHz sampling, the low pass before the sampling stage has to<br />
be extremely sharp to avoid cutting any audible frequencies below<br />
[[wikipedia:Hearing_range|20kHz]] but still not allow frequencies above the Nyquist to leak<br />
forward into the sampling process. This is a difficult filter to<br />
build, and no practical filter succeeds completely. If the sampling<br />
rate is 96kHz or 192kHz on the other hand, the low pass has an extra<br />
[[wikipedia:Octave_(electronics)|octave]] or two for its [[wikipedia:Transition_band|transition band]]. This is a much easier filter to<br />
build. Sampling rates beyond 48kHz are actually one of those messy<br />
analog stage compromises.<br />
<br style="clear:both;"/><br />
<br />
===sample format===<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Raw_.28digital_audio.29_meat|Discuss this section]]</small><br />
[[Image:Dmpfg_anim.gif|right]]<br />
<br />
The second fundamental PCM parameter is the sample format; that is,<br />
the format of each digital number. A number is a number, but a number<br />
can be represented in bits a number of different ways.<br />
<br />
Early PCM was [[wikipedia:Quantization_(sound_processing)#Audio_quantization|eight-bit]] [[wikipedia:Linear_pulse_code_modulation|linear]], encoded as an [[wikipedia:Signedness|unsigned]] [[wikipedia:Integer_(computer_science)#Bytes_and_octets|byte]]. The<br />
[[wikipedia:Dynamic_range#Audio|dynamic range]] is limited to about [[wikipedia:Decibel|50dB]] and the [[wikipedia:Quantization_error|quantization noise]], as<br />
you can hear, is pretty severe. Eight-bit audio is vanishingly rare<br />
today.<br />
<br />
Digital telephony typically uses one of two related non-linear eight<br />
bit encodings called [[wikipedia:A-law_algorithm|A-law]] and [[wikipedia:Μ-law_algorithm|μ-law]]. These formats encode a roughly<br />
[[wikipedia:Audio_bit_depth#Dynamic_range|14 bit dynamic range]] into eight bits by spacing the higher amplitude<br />
values farther apart. A-law and mu-law obviously improve quantization<br />
noise compared to linear 8-bit, and voice harmonics especially hide<br />
the remaining quantization noise well. All three eight-bit encodings,<br />
linear, A-law, and mu-law, are typically paired with an 8kHz sampling<br />
rate, though I'm demonstrating them here at 48kHz.<br />
<br />
Most modern PCM uses 16- or 24-bit [[wikipedia:Two's_complement|two's-complement]] signed integers to<br />
encode the range from negative infinity to zero decibels in 16 or 24<br />
bits of precision. The maximum absolute value corresponds to zero decibels.<br />
As with all the sample formats so far, signals beyond zero decibels, and thus<br />
beyond the maximum representable range, are [[wikipedia:Clipping_(audio)|clipped]].<br />
<br />
In mixing and mastering, it's not unusual to use [[wikipedia:Floating_point|floating-point]]<br />
numbers for PCM instead of [[wikipedia:Integer_(computer_science)|integers]]. A 32 bit [[wikipedia:IEEE_754-2008|IEEE754]] float, that's<br />
the normal kind of floating point you see on current computers, has 24<br />
bits of resolution, but a seven bit floating-point exponent increases<br />
the representable range. Floating point usually represents zero<br />
decibels as +/-1.0, and because floats can obviously represent<br />
considerably beyond that, temporarily exceeding zero decibels during<br />
the mixing process doesn't cause clipping. Floating-point PCM takes<br />
up more space, so it tends to be used only as an intermediate<br />
production format.<br />
<br />
Lastly, most general purpose computers still read and<br />
write data in octet bytes, so it's important to remember that samples<br />
bigger than eight bits can be in [[wikipedia:Endianness|big- or little-endian order]], and both<br />
endiannesses are common. For example, Microsoft [[wikipedia:WAV|WAV]] files are little-endian,<br />
and Apple [[wikipedia:AIFC|AIFC]] files tend to be big-endian. Be aware of it.<br />
<br style="clear:both;"/><br />
<br />
===channels===<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Raw_.28digital_audio.29_meat|Discuss this section]]</small><br />
<br />
The third PCM parameter is the number of [[wikipedia:Multichannel_audio|channels]]. The convention in<br />
raw PCM is to encode multiple channels by interleaving the samples of<br />
each channel together into a single stream. Straightforward and extensible.<br />
<br style="clear:both;"/><br />
===done!===<br />
<br />
And that's it! That describes every PCM representation ever. Done.<br />
Digital audio is ''so easy''! There's more to do of course, but at this<br />
point we've got a nice useful chunk of audio data, so let's get some<br />
video too.<br />
<br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
* [[wikipedia:Roll-off|Wikipedia's article on filter roll-off]], to learn why it's hard to build analog filters with a very narrow [[wikipedia:Transition_band|transition band]] between the [[wikipedia:Passband|passband]] and the [[wikipedia:Stopband|stopband]]. Filters that achieve such hard edges often do so at the expense of increased [[wikipedia:Ripple_(filters)#Frequency-domain_ripple|ripple]] and [http://www.ocf.berkeley.edu/~ashon/audio/phase/phaseaud2.htm phase distortion].<br />
* [http://wiki.multimedia.cx/index.php?title=PCM Some more minutiae] about PCM in practice.<br />
* [[wikipedia:DPCM|DPCM]] and [[wikipedia:ADPCM|ADPCM]], simple audio codecs loosely inspired by PCM.<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
==Video vegetables (they're good for you!)==<br />
[[Image:Dmpfg_010.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
One could think of video as being like audio but with two additional<br />
spatial dimensions, X and Y, in addition to the dimension of time.<br />
This is mathematically sound. The Sampling Theorem applies to all<br />
three video dimensions just as it does the single time dimension of<br />
audio.<br />
<br />
Audio and video are obviously quite different in practice. For one,<br />
compared to audio, video is huge. [[wikipedia:Red_Book_(audio_Compact_Disc_standard)#Technical_details|Raw CD audio]] is about 1.4 megabits<br />
per second. Raw [[wikipedia:1080i|1080i]] HD video is over 700 megabits per second. That's<br />
more than 500 times more data to capture, process, and store per<br />
second. By [[wikipedia:Moore's_law|Moore's law]]... that's... let's see... roughly eight<br />
doublings times two years, so yeah, computers requiring about an extra<br />
fifteen years to handle raw video after getting raw audio down pat was<br />
about right.<br />
<br />
Basic raw video is also just more complex than basic raw audio. The<br />
sheer volume of data currently necessitates a representation more<br />
efficient than the linear PCM used for audio. In addition, electronic<br />
video comes almost entirely from broadcast television alone, and the<br />
standards committees that govern broadcast video have always been very<br />
concerned with backward compatibility. Up until just last year in the<br />
US, a sixty-year-old black and white television could still show a<br />
normal [[wikipedia:NTSC|analog television broadcast]]. That's actually a really neat<br />
trick.<br />
<br />
The downside to backward compatibility is that once a detail makes it<br />
into a standard, you can't ever really throw it out again. Electronic<br />
video has never started over from scratch the way audio has multiple<br />
times. Sixty years worth of clever but obsolete hacks necessitated by<br />
the passing technology of a given era have built up into quite a pile,<br />
and because digital standards also come from broadcast television, all<br />
these eldritch hacks have been brought forward into the digital<br />
standards as well.<br />
<br />
In short, there are a whole lot more details involved in digital video<br />
than there were with audio. There's no hope of covering them<br />
all completely here, so we'll cover the broad fundamentals.<br />
<br style="clear:both;"/><br />
===resolution and aspect===<br />
[[Image:Dmpfg_011.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
The most obvious raw video parameters are the width and height of the<br />
picture in pixels. As simple as that may sound, the pixel dimensions<br />
alone don't actually specify the absolute width and height of the<br />
picture, as most broadcast-derived video doesn't use square pixels.<br />
The number of [[wikipedia:Scan_line|scanlines]] in a broadcast image was fixed, but the<br />
effective number of horizontal pixels was a function of channel<br />
[[wikipedia:Bandwidth_(signal_processing)|bandwidth]]. Effective horizontal resolution could result in pixels that<br />
were either narrower or wider than the spacing between scanlines.<br />
<br />
Standards have generally specified that digitally sampled video should<br />
reflect the real resolution of the original analog source, so a large<br />
amount of digital video also uses non-square pixels. For example, a<br />
normal 4:3 aspect NTSC DVD is typically encoded with a display<br />
resolution of [[wikipedia:DVD-Video#Frame_size_and_frame_rate|704 by 480]], a ratio wider than 4:3. In this case, the<br />
pixels themselves are assigned an aspect ratio of [[wikipedia:Standard-definition_television#Resolution|10:11]], making them<br />
taller than they are wide and narrowing the image horizontally to the<br />
correct aspect. Such an image has to be resampled to show properly on<br />
a digital display with square pixels.<br />
<br style="clear:both;"/><br />
===frame rate and interlacing===<br />
[[Image:Dmpfg_012.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
The second obvious video parameter is the [[wikipedia:Frame_rate|frame rate]], the number of<br />
full frames per second. Several standard frame rates are in active<br />
use. Digital video, in one form or another, can use all of them. Or,<br />
any other frame rate. Or even variable rates where the frame rate<br />
changes adaptively over the course of the video. The higher the frame<br />
rate, the smoother the motion and that brings us, unfortunately, to<br />
[[wikipedia:Interlace|interlacing]].<br />
<br />
In the very earliest days of broadcast video, engineers sought the<br />
fastest practical frame rate to smooth motion and to minimize [[wikipedia:Flicker_(screen)|flicker]]<br />
on phosphor-based [[wikipedia:Cathode_ray_tube|CRTs]]. They were also under pressure to use the<br />
least possible bandwidth for the highest resolution and fastest frame<br />
rate. Their solution was to interlace the video where the even lines<br />
are sent in one pass and the odd lines in the next. Each pass is<br />
called a field and two fields sort of produce one complete frame.<br />
"Sort of", because the even and odd fields aren't actually from the<br />
same source frame. In a 60 field per second picture, the source frame<br />
rate is actually 60 full frames per second, and half of each frame,<br />
every other line, is simply discarded. This is why we can't<br />
[[wikipedia:Deinterlacing|deinterlace]] a video simply by combining two fields into one frame;<br />
they're not actually from one frame to begin with.<br />
<br style="clear:both;"/><br />
<br />
===gamma===<br />
[[Image:Dmpfg_013.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
The cathode ray tube was the only available display technology for<br />
most of the history of electronic video. A CRT's output brightness is<br />
nonlinear, approximately equal to the input controlling voltage raised<br />
to the 2.5th power. This exponent, 2.5, is designated gamma, and so<br />
it's often referred to as the gamma of a display. Cameras, though,<br />
are linear, and if you feed a CRT a linear input signal, it looks a<br />
bit like this.<br />
<br />
As there were originally to be very few cameras, which were<br />
fantastically expensive anyway, and hopefully many, many television<br />
sets which best be as inexpensive as possible, engineers decided to<br />
add the necessary [[wikipedia:Gamma_correction|gamma correction]] circuitry to the cameras rather<br />
than the sets. Video transmitted over the airwaves would thus have a<br />
nonlinear intensity using the inverse of the set's gamma exponent, so that<br />
once a camera's signal was finally displayed on the CRT, the overall<br />
response of the system from camera to set was back to linear again.<br />
<br />
Almost.<br />
<br />
There were also two other tweaks. A television camera actually uses a<br />
gamma exponent that's the inverse of 2.2, not 2.5. That's just a<br />
correction for viewing in a dim environment. Also, the exponential<br />
curve transitions to a linear ramp near black. That's just an old<br />
hack for suppressing sensor noise in the camera.<br />
<br />
Gamma correction also had a lucky benefit. It just so happens that the<br />
human eye has a perceptual gamma of about 3. This is relatively close<br />
to the CRT's gamma of 2.5. An image using gamma correction devotes<br />
more resolution to lower intensities, where the eye happens to have<br />
its finest intensity discrimination, and therefore uses the available<br />
scale resolution more efficiently. Although CRTs are currently<br />
vanishing, a standard [[wikipedia:sRGB|sRGB]] computer display still uses a nonlinear<br />
intensity curve similar to television, with a linear ramp near black,<br />
followed by an exponential curve with a gamma exponent of 2.4. This<br />
encodes a sixteen bit linear range down into eight bits.<br />
<br style="clear:both;"/><br />
<br />
===color and colorspace===<br />
[[Image:Dmpfg_014.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
The human eye has three apparent color channels, red, green, and blue,<br />
and most displays use these three colors as [[wikipedia:Additive_color|additive primaries]] to<br />
produce a full range of color output. The primary pigments in<br />
printing are [[wikipedia:CMYK|Cyan, Magenta, and Yellow]] for the same reason; pigments<br />
are [[wikipedia:Subtractive_color|subtractive]], and each of these pigments subtracts one pure color<br />
from reflected light. Cyan subtracts red, magenta subtracts green, and<br />
yellow subtracts blue.<br />
<br />
Video can be, and sometimes is, represented with red, green, and blue<br />
color channels, but RGB video is atypical. The human eye is far more<br />
sensitive to [[wikipedia:Luminance_(relative)|luminosity]] than it is the color, and RGB tends to spread<br />
the energy of an image across all three color channels. That is, the<br />
red plane looks like a red version of the original picture, the green<br />
plane looks like a green version of the original picture, and the blue<br />
plane looks like a blue version of the original picture. Black and<br />
white times three. Not efficient.<br />
<br />
For those reasons and because, oh hey, television just happened to<br />
start out as black and white anyway, video usually is represented as a<br />
high resolution [[wikipedia:Luma_(video)|luma channel]]&mdash;the black & white&mdash;along with<br />
additional, often lower resolution [[wikipedia:Chrominance|chroma channels]], the color. The<br />
luma channel, Y, is produced by weighting and then adding the separate<br />
red, green and blue signals. The chroma channels U and V are then<br />
produced by subtracting the luma signal from blue and the luma signal<br />
from red.<br />
<br />
When YUV is scaled, offset, and quantized for digital video, it's<br />
usually more correctly called [[wikipedia:Y'CbCr|Y'CbCr]], but the more generic term YUV is<br />
widely used to describe all the analog and digital variants of this<br />
color model.<br />
<br style="clear:both;"/><br />
<br />
===chroma subsampling===<br />
[[Image:Dmpfg_015.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
The U and V chroma channels can have the same resolution as the Y<br />
channel, but because the human eye has far less spatial color<br />
resolution than spatial luminosity resolution, chroma resolution is<br />
usually [[wikipedia:Chroma_subsampling|halved or even quartered]] in the horizontal direction, the<br />
vertical direction, or both, usually without any significant impact on the<br />
apparent raw image quality. Practically every possible subsampling<br />
variant has been used at one time or another, but the common choices<br />
today are [[wikipedia:Chroma_subsampling#4:4:4_Y.27CbCr|4:4:4]] video, which isn't actually subsampled at all, [[wikipedia:Chroma_subsampling#4:2:2|4:2:2]] video in<br />
which the horizontal resolution of the U and V channels is halved, and<br />
most common of all, [[wikipedia:Chroma_subsampling#4:2:0|4:2:0]] video in which both the horizontal and vertical<br />
resolutions of the chroma channels are halved, resulting in U and V<br />
planes that are each one quarter the size of Y.<br />
<br />
The terms 4:2:2, 4:2:0, [[wikipedia:Chroma_subsampling#4:1:1|4:1:1]], and so on and so forth, aren't complete<br />
descriptions of a chroma subsampling. There's multiple possible ways<br />
to position the chroma pixels relative to luma, and again, several<br />
variants are in active use for each subsampling. For example, [[wikipedia:Motion_Jpeg|motion<br />
JPEG]], [[wikipedia:MPEG-1#Part_2:_Video|MPEG-1 video]], [[wikipedia:MPEG-2#Video_coding_.28simplified.29|MPEG-2 video]], [[wikipedia:DV#DV_Compression|DV]], [[wikipedia:Theora|Theora]], and [[wikipedia:WebM|WebM]] all use or can<br />
use 4:2:0 subsampling, but they site the chroma pixels [http://www.mir.com/DMG/chroma.html three different ways].<br />
<br />
Motion JPEG, MPEG-1 video, Theora and WebM all site chroma pixels<br />
between luma pixels both horizontally and vertically.<br />
<br />
MPEG-2 sites chroma pixels between lines, but horizontally aligned with<br />
every other luma pixel. Interlaced modes complicate things somewhat,<br />
resulting in a siting arrangement that's a tad bizarre.<br />
<br />
And finally PAL-DV, which is always interlaced, places the chroma<br />
pixels in the same position as every other luma pixel in the<br />
horizontal direction, and vertically alternates chroma channel on<br />
each line.<br />
<br />
That's just 4:2:0 video. I'll leave the other subsamplings as homework for the<br />
viewer. Got the basic idea, moving on.<br />
<br style="clear:both;"/><br />
<br />
===pixel formats===<br />
[[Image:Dmpfg_016.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Video_vegetables_.28they.27re_good_for_you.21.29|Discuss this section]]</small><br />
<br />
In audio, we always represent multiple channels in a PCM stream by<br />
interleaving the samples from each channel in order. Video uses both<br />
packed formats that interleave the color channels, as well as planar<br />
formats that keep the pixels from each channel together in separate<br />
planes stacked in order in the frame. There are at least [http://www.fourcc.org/yuv.php 50 different formats] in<br />
these two broad categories with possibly ten or fifteen in common use.<br />
<br />
Each chroma subsampling and different bit-depth requires a different<br />
packing arrangement, and so a different pixel format. For a given<br />
unique subsampling, there are usually also several equivalent formats<br />
that consist of trivial channel order rearrangements or repackings, due either to<br />
convenience once-upon-a-time on some particular piece of hardware, or<br />
sometimes just good old-fashioned spite.<br />
<br />
Pixels formats are described by a unique name or [[wikipedia:FourCC|fourcc]] code. There<br />
are quite a few of these and there's no sense going over each one now.<br />
Google is your friend. Be aware that fourcc codes for raw video<br />
specify the pixel arrangement and chroma subsampling, but generally<br />
don't imply anything certain about chroma siting or color space. [http://www.fourcc.org/yuv.php#YV12 YV12]<br />
video to pick one, can use JPEG, MPEG-2 or DV chroma siting, and any<br />
one of [[wikipedia:YUV#BT.709_and_BT.601|several YUV colorspace definitions]].<br />
<br style="clear:both;"/><br />
<br />
===done!===<br />
<br />
That wraps up our not-so-quick and yet very incomplete tour of raw<br />
video. The good news is we can already get quite a lot of real work<br />
done using that overview. In plenty of situations, a frame of video<br />
data is a frame of video data. The details matter, greatly, when it<br />
come time to write software, but for now I am satisfied that the<br />
esteemed viewer is broadly aware of the relevant issues.<br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
* YCbCr is defined in terms of RGB by the ITU in two incompatible standards: [[wikipedia:Rec. 601|Rec. 601]] and [[wikipedia:Rec. 709|Rec. 709]]. Both conversion standards are lossy, which has prompted some to adopt a lossless alternative called [http://wiki.multimedia.cx/index.php?title=YCoCg YCoCg].<br />
* Learn about [[wikipedia:High_dynamic_range_imaging|high dynamic range imaging]], which achieves better representation of the full range of brightnesses in the real world by using more than 8 bits per channel.<br />
* Learn about how [[wikipedia:Trichromatic_vision|trichromatic color vision]] works in humans, and how human color perception is encoded in the [[wikipedia:CIE 1931 color space|CIE 1931 XYZ color space]].<br />
** Compare with the [[wikipedia:Lab_color_space|Lab color space]], mathematically equivalent but structured to account for "perceptual uniformity".<br />
** If we were all [[wikipedia:Dichromacy|dichromats]] then video would only need two color channels. Some humans might be [[wikipedia:Tetrachromacy#Possibility_of_human_tetrachromats|tetrachromats]], in which case they would need an additional color channel for video to fully represent their vision.<br />
** [http://www.xritephoto.com/ph_toolframe.aspx?action=coloriq Test your color vision] (or at least your monitor).<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
==Containers==<br />
[[Image:Dmpfg_017.jpg|360px|right]]<br />
<small>[[Talk:A_Digital_Media_Primer_For_Geeks_(episode_1)#Containers|Discuss this section]]</small><br />
<br />
So. We have audio data. We have video data. What remains is the more<br />
familiar non-signal data and straight-up engineering that software<br />
developers are used to, and plenty of it.<br />
<br />
Chunks of raw audio and video data have no externally-visible<br />
structure, but they're often uniformly sized. We could just string<br />
them together in a rigid predetermined ordering for streaming and<br />
storage, and some simple systems do approximately that. Compressed<br />
frames, though, aren't necessarily a predictable size, and we usually want<br />
some flexibility in using a range of different data types in streams.<br />
If we string random formless data together, we lose the boundaries<br />
that separate frames and don't necessarily know what data belongs to<br />
which streams. A stream needs some generalized structure to be<br />
generally useful.<br />
<br />
In addition to our signal data, we also have our PCM and video<br />
parameters. There's probably plenty of other [[wikipedia:Metadata#Video|metadata]] we also want to<br />
deal with, like audio tags and video chapters and subtitles, all<br />
essential components of rich media. It makes sense to place this<br />
metadata&mdash;that is, data about the data&mdash;within the media itself.<br />
<br />
Storing and structuring formless data and disparate metadata is the<br />
job of a [[wikipedia:Container_format_(digital)|container]]. Containers provide framing for the data blobs,<br />
interleave and identify multiple data streams, provide timing<br />
information, and store the metadata necessary to parse, navigate,<br />
manipulate, and present the media. In general, any container can hold<br />
any kind of data. And data can be put into any container.<br />
<br />
<br />
<center><div style="background-color:#DDDDFF;border-color:#CCCCDD;border-style:solid;width:80%;padding:0 1em 1em 1em;text-align:left;"><br />
'''Going deeper…'''<br />
* There are several common general-purpose container formats: [[wikipedia:Audio_Video_Interleave|AVI]], [[wikipedia:Matroska|Matroska]], [[wikipedia:Ogg|Ogg]], [[wikipedia:QuickTime_File_Format|QuickTime]], and [[wikipedia:Comparison_of_container_formats|many others]]. These can contain and interleave many different types of media streams.<br />
* Some special-purpose containers have been designed that can only hold one format:<br />
** [http://wiki.multimedia.cx/index.php?title=YUV4MPEG2 The y4m format] is the most common single-purpose container for raw YUV video. It can also be stored in a general-purpose container, for example in Ogg using [[OggYUV]].<br />
** MP3 files use a [[wikipedia:MP3#File_structure|special single-purpose file format]].<br />
** [[wikipedia:WAV|WAV]] and [[wikipedia:AIFC|AIFC]] are semi-single-purpose formats. They're audio-only, and typically contain raw PCM audio, but are occasionally used to store other kinds of audio data ... even MP3!<br />
</div></center><br />
<br />
<br style="clear:both;"/><br />
<br />
==Credits==<br />
[[Image:Dmpfg_018.jpg|360px|right]]<br />
[[Image:Dmpfg_019.png|360px|right]]<br />
<br />
In the past thirty minutes, we've covered digital audio, video, some<br />
history, some math and a little engineering. We've barely scratched the<br />
surface, but it's time for a well-earned break.<br />
<br />
There's so much more to talk about, so I hope you'll join me again in<br />
our next episode. Until then&mdash;Cheers!<br />
<br />
Written by:<br />
Christopher (Monty) Montgomery<br />
and the Xiph.Org Community<br />
<br />
Intro, title and credits music:<br><br />
"Boo Boo Coming", by Joel Forrester<br><br />
Performed by the [http://microscopicseptet.com/ Microscopic Septet]<br><br />
Used by permission of Cuneiform Records.<br><br />
Original source track All Rights Reserved.<br><br />
[http://www.cuneiformrecords.com www.cuneiformrecords.com]<br />
<br />
This Video Was Produced Entirely With Free and Open Source Software:<br><br />
<br />
[http://www.gnu.org/ GNU]<br><br />
[http://www.linux.org/ Linux]<br><br />
[http://fedoraproject.org/ Fedora]<br><br />
[http://cinelerra.org/ Cinelerra]<br><br />
[http://www.gimp.org/ The Gimp]<br><br />
[http://audacity.sourceforge.net/ Audacity]<br><br />
[http://svn.xiph.org/trunk/postfish/README Postfish]<br><br />
[http://gstreamer.freedesktop.org/ Gstreamer]<br><br />
<br />
All trademarks are the property of their respective owners. <br />
<br />
''Complete video'' [http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode CC-BY-NC-SA]<br><br />
''Text transcript and Wiki edition'' [http://creativecommons.org/licenses/by-sa/3.0/legalcode CC-BY-SA]<br><br />
<br />
A Co-Production of Xiph.Org and Red Hat Inc.<br><br />
(C) 2010, Some Rights Reserved<br><br />
<br />
<br style="clear:both;"/><hr/><br />
<center><font size="+1">''[[A Digital Media Primer For Geeks (episode 1)/making|Learn more about the making of this video…]]''</font></center></div>Lee Carréhttps://wiki.xiph.org/index.php?title=StaticPlayers&diff=9390StaticPlayers2008-08-27T20:04:54Z<p>Lee Carré: /* Car Audio */ replaced <s> with more semantic <del></p>
<hr />
<div>== Introduction ==<br />
<br />
On this page you can find all static players that are known to support Vorbis. This includes Hi-Fi components such as CD/DVD players and car audio equipment. For hardware that is able to run third-party software (such as PDAs and video game consoles), please visit [[VorbisSoftwarePlayers]].<br />
<br />
== Hi-Fi components ==<br />
<br />
* [http://actiontec.com/products/tech/broadband/wdmp/wdmp_overview.html Actiontec] Wireless Digital Media Player<br />
:This player is a streaming client for video, audio and images. It supports MP3, AC3, AAC, WAV, WMA, Vorbis and internet radio. Supported picture formats are JPEG, GIF, TIF, BMP and PNG. It can play back MPEG-1/-2/-4, Xvid, RMP4. It has RCA connectors, a digital output, supports HDTV and can surf the internet.<br />
<br />
* [http://www.adstech.com/ ADS Tech's] Media-Link<br />
:This is a streaming client that uses ethernet and WLAN for connecting. It has a composite, component and s-video out and sterea and S/PDIF out. It supports MPEG-1/-2/-4, DivX, Xvid, MOV, MP3, Vorbis, AC3, WMA, JPG, BMP, GIF. The server software seems to support only windows.<br />
<br />
* [http://www.arcam.co.uk/ ARCAM] DV137, DV139, Solo Movie 5.1<br />
:These high-end British home cinema products are primarily DVD-Video and DVD-Audio playback devices. All support playback of Vorbis, MP3 and WMA files from CD-R and DVD-R discs. Other media supported includes SACD. Audio performance competes with dedicated high-end CD/DVD-A/SACD players whilst video can be upscaled to HD resolutions over HDMI. The DV137 and DV139 are player components whilst the Solo Movie 5.1 is an all-in-one system that includes a DAB/AM/FM radio (territory dependant), various auxiliary inputs and five channels of amplification (5 x 50W RMS into 8 Ohms).<br />
<br />
* [http://www.buffalotech.com/ Buffalo's] PC-P3LWG/DVD<br />
:This product is a DVD player and streaming client with HDTV support. It has wireless and wired networking and a USB port. The media server software only runs on Windows (UPnP AV). It supports many formats: video (SVCD/DVD/DivX HD/Xvid/RealMedia/WMV HD), audio (MP3, Vorbis, WAV, AAC, WMA, AC3) and picture (JPG, GIF, BMP, TIF, PNG). It can be integrated with the NAS solution LinkStation/TeraStation for media storage such that no PC is required.<br />
<br />
* [http://www.cyberhome.com/ Cyberhome's] DVD 635s<br />
<br />
:According to this [http://www.dv-rec.de/test/player2005/635/635.html review(german)] on [http://www.dv-rec.de DV-REC], it plays Vorbis and has '''buggy''' Ogm Video-support. The sound quality appears to be very good(accordimg the review), but there is no special Vorbis point of view about sound quality in the review. Some users report troubling noises from the build in CD/DVD-device.<br />
<br />
* [http://www.digitaltechniques.com/ Digital Technique's] 080S, 160A, 160S, 300A <br />
:These are music servers based on PC technology with a capacity from 80 to 300 GB. They support MP3, Vorbis, FLAC and WAV. <br />
<br />
* [http://www.digitalrise.biz/ DigitalRise'] Xstream Player<br />
:This item is part of the new generation of DVD players like the Kiss DP-600 and the models from I-O Data and Buffalo -- it can play DVDs, but also WMV-HD DVDs and supports all kinds of audio and video codecs: MPEG-1/-2/-4 (incl. DivX), WMV9, AAC, MP3, WMA and Vorbis.<br />
<br />
* [http://www.dlink.com/ D-Link's] DSM-320<br />
:A wired and wireless UPnP streaming media player. Supports decoding Vorbis as of the 1.03 firmware. <br />
<br />
* [http://www.ethernut.de/en/hardware/eir/ EIR Project] Elektor Internet Radio<br />
:Open Source Hardware and Software Project featuring an ARM7 based Internet Radio, which uses VLSI's VS1053 decoder chip.<br />
<br />
* [http://www.hermstedt.de/hifidelio/ Hermstedt's] Hifidelio, Hifidelio Pro<br />
:The Hifidelio is a music server in hi-fi format and designed to produce high-quality sound. It uses a CD/DVD combo drive and can thus rip Audio-CDs and read from DVD-Rs, and is also able to burn CDs. It has an in-built 4-port ethernet switch, a WLAN interface, can connect to the iPod and other portable players through USB 2.0. It can connect to other Hifidelios through the UPnP/AV standard and to iTunes shares (iTunes shopping is a future feature). The songs are stored on the 80 GB harddisk. Supported formats for decoding are: MP3, Vorbis, AAC, WMA, FLAC, WAV. The Hifidelio Pro has a 160 GB hdd and some other advanced features.<br />
<br />
* [http://www.iodata.com/ I-O Data's] AVeL LinkPlayer2<br />
:This piece of hardware is a DVD player and a HDTV streaming client. It supports MPEG-2, DivX, XviD and WMV9 (WMV HD), as audio tracks PCM, AC3, MP3, AAC, WMA and Vorbis. It can use ethernet, WLAN and USB 2.0 to connect to media. It is available in Japan from September.<br />
<br />
* [http://www.kenwood.com/ Kenwood's] VRS-N8100, DVF-N7080<br />
:The new line of networked hi-fi components are supposed to decode Vorbis over the Ethernet port: the A/V receiver VRS-N8100 and the DVD player DVF-N7080.<br />
<br />
* [http://www.kiss-technology.com/ KISS Technology's] DVD player models (basically all)<br />
:Except for one older model (the DP-330) all DVD/DivX players from Kiss can play Vorbis files from CD-Rs and CD-RWs (but reportedly have trouble with UTF-8 comments that aren&#x2019;t also ASCII), as well as DivX (but not DivX Vorbis).<br />
:<strong>There are reportedly problems with some versions of the firmware (2.6.6 &#x2264; <i>x</i> &#60; 2.7.1)</strong>, where playback is awful for a bitrates greater than 128Kb/s.<br />
<br />
* [http://www.medainc.com/ Meda Systems'] Bravo, Bravado<br />
:These are media servers with up to 500 GB storage. They can be controlled via PDA and support MP3, WAV, WMA, Vorbis and FLAC. They can also connect to the local network via ethernet.<br />
<br />
* [http://www.xbox.com/ Microsoft's] Xbox<br />
:The Xbox is a gaming console based on PC hardware, including a 733 MHz processor, 8 GB harddisk, a DVD drive and an Ethernet port. The console can be [http://waltercedric.com/Mambo/index.php?option=com_content&task=view&id=58&Itemid=40 modded] to allow the installation of third-party software, such as the [http://www.xboxmediacenter.de/ Xbox Media Center] project. Once installed the Xbox becomes a media center and streaming client. It supports vast amounts of audio, video and picture standards, including Vorbis and FLAC.<br />
<br />
* [http://www.mvixusa.com/product.php?product=mx780 Mivx] Wireless Media Player<br />
:Does not include a hard drive - you have to supply your own IDE or SATA drive. Supports a wide variety of wireless and component connections and audio/video formats including OGG Vorbis.<br />
<br />
* [http://www.momitsu.com/ Momitsu's] V880N<br />
:The V880N is a disc player and streaming client. It supports DVD, VCD/SVCD, Audio CD, Picture CD, MP3, JPEG, DivX, Xvid on discs and MOV, Vorbis, AAC, WMA, AC3 and internet radio over ethernet. In addition to the usual TV connection it supports digital video (DVI) and audio (coaxial/optical) output in HDTV. It has a LAN interface and a PC card slot for a WLAN card.<br />
<br />
* [http://www.mpsharp.com/ MP Sharp Technologies'] Digital Jukebox<br />
:The MPST Digital Jukebox is a Linux PC designed for audio playback and sold as a stereo component, which of course can play Vorbis.<br />
<br />
* [http://www.netgem.com Netgem's] iPlayer<br />
:The iPlayer is primarily a DVB-T receiver, which includes an in-built modem and can also use a small range of USB ethernet adaptors to connect to a network. Supported media formats include MPEG and MPEG2, MP2 and MP3 and, in the latest release, Vorbis. Technical limitations in the USB controller limit the practical bandwidth of media to around 4 megabits/second. Perhaps the reason for the rather limited range of media formats supported is that the iPlayer is based on low-cost hardware - in the UK Netgem's own branded iPlayer usually retails for around £90. Netgem also host a [http://forum.netgem.com forum]. In addition to the Netgem branded iPlayer in the UK, branded devices are available from other manufacturers such as [http://player.teac.com.au/ Teac] (the ITV-D500, for the Australian market). With the imminent launch of DTT in France, Netgem is also expected to launch a model there.<br />
<br />
* [http://www.neurostechnology.com/ Neuros] [http://www.neurostechnology.com/neuros-osd-specifications OSD]<br />
:Hackable, nay, hack-encouraged, open-source streaming media client that plays many video and audio formats, including Vorbis and FLAC.<br />
<br />
* [http://www.neuston.com/ Neuston's] Maestro DVX-1201<br />
:This is a standalone DVD player that supports Vorbis.<br />
<br />
* [http://www.tuxbox.org/ Nokia/Philips/Sagem] DBox2<br />
:This device, manufactured by Nokia, Philips and Sagem until 2002 in huge numbers for the German Pay-TV provider Premiere, is a DVB-C or DVB-S receiver. It features a 10Mbit Ethernet interface and a nifty graphics display. The original software on this device was always a bit flakey. The alternate Linux-based [http://www.tuxbox.org/ Tuxbox] project includes an audio player that perfectly plays Vorbis files from a NFS or CIFS share. Streaming is in beta state.<br />
<br />
* [http://www.olive.us/ Olive Inc's] Musica<br />
:This is obviously a relabeled Hifidelio Pro for the US market. For details see the entry of Hermsted.<br />
<br />
* [http://support.packardbell.com/uk/item/index.php?m=step2&i=menu_dvd Packard Bell's] DivX 350 DVD, DivX 450 Pro, DVX 460 USB<br />
:According to Packard Bell's website these players should all be able to play Vorbis audio files. The 350 model needs to be firmware-upgraded to [http://support.packardbell.com/se/item/index.php?i=instr_releasenotes_fw_divx350pb&pi=platform_divx350pb&dhepn=A000088300 v2.19] to play Vorbis. The 450 Pro exists in three different hardware revisions all of which might not be vorbis enabled.<br />
<br />
* [http://www.phatnoise.com/products/index.php PhatNoise's] Home Player<br />
:The Home Digital Media Player uses the same cartridges as the PhatBox, and supports Vorbis out of the box.<br />
<br />
* [http://www.pinnacleaudio.co.uk Pinnacle Audio] Athenaeum<br />
:Pinnacle Audio Athenaeum is a high end music server it plays FLAC and Vorbis. It automatically rips CD's to FLAC, but can also encode to Vorbis. It also supports encoding and playing MP3 but does not support DRM.<br />
<br />
* [http://www.philips.com Philips] DVP-5500S/5505 DVD/DIVX/CD/SACD Player<br />
:Although it's not written in the manual, this player indeed support Vorbis out of the box (as well as vorbis in an avi container, divx/xvid in an OGM container....) I don't know if there are limitations. I don't understand why it's not advertised.<br />
<br />
* [http://www.pinnaclesys.com/ Pinnacle's] ShowCenter 200<br />
:This is a streaming box for audio and video. It supports MPEG-1, MPEG-2, MPEG-2 VOB, MPEG-4 AVI, Xvid, WMV9 and even WMV-HD video. Picture formats are JPEG, BMP, PNG and GIF. The box has native support for MP3, WAV, WMA and Vorbis (the latter requires a software and firmware upgrade to version 2.5, freely available from [http://www.pinnaclesys.com/ Pinnacle]).<br />
<br />
* [http://www.pontis.de/site_e/home_e.htm Pontis'] MediaServer MS300, MS330<br />
:The website stupidly doesn't mention Vorbis support, but it is there, along with MP3. The MS300 is a music server that runs Linux and comes with 80 or whopping 300 GB of storage. It has an ethernet port that lets other desktops access the music via Samba, and supports hardware streaming clients that use the Slimserver protocol ([http://www.slimdevices.com/ Slimdevices], [http://www.rokulabs.com/ Roku]). The USB port and the memory card slot can be used to read in music from portable players and photos from digital cameras. Pictures can be viewed via SCART on the TV. The MS330 is similar to the MS300, but can also burn CDs from the CD drive, has a 6-in-1 memory card slot and supports MP3, Vorbis and FLAC.<br />
<br />
* [http://www.request.com/ ReQuest Multimedia] all products<br />
:ReQuest home theatre music systems play FLAC and Vorbis songs, and can edit FLAC and Ogg comments. They can encode CDs to FLAC, and transcode WAV to FLAC, but currently cannot encode to Vorbis. FLAC support has been there for many years; they were one of the first hardware makers to support it. Vorbis support has been there since their 2.0 software release. (They also support MP3 and WAV. They do not support any DRM formats and do not enforce any DRM rules.)<br />
<br />
* [http://www.reson.de/ Reson's] rh1<br />
:The rh1 is a Hifidelio which has been modified for audiophile requirements (new DA component etc).<br />
<br />
* [http://www.rokulabs.com/ Roku's] HD1000, M1000, M2000<br />
:Roku's streaming audio clients support the Slimserver from Slimdevice's products (for details see below).<br />
<br />
* [http://www.skipjam.com/imedia_audio_player.php SkipJam's] iMedia Audio Player, iMedia Audio Player Pro<br />
:The iMedia Audio Player is a streaming client with two Ethernet ports and supports MP3, WAV, PCM, WMA, AAC, AC3 FLAC, and Vorbis directly. Through PC-Server software it also plays M4A and M4P. It has two digital (optical/coaxial) and one analog output. The pro version can stream the same formats through ethernet or through built-in Homeplug power line networking, and has a built-in 30W/Chan digital amp. The pro unit is designed for installation in-wall in a 6-gang junction box.<br />
<br />
* [http://www.mysilvercrest.de/ Silvercrest's] KH6510, KH6511, KH6515, KH6516 DVD players<br />
:According to [http://www.hdtv-praxis.de/modules.php?op=modload&name=PagEd&file=index&topic_id=2&page_id=151&ppart=2 these (German) reviews], these players can play Vorbis stereo files, but not multichannel files. Silvercrest is a brand of the german discounter LIDL.<br />
<br />
* [http://www.slimdevices.com/ Slim Devices] Squeezebox, Squeezebox2, Squeezebox3, Transporter<br />
:The Squeezebox is a streaming receiver, that uses LAN or WLAN to stream audio. It supports decoding of MP3 and raw PCM. The server software is open source and available for a number of platforms (Windows, Mac OS X, Linux, FreeBSD) and decodes other formats, like Vorbis and FLAC, on the fly to PCM before streaming. The Squeezebox2 uses the same server software, but can decode FLAC natively, which lowers network traffic for other formats than MP3 considerably. The Squeezebox3 has basically the same features as version 2, but the design has been revamped completely and is more luxurious.<br />
<br />
* [http://www.sonos.com/ Sonos'] Multi Zone Digital Music System<br />
:Sonos is a complete music system for a house that consists of speakers that are connected wirelessly to a media server. The system also supports Vorbis and FLAC.<br />
<br />
* [http://www.playstation.com/ Sony's] Playstation 2<br />
:The [http://www.trend-express.com/en/medio.html Medio Digital Media Player] transforms the Playstation2 into a streaming client, supporting various audio and video formats, including Vorbis.<br />
<br />
* [http://www.streamit.eu/ Streamit's] Lukas II, SIR120 and SIR120PRO<br />
:The Lukas II is a streaming receiver with integrated loudspeaker, that uses LAN or dialup to stream audio. The SIR120 and SIR120PRO are 19" rack mountable streaming receivers with SD card which use LAN to stream audio. All these devices support MP3, WMA, AAC+ and Vorbis streaming.<br />
<br />
* [http://www.my-noxon.com/ Terratec's] Noxon iRadio, Noxon2Radio for iPod, Noxon2Audio.<br />
:A WiFi radio for streaming music from the computer and the Internet.<br />
<br />
* [http://www.trans-technology.com/ Transgear's] DVX-500E, DVX-700 M20<br />
:The DVX-500E is a DVD player and streaming client. It supports MPEG-1/-2/DivX/Xvid/VOB/DVB and WAV/MP3/WMA/AAC/Vorbis and JPG/BMP/GIF/TIF/PNG. The DVX-700 can do the same, plus has digital video plugs, supports HD video formats and has a change slot for 3,5" HDDs.<br />
<br />
* [http://www.tversity.com TVersity Media Server]: <br />
:A UPNP/AV compliant media server that uses the Vorbis libraries to transcode audio files to the Vorbis format.<br />
<br />
* [http://www.umax.de/ Umax/Yamada] <br />
**DVX-6600 For the DVD/DivX player DVX-6600 a future firmware is supposed to be able to decode Vorbis, but there is no release date yet.<br />
**[http://www.umax.de/WebNew/Produkte/9_HomeEntertainment/DVX-6700/DVX-6700.htm DVX-6700] <br />
<br />
* [http://www.watterott.net/webradio.php WebRadio Project]<br />
:Open Source Internet Radio Project based on an AVR microcontroller and VLSI's VS1053 audio codec.<br />
<br />
* [http://www.yamakawa.de/ Yamakawa's] DVD-375<br />
:The Yamakawa DVD-375 supports Vorbis.<br />
<br />
* [http://www.z500series.com/ Zensonic's] Z500<br />
:The Z500 is a networked multimedia player. It is almost unbelievable how many media types are supported. Video formats: HDTV, DVD, WMV9, DivX, MPEG-1, MPEG-2, MPEG-4, HighMAT, Matroska. Audio formats: Audio CD, MP3, FLAC, Vorbis, AAC, WMA, DVD Audio, and internet radios. Pictures: JPEG, PNG, TIF etc. It supports USB mass storage devices and connects through Gigabit Ethernet or WLAN to the network. The server software runs on Windows, Mac and Linux (UPnP Streaming). Among other connectors it supports the new HDMI standard.<br />
<br />
== Car Audio ==<br />
* Acoustic Solutions ICS-160<br />
:Plays Vorbis, MP3 and WMA from CD, USB and SD card. Can rip from CD/radio/aux to MP3 or WMA, but cannot rip to Vorbis. Displays metadata for MP3, but seems to ignore metadata for Vorbis. (Metadata display not tested for WMA.) Available in UK in [http://www.argos.co.uk/static/Product/partNumber/5005316.htm Spring/Summer 2007 Argos catalogue]. Appears to be based on the same architecture as the Yakumo Hypersound Car Eazy (see above), as the digital display and software appear to be identical, and the two models appear to have identical specifications. However, the design of the fascia is completely different.<br />
<br />
* <del>Alpine CDE-9846R/RM and CDE-9848RB</del><br />
:'''Cannot play''' Vorbis.<br />
<br />
* AudioVox VME 9112<br />
:Plays Vorbis from CD, at least up to q6.<br />
<br />
* <del>[http://www.blaupunkt.com/au/7647573510_main.asp Blaupunkt London MP37]</del><br />
:Cannot play OGG Vorbis files. (In fact, support for Vorbis is ''almost'' present: it can be tricked to play an OGG Vorbis file by putting it into a subdirectory on the CD, but that's it.)<br />
<br />
* [http://www.dension.com Dension] [http://www.dension.com/icelinkgateway300.php ice>Link Gateway 300], [http://www.dension.com/icelinkgateway400.php 400] and [http://www.dension.com/icelinkgateway500.php 500]<br />
:Dension develops <i>connected car infotainment systems: Either as a direct stand-alone equipment, or accessory, or complete systems.</i> Either for fitting by the OEM or aftermarket, Dension offers three different (hardware-)gateways to connect either audio players 3.5mm jack), iPods (special connector) or mass storage devices (USB), with the latter having Vorbis files stored on amongst other popular formats. The products are called [http://www.dension.com/icelinkgateway300.php ice>Link Gateway 300], [http://www.dension.com/icelinkgateway400.php 400] and [http://www.dension.com/icelinkgateway500.php 500]; and the support knowledge-base [http://support.dension.com/support-center/index.php?x=&mod_id=2&root=11&id=79 lists all supported formats]. The gateways are compatible to various OEM systems and aftermarket head units, the system used by Volkswagen (see below) may well be supplied by Dension.<br />
<br />
* [http://www.hb-direct.com/ H&B] CA-7475 / CA-7475BTi<br />
:This device seems to be similar to the PLU2 P2-106USB, but also has Bluetooth support. It is mainly sold in France, but it is not on H&B's website, so it may be a phased-out model. [http://www.bestofmicro.com/actualite/22493-H-B-CA-7475BTi.html (fr)]<br />
<br />
* [http://www.hb-direct.com/ H&B] CA-7575BTI<br />
:[http://www.ldlc.com/fiche/PB00061312.html (fr)]<br />
<br />
* Hyundai H-CDM8030<br />
:Can play Vorbis from USB flash drive until Q7.<br />
<br />
* [http://www.insignia-products.com/ Insignia's] NS-C5111 CD Car deck<br />
:It is being sold at [http://www.bestbuy.com/ Best Buy] as of April 2006 and will play Vorbis off of a USB drive, SD Card or from Oggs encoded onto data CDs. The Vorbis ability is undocumented. There are similar (or same) complaints as noted about the Yakumo unit below. Long TOC reads and the Random button causes track-change. The system has frozen a couple of times requiring the use of a reset button (it has one). Also problems have been experienced with nested directories, it seems to only read filenames from .ogg files, displays no ID3 info, but it constantly displays stats about the currently-playing file.<br />
<br />
* [http://mobile.jvc.com/product.jsp?modelId=MODL027694&pathId=54&page=1 JVC KD-G720] and [http://mobile.jvc.com/product.jsp?modelId=MODL027693&pathId=54&page=1 KD-G820]<br />
:Both support Vorbis according to [http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=29933&view=findpost&p=392489 this post], however according to tommyj's review on [http://www.crutchfield.com/S-YynrlAPfgcF/cgi-bin/ProdView.asp?g=300&I=257KDG820&id=review this page] Vorbis support is limited to the USB connector and is also quite flakey. Another source suggests that JVC [http://www.jvc.ca/en/consumer/product-detail.asp?model=KD-G720 KD-G720] and KD-G820 both have undocumented, partial Vorbis support. Vorbis files can be played from a USB device attached to the USB port, but not from a CD. They do not support tags. For the vast majority of songs, q6 seems to be the highest they can reliably play. These decks are a good option for anybody looking to play Vorbis in their car because they are available at major retailers (e.g. Best Buy) and are relatively inexpensive.<br />
<br />
*JVC KD-G722 and KD-G721, KD-G821, KD-SH1000<br />
:The JVC 2005 generation of car audio can play Vorbis from USB devices. They do not recognize Vorbis tracks on other media (neither CD, nor SD-card on SH1000). Their USB slot is not powerful enough to power a real hard drive, but USB flash is no problem. The 721/722 can play Vorbis until q7 (721 and 722 only differ in color, grey or black). The 821 can play up to q5. The KD-SH1000 also plays Vorbis from USB (unknown which quality it supports).<br />
<br />
*<del>JVC KD-G731/831</del><br />
:These do '''not''' play Vorbis. They are the successors to the 72x/82x series, but the (undocumented) Vorbis support was '''dropped''' here.<br />
:Official reply from JVC regarding support : ''"The following models from 2006 are the only ones to support Ogg Vorbis, KD-SH1000, KD-G821, KD-G721/722. The 2007 and planned 2008 range will not be compatible with Ogg Vorbis."''<br />
<br />
* [http://www.kenwood.com/ Kenwood's] Music Keg <br />
:The [http://www.kenwoodusa.com/products/ListProduct.aspx?k1=2&k2=5&k3=71&pr=2008 Music Keg KHD-C710] uses the same system as the PhatBox below, which means Vorbis support is available. But it seems, that only the software can encode to the HD, but can't play from the Music Keg. [http://www.crutchfieldadvisor.com/S-g2FinmVl7fe/cgi-bin/ProdView.asp?print=Y&g=50800&id=detailed_info&i=113KHDC710]<br />
<br />
* Lynx CRM 2005<br />
:Low-Cost Car Radio with support to read Vorbis from CD, USB 1.1, SD and MMC. In Germany it's labeled as "Tevion CRM-2005" and was sold by Aldi-Süd.<br />
<br />
* [http://origin-community.ministryofsound.com/audio/range.htm Ministry of Sound] [http://shop.ministryofsound.com/Cultures/en-GB/Products/MOSCA104X5.htm?MSCSProfile=9E133C53BD3D92DF1CE9F907D3646C9255036D7AFA803EF7A1C19406E5739EB04CA3BBA8EABD4803AC7F85E26AE78DC143DE377C1060D36EE764E752F8748B9C37DA7AE4DC53D986D49D1C7ADE21AEE447308E31C3159353F77EB0DD5B9A4EA78160B1E4E075A977762313FF570F8494A1229CE23CB601E9992AF7076FC531CC?CatalogNavigationBreadCrumbs=MinistryofSound|Audio|Car_Audio CD tuner]<br />
:It is likely that it uses Roadstar electronics as well, because both brands are owned by Alba Plc.<br />
<br />
* [http://www.mutant.uk.com/mt1106mp3.html Mutant MT1106MP3]<br />
:Head unit with removable 512MB audio player. Supports Vorbis according to [http://www.ciao.co.uk/Mutant_MT1106USB__Review_5649304 this review].<br />
<br />
* [http://www.phatnoise.com/products/index.php PhatNoise's] PhatBox<br />
:The PhatBox is a audio entertainment system for the car. It uses a cartridge to store the music, and it can be filled with music through a docking station for the PC. As of version 3.1 of the desktop software (Phatnoise Music Manager), Vorbis is supported out of the box.<br />
<br />
* [http://www.plu2.de/ PLU2] P2-106USB<br />
:Plays Vorbis from CD, SD and USB. ebay link on discussion page.<br />
<br />
* [http://www.roadstar.com/ Roadstar] [http://www.roadstar.com/newsite/index.php?left=family&id=1300&center=productdetail&id_prd=365&right=productdownload&id_fam=113 CD-258US/512] <br />
:Car CD tuner with MP3 / WMA / Vorbis disc playback and a detachable front panel with internal Flash memory of 512 MB. Upload via USB from your PC your favourite songs to the internal memory inside the detachable panel (MP3, WMA or Vorbis file format). Encode your music in MP3 format from CD / Radio / Aux-In source to the Internal Flash Memory or USB / SD / MMC. Transfer your favourite MP3 / WMA / Vorbis files between CD disc / Internal Flash Memory / USB / SD / MMC.<br />
:It displays no ID3 info on Ogg files, but it constantly displays stats and filename about the currently-playing file.<br />
<br />
* [http://www.mysilvercrest.de Silvercrest] ([http://www.mysilvercrest.de/en/artikel.php?a=62 KH 2389], [http://www.mysilvercrest.de/en/artikel.php?a=98 KH 2380] and [[http://www.mysilvercrest.de/en/artikel.php?a=127 CRB-530]) : In-dash CD-MP3-Players. It is possible to plug in a USB stick and SD card into them. Vorbis works with the USB stick, SD card and CD. Silvercrest is a brand of the german discounter LIDL.<br />
<br />
:Although LIDL's advertisement for the KH 2380 in December 2006 made a show of its Vorbis support, this is not mentioned in the manual, or any accompanying documentation. Initial impressions suggest that playback for q3 is good, and correctly plays the entire track, but is not gapless.<br />
<br />
: [http://www.mysilvercrest.de/en/artikel.php?a=127 CRB-530] has a documented compatibility with ogg. The ogg is fluid.<br />
<br />
* <del>VDO Dayton CD 2803, CD 2737 B</del><br />
:'''Cannot play''' Vorbis.<br />
<br />
* [http://www.vdodayton.com VDO Dayton] [http://www.vdodayton.com/default2_and_fz_menu=cd_1537_x.aspx CD 1537 X] and [http://www.vdodayton.com/default2_and_fz_menu=cd_1737_x.aspx CD 1737 X]<br />
:Manufacturer's site clearly states that these are able to play Ogg Vorbis from CD, SD/MMC and USB 1.1 devices. The 1737 manual states that it can play files between 8 and 192kb/s. Up to 99 files in 99 directories (assume that means 99 in each), with names of up to 32 characters. Favorable review on [http://www.cnetfrance.fr/produits/materiels/systemes-auto-embarques/test/0,3800002254,39367497,00.htm Cnet France] (in French)<br />
<br />
* [http://www.volkswagen-individual.de/ Volkswagen's] Golf, Golf Plus, Touran<br />
:Well, this is a great development for Vorbis hardware support. From January 2006 onwards all Golf, Golf Plus and Touran models will offer an USB port, which support USB sticks with music. Supported formats include MP3, WAV, WMA and Vorbis. Find more information in German at [http://www.volkswagen.de/vwcms_publish/vwcms/master_public/virtualmaster/de3/modelle/golf/golf0/rund_um_den_golf/individualisierung.html]. On a related note, the iPod is supported, too.<br />
<br />
:Volkswagen offers a USB interface for their Golf V models optionally, where you can attach a USB mass storage device containing music. MP3, WAV, WMA and Vorbis formats will be played through the car's stereo. Source: [http://volkswagen.de/vwcms_publish/vwcms/master_public/virtualmaster/de3/modelle/golf/golf0/zahlen___fakten/infomaterial___preise.html German PDF price list]<br />
<br />
* [http://www.yakumo.com/produkte/index.php?pid=1&ag=Autoradio Yakumo's] Hypersound Car<br />
:This in-dash car CD player supports Vorbis, MP3, and WMA playback from CD, USB stick or MMC/SD card. Vorbis support is not obvious but are clearly specified in the Technical Specifications page of the user manual, but has been verified to work with both UK and German versions. Reservations [http://www.tomergabel.com/TheQuestForTheHolyErSound.aspx have been made] regarding the product's quality, in particular stability and performance. (There was also a Yakumo Support Forum Discussion, but Yakumo seem to have taken their forums offline as of March 2007. Partial archive [http://www.moteprime.org/article.php?id=30 here].)It supports Vorbis files on USB, MMC/SD and CD. However, as of early 2006 its firmware is notoriously flaky, no firmware update is available, and it also has poor tuner sensitivity. This is also supplied in unbranded form at various retailers, but it does have a distinctive look. [http://www.yakumo.de/produkte/index.php?pid=1&ag=Autoradio Yakumo Car Entertainment]. [http://www.yakumo.com/datafiles/produkte/manuals/man_1037991_38_2_yakumo_hypersound_car_eazy.pdf Online manual]. See also Acoustic Solutions ICS-160.<br />
<br />
''Note: Some of this information was moved from the Mobile Players page, so there may be some duplication.''<br />
<br />
== Media Storage ==<br />
<br />
* [http://www.gennetworks.com/ GenNetwork's] GenMedia DivXStorage<br />
:This is an external harddrive as a video storage to connect to TV sets. It comes in various versions and storage sizes. It comes with USB 2.0 and a remote control. HDTV resolution, 5.1 sound and the following file formats are supported: MPEG-4/DVD/VCD/SVCD/AudioCD/JPEG/MP3. For the [http://www.gennetworks.com/pro_genmedia02.htm 3,5"] and deck version Vorbis format is mentioned.<br />
<br />
* [http://www.numark.com/ Numark's] HDX, HDMix<br />
:These are DJ media players with a 80GB HD on-board and a CD drive. They support Hard Drive Playback of MP3, WMA, WAV, Vorbis, and FLAC (lossless) formats. See [http://www.numark.com/ homepage] for more.<br />
<br />
[[Category:Vorbis]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Games_that_use_Vorbis&diff=9389Games that use Vorbis2008-08-27T18:12:11Z<p>Lee Carré: improved typography — use of correct characters for punctuation etc.</p>
<hr />
<div>The following games use [[Vorbis]], most frequently for their in-game music or sound effects:<br />
<br />
* All Games By [http://www.reflexive.com/index.php?CAT=Search&SEARCH=dev%3AReflexive+Entertainment&PAGE=GameList Reflexive Entertainment].<br />
<br />
* [http://www.mobygames.com/game/windows/007-nightfire 007: Nightfire]: Uses Ogg Vorbis for background soundtrack.<br />
<br />
* [http://www.ageofconan.com/ Age of Conan — Hyborian Adventures]: Uses Ogg Vorbis for all audio.<br />
<br />
* [http://www.americasarmy.com/ America’s Army]: Uses Ogg Vorbis for main theme.<br />
<br />
* [http://www.lionhead.com/bw2/ Black & White 2]: Uses Ogg Vorbis for music.<br />
<br />
* [http://www.pyrogon.com/games/candycruncher/ Candy Cruncher]: This cute puzzle game from Brian Hook’s company, Pyrogon, uses Vorbis for the addictive music you hear while you race the clock.<br />
<br />
* [http://www.callofcthulhu.com/ Call of Cthulhu] is a first-person horror game that combines intense action and adventure elements. It uses Ogg Vorbis for music and speech.<br />
<br />
* [http://www.mobygames.com/game/windows/catechumen Catechumen] is a Christian-themed FPS that uses Ogg Vorbis.<br />
<br />
* [http://www.atari.com/crashday/ Crashday]: Stunt racing game, developed by independent German studio Moon Byte. Uses Ogg Vorbis for music.<br />
<br />
* [http://buenavistagames.go.com/product/chickenLittlePC.html Chicken Litte]: Adventure game for children inspired by the motion picture in PC edition uses Vorbis for dialogs and music. (not sure if sound effects too)<br />
<br />
* [http://www.cossacks2.de/ Cossacks 2]: “Cossacks II: Napoleonic Wars” is a sequel of “Cossacks: European Wars”. Ogg Vorbis 1.0 files are in \data\music\<br />
<br />
* [http://www.darwinia.co.uk/ Darwinia]: The second title from Indy developer Introversion Software. Darwinia is a stylised retro — Tron meets Cannon Fodder. It uses Vorbis for all in game sound effects and music.<br />
<br />
* [http://www.introversion.co.uk/defcon/ DEFCON]: The third title from Introversion Software. Uses Vorbis for music, effects, everything, like Darwinia.<br />
<br />
* [http://devilmaycry.com/ Devil May Cry 4] (for the PC, at least): Uses (occasionally multichannel) Ogg Vorbis for ingame and cutscene music.<br />
<br />
* [http://www.eidos.co.uk/gss/dxiw/ Deus Ex: Invisible War] by Ion Storm/Eidos: Uses Ogg Vorbis for music and voice (and possibly for sound fx too).<br />
<br />
* [http://www.idsoftware.com/games/doom/doom3/ DOOM 3]: The latest version of this famous first person shooter game from id software uses Vorbis for the theme music as well as their ambient and game sounds.<br />
<br />
* [http://mobygames.com/game/sheet/p,3/gameId,6505/ Duke Nukem: Manhattan Project]: This game from 3D Realms was released in 2002 and used Vorbis for their music. (Official website is down, using Mobygames link)<br />
<br />
* [http://www.popcap.com/games/free/dynomite Dynomite]: Puzzle Bobble/Bust A Move clone for Windows by PopCap Games, with mouse control. Uses Ogg Vorbis for nearly all sound effects.<br />
<br />
* [http://www.nexuiz.com/ Nexuiz], a fast-paced FPS with roots in Quake I, uses Vorbis for background music. The minstagib mod uses Vorbis for all of its sound.<br />
<br />
* [http://www.mobygames.com/game/enclave/ Enclave] by Starbreeze/Black Label Games: Uses Ogg Vorbis for music (and possibly for sound fx and voice too).<br />
<br />
* [http://www.eve-online.com EVE Online] by CCP Games, the Icelandic-homed space-based single-shard persistent world game uses Ogg Vorbis for its music.<br />
<br />
* [http://www.lionhead.com/fabletlc/ Fable: The Lost Chapters]: Uses Ogg Vorbis for music and cutscenes (Ancient libVorbis version, 1.0 RC2).<br />
<br />
* [http://farcry.ubi.com/ FarCry] by Crytek: uses Ogg Vorbis for music and effects.<br />
<br />
* [http://www.freedom-fighters.co.uk/ Freedom Fighters] by IO Interactive: String search reveals “libVorbis I 20011217” in freedom.exe.<br />
<br />
* [http://www.siriusgames.dk/index.php?pageid=67 Gangland] by MediaMobsters: Uses Ogg Vorbis for music and cutscenes (Data\streams\). Encoded with Xiph.Org libVorbis I 20020717. Decoder library: FMOD 3.71.<br />
<br />
* [http://www.rockstargames.com/sanandreas/ Grand Theft Auto: San Andreas] by Rockstar Games/Rockstar North uses Ogg Vorbis to store music, radio, ambient sounds, police messages and cutscene audio. Players can also store their custom tracks (accessible in-game via the “User Track Player” radio station) in Ogg Vorbis.<br />
<br />
* [http://www.guiltygearx2reload.com/ Guilty Gear XX]: The PC version, at least, uses Ogg Vorbis for all the music.<br />
<br />
* [http://www.guitarherogame.com/gh2/ Guitar Hero II] by Red Octane (Activision), XBox360 platform only (multichannel Vorbis with 5 or 6 channels per song)<br />
<br />
* [http://halo.bungie.org/ Halo]: Mac and PC versions of Halo use Ogg Vorbis for all audio, it seems. The Xiph license and dynamically linked libraries of Ogg and Vorbis are included in the Halo directory. XBox version does not use Ogg Vorbis.<br />
<br />
* [http://harrypotter.ea.com/cofs/index.html Harry Potter II (Chamber of Secrets)]: This is unsubstantiated, it was reported on one of the vorbis mailing lists, but there is little evidence either way on this title. EA has been supportive of Vorbis though, so it’s not entirely impossible. If anyone can give us a yay or nay on this, please do.<br />
<br />
* [http://www.mightandmagicgame.com/HeroesV/ Heroes of Might and Magic V]: Uses Vorbis for audio and Theora for video.<br />
<br />
* [http://www.eidosinteractive.com/games/info.html?gmid=118 Hitman 2]: uses Vorbis. (PC only or consoles too?)<br />
<br />
* [http://www.codemasters.com/igi2/front.htm IGI2: Covert Strike]: Not a Norwegian first-person shooter.<br />
<br />
* [http://www.inthegroove.com In The Groove]: The premier dance game created by [http://www.roxorgames.com Roxor Games, Inc.] Uses Vorbis for all of the in-game music.<br />
<br />
* [http://www.p3int.com/KULT/ KULT Heretic Kingdoms] by 3D People/Project 3 Interactive: Uses Vorbis (1.0) for music, voice and sound effects.<br />
<br />
* Recent Legacy of Kain Games: On the PC, both <b>Soul Reaver 2</b> and <b>Blood Omen 2</b> by Crystal Dynamics/Eidos use Ogg Vorbis for music and sound effects. (Source: [http://www.thelostworlds.net/FAQ.HTML#ogg])<br />
<br />
* [http://www.ncsoft.net/eng/ncgames/lineage2_intro.asp Lineage II]: NCSoft Corporation’s 3D MMORPG Lineage II uses Ogg Vorbis for its music. They use 1.0beta3, though.<br />
<br />
* [http://www.liveforspeed.net/ Live for Speed]: Online racing simulator uses Ogg for all audio and sound effects.<br />
<br />
* [http://www.mafia-game.com/ Mafia: The City Of Lost Heaven]: Not sure about any console version, but PC version is reported to use Ogg Vorbis.<br />
<br />
* [http://www.capcom.co.jp/rockmanx8/ Mega Man X8]: The PC version of Mega Man X8 makes use of Vorbis for music and dialogue during cutscenes.<br />
<br />
* [http://www.mobygames.com/game/gamecube/metal-gear-solid-the-twin-snakes Metal Gear Solid: The Twin Snakes]: Uses Ogg Vorbis for certain cut scenes.<br />
<br />
* MotoGP: This motorcycle racing sim uses Vorbis for the music and allows players to drop their own .ogg files into the music dir to listen to them in-game.<br />
<br />
* [http://www.mystrevelation.com/ Myst IV: Revelation]: Fourth game in the Myst series. Uses Ogg Vorbis for all music, speech and sound effects.<br />
<br />
* [http://www.mystvgame.com/ Myst V: End of Ages]: Fifth and final game in the Myst series. Uses Ogg Vorbis for all music, speech and sound effects.<br />
<br />
* Nascar Racing Games from Papyrus: They had this to say about their decision and experience:<br />
<br />
“We’re using a lot of spoken audio in this title (a first for us) and<br />
your codec has allowed us to reduce more than 350MB of audio data to<br />
about 40MB, a huge savings of memory and disk space! We are very<br />
impressed.” —Tom Faiano, Producer <br />
<br />
“Incorprating Ogg Vorbis into our codebase was quite painless, and in the<br />
end, even refreshing. No fuss no muss. Thank you for your efforts!”<br />
—Bill Farquhar, Soundguy du jour<br />
<br />
* [http://www.codemasters.com/flashpoint/ Operation Flashpoint]: This highly successful military simulation/action game from Codemasters uses Vorbis for the in-game music.<br />
<br />
* [http://www.orunner.com/ Ostrich Runner] by Geleos: This funny Russian cartoon-style game for kids and not only kids uses Ogg Vorbis for sound, speech and music.<br />
<br />
* [http://www.ysagoon.com/glob2/ Globulation 2]: State of the art GPL-ed strategy game!<br />
<br />
* [http://www.psobb.com/index.php Phantasy Star Online: Blue Burst]: Uses Ogg Vorbis for music, stored in data/ogg.<br />
<br />
* [http://www.gopostal.com/ Postal 2]: Probably not the game we want to use to showcase Vorbis, but it’s being used in this Unreal-engine-powered ultra-violent game.<br />
<br />
* [http://www.praetoriansgame.com/ Praetorians]: This very successful game from Pyro Studios uses Vorbis for its music.<br />
<br />
* [http://www.psychonauts.com/ Psychonauts]: Has vorbis.dll and vorbisfile.dll.<br />
<br />
* [http://www.quake4game.com/ Quake 4]: Quake 4 is the fourth title in the series of Quake FPS computer games. All game music, speech and sound effects make use of Vorbis.<br />
<br />
* [http://www.restricted-area.net/ Restricted Area]: by Master Creating uses Ogg Vorbis for music and VP3 for videos.<br />
<br />
* Ricochet: An addictive version of Break out.<br />
<br />
* [http://www.rockband.com/ Rock Band]: XBox360 version uses the same type of multichannel Vorbis files as Guitar Hero II, but with more channels to handle the drums and vocals separately.<br />
<br />
* [http://www.rockmanager.net/ Rock Manager]: Vorbis is used in this “new rock ’n roll management sim for PC from Pan Vision and Monsterland”.<br />
<br />
* [http://www.sacred2.com/ Sacred 2] by Studio II: uses multichannel(!) Ogg Vorbis for music, speech and sound effects.<br />
<br />
* [http://www.s2games.com/savage/ Savage]: This S2 Games “RTSS” hybrid genre game uses Vorbis for all the in-game music.<br />
<br />
* [http://www.serioussam.com/se/ Serious Sam: The Second Encounter]: uses Vorbis for the music, although it is slightly obfuscated so as not to be easily playable by standard Ogg Vorbis players.<br />
<br />
* [http://www.serioussam2.com/ Serious Sam 2]: not only uses Vorbis for the music but even Theora for the videos<br />
<br />
* [http://www.totalwar.com/community/warlord.htm Shogun: Total War]: Shogun uses Vorbis, but only to distribute — everything is decompressed to wav during the install.<br />
<br />
* [http://www.singles2.com/englisch/index.html Singles 2]: Uses ogg vorbis for sound<br />
<br />
* [http://www.lart.pl/en/portfolioItem.php?id=91 Ski Jumping 2004]: A commerical game that accurately models the activity of ski jumping. The game also contains over 700 Ogg Vorbis files.<br />
<br />
* [http://mobygames.com/game/sheet/p,3/gameId,3453/ Star Trek: Away Team]: Vorbis is used for all sound in the game — music, voiceover and SFX. This squad-based strategy game is set in the Star Trek Next Generation universe. (Official website is down, using Mobygames link)<br />
<br />
* [http://supertux.lethargik.org/ Super Tux]: Uses Vorbis for music.<br />
<br />
* [http://www.splintercell3.com/ Tom Clancy’s Splinter Cell Chaos Theory]: .LS0 files are in fact Ogg Vorbis files.<br />
<br />
* [http://www.lucasarts.com/games/swrepubliccommando/ Star Wars Republic Commando]: Vorbis is used in the ambient and game music in this latest action game from LucasArts.<br />
<br />
* [http://www.reflexive.net/index.php?PAGE=game_detail&AID=30 Swarm]: A fun little arcade shooter.<br />
<br />
* [http://www.swat4.com/ SWAT 4]: SWAT 4 uses Ogg Vorbis for audio files.<br />
<br />
* [http://www.there.com/ There]: uses both Ogg Vorbis for the sound effects and Ogg Speex for realtime group voice chat, a first for an immersive consumer-oriented world. Voice has become a very popular part of our product! ** posted by [http://david.weekly.org David Weekly], a There developer.<br />
<br />
* [http://www.riddickgame.com/ The Chronicles of Riddick: Escape From Butcher’s Bay (Director’s Cut)]: Uses Vorbis for all audio and Theora for cutscenes.<br />
<br />
* [http://www.thethinggames.com/ The Thing]: Uses Vorbis<br />
<br />
“The original multilanguage distro took three CDs, and went down to <br />
only one after I converted all wavs to oggs. Nifty :) Sadly enough, <br />
marketing decided to not have one language per CD anyway (probably to <br />
annoy people who migrate) :/ Thanks for a very cool (and easy to use)<br />
lib/format!”<br />
<br />
—Vincent Penquerc’h<br />
<br />
* [http://www.asahi-net.or.jp/~cs8k-cyu/windows/tt_e.html Torus Trooper]: Frantic 3D shootemup, using Vorbis for the music. (see also the [http://www.emhsoft.net/ttrooper/ Linux port] and [http://www.apple.com/downloads/macosx/games/action_adventure/torustrooper.html MacOS version])<br />
<br />
* [http://www.mikeoldfield.com/ Tr3s Lunas] (aka Music VR episode 1): This game, featuring the music of Mike Oldfield, uses Vorbis for the music.<br />
<br />
* [http://www.tribesvengeance.com Tribes: Vengance] by Irration Games/Sierra use Ogg Vorbis for music.<br />
<br />
* [http://www.mobygames.com/game/gamecube/true-crime-new-york-city True Crime: New York City]: GameCube version contains over 11,500 Ogg Vorbis files. It is likely that other platform ports also use the same files (note that the [http://www.mobygames.com/game/xbox/true-crime-new-york-city Xbox version] uses Windows Media Audio files in place of Ogg Vorbis files)<br />
<br />
* [http://tuxtype.sourceforge.net/ Tuxtyping 2]: Educational typing tutor for kids of all ages! <br />
<br />
* [http://www.ufo-aftershock.com/ UFO: Aftershock]: Uses Vorbis for music.<br />
<br />
* [http://www.ufo-afterlight.com/ UFO: Afterlight]: Uses Vorbis for music.<br />
<br />
* [http://www.atari.com/us/games/unreal2/pc Unreal 2]: PC version uses Vorbis, usage on consoles not confirmed.<br />
<br />
“We went with Ogg Vorbis due to its excellent playback and compression,<br />
and we used it not only for music but also all of the in-game voice.<br />
Without it, we never would have been able to fit on two CDs.”<br />
<br />
— http://www.4unrealers.com/entrevistas/263/<br />
<br />
* [http://www.unrealtournament.com/ut2003/ Unreal Tournament 2003]: This overwhelmingly-popular multiplayer first person shooter PC title uses Vorbis for its music.<br />
<br />
* [http://www.unrealtournament.com/ut2004/ Unreal Tournament 2004]: Yet another Unreal game which uses Vorbis for the music (What about effects and voice? Does anyone know?). The readme file of the demo even mentions Speex!<br />
<br />
* [http://sc2.sourceforge.net/ The Ur-Quan Masters]: Port of Star Control 2 to modern computers. Toys for Bob released the source of this amazing game under the GPL in 2002. Ogg Vorbis is used for the dialogue and the background music.<br />
<br />
* [http://uru.ubi.com/ Uru: Ages Beyond Myst]: Spinoff from the Myst series. Uses Ogg Vorbis for all music, speech and sound effects.<br />
<br />
* [http://mobygames.com/game/sheet/p,3/gameId,8635/ Lionheart — Legacy of the Crusader]: An 3/4 RPG from Black Isle. Uses Vorbis for all audio. Thanks to all the guys that made Vorbis great.. (I even donated money myself, someday maybe I can convince the company to kick in some bucks as well). Official site is down, using mobygames link.<br />
<br />
* [http://www.global-gaming.com/Dominion/ Urban Dominion] (beta): First Person Massively Multiplayer Online Role-Playing Game by Global-Gaming. Uses Ogg Vorbis for the sound system.<br />
<br />
* [http://www.vietcong-game.com/ Vietcong]: Vietnam War First Person Shooter by Pterodon. Uses Ogg Vorbis I believe for the background music.<br />
<br />
* [http://vegastrike.sourceforge.net/ Vega Strike]: It is a free spacesim. Ogg Vorbis files are stored in \music\ .<br />
<br />
* [http://www.gathering.com/wingsofwar/ Wings Of War]: It is an arcade shooter in times of WWI. Game has ogg.dll, vorbis.dll and vorbisfile.dll — but *.ogg files are not accesible.<br />
<br />
* [http://jonof.edgenetwork.org/winbuild/ WinBuild]: Winbuild is a port of Ken Silverman’s [http://www.advsys.net/ken/buildsrc/default.htm original Build engine demo] (for DOS) to Windows. It uses Vorbis compression for the music.<br />
<br />
* [http://www.worldofwarcraft.com/ World of Warcraft]: popular massively multiplayer online role-playing game from Blizzard Entertainment use Vorbis for speech and sound effects.<br />
<br />
* [http://www.zax-game.com/ Zax — The Alien Hunter]: A large 3/4 view action adventure game.<br />
<br />
[[Category:Vorbis]]</div>Lee Carréhttps://wiki.xiph.org/index.php?title=Xiph.Org_Foundation&diff=9297Xiph.Org Foundation2008-08-10T21:31:12Z<p>Lee Carré: added link text</p>
<hr />
<div>The '''Xiph.Org Foundation''' is a non-profit corporation dedicated to protecting the foundations of Internet multimedia from control by private interests. Our purpose is to support and develop free, open protocols and software to serve the public, developer, and business markets.<br />
<br />
Please see the [http://xiph.org/about.html main Xiph About page] for more details about why we're here.<br />
<br />
==External links==<br />
* [http://www.xiph.org/ Xiph.Org homepage]</div>Lee Carré