Talk:Videos/Digital Show and Tell
Greetings, Feel free to comment here— just log in to edit— or join us on IRC chat.
The wiki version of the video isn't yet as complete as the last video, due to schedules and timelines. In particular I think it could use some more going-deeper coverage. I'm surprised that I couldn't better HTML5 audio api examples of the "type your own JS, get audio and a scope" kind, if anyone knows of a better one than the one we have now that would be great.
--Gmaxwell 02:54, 26 February 2013 (PST)
Just dropping a line to say thank you! I'm continually impressed by the guides and overall outreach coming out of the xiph team. The latest video was a great introduction that managed to walk that fine line between theory and application without falling over or flailing about madly (in my opinion, anyway). Not to mention, I'm going through the gtk-bounce and waveform code now and really like it! It's not so trivial a piece of software as to be meaningless when learning to code useful applications, but it's not so gigantic as to be unapproachable either. Hell, I think it would serve as a great example for the GNOME folks to use in their documentation. Most guides on GTK just have you draw shapes on the screen and leave it at that. All in all, I'm really impressed and hope to have a similar setup replicated in a few weeks at my university, just for the sake of it.
--Aggroskater 23:16, 26 February 2013 (PST)
- Some parts are better written than others... I used enough cut & paste to warn against taking it too seriously :-) --Xiphmont 10:50, 28 February 2013 (PST)
@Monty: Thanks for these information. And as a non-native English speaker I want to thank you for your clear pronunciation. What did you mean with "no one ever ruined a great recording by not dithering the final master."? Do you mean, that nobody ever would forget it, or that it was not ruinous? That "CSS is awesome"-cup made me really nervous. I hope it means something like "Cascading Style Sheets", and not, what would fit better in this context, "Content Scramble System"[shudder]! --Akf 15:20, 27 February 2013 (PST)
- I meant that "not adding dither is not ruinous". I recall in at least one listening test on the subject, a minority of participants had a statistically significant preference for undithered versions, at least on those samples where it was in fact audible and the testers were encouraged to increase gain and listen to fade-outs. *However* I can't find the results of that test now that I've gone back to look for it, so my memory may be faulty. I've asked the HA folks to help me find it again if it really existed :-)
- CSS refers to Cascading Style Sheets. Thus the typsetting that overflows the box. --Xiphmont 10:50, 28 February 2013 (PST)
- I once built an 8-bit ADC from transistors, for experimentation. One strange result was how few bits are needed for speech. 8kHz with just one bit resolution is still quite intelligible (though rather noisy). --RichardNeill 08:08, 4 March 2013 (PST)
Thanks for these resources! One question: In the vid, you mention the Gibbs phenomenon. Is that in any way related to the Fourier uncertainty principle? These days, in various audio-related forums, people throw this Oppenheim and Magnasco (2013) paper entitled "Human hearing beats the Fourier uncertainty principle" around in response to your 24/192 article. Does the paper qualify any of the results presented in the video and/or your 24/192 article? (Just fixed the reference.) Lenfaki 13:14, 28 February 2013 (PST)
- it is related to the fourier uncertainty principle in that all of these effects are in some way related by the same math. As for the "Human hearing beats the Fourier uncertainty principle" paper floating around, a) the headline is effectively wrong, b) the effect described as 'newly discovered' has been understood for roughly 100 years, this merely adds some new hard measurements to the data set, c) the Gabor limit does not even apply to the detection task they're describing. So either the authors or their editor are partly confused. There's been a decent discussion of it at Hydrogen Audio, with none other than James Johnston and Ethan Winer weighing in. --Xiphmont 10:50, 28 February 2013 (PST)
You talk about discrete values (whether the analog sample points, or the infinitesimal image pixels). BUT, these are in some way, averages. In a digital camera, the pixel value is the integral across about 90% of the pixel-pitch. In analog audio, is it an instantaneous sample, or an average over the preceding sample-interval, or is it sometimes even more "blurred" than that? Also, when performing DAC, how do we get rid of the stairstep so perfectly without distortion? --RichardNeill 07:51, 4 March 2013 (PST)
- The pixel values in a camera are area averages exactly as you say, a necessary compromise in order to have enough light to work with. The sensor sits behind an optical lowpass filter that is intentionally blurring the image to prevent much of the aliasing distortion (Moiré) that would otherwise occur. Despite that, cameras still alias a bit, and if you _remove_ that anti-aliasing filter, you get much more (I have such a camera, danged filter was bonded to the hot filter, so both had to go to photograph hydrogen alpha lines).
- Audio does in fact use as close to an instantaneous sample as possible. The 'stairsteps' of a zero-order hold are quite regular in the frequency domain; they're folded mirror images of the original spectrum extending to infinity. All the anti-imaging filter has to do is cut off everything above the original channel bandwidth, and it doesn't even have to do a great job to beat a human ear :-) --Xiphmont 02:56, 12 March 2013 (PDT)
What's the correct way to plot a reconstructed waveform? If I have an array of samples and play them back through a DAC, the oscilloscope shows a smooth curve. But plotting them with eg matplotlib shows a stairstep. Thanks --RichardNeill 07:58, 4 March 2013 (PST)
- well, a fully reconstructed waveform is equal to the original input; it's a smooth continuous waveform. OTOH, if you want to plot an actual zero-order hold, a zero order hold really is a staircase waveform.
- If you want to plot the digital waveform pre-reconstruction, a mathemetician would always use lollipops, an engineer will use whatever's the most convenient. --Xiphmont 02:56, 12 March 2013 (PDT)
I have a strange issue with the gtk-bounce program - on (k)ubuntu 12.10, spectrum and waveform work just fine, but if I scroll into the gtk-bounce panel, the cursor disappears. Anyone seen that behaviour? - Julf
- Edit out the calls to "hide_mouse()" in gtk-bounce-widget.c. It's hiding the mouse on purpose because it's supposedly a touch application :-) --Xiphmont 02:56, 12 March 2013 (PDT)
One issue I'm not sure you have covered, relates content with short, loud sections (e.g. more like a movie or maybe classical music, less like pop music). I understand that that a 10dB change in sound level is perceived as twice as loud. Lets say we have some content where one section is 4 times (20dB) louder than the rest (not unreasonable - that is the difference between a conversation and street noise according to this page). If each extra bit adds 6dB of dynamic range, then the quiter sections will effectively be quantized using roughly 3 fewer bits than the louder section (e.g. 13bits rather than 16bits). If the majority of the content relatively quite (ie. being quantized using 13bits or less depending on how quiet relative to the peaks) then is it really fair to claim "16bit quality" for the entire piece of conetnet? Is this a real problem?? Is this ever an audible? Klodj 21:59, 11 March 2013 (PDT)
- the bit depth doesn't affect the correctness or 'fineness' of the reconstruction, it only changes the noise floor. It is the same and sounds the same as what happens when recording quieter-than-full-range on analogue tape. Compare a 16-bit digital signal recorded at -20dBFS to the same signal recorded on tape at -20dBV. Both will be smooth and analogue, but the tape will be noisier ;-) --Xiphmont 02:56, 12 March 2013 (PDT)
- Thank you for your reply! Ok, I understand that, when comparing digital to "analog" (tape), this is a non-issue. The tape noise floor is higher. But could you clarify a related point that is entirely in the digital domain? Lets say we have a 16bit recording of the 1812 Overture where the canons don't clip. The average level is going to depend on how dynamic range is compressed to handle the canons, but lets say it -36dBFS. If I adjust volume to suit the average level, then won't I effectively be hearing a noise floor equivalent to 10 bit quantization (16 bits - 36dB/6db_per_bit) for the majority of the recording (dither aside). --Klodj 19:13, 13 March 2013 (PDT)
- Yes. --Xiphmont 00:54, 14 March 2013 (PDT)
I've really enjoyed your videos, many thanks for your work producing them. In keeping with your message in the video about breaking things I was curious to see what would happen as you pushed the signal up to and beyond the Nyquist frequency. Would the filters mean that it would simply fade out smoothly (what order filter is employed?)? I suppose I could check this out myself, but would be interested to hear your answer. Also is the Gibbs phenomenon audible to any degree? --Stuarticus 14:00, 11 June 2013 (PDT)
The presenter is very confused. Each DAC does put out a held sample until the next sample is available. The staircase effect is not a convenient drawing, it is what is actually there. If you pull the data off of a file, of course no steps are visible. It appears the presenter is not familiar with the concept of a reconstruction filter. It's purpose is to remove the heterodyning of the sample frequency (the sample and hold that is there if the presenter cared to remove the top off the DAC and probe the op amp connected to the DAC) with audio freuqencies. Since a sync function is used to smooth the output waveform you will always see a smooth sinewave on the output of a DAC, no matter what frequency (below Nyquis) you use. In fact if the presenter showed a square wave and did this presentation all over again - he would find out why high sample rates are used in professional audio. They are used to move the reconstruction filter effects (which causes ringing in the waveform) to ultrasonic frequencies where we won't hear their effect like we hear the nasties that 44.1kHz anti-aliasing and reconstruction filters cause. This is a glib presentation that shows only a partial understanding of the construction and the mathematics behind digital sampling and reconstruction and I suggest consult folks who are more expert in the field before exposing your ignorance of the issues you are discussing. You should show what a 10kHz square wave looks like when sampled and played back at 44.1kHz. It looks like a sinewave! Now do the same thing at 192kHz and you will see why high sample rates are used. It is not that we hear up to 80kHz, it is that what we do year up to around 16kHz has much truer fidelity when you can move the effect of the sampling and reconstruction filters to an ultrasonic range. Regards, jkorten.