Videos/A Digital Media Primer For Geeks/making
This page documents some of the background information behind the production of Digital Media Primer For Geeks. To see the video or its wiki-edition visit the main video page.
The making of…
Canon HV40 HDV camera w/ wide-angle lens operating on a tripod. At the time I was looking for MyFirstVideoCamera, all six people [who did video work] recommended this same camera, and two said not to get it without the wide angle lens. I took their advice and have been happy with it. Among other nifty features, the camera offers true progressive scan modes, live firewire output, and the ability to act as a digitizer for external video input.
The wide angle lens gives the camera a nice close macro mode, and approximately triples the amount of light coming into the sensor for a given zoom/aperture. Useful for shooting indoors at night (eg, this entire video)
No additional lighting kit was used.
Two Crown PCC160 boundary microphones placed on a table approximately 4-8 feet in front of the speaker, run through a cheap Behringer portable mixer and into the camera's microphone input.
No additional audio kit was used.
Whiteboard markers by 'Bic'
Drawing aids by Staedtler, McMaster Carr, and 'Generic'.
Video shooting sequence
Scenes were pre-scripted and memorized, usually with lots of on-the-fly revision. In the future... I'm getting a teleprompter. OTOH, I can totally rattle off the entire video script from beginning to end as a party trick, thus ensuring I'll not be invited to many parties.
Diagrams were drawn by hand on a physical whiteboard with whiteboard markers and magnetic T-squares, triangles, and yardsticks. Despite looking a lot like greenscreen work, there is no image compositing in use (actually-- there are two small composites where an error in a whiteboard diagram was corrected by subtracting part of the original image and then adding a corrected version of the diagram).
Camera operated in 24F shutter priority mode (Tv set to "24") with exposure and white balance both calibrated to the white board (or a white piece of paper) and locked. Microphone attenuation setting was active, with gain locked such that room noise peaked at -40dB (all the rooms in the shooting sequences were noisy due to the building's ventilation system, or active equipment). Lighting in the whiteboard rooms tended to be odd, with little relative light cast on a presenter standing just in front of the whiteboard; a presenter is practically standing in the room's only shadow. Most of the room light is focused on the table and walls. Additional fill lighting kit would have been useful, but for the first vid, I didn't want 'perfect' to be the enemy of 'good'.
Autofocus used for whiteboard scenes, manual focus used for several workshop scenes as the autofocus tended to hunt continuously in very low light.
Continuous capture to a Thinkpad with firewire input via a simple gstreamer script.
All hail Cinelerra. You better hail, or Cinelerra will get pissy about it.
Most of the production sequence hinged on making Cinelerra happy; it is a hulking rusty cast iron WWI tank of a program that can seem like it's composed entirely of compressed bugs. That said, it was neither particularly crashy nor did it ever accidentally corrupt or lose work. It was also the only FOSS editor with a working 2D compositor. It got the job done once I found a workflow it would cope with (and fixed a number of bugs; these fixes are available from my cinelerra Git repo at http://git.xiph.org/?p=users/xiphmont/cinelerraCV.git;a=summary)
Each shooting session yielded four to six hours of raw video. The first step was to load the raw video into the cinelerra timeline, label each complete take, compare and choose the take to use, then render the chosen take out to a raw clip as a YUV4MPEG raw video file and a WAV raw audio file. Be careful that Settings->Align Cursor On Frames is set, else the audio and video renders won't start on the same boundary.
At this point, the raw video clips were adjusted for gamma, contrast, and saturation in gstreamer and mplayer. In the earlier shoots the camera was underexposing due to pilot error, which required quite a bit of gamma and saturation inflation to 'correct' (there is no real correction as the low-end data is gone, but it's possible to make it look better). Later shoots used saner settings and the adjustments were mostly to keep different shooting sessions more uniform. The whiteboard tends not to look white because it's mildly reflective, and picked up the color of the cyan and orange audio baffles in the room like a big diffuse mirror.
The audio was both noisy (due to the building's ventilation system which either sounded like a low loud rumble or a jet-engine taking off) and reverberant (the rooms were glass on two sides and plaster on the other two). Early takes used no additional sound absorbing material in the rooms, and the Postfish filtering and deverb was used heavily. It gives the early audio in the vid a slightly odd, processed feel (I had almost decided the original audio was simply unusable). Later takes used some big fleece 'soft flats' in the room to absorb some additional reverb, and the later takes are less heavily filtered.
The postfish filtering chain used declip (for the occasional overrange oops), deverb (remove room reverberation), multicompand (noise gating), single compand (for volume levelling) and EQ (the Crown mics are nice, but are very midrange heavy).
Audio special effects were one-offs, mostly done using Sox. The processed demo sections of audio were then spliced back into the original audio takes using Audactity.
Video special effects (eg, removing a color channel, etc) were done by writing quick, one-off filters in C for y4oi. A few effects were done by dumping a take as a directory full of PNGs and then batch-processing the PNGs again using a one-off C program, then reassembling with mplayer. Video effects were then stitched back into the original video takes in Cinelerra.
All editing was done in Cinelerra. This primarily consisted of stitching the individual takes back together with crossfades. All input and rendering output were done with raw YUV4MPEG and WAV files. Note that making this work well and correctly required several patches to the YUV4MPEG handler and colorspace conversion code.
I encoded by hand external to Cinelerra using mplayer for final postprocessing, the example_encoder included with the [Ptalarbvorm] Theora source distribution, and ivfenc for WebM. I synced subtitles to the video by hand with Audacity (I already had the script) in SRT format [for easy editing/translation and syncing witht he video in HTML5], and transcoded to Ogg Kate using kateenc. The Kate subs were then muxed with the Ogg video encoding using oggz-merge, and finally indexing added to the Ogg with OggIndex.
Sample Theora encode command lines:
- 360p, 128-ish (a4) audio + 500-ish (v50) video
- perform a little denoising, scale, and deband the raw render:
mplayer -vf hqdn3d,scale=640:360,gradfun=1.5,unsharp=l3x3:.1 complete.y4m -fast -noconsolecontrols -vo yuv4mpeg:file=filtered.y4m
- encode the basic Ogg Vorbis/Theora file:
encoder_example -a 4 -v 50 -k 240 complete.wav filtered.y4m -o basic.ogv
- produce Kate subs from the SRT input file:
kateenc -t srt -l en_US -c SUB -o subs.kate subs.srt
- add the subs to the Ogg video file:
oggz-merge basic.ogv subs.kate -o subbed.ogv
- add index for faster seeking on the Web:
OggIndex subbed.ogv -o A_Digital_Media_Primer_For_Geeks-360p.ogv
Sample WebM command lines:
- Might as well reuse the Vorbis encoding already done for the Ogg file:
oggz-rip -c vorbis A_Digital_Media_Primer_For_Geeks-360p.ogv -o vorbis.ogg
- Produce VP8 encoding from the y4m file used for Theora
ivfenc filtered.y4m vp8.ivf -p 2 -t 4 --best --target-bitrate=1500 --end-usage=0 --auto-alt-ref=1 -v --minsection-pct=5 --maxsection-pct=800 --lagin-frames=16 --kf-min-dist=0 --kf-max-dist=120 --static-thresh=0 --drop-frame=0 --min-q=0 --max-q=60
- Mux the audio and video into WebM file
mkvmerge vorbis.ogg vp8.ivf -o A_Digital_Media_Primer_For_Geeks-360p.webm