CMML
CMML stands for Continuous Media Markup Language and is to audio or video what HTML is to text. CMML is essentially a timed text codec. It allows to structure a time-continuously sampled data file by dividing it into temporal section (so-called clips) and provides these clips with some additional information. This information is HTML-like and is essentially a textual representation of the audio or video file. CMML enables textual searches on these otherwise binary files.
Detailed CMML specifications are part of the Annodex Technologies. This description gives a quick introduction only. For full specifications, see http://www.annodex.net/specifications.html.
CMML specification
Before describing the actual data that goes into a logical Ogg bitstream, we need to understand what the stand-alone "codec" contains.
CMML basically consists of:
- a head tag which contains information for the complete audio/video file
- a set of clip tags which each contains information on a temporal section of the file
- for authoring purposes, CMML also allows a stream tag which spcifies the file it describes
An example CMML file looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE cmml SYSTEM "cmml.dtd"> <cmml lang="en" id="simple" granulerate="1000/1"> <stream id="fish" basetime="0"> <import id="videosrc" lang="en" title="Video fish" granulerate="25/1" contenttype="video/theora" src="fish.ogg" start="0" end="360"> <param id="vheight" name="video.height" value="250"/> <param id="vwidth" name="video.width" value="180"/> </import> </stream> <head> <title>Types of fish</title> <meta name="Producer" content="Joe Ordinary"/> <meta name="DC.Author" content="Joe's friend"/> </head> <clip id="intro" start="0"> <a href="http://www.example.com/fish.html">Read more about fish</a> <desc>This is the introduction to the film Joe made about fish.</desc> </clip> <clip id="dolphin" start="npt:3.5" end="npt:5:5.9"> <img src="dolphin.jpg"/> <desc>Here, Joe caught sight of a dolphin in the ocean.</desc> <meta name="Subject" content="dolphin"/> </clip> <clip id="goldfish" start="npt:5:5.9"> <a href="http://www.example.com/morefish.anx?id=goldfish">More video clips on goldfish.</a> <img src="http://www.example.com/goldfish.jpg"/> <desc>Joe has a fishtank at home with many colourful fish. The common goldfish is one of them and Joe's favourite. Here are some fabulous pictures he has taken of them.</desc> <meta name="Location" content="Joe's fishtank"/> <meta name="Subject" content="goldfish"/> </clip> </cmml>
The head element is a standard head element from html.
Clips contain (amongst others) the following information:
- a name in the id attribute so addressing of the clips is possible, as in http://www.example.com/morefish.anx?id=goldfish (Web server needs to support this)
- a start and possibly an end attribute, to tell the clip where it is temporally located
- a title attribute to give it a short description
- meta elements to provide it with structed meta data as name-value pairs
- a img element which links to a picture that represents the content of the clip visually
- a a element which puts a hyperlink to another Web resource into the clip
- a desc element giving a long, free-text description/annotation/transcription for the clip
Most of this information is optional.
CMML mapping into Ogg
When CMML is mapped into an Ogg logical bitstream it needs to be serialised first. XML is a hierarchical file format, so is not generally serialisable. However, CMML has been designed to be serialised easily.
CMML is serialised by having some initial header pages that set up the CMML decoding environment, and contain header type information. The content of a CMML logical bitstream then consists of clip tags only. The stream tag is not copied into the CMML bitstream as it controls the authoring only.
All of the CMML bitstream information is text. As it gets encoded into a binary bitstream, an encoding format has to be specified. To simplify things, UTF-8 is defined as the mandatory encoding format for all data in a CMML binary bitstream. Also, the encoding process MUST ensure that newline characters are represented as LF (or "\n" in C) only and replace any new line representations that come as CR LF combinations (or "\r\n" in C) with LF only.
The CMML ident header packet
The first header packet of a CMML logical bitstream is the CMML ident header. It contains all information required to identify the CMML bitstream and to set up a CMML decoder. It has the following format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identifier 'CMML\0\0\0\0' | 0-3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 4-7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version major | Version minor | 8-11 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ...
The CMML version as described here is major=2 minor=1.
The CMML secondary headers
The CMML secondary headers are a sequence of two packets that contain the CMML and XML "setup" information:
- one packet with the CMML xml preamble and cmml tag.
- one packet with the CMML head tag.
These packets contain textual, not binary information.
The CMML preamble tags are all single-line tags, such as the xml processing instruction (<?xml...>) and the document type declaration (<!DOCTYPE...>).
The only CMML tag that is not already serialized from a CMML file is the cmml tag, as it encloses all the other content tags. To serialise it, the cmml start tag is transformed into a processing instruction, retaining all its attributes (<?cmml ...>), and the cmml end tag is deleted.
The first CMML secondary header packet has the following format:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | <?xml ... | 0- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | <!DOCTYPE ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | <?cmml ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The second CMML secondary header packet contains the CMML head element with all its attributes and other containing elements and has the following format.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | <head ... | 0- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | </head> | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The CMML data packets
The data packets of the CMML bitstream contain the CMML clip elements. Their start and end attributes however only exist for authoring purposes and are not copied into the bitstream (to avoid contradictory information), but are rather represented through the time mapping of the encapsulation format that interleaves CMML data with data from other time-continuous bitstreams. Generally the time mapping is done through some timestamp representation and through the position in the stream.
A clip tag is encoded with all tags (except for the start and end attributes) as a string printed into a clip packet. The clip tag's start attribute tells the encapsulator at what time to insert the clip packet into the bitstream. If an end attribute is present, it leads to the creation of another clip packet, unless another clip packet starts on the same track beforehand. This clip packet contains an "empty" clip tag, i.e. a clip tag without meta, a, img or desc elements and no attribute values except for a copy of the track attribute from the original clip tag.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | <clip ... | 0- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | </clip> | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+