Jump to: navigation, search


9,685 bytes added, 04:24, 15 June 2012
Reverted edits by WikiCleaner (talk) to last revision by Ogg.k.ogg.k
== Disclaimer ==
This is not a Xiph codec, but I was asked to post informationthough it may be embedded in Ogg alonside other Xiphabout Ogg/Kate on this wikicodecs, such as Vorbis and Theora. As such, please do not assume that Xiph has anythingto do with this, much less responsibility.
== What is Kate? ==
Kate is a an overlay codec , originally designed for karaoke and text encapsulation for , that can bemultiplixed in Ogg. Text and images can becarried by a Kate stream, and animated. Most of the time, this would be multiplexedwith audio/video to carry subtitles, song lyrics (with or without karaoke data), etc,but doesn't have to be. A possible use  Series of a lone Kate stream would curves (splines, segments, etc) may be an e-bookattached to various properties(text position, font size, etc) to create animated overlays.This allows scrollingMoreover, the motion feature gives Kate a powerful means or fading text to be defined. This can even be used to describe draw arbitrary curvesshapes, sohand drawing of shapes can also be achievedrepresented by a Kate stream. This was originally meant  Example uses of Kate streams are movie subtitles for karaoke useTheora videos, either text based, butcan as may be used for any purposecreated by [ ffmpeg2theora], or imagebased, such as created by [http://thoggen. Motions can be attached to various semanticsnet Thoggen] (patching needed), like positionand lyrics,coloras created by oggenc, etc, so scrolling or fading text can be definedfrom vorbis-tools.
== Why a new codec? ==
*text data: text/image and optional motions, accompanied by optional overrides for style, region, language, etc
*keepalive: can be emitted at any time to help a demuxer know where we're at, but those packets are optional
*repeats: a verbatim repeat of a text packet's payload, in order to bound any backward seeking needed when starting to play a stream partway through. These are also optional.
*end data [EOS]: marks the end of the stream, it doesn't have any useful payload
::0x00 text data (including optional motions and overrides)
::0x01 keepalive
::0x02 repeat
::0x7f end packet (EOS)
This format described here is for bitstream version 0.x.
As or 19 december 2008, the latest bitstream version is 0.4.
For more detailed information, refer to the format documentation
*Cortado (wikimedia version)
I have patches for the following with Kate support:
*MPlayer (for multiplexed per-language subtitles - all region/style info is ignored)*xine (everything kate supports, as xine is my testbed)
*and more...
These may be found in the libkate source distribution (see [[#Downloading|Downloading]]
for a stream holds that information in the granule_shift field,
so each part may be reconstructed from a granulepos.
The timestamp T of a given Kate packet is split into a base B andoffset O, and these are stored in the granulepos of that packet.The split is done such that the B is the time of the earliest eventstill active at the time, and the O is the time elapsed between Band T. Thus, T = B + O. This mimics the way Theora stores its owntimestamps in granulepos, where the base acts as a keyframe, andan offset acts as the position of an intra frame from the previouskeyframe. Since Kate allows time overlapping events, however, thechoice of the base to use is slightly more complex, as it may notbe the starting time of the previous event, if the stream containstime overlapping events.
The kate_info structure for a stream holds a rational fraction
=== Generic timing ===
There are a few things to solve before the Kate bitstream format can be considered good
enough to be frozen:
Note: the following is mostly solved, and the bitstream is now stable, and has been
backward and forward compatible since the first released version. This will be updated
when I get some time.
=== Seeking and memory ===
* spu-subtitles - movie subtitles in DVD style paletted images
* lyrics - song lyrics
* transcript - exact words of a speech
* commentary - runnning commentary about an accompanying eg. video
* narration - narration of an accompanying eg. video
* book - a full book as text, might be a lone Kate stream (or muxed with other languages)
Please remember the 15 character limit if proposing other categories.
Note that the list of categories is subject to change, and will likely
be replaced by new, more "identifier like" ones. The three ones above,
however, would be kept for backward compatibility as they're already used.
== Text to speech ==
instead of simple paletted bitmaps in a Kate streams. Comments would be most welcome on
whether this is going too far, however.
I am also investigating SVG images. These allow for very small footprint images for simple
vector drawings, and could be very useful for things like background gradients below text.
A possible solution to the duplication issue is to have another stream in the container
== Reference encoder/decoder ==
A encoder (kateenc) and a decoder (katedec) are included in the tools directory. The encoder pulls its supports input from several different formats:* a customtext based file format (see [[#The Kate file format|The Kate file format]]),which is by no means meant to be part of the Kate bitstream specification itself,from an * SubRip (.srt) format file (, the most common subtitle format I found* LRC lyrics format. As an example for the widely used SRT subtitles format, and the following command linecreate a very basic one)Kate subtitles stream from an SRT file: kateenc -l en -c subtitles -t srt -o subtites.ogg The reverse is possible,or to recover an SRT file from a lyrics (Kate stream, with katedec. Note that the subtitles.lrc) format ogg fileshould then be multiplexed into the A/V stream,using either ogg-tools or oggz-tools.
The Kate bitstreams encoded and decoded by those tools are (supposed to be) correct for this
And after all, some people might prefer editing the XML version.
=== Packaging ===
It would be really nice to have packages for libkate/libtiger for many distros.
If you're a packager for a distro which doesn't have yet packages for libkate
or libtiger, please consider helping :)
In particular, packages for Debian would be grand.
== Matroska mapping ==
== Downloading ==
libkate encodes and decodes Kate streams, and is API and ABI stable.
The libkate source distribution is available at [].
libkate encodes and decodes A public git repository is available at [;a=summary;a=summary]. libtiger renders Kate streamsusing Pango and Cairo, and is alpha, with API stablechanges still possible.
The libtiger source distribution is available at [].
A public git repository is available at [;a=summary;a=summary]. == HOWTOs == These paragraphs describe a few ways to use Kate streams: === Text movie subtitles === Kate streams can carry Unicode text (that is, text that can representpretty much any existing language/script). If several Kate streams aremultiplexed along with a video, subtitles in various languages can bemade for that movie. An easy way to create such subtitles is to use ffmpeg2theora, whichcan create Kate streams from SubRip (.srt) format files, a simple butcommon text subtitles format. ffmpeg2theora 0.21 or later is needed. At its simplest:  ffmpeg2theora -o video-with-subtitles.ogg --subtitles video-without-subtitles.avi Several languages may be created and tagged with their language codefor easy selection in a media player:  ffmpeg2theora -o video-with-subtitles.ogg video-without-subtitles.avi --subtitles --subtitles-language ja --subtitles --subtitles-language cy --subtitles --subtitles-language en_GB Alternatively, kateenc (which comes with the libkate distribution) cancreate Kate streams from SubRip files as well. These can then be mergedwith a video with oggz-tools:  kateenc -t srt -c SUB -l it -o subtitles.ogg oggz merge -o movie-with-subtitles.ogg movie-without-subtitles.ogg subtitles.ogg This second method can also be used to add subtitles to a video whichis already encoded to Theora, as it will not transcode the video again.  === DVD subtitles === DVD subtitles are not text, but images. Thoggen, a DVD ripper program,can convert these subtitles to Kate streams (at the time of writing,Thoggen and GStreamer have not applied the necessary patches for thisto be possible out of the box, so patching them will be required). When configuring how to rip DVD tracks, any subtitles will be detectedby Thoggen, and selecting them in the GUI will cause them to be saved asKate tracks along with the movie.  === Song lyrics === Kate streams carrying song lyrics can be embedded in an Ogg file. Theoggenc Vorbis encoding tool from the Xiph.Org Vorbis tools allows lyricsto be loaded from a LRC or SRT text file and converted to a Kate streammultiplexed with the resulting Vorbis audio. At the time of writing,the patch to oggenc was not applied yet, so it will have to be patchedmanually with the patch found in the diffs directory.  oggenc -o song-with-lyrics.ogg --lyrics lyrics.lrc --lyrics-language en_US song.wav So called 'enhanced LRC' files (containing extra karaoke timing information)are supported, and a simple karaoke color change scheme will be savedout for these files. For more complex karaoke effects (such as more complex style changes, or sprite animation), kateenc should be used witha Kate description file to create a separate Kate stream, which can thenbe merged with a Vorbis only song with oggz-tools:  oggenc -o song.ogg song.wav kateenc -t kate -c LRC -l en_US -o lyrics.ogg lyrics-with-karaoke.kate oggz merge -o song-with-karaoke.ogg lyrics-with-karaoke.ogg song.ogg This latter method may also be used if you already have an encoded Vorbis songwith no lyrics, and just want to add the lyrics without reencoding.  === Metadata === Metadata can be attached to events, or to styles, bitmaps, regions, etc.Metadata are free form tag/value pairs, and can be used to enrich theirattached data with extra information. However, how this information isinterpreted is up to the application layer. It is worth noting that an event may not have attached text, so it ispossible to create an empty timed event with attached metadata. For instance, let's say we have a documentary, with footage from variousplaces, as well as short interviews, and we want two things:- tag footage with metadata about the location and date that footage was shot- subtitle the interviews and tag those subtitles with information about the speaker You can then create an empty Kate event for each footage part, synchronizedwith the footage, and attach a new metadata item called GEO_LOCATION, filledwith latitude and longitude of the place the footage was shot at.Similarly, for each subtitle event, a metadata item called SPEAKER can beattached. An empty event to tag a long 4:20 footage shot in Tokyo on 2011/08/12, andinserted at 18:30 in the documentary could look like:  event { 00:18:30,000 --> 00:22:50,000 meta "GEO_LOCATION" = "35.42; 139.42" meta "DATE" = "2011-08-12" } Here's a example for a line spoken by Dr Joe Bloggs at 18:30 into the documentary:  event { 00:18:30,000 --> 00:18:32,000 "Notice how the subtitles for my words have metadata attached to them" meta "SPEAKER" = "Dr Joe Bloggs" meta "URL" = "" } Notice how another metadata item, URL, is also present. The applicationwill have to be aware of those metadata in order to do something with itthough. Since those are free form, it is up to you to think of whatmetadata you want, and make use of it. Note that metadata may be attached to other objects, such as regions.This way, you can for example create a region tagged with a name, andtrack a person's movements with that region. Or you can tag a bitmapwith a copyright and a URL to a larger version of the image.   === Changing a Kate stream embedded in an Ogg stream === If you need to change a Kate stream already embedded in an Ogg stream (eg, you have a movie with subtitles, and you want to fix a spelling mistake, or want to bring one of the subtitles forward in time, etc), you can do this easily with KateDJ, a tool that will extract Kate streams, decode them to a temporary location, and rebuild the original stream after you've made whatever changes you want. KateDJ (included with the libkate distribution) is a GUI program using wxPython, a Python module for the wxWidgets GUI library, and the oggz tools (both needing installing separately if they are not already). The procedure consists of: * Run KateDJ* Click 'Load Ogg stream' and select the file to load* Click 'Demux file' to decode Kate streams in a temporary location* Edit the Kate streams (a message box tells you where they are placed)* When done, click 'Remux file from parts'* If any errors are reported, continue editing until the remux step succeeds == Frequently Asked Questions == === Does libkate work on other plaforms than Linux ? === Yes, libkate is not Linux specific in any way. It optionally relies on liboggand libpng, two libraries widely ported to various platforms.It has been reported to work on Windows and MacOS X as well as UNIX platforms. However, libtiger renders , a rendering library for Kate streams using , relies on Pango and Cairo,which are not easy to build on Windows, though they can be.The Tiger renderer is however completely separate from libkate, and is alphanot neededfor full encoding and decoding of Kate streams. === Where can I find some example files ? === The libkate distribution can generate various examples, but already built filescan be found there:[ API changes still possible-subtitles.ogg][] These files use raw text only.
== Things I need to get feedback on ==
* is it a good idea to avoid floating point usage altogether ?
[[Category:Ogg Mappings]]

Navigation menu