From XiphWiki
Revision as of 19:19, 31 October 2004 by (talk)
Jump to navigation Jump to search


Ogg Writ is a text phrase codec. While its primary purpose is to embed subtitles or captions in a Theora stream, its design makes it useful for many other purposes. It could provide lyrics to song encoded in Vorbis, a transcript to a political debate encoded in Speex, or even incorporate a live chat session as part of a continuous video stream.

One of the unique aspects of Writ is its discontinuous nature, that is, unlike other Ogg codecs the granules for which seperate packets effect may overlap. See the Granules and Muxing section below for how this works.


Current Ogg Writ development is on Xiph SVN as /trunk/writ/. It's being developed to use libogg2, so you'll need both to work on it. The reference encoder and decoder are available as part of the py-ogg2 package which is available on Xiph SVN as /trunk/py-ogg2/.

This is a (near final) working draft of the spec
Writ has been designed so that encoders/decoders can support a bare minimum and be fully compatable with future subversions. Each subversion adds a new feature, some building on others, adding a new header packet and likely a new field to each body packet.

Decoders should ignore header packets beyond what they were written to support and also ignore extra fields in data packets beyond their current version. This allows new features to be added without requiring that all software, or even most software, to support them.

We will be conservative about adding future subversions.

Header Packet 0 (BOS, 16 bytes):
 0x00                                   ( 8 bit Header 0)
 "writ" (LSB 0x74697277)                (32 bit codec identification)
 version                                ( 8 bit unsigned int, 0 = Alpha)
 subversion                             ( 8 bit unsigned int)
 granulerate_numerator                  (32 bit unsigned int)
 granulerate_denominator                (32 bit unsigned int)

Data Packet (each):
 0xFF                                   ( 8 bit 0xFF = data packet)
 granule_start                          (64 bit signed integer)
 granule_duration                       (32 bit unsigned integer)
 text_length                            ( 8 bit unsigned integer)
 text_string                            (variable-length UTF-8 string)

<B>Subversion 1 adds multiple language support</B>

Header Packet 1 (Language Definition, 8+ bytes) :
 0x01                                   ( 8 bit Header 1)
 "writ" (LSB 0x74697277)                (32 bit codec identification)
 num_languages                          ( 8 bit unsigned int)
 [repeated 1+num_languages times] :
   language_length                      ( 8 bit unsigned int)
   language_string                      (0+language_length rfc3066)
   language_desc_length                 ( 8 bit unsigned int)
   language_desc_string                 (0+language_desc_length UTF-8)

Data Packet (each):
 0xFF                                   ( 8 bit 0xFF = data packet)
 granule_start                          (64 bit signed integer)
 granule_duration                       (32 bit unsigned integer)
 [repeated num_languages times] :
   text_length                          ( 8 bit unsigned integer)
   text_string                          (variable-length UTF-8 string)

<B>Subversion 2 adds text window support</B>

Header Packet 2 (Window Definition, 10+ bytes) :
 0x02                                   ( 8 bit Header 2)
 "writ" (LSB 0x74697277)                (32 bit codec identification)
 location_scale_x                       (16 bit unsigned int)
 location_scale_y                       (16 bit unsigned int)
 num_windows                            ( 8 bit unsigned int)
 [if (window_num > 0) repeated window_num times] :
   location_x                           (variable length, see below)
   location_y                           (variable length, see below)
   location_width                       (variable length, see below)
   location_height                      (variable length, see below)
   alignment_x                          ( 2 bit alignment, see below)
   alignment_y                          ( 2 bit alignment, see below)

Data Packet (each):
 0xFF                                   ( 8 bit 0xFF = data packet)
 granule_start                          (64 bit signed integer)
 granule_duration                       (32 bit unsigned integer)
 [repeated num_languages times] :
   text_length                          ( 8 bit unsigned integer)
   text_string                          (variable-length UTF-8 string)
 [if (window_num > 1)] :
   window_id                            ( 8 bit unsigned integer)

<B>Example Stream</B>
 Header Packet 0
  version 0
  subversion 2
  granulenum 1
  granuledom 1

 Header Packet 1
  num_languages 2
   Language 0:
    language en
    language_desc English
   Language 1:
    language es
    language_desc Spanish

 Header Packet 2
  location_scale_x 4000 (12 bits)
  location_scale_y 270  ( 9 bits)
  num_windows 2
   Window 0:
    location_x 1
    location_y 2
    location_width 3
    location_height 1
    alignment_x 3 (Full)
    alignment_y 3 (Full)
   Window 1:
    location_x 5
    location_y 6
    location_width 7
    location_height 1
    alignment_x 3 (Full)
    alignment_y 3 (Full)

 Phrase Packet:
  granule_start 5
  granule_duration 10
  Language 0: "Hello World!"
  Language 1: "Hola, Mundo!"
  window_id 0
 \xff\x05\x00\x00\x00\x00\x00\x00\x00\x0a\x00\x00\x00\x0cHello World!\x0cHola, Mundo!\x00

 Phrase Packet:
  granule_start 12
  granule_duration 15
  Language 0: "It's a beautiful day to be born."
  Language 1: "Es un día hermoso para que se llevará."
  window_id 1
 \xff\x0c\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00\x20It's a beautiful day to be born.\x26Es un d\xeda hermoso para que se llevar\xe1.\x01

Granules and Muxing

Granulepos in Writ (as well as future discontinuous codecs) will be by start time, not end time, that the data in a given page is tagged for. This greatly simplifies this specification (see the old method below).

All Writ phrases will be provided at and given the granulepos of their start time, ordered by their start time within the logical bitstream.

Phrase packets with long durations should be repeated in the logical bitstream at regular intervals to ensure that a player seeking to the middle of their duration will still see them. These packet copies will be identical to their original, including the start and duration fields, the granulepos of the page they reside on will be incremented for each copy to place it forward on the logical bitstream.

No two phrases can start on the same granule. On decoding, each packets' start granule is checked against already known packets. If a match is found the new packet is ignored. This prevents phrase copies from being interpreted as new phrases.

Seeking Example

Here is a timeline (granule numbers at top, read down) of a sample stream:

                        <- Granules ->
 ___________  ____________  ____________  ____________  _____________
 ____________________   ____________________________________
|_A____________>_____| |_D____________>______________>______|
     _________      ___    __________     ___________
    |_B_______|    |_C_|  |_E________|   |_F_________|

 (note: these have been seperated vertically for easy viewing only)

Packet  Granule Description
 V H0   0       Vorbis Header 0x01 (page by itself)
 W H0   0       Writ Header 0 (page by itself)
 V H1   0       Vorbis Header 0x03
 V H2   0       Vorbis Header 0x05
 W H1   0       Writ Header 1 (Language Defs)
 W H2   0       Writ Header 2 (Window Defs)
 W A    0       Writ Phrase A
 W B    4       Writ Phrase B
 V      12      Vorbis 0-12
 W A    15      Writ Phrase A
 W C    19      Writ Phrase C
 W D    23      Writ Phrase D
 V      26      Vorbis 13-26
 W E    26      Writ Phrase E
 W D    38      Writ Phrase D
 V      40      Vorbis 27-40
 W F    41      Writ Phrase F
 W D    53      Writ Phrase D (EOF)
 V      54      Vorbis 41-54
 V      69      Vorbis 55-69 (EOF)

Player begins decoding at beginning of stream. It reads the BOS pages for both codecs, then receives a non-BOS page. At this point it knows that it has two bitstreams to decode and has resolved that one is Writ and the other Vorbis. It'll continue processing the headers for both.

Next it's going to find two Writ packets (phrases A and B) and toss them into libwrit. Then it'll get to the first Vorbis data page. It now has data from both bitstreams, and it knows (from the granulepos on the Vorbis page) that it has enough data to run until 12. If there were any Writ packets before 12 they would have appeared first.

At around granule 9 the listener seeks forward to 24. This will cause a rapid seek through the file to find the first page with a granulepos greater than the seek position and begin decoding at that point.

It'll find a Vorbis packet containing 13-26 (and not use 13-23) and Writ phrase E. Again, having data from both bitstreams it can begin playing. D would normally appear at granule 24 but is not known about yet. The player knows that this is only enough to decode until 26 so, knowing enough to prebuffer, continues reading the file as it plays the media.

The next packet it finds is Writ phrase D, and passing it to libwrit, is found that the current granulepos is within the duration. It is thus displayed immediatly, as it's prebuffered, without waiting for granulepos 38. It'll keep reading (because the maximum decoded Vorbis is still 26) and find a Vorbis packet with a 40 granulepos.

As it nears 38 it'll read the file again and find Writ phrase F, which takes it out to 41. Vorbis only goes until 40, so it'll have to keep reading until the next Vorbis packet.

Next it'll find Writ phrase D, which will be ignored by libwrit because phrase D is already known (matches start granule of earlier D), and the EOF on that page marks this as the last of the Writ stream.

It'll continue reading for the next Vorbis data and find the packet for granule 54, followed by the Vorbis packet for granule 69. With that it's EOS, EOF, finished.

This is of course a simplistic example, Writ and Vorbis will rarely have granules which equal the same amount of time. Each bitstream has its' own granule -> time mapping which is calculated when muxing concurrent bitstreams within the file. So if there are 44100 Vorbis granules per second and only 4 Writ granules per second, pages would be ordered as W25 V297892 W31 V385932 W39 W41 V463057 etc. The logic used in the above example works after this granule-time mapping is calculated.

Ongoing Discussion

  • How does this get "encoded" and "merged"?
    • <purple_haese> The muxing rule is pages are arranged in ascending order by the timestamp that is represented by their granulepos.
  • For what reason is the 0x00 and 0xFF byte at the beginning of header and data packet respectively?
    • <xiphmont> If, after a seek, I hand your codec a header packet, what does the codec do?
    • <xiphmont> It does *nothing*. If I haven't told it to reset, the header is not data, *it must ignore the header*.
    • <xiphmont> this eliminates a huge raft of special cases in Ogg seeking.

"The Old Way"

The section below is for historical purposes only!

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  In a lengthy discussion with Monty and Derf the decidion to change the
  behavior of discontinuous bitstreams in Ogg, or rather, extend the
  current Ogg specification to handle discontinuous codecs, was made.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The Ogg granulepos of each page is equal to the expiration of the text, packets are ordered by expiration time and may overlap. So, at or before text A is to be displayed, the following sequence is included:

Physical        Text    Text    Text
Location        Packet  Start   Expire  (text expire = page granulepos)
00              B       04      14
00              D       19      23
00              C       09      24
00              F       27      34
00              E       26      37
00              G       35      47
00              H       42      54
00              A       00      59
51              I       51      66

So B, D, C, F, E, G, and H are all defined before A, building a FIFO (first in first out) buffer in the player. Encoders should limit the extend of this behavior to reduce nessesary buffer size on the player side by prematurly expiring captions and recreating them periodically.

The screen should not be updated with the new captions until they've all been processed to prevent "flicker". New caption data to the same position will scroll the previous data upwards with no line breaks seperating them (unless present in text). [1] 灌装机 光触媒材料 光固化地坪 光固化涂料用树脂 光缆集中监测系统 光膜 光学材料制造 光学眼镜毛坯 光阻剂 广播级摄像机锂离子电池 广告袋 广告赠品 规整波纹填料 硅胶 硅胶按键 硅胶小包装 硅凝胶 硅酸锆 硅酸盐水泥 硅酮胶包装瓶 硅烷交联聚乙烯电缆绝缘料耐环 硅微粉 国际国内贸易 过滤布 过滤稀土瓷砂 海水电解不溶性氧化物电极 含油尼龙 航空煤油 航空器材 航空润滑油 合成革 合成树脂 合成型热传导液 黑白全色胶卷 黑白全色胶片 黑色抗静电塑料原料 红外防伪发光材料 红外激光探测板 糊用聚氯乙烯树脂 护耳罩 护罩盖 花钵 花岗岩板材 花坛 滑石粉 化肥 化工填料 化合物纳米粉末 化纤 化纤绸 化纤面料 化学矿 化学试剂 化学纤维丝网 化学纤维油剂 化油器清洗剂 化妆品 化妆品玻璃 化妆品工艺 化妆品瓶 化妆品容器 化妆品添加剂 环保大芯板 环保垃圾桶 环保滤料 环保乳胶漆 环境工程 环氧胶系列 环氧树脂 环氧系列固化稀释剂 黄蜡 黄抛光膏 磺化酞菁钴 回位装置 混苯 混合气体 混凝土防水屏障 混凝土膨剂 火柴 机械护罩 机械油 机械油封 机制甘油香皂 机制透明皂 激光照排胶片冲洗套药 集装桶 加氢阻垢剂 夹层玻璃 家庭急救用品 甲苯 甲基苯三唑 甲基丙烯酸甲酯 甲基儿茶酚 甲基二乙醇胺 甲基腈 甲基萘 甲基异丁基甲酮 甲醚化六羟甲基三聚氰胺树脂 甲酸 减速机专用脂 剪切平直型 碱锌光亮剂 碱性锌锰 碱性锌锰干电池 建材 建筑材料 建筑钢材 建筑模板 建筑模板生产车间 健身器材 键盘内衬 胶辊胶 胶合板模板平铺施工 胶木粉 焦宝石 焦亚硫酸钠 巾包装袋 金锭 金漆镶嵌 金属表面处理剂 金属加工用油 金属加工油 金属纳米粉末 锦纶6聚合 锦纶丝 锦纶线 精甲醇 精密仪器 精密铸造 精品搪瓷 精塑制品 精细化工 精制盐 警示带 净化设备 镜片胶 镜铜纸 酒精消毒湿巾 局部镀铝膜 矩形镀铜碳棒 矩形气刨炭棒 聚氨脂防水涂料 聚氨酯 聚氨酯树脂 聚胺脂泡沫填缝剂 聚丙烯 聚丙烯捆扎绳 聚丙烯填料 聚合硫酸铁 聚合物锂离子电池 聚氯乙烯复合管 聚氯乙烯加工助剂 聚氯乙烯排水管 聚醚多元醇 聚四氟乙烯 聚四氟乙烯密封圈 聚四氟乙烯制品 聚烯烃复合纤维 聚乙二醇 聚乙烯 聚乙烯醇 聚乙烯工农用薄膜 聚乙烯购物袋 聚酯薄膜 聚酯切片 聚酯芯输送带 卷装背心 卷装食品保鲜袋 绝缘护套 绝缘料聚丙烯管材专用料 军用高分子材料 卡扣线系列产品 聚晶石 开缸剂 开水炉 康复理疗器械 抗静电高分子材料 抗磨液压油 空气除尘滤袋 空气清新剂 苦味酸 矿物型导热油 矿渣硅酸盐水泥 垃圾袋 拉挤树脂 拉链片 蜡烛 蜡烛台玻璃 老弱病残护理商品 雷射 雷射雕刻按键 镭射膜 镭射转印膜 冷拔钢丝型 冷却塔 冷热水管 离子交换树脂 礼花弹 礼品玻璃 锂离子电池充电器 沥青 连接气刨炭棒 连接式镀铜碳棒 连卷袋 连体豪华玻璃洗手盆 连铸保护渣 炼油 磷矿粉 磷矿石 磷酸二氢铝 磷酸二氢镁 磷酸氢镁 磷酸盐 流态化石灰粉 硫磺块 硫酸锆 硫酸铅 硫酸氧锆 柳编制品 滤紫外线石英管 铝箔封口垫片 铝箔食品容器 铝材光亮剂 铝合金氧化 铝用碳素 绿色环保型增塑剂 氯化铵 氯化钾 氯化聚乙烯卷材 氯化镁 氯化石蜡 氯化石蜡油清 氯碱工业隔膜法 氯碱化工 氯霉素原料药 氯霉素中间体 氯酸盐工业用金属阳极 箩筐 螺栓松动剂 埋弧焊用烧结焊剂 梅花点波纹填料 煤焦油瓷漆 煤炭 煤油 美容面罩 美容泡泡沐浴露 美容纸 镁基脱硫剂 密封材料 密封段 密封件 密封胶溶剂 密封胶条 密压板 棉麻固色油 免洗型助焊剂 模具标准件加工 模具硅胶 模具注塑 模内标签 模内转印 模造纸 摩托车挡风玻璃 磨擦试验机 木材表面转印膜 木钙 木钠 木糖 木糖醇 木质素磺酸钙 内燃机油 高效脱硫剂 高效脱硫脱碳剂 高效脱漆剂 高效阻燃剂 复合橡胶分散剂 锆氟酸铵 锆氟酸钾 锆刚玉 锆英粉 锆英砂 隔热耐烧蚀涂料 各种溶剂 各种卫生纸 各种橡胶杂件 各种粘结剂 铬矿 给排水管件管材 给水管材 工程塑料包装 工程橡塑 工具箱 工矿橡胶配件 工业苯 工业齿轮油 工业缝包线 设备清洗 工业级羊毛脂 工业甲醛 工业硫酸 工业清洗剂 工业热风炉 工业润滑油 工业碳酸锂 工业陶瓷 工业盐 工业用润滑油 工业用塑料薄膜 工业用油 工艺玻璃 工艺雕刻用牛角 工艺品树脂 健康监测仪器