Timed Divs HTML: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(added example file)
(added changes to html4 body element)
Line 5: Line 5:
This page specifies a subclass of HTML documents that is a time-aligned text format for audio-visual content. We call the format "timed divs within HTML" or TDHT. It is intended to be used only in a World Wide Web context i.e. everywhere that Web browser functionality is available. Use cases for the format are subtitles, captions, annotations and other time aligned text as listed at http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs .
This page specifies a subclass of HTML documents that is a time-aligned text format for audio-visual content. We call the format "timed divs within HTML" or TDHT. It is intended to be used only in a World Wide Web context i.e. everywhere that Web browser functionality is available. Use cases for the format are subtitles, captions, annotations and other time aligned text as listed at http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs .


TDHT may be similar to W3C TimedText DFXP in many respects, but in comparison to DFXP it does not re-invent HTML, CSS and effects, but rather uses existing HTML, CSS and javascript for these. The purpose of DFXP is to create a web-independent exchange format for timed text, which is why it cannot directly be specified as a subpart of HTML. TDHT in contrast is HTML with a minimum number of changes.  
TDHT may be similar to W3C TimedText DFXP in many respects, but in comparison to DFXP it does not re-invent HTML, CSS and effects, but rather uses existing HTML, CSS and javascript for these. The purpose of DFXP is to create a web-independent exchange format for timed text, which is why it cannot directly be specified as a subpart of HTML.
 
TDHT in contrast is HTML with a minimum number of changes. TDHT is parsable by any HTML parser. It works with CSS and javascript. No new functionality has to be defined for TDHT.




Line 15: Line 17:




= The TDHT format =
= The TDHT format restrictions from HTML =


TDHT files are time-aligned text. This means there is a time association with blocks of text and there is time-based seeking functionality on those blocks of text.
TDHT files are time-aligned text. This means there is a time association with blocks of text and there is time-based seeking functionality on those blocks of text.


Here is an example tdht file for subtitles:
Here is an example TDHT file for subtitles:


<pre>
<pre>
Line 42: Line 44:
</html>
</html>
</pre>
</pre>
Right now, TDHT is based on [http://www.w3.org/TR/html401/ HTML4.01], but it should also be possible to work on [http://www.whatwg.org/specs/web-apps/current-work/ HTML5], which is still in flux.
The following restrictions to HTML4.01 are imposed on TDHT:
== 1. The body element ==
In HTML4.01, the [http://www.w3.org/TR/html401/struct/global.html#h-7.5 body element] is defined as follows:
<!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL) -- document body -->
<!ATTLIST BODY
  %attrs;                              -- %coreattrs, %i18n, %events --
  onload          %Script;  #IMPLIED  -- the document has been loaded --
  onunload        %Script;  #IMPLIED  -- the document has been removed --
  >
In TDHT1.0 we restrict it to the following:
<!ELEMENT BODY O O (DIV|SCRIPT)+ -- document body -->
<!ATTLIST BODY
  %attrs;                              -- %coreattrs, %i18n, %events --
  onload          %Script;  #IMPLIED  -- the document has been loaded --
  onunload        %Script;  #IMPLIED  -- the document has been removed --
  >

Revision as of 21:24, 4 January 2009


Introduction

This page specifies a subclass of HTML documents that is a time-aligned text format for audio-visual content. We call the format "timed divs within HTML" or TDHT. It is intended to be used only in a World Wide Web context i.e. everywhere that Web browser functionality is available. Use cases for the format are subtitles, captions, annotations and other time aligned text as listed at http://wiki.xiph.org/index.php/OggText#Categories_of_Text_Codecs .

TDHT may be similar to W3C TimedText DFXP in many respects, but in comparison to DFXP it does not re-invent HTML, CSS and effects, but rather uses existing HTML, CSS and javascript for these. The purpose of DFXP is to create a web-independent exchange format for timed text, which is why it cannot directly be specified as a subpart of HTML.

TDHT in contrast is HTML with a minimum number of changes. TDHT is parsable by any HTML parser. It works with CSS and javascript. No new functionality has to be defined for TDHT.


File Extension

Files in this format are to be of text/x-tdht mime type.

Files in this format should have a file extension of .tdht .


The TDHT format restrictions from HTML

TDHT files are time-aligned text. This means there is a time association with blocks of text and there is time-based seeking functionality on those blocks of text.

Here is an example TDHT file for subtitles:

<html>
  <head>
    <title>Desperate Housewives - Season 5, Episode 6</title>
  </head>
  <body>
    <div start="00:00:00,070" end="00:00:02,270">
      <p>Previously on...</p>
    </div>
    <div start="00:00:02,280" end="00:00:04,270">
      <p>We had an agreement to keep things casual.</p>
    </div>
    <div start="00:00:04,280" end="00:00:06,660">
      <p>Susan made her feelings clear.</p>
    </div>
    <div start="00:00:06,800" end="00:00:10,100">
      <p>So if I was with another woman, that wouldn't bother you? No, it wouldn't.</p>
    </div>
  </body>
</html>

Right now, TDHT is based on HTML4.01, but it should also be possible to work on HTML5, which is still in flux.

The following restrictions to HTML4.01 are imposed on TDHT:


1. The body element

In HTML4.01, the body element is defined as follows:

<!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL) -- document body --> <!ATTLIST BODY

 %attrs;                              -- %coreattrs, %i18n, %events --
 onload          %Script;   #IMPLIED  -- the document has been loaded --
 onunload        %Script;   #IMPLIED  -- the document has been removed --
 >

In TDHT1.0 we restrict it to the following:

<!ELEMENT BODY O O (DIV|SCRIPT)+ -- document body --> <!ATTLIST BODY

 %attrs;                              -- %coreattrs, %i18n, %events --
 onload          %Script;   #IMPLIED  -- the document has been loaded --
 onunload        %Script;   #IMPLIED  -- the document has been removed --
 >