What is it
OggUVS is an uncompressed video codec for Ogg. It's a simple way to store and transfer uncompressed video within an Ogg container. It is similar to OggYUV and OggRGB, but is intended to support both formats.
This is a work in progress and not a final proposal.
Why is it
This format is intended to be used as an interchange format. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files. It is intended to be less complex to use than RIFF/AVI.
A stream is composed of a main header packet, one comment packets, zero or more additional header packets, and one or more data packets. At this time, one additional header packet is specified to describe the data packet layout. This packet SHOULD be present in all streams. Data packets are of fixed length as specified in the main header. A special zero length data packet with the EOS flag set is permitted. Data packets must contain exactly one image. This stream format is field based, rather than frame based. It supports only two fields per frame, and one field per data packet. Packets (fields) must appear in temporal order.
Note that unless otherwise noted, all multi-byte fields use the network byte order (big endian). The first packed in a stream MUST be the main header packet. The second packet MUST be the comment packet. Some extra header packets MAY be included after the comment header, provided this is identified in the main header. The packets that follow MUST all be data packets.
Main Header Packet
The main header packet MUST be the first packet in the stream.
32 "UVS " Codec Identifier Word 0 32 " " Codec Identifier Word 1 16 [uint] Version Major (breaks backwards compatability to increment) 16 [uint] Version Minor (backwards compatable, ie, more supported format id's) 16 [uint] Display Width 16 [uint] Display Height 16 [uint] Pixel Aspect Ratio Numerator 16 [uint] Pixel Aspect Ratio Denominator 16 [uint] Field Rate Numerator 16 [uint] Field Rate Denominator 32 [uint] Timebase (hz) 32 [uint] Field Image Size (in bytes) 32 [uint] Number of extra headers 32 [enum] Colorspace 31 [uint] Reserved 1 [flg] Interlaced 32 [enum] Layout ID
- The number of extra headers field counts the number of headers following the comment packet and preceding the data.
- Field Rate and Timebase: The Timebase field is used to change the time base of the granule position. The special value 0 indicates the the value (1/Field Rate). If the Field Rate values are set to zero, the content uses a variable field rate. In all cases the absolute field time is determined by (granulepos/Timebase). At least one of these two values must be declared. Examples of valid descriptions of 29.98fps video:
- Field Rate = 2998/100, Timebase = 90000, granulepos of first frame = 3002
- Field Rate = 0/0, Timebase = 90000, granulepos of first frame = 3002
- Field Rate = 2998/100, Timebase = 0, granulepos of first frame = 1
- The Layout ID field is used to describe the layout of the image buffer in memory. This provides an easy means of selecting amongst common storage methods. If this field is set to zero, a Data Layout Packet MUST be included in the stream, and the contents of that packet should be parsed to determine the image buffer layout. The valid values for this field are:
Value Short Name Description 0x32315659 OGGUVS_FMT_YV12 8-bpp Y plane, followed by 8-bpp 2×2 V and U planes. 0x56555949 OGGUVS_FMT_IYUV 8-bpp Y plane, followed by 8-bpp 2×2 U and V planes. 0x32595559 OGGUVS_FMT_YUY2 UV downsampled 2:1 horizontally, ordered Y0 U0 Y1 V0 0x59565955 OGGUVS_FMT_UYVY UV downsampled 2:1 horizontally, ordered U0 Y0 V0 Y1 0x55595659 OGGUVS_FMT_YVYU UV downsampled 2:1 horizontally, ordered Y0 V0 Y1 U0 0x80808081 OGGUVS_FMT_RGB24DIB 8 bits per component, stored BGR, rows aligned to a 32 bit boundary, rows stored bottom first. 0x80808082 OGGUVS_FMT_RGB32DIB 8 bits per component, stored BGRx (x is don't care) rows stored bottom first. 0x80808083 OGGUVS_FMT_ARGBDIB 8 bits per component, stored BGRA, rows stored bottom first.
By convention, layouts with a registered fourcc should use that fourcc for this value. Other formats should set the MSB of each byte and use a OggUVS specific value. Layout ID's with 0xFF as the most significant byte will be considered to be application specific.
- The Colorspace field is used to identify all colorspaces supported by this format and is defined as follows:
Value Short Name Description 0x00000001 OGGUVS_CSP_UNSPEC_RGB Unspecified R'G'B' 0x00000002 OGGUVS_CSP_UNSPEC_YCBCR Unspecified Y'CbCr other useful colorspaces could go here.
The unspecified colorspaces are intended to be used only when the actual colorspace used is not known. This situation arises when getting decompressed frames from proprietary codecs, for instance. Applications should make every effort to properly identify the colorspace and use the proper value in this field.
The comment packet MUST be present and MUST be the second packet in the stream.
Undefined at this time, probably will be whatever Theora uses.
Data Layout Packet
The data layout packet MUST be included if the 'Layout ID' field in the main header packet is set to zero. The data layout packet SHOULD be included in all streams. If a 'Layout ID' field is specified, the data layout packet MUST NOT be modified from it's standard definition. Application that have a native understanding of the storage format as specified by the 'Layout ID' MAY parse the data layout packet, but are not required to.
32 0x1 Data Layout Header Packet ID 16 [uint] Version Major (breaks backwards compatability to increment) 16 [uint] Version Minor (backwards compatable, ie, more supported format id's) 16 [uint] Luma Height 16 [uint] Luma Width 16 [uint] Chroma Height 16 [uint] Chroma Width -- Repeat all fields below this point for interlaced storage. 32 [uint] Alpha channel offset 32 [uint] Y/R channel offset 32 [uint] U/G channel offset 32 [uint] V/B channel offset 32 [sint] Alpha Y Stride 32 [sint] Y/R Y Stride 32 [sint] U/G Y Stride 32 [sint] V/B Y Stride 32 [sint] Y/R X Stride 32 [sint] U/G X Stride 32 [sint] V/B X Stride
- This layout packet is for 8 bit per channel formats only.
- The width and height fields reflect the storage size, not the displayed size, of the field.
- The offset fields specify the offset, in bytes, from the start of the data packet to the top leftmost sample for the specified channel.
- The Y stride field indicates the number of bytes to add to the current position to get the corresponding sample one row down. For the alpha channel, this value should be set to zero if the channel is not present.
- The X stride field indicates the number of bytes to add to the current position to get the corresponding sample one pixel to the right. For the alpha channel, this value should be set to zero if the channel is not present.
Implementation Notes: Great care must be exercised when using the layout packet directly. The following are a few checks that should be made to validate the data:
For all channels: Width <= abs(Y_Stride) For alpha and luma channels: offset + y_stride*luma_h >= 0 offset + y_stride*luma_h <= image size (from main header) For chroma channels: offset + y_stride*chroma_h >= 0 offset + y_stride*chroma_h <= image size (from main header)
More to be added later.
32 'FLD0' Field 0 (Top) header .. [data] Data for whole field
32 'FLD1' Field 1 (Bottom) header .. [data] Data for whole field
Discussion: The length of the data packet must be exactly equal to the image size specified in the main header plus four bytes for the field header.
Encapsulation in Ogg
The time base of the granule position is defined in the main header packet, and may vary from stream to stream.
Predefined Layout Packets
The following packets are defined as the standard layout packets for the various defined formats. For those formats that declare a fourcc, it is illegal to modify the values of the layout packet. The following abbreviations are used in the formulae below:
- disp_h: Display Height, from main header packet
- disp_w: Display Width, from main header packet
- l_h: Luma height, from data layout packet
- l_w: Luma width, from data layout packet
- c_h: Chroma height, from data layout packet
- c_w: Chroma width, from data layout packet
- Layout ID: 0x32315659
l_h = (disp_h + 1) & ~1 l_w = (disp_w + 1) & ~1 c_h = l_h / 2 c_w = l_w / 2 a_offset = 0 yr_offset = 0 ug_offset = l_h * l_w + c_h * c_w vb_offset = l_h * l_w a_y_stride = 0 yr_y_stride = l_w ug_y_stride = c_w vb_y_stride = c_w a_x_stride = 0 yr_x_stride = 1 ug_x_stride = 1 vb_x_stride = 1
- Layout ID: 0x32595559
l_h = disp_h l_w = (disp_w + 1) & ~1 c_h = l_h c_w = l_w / 2 a_offset = 0 yr_offset = 0 ug_offset = 1 vb_offset = 3 a_y_stride = 0 yr_y_stride = l_w * 2 ug_y_stride = l_w * 2 vb_y_stride = l_w * 2 a_x_stride = 0 yr_x_stride = 2 ug_x_stride = 4 vb_x_stride = 4
- Layout ID: 0x80808081
l_h = disp_h l_w = disp_w c_h = disp_h c_w = disp_w a_offset = 0 yr_offset = 2 + (disp_h-1) * -yr_y_stride ug_offset = 1 + (disp_h-1) * -yr_y_stride vb_offset = 0 + (disp_h-1) * -yr_y_stride a_y_stride = 0 yr_y_stride = -1 * (((disp_w*3)+3) & ~1) ug_y_stride = -1 * (((disp_w*3)+3) & ~1) vb_y_stride = -1 * (((disp_w*3)+3) & ~1) a_x_stride = 0 yr_x_stride = 3 ug_x_stride = 3 vb_x_stride = 3