https://wiki.xiph.org/api.php?action=feedcontributions&user=Arc&feedformat=atomXiphWiki - User contributions [en]2024-03-28T18:37:35ZUser contributionsMediaWiki 1.40.1https://wiki.xiph.org/index.php?title=IceShare&diff=3030IceShare2005-12-07T17:33:14Z<p>Arc: </p>
<hr />
<div><H1><font color=#FF0000>Help Wanted</Font></H1><br />
Just to preceed all this info on what IceShare is, I'd like to make a request to help get it finished sooner. We (Xiph foundation) could really use some help from people with one or more of the following skills:<ul><br />
<li>technical documents writer (for libraries, protocols, etc)<br />
<li>crypto guru - the IceShare system needs some help with hashing and encryption in general<br />
<li>python programmers - to help complete the prototype suite<br />
<li>player integration - getting this system available to users<br />
</ul><br />
<br />
Please contact Arc <arc@Xiph.org> if you can help. Thanks!<br />
<br />
<br />
== What is it? ==<br />
<br />
IceShare is library that distributes Ogg streams on a pseudo-P2P network. It is heavily based on BitTorrent, but works on the Ogg page level, and unlike PeerCast it works with files as well as continuous streams.<br />
<br />
It's designed to allow musicians, video producers, radio and television stations, or anyone looking to inexpensivly distribute audio/video on the web. It's intended to be initiated from websites, with links to icet:// URLs. It is not designed for P2P searching, such as Gnutella, Kazaa, and eDonkey provide, however websites may be setup to easily search content on one or more IceTracker servers.<br />
<br />
<br />
== Overview ==<br />
<br />
IceShare is called pseudo-P2P because the network relies on a traditional client-server model for managing transfers between IceShare peers on the network.<br />
<br />
The media players are the level which P2P takes place, whereas listeners who have available upstream bandwidth can help distribute the same content they're listening to amoung other listeners. This helps Icecast servers non-linearly scale for much larger listener loads and reduces the bandwidth requirements for hosting static Ogg multimedia on websites.<br />
<br />
The IceShare library allows these features to be easily added to media players, including support for seeking to "not downloaded yet" parts of the media and available bandwidth detection/reporting for multi-bitrate streams.<br />
<br />
IceTracker is a server that keeps track of who's actively sharing certain media and each of their send/receive ratios. IceTracker helps direct IceShare users to better hosts and track individual user's bandwidth and level of participation to reward high bandwidth/participation users with faster peers. IceTracker servers track users anonymously by a DSA key generated by each IceShare client.<br />
<br />
Icecast connects to an IceTracker as a client to provide live stream information (pageno's, checksums, etc) and to receive guidance as per dropping less participating listeners when bandwidth is tight.<br />
<br />
<br />
== Media Players ==<br />
<br />
URLs in the form icet://&lt;icetracker&gt;:&lt;port&gt;/&lt;media&gt; direct the media player to connect to an IceTracker using IceT protocol via the IceShare library. IceShare will state that it need's the specified media.<br />
<br />
The IceTracker for that media should then respond with general information about the media in question, how many pages it has, how long its playtime is (or if it is continuous), and generally how long it should take to transfer it. This information should allow the media player to setup the seek bar and know how much it should buffer before beginning play.<br />
<br />
IceTracker should then start directing IceShare to hosts which pieces of the media can be accessed from. IceShare does not know how much of the media each of those hosts has, since many may have only partial transfers. IceTracker specifies which page, or set of pages, to download from each host. IceShare responds with a quick "I got it" for each page, thereby letting IceTracker know that the reported page is ready to be shared with others. This also helps IceTracker keep track of latency and bandwidth between peers so that it can provide the client with better hosts.<br />
<br />
If the player seeks to an not-yet-downloaded part of the media IceShare can express this to IceTracker, which will change its transfer focus to the seek point and beyond. In this way, especially for long pieces of media, the whole file does not have to be transfered to access a specific section of it.<br />
<br />
IceShare also provides media players access to its "page table". The media player can use this to reflect media transfer stats in the seek bar, prehaps using an alternative background color to indicate sections of the media which have been downloaded.<br />
<br />
IceShare handles incoming <A HREF="IceHTTP">HTTP</A> connections from peers, information about uploads on the P2P network are available to the media player but are not nessesary. The media player can tune the level of participation, limiting the amount of bandwidth or length of time a piece of media is available. For the most part, it's in the user's interest to participate as much as they're able to, since this will earn them faster access to other media through the same IceTracker.<br />
<br />
A slightly-extended <A HREF="IceHTTP">HTTP/1.1</A> is used to specify page-ranges. IceShare should also support byte-ranges for traditional HTTP download agents which are attempting to resume a lost transfer.<br />
<br />
<br />
== Media Distributors ==<br />
<br />
IceShare can also be used to distribute original media on the P2P network. A distribution client can use IceShare to connect to an IceTracker and inform it of the new media's statistics. This client should have enough upstream bandwidth to send the first few copies by itself, after which those who have downloaded it should begin sharing the load.<br />
<br />
Icecast is a good example of a distribution client. It can use IceShare to inform IceTracker of its streams and continue to send it page information for each of its ongoing streams. Icecast servers using IceShare will still need enough bandwidth to send atleast one (preferably more) streams to listeners who can then redistribute it to other listeners.<br />
<br />
IceTracker will allow IceShare clients to request current listeners and total "hits" for any media that it is tracking. This can be used by Icecast to accurately track listeners.<br />
<br />
<br />
== Alternative Streams ==<br />
<br />
IceShare also includes support for alternative bitrates and codecs to published media. These alternatives can be used to meet the needs of each individual user on the network. For instance, a stream could be provided in 64kbps Vorbis, 24kbps Speex, and 24kbps Vorbis (in that order). Those with enough bandwidth will receive the default 64kbps Vorbis stream, while modem users will switch to either the Speex or the low bitrate Vorbis based on their ability to support Speex. This makes it possible for every IceShare peer to receive a continuous stream in the highest quality format their software and network connection allows them.<br />
<br />
<br />
== Payload Protocols ==<br />
<br />
One of the interesting things about IceShare is that it's designed to use many different protocols for the actual file transfer. This allows a combination of protocols to be used, even in between the same two hosts, in the effort to get the media deployed in the most efficient manner possible.<br />
<br />
This also allows IceShare to be combined with other P2P systems. For instance, if someone chose to ignore PeerCast's GPL license addendum which doesn't allow modified clients to connect to their metaserver, a broadcaster could choose to stream to both Peercast and IceShare whereas any IceShare peer with a Peercast plugin could be sent to grab parts or all of the stream from that P2P network. The same is also true for BitTorrent, Gnutella, or any other P2P system.<br />
<br />
Here are the requirements for an IceShare Payload Protocol:<br />
* You must be able to request a path (local URL) for a media<br />
** Each media must be at the same path for every protocol<br />
** Local media path sorting and arranging is up to the local implementation<br />
*** All media may be in one directory or many, even layers deep<br />
*** Media may be moved locally (after notifying the Tracker)<br />
* You must be able to specify a range of data for download<br />
** Protocols supporting only Byte ranges limits it's useage<br />
*** A peer may not nessesarily know where in a given binary stream a series of pages belong<br />
*** Byte ranges are only known for pages in a continuous series from the start of the media<br />
** Time ranges are more useable, but may result in wasted bandwidth<br />
** Supporting (Ogg) Page ranges allows total useability <br />
** Range can be requested in any way nessesary for the protocol<br />
*** Some payload protocols may even append it to the media's path<br />
* Binary data must be able to be transfered by some means<br />
** Ogg Pages do not need to be seperated in this data<br />
*** libogg2 provides a fast and efficient manner for seperating Pages by the receiver<br />
** Delivery of any segment does not need to be guarenteed<br />
*** The IceTracker will make sure everyone gets a specific Page<br />
*** Retransmittial by IceTracker is only supported with Page granularity<br />
** Error detection does not need to be implemented<br />
*** Each Ogg Page has it's own CRC checksum in it's header<br />
** Order '' should '' be guarenteed for data within a single Ogg Page<br />
*** Ogg Pages may be sent complete in a single packet for this<br />
*** Many packets may carry sequential Pages with packet ordering provided in a small header<br />
* The IceTracker does not need to know any details of a payload protocol<br />
** It will learn new protocol names when support is advertised by peers<br />
** It will learn their strengths and weaknesses by building statistical data<br />
** Protocols will be used based on the bandwidth, reliability, and timing needs of a peer<br />
<br />
== Security Model ==<br />
<br />
Three layers of "security" are provided by the IceShare system to ensure data is transfered without errors, without alteration by peers, and without alteration of trusted content from trackers.<br />
<br />
== What's the Holdup? ==<br />
<br />
While we're apparently fairly close to wrapping up this baby and start into some massive plugin coding, there's vital things missing from other Xiph projects which this needs. <br />
<br />
Specifically: <br />
* <strike>OggFile</strike> [[OggStream]] granule handling needs to be decided on so we can get the IceT timing specs to match <br />
* <strike>we need to get the final scoop on discontinuous bitstreams</strike> done and integrated into libogg2<br />
* a fully functional public tracker needs to be written (even if it's just in Python for now). <br />
* it makes alot of sense for the plugins to be packaged with <strike>OggFile</strike> OggStream as a join distribution effort.<br />
<br />
Current estimates, given the committments from various Xiph developers and pace of development, is IceShare will begin getting deployed summer <strike>2004</strike> 2005. <br />
<br />
''' Updated: 1/10/05 ''' -- I've had to delay IceShare in order to get [[OggStream]] (formerly OggFile) ready. This explains why this project is so over-due.<br />
<br />
If you're reading this page and thinking "damned, that's awesome!" and want to speed up the above timeline we can always use volunteers. Even if you don't know any code, but prehaps you're good at HTML or graphic design, or can do no more than help test the system, we could use your help. Email Arc <arc@xiph.org> for more information.<br />
<br />
== Discussion ==<br />
Discussion has been moved to the Talk page to keep load times down. Please add comments to [[Talk:IceShare]].</div>Archttps://wiki.xiph.org/index.php?title=OggWrit&diff=3094OggWrit2005-11-16T05:12:56Z<p>Arc: /* Application Support */</p>
<hr />
<div>== Introduction ==<br />
<br />
Ogg Writ is a text phrase codec. While its primary purpose is to embed subtitles or captions in a [[Theora]] stream, it's design makes it useful for many other purposes. It could provide lyrics to song encoded in [[Vorbis]], a transcript to a political debate or oral history recording encoded in [[Speex]], or even incorporate a live chat session as part of a continuous video stream.<br />
<br />
One of the unique aspects of Writ is its discontinuous nature, that is, unlike other Ogg codecs the granules for which seperate packets effect may overlap. See the Granules and Muxing section below for how this works.<br />
<br />
== SVN ==<br />
<br />
Current Ogg Writ development is on Xiph CVS as package "writ". It's being developed to use libogg2, so you'll need both to work on it. The reference encoder and decoder are available as part of the py-ogg2 package which is available on Xiph SVN at http://svn.xiph.org/trunk/py-ogg2/<br />
<br />
== Application Support ==<br />
<br />
Writ has been endorsed by Xiph as a timed-text codec. It is used by example code, but because it's implementation depends on the yet-unreleased libogg2, it is not supported by any end-user applications at this time.<br />
<br />
== Format ==<br />
Writ has been designed so that encoders/decoders can support a bare minimum and be fully compatable with future minor versions. Each minor version adds a new feature, some building on others, adding a new header packet and likely a new field to each body packet. <br />
<br />
Decoders should ignore header packets beyond what they were written to support and also ignore extra fields in data packets beyond their current version. This allows new features to be added without requiring that all software, or even most software, to support them. <br />
<br />
Header Packet 0 (BOS, 16 bytes):<br />
8 0x00 (Packet ID, Header 0)<br />
32 "writ" (LSB 0x74697277) (Codec Identification)<br />
8 version (unsigned int, 0 = Alpha)<br />
8 minor version (unsigned int)<br />
32 granulerate_numerator (unsigned int)<br />
32 granulerate_denominator (unsigned int)<br />
<br />
Data Packet (each):<br />
8 0xFF (Packet ID, Data Packet)<br />
64 granule_start (signed integer)<br />
32 granule_duration (unsigned integer)<br />
8 text_length (unsigned integer)<br />
** text_string (variable-length UTF-8 string)<br />
<br />
<br />
'''Minor version 1''' adds multiple language support <br />
<br />
Header Packet 1 (Language Definition, 8+ bytes) :<br />
8 0x01 (Packet ID, SubHeader 1)<br />
32 "writ" (LSB 0x74697277) (Codec Identification)<br />
8 num_languages (unsigned int)<br />
[repeated 1+num_languages times] :<br />
8 language_length (unsigned int)<br />
** language_string (0+language_length rfc3066)<br />
8 language_desc_length (unsigned int)<br />
** language_desc_string (0+language_desc_length UTF-8)<br />
<br />
Data Packet (each):<br />
8 0xFF (Packet ID, Data Packet)<br />
64 granule_start (signed integer)<br />
32 granule_duration (unsigned integer)<br />
[repeated num_languages times] :<br />
8 text_length (unsigned integer)<br />
** text_string (variable-length UTF-8 string)<br />
<br />
<br />
'''Minor version 2''' adds text window support<br />
<br />
Header Packet 2 (Window Definition, 10+ bytes) :<br />
8 0x02 (Packet ID, SubHeader 2)<br />
32 "writ" (LSB 0x74697277) (Codec Identification)<br />
16 location_scale_x (unsigned int)<br />
16 location_scale_y (unsigned int)<br />
8 num_windows (unsigned int)<br />
[if (window_num > 0) repeated window_num times] :<br />
** location_x (variable length, see below)<br />
** location_y (variable length, see below)<br />
** location_width (variable length, see below)<br />
** location_height (variable length, see below)<br />
2 alignment_x (horizontal alignment, see below)<br />
2 alignment_y (vertical alignment, see below)<br />
<br />
Data Packet (each):<br />
8 0xFF (Packet ID, Data Packet)<br />
64 granule_start (signed integer)<br />
32 granule_duration (unsigned integer)<br />
[repeated num_languages times] :<br />
8 text_length (unsigned integer)<br />
** text_string (variable-length UTF-8 string)<br />
[if (window_num > 1)] :<br />
8 window_id (unsigned integer)<br />
<br />
<br />
=== Example Stream ===<br />
Header Packet 0<br />
version 0<br />
minor version 2<br />
granulenum 1<br />
granuledom 1<br />
\x00writ\x00\x02\x01\x00\x00\x00\x01\x00\x00\x00<br />
<br />
Header Packet 1<br />
num_languages 2<br />
Language 0:<br />
language en<br />
language_desc English<br />
Language 1:<br />
language es<br />
language_desc Spanish<br />
\x01writ\x01\x02en\x07English\x02es\x07Spanish<br />
<br />
Header Packet 2<br />
location_scale_x 4000 (12 bits)<br />
location_scale_y 270 ( 9 bits)<br />
num_windows 2<br />
Window 0:<br />
location_x 1<br />
location_y 2<br />
location_width 3<br />
location_height 1<br />
alignment_x 3 (Full)<br />
alignment_y 3 (Full)<br />
Window 1:<br />
location_x 5<br />
location_y 6<br />
location_width 7<br />
location_height 1<br />
alignment_x 3 (Full)<br />
alignment_y 3 (Full)<br />
\x02writ\xa0\x0f\x0e\x01\x02\x01\x20\x60\x00\x02\x7c\x01\x18\x38\x80\x00\x0f<br />
<br />
Phrase Packet:<br />
granule_start 5<br />
granule_duration 10<br />
Language 0: "Hello World!"<br />
Language 1: "Hola, Mundo!"<br />
window_id 0<br />
\xff\x05\x00\x00\x00\x00\x00\x00\x00\x0a\x00\x00\x00\x0cHello World!\x0cHola, Mundo!\x00<br />
<br />
Phrase Packet:<br />
granule_start 12<br />
granule_duration 15<br />
Language 0: "It's a beautiful day to be born."<br />
Language 1: "Es un día hermoso para que se llevará."<br />
window_id 1<br />
\xff\x0c\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00\x20It's a beautiful day to be born.\x26Es un d\xeda hermoso para que se llevar\xe1.\x01<br />
<br />
<br />
=== Granules and Muxing ===<br />
<br />
Granulepos in Writ (as well as future discontinuous codecs) will be by start time, not end time, that the data in a given page is tagged for. This greatly simplifies this specification.<br />
<br />
All Writ phrases will be provided at and given the granulepos of their start time, ordered by their start time within the logical bitstream.<br />
<br />
Phrase packets with long durations should be repeated in the logical bitstream at regular intervals to ensure that a player seeking to the middle of their duration will still see them. These packet copies will be identical to their original, including the start and duration fields, the granulepos of the page they reside on will be incremented for each copy to place it forward on the logical bitstream.<br />
<br />
No two phrases can start on the same granule. On decoding, each packet's start granule is checked against already known packets. If a match is found the new packet is ignored. This prevents phrase copies from being interpreted as new phrases.<br />
<br />
=== Seeking Example ===<br />
<br />
Here is a timeline (granule numbers at top, read down) of a sample stream:<br />
<br />
<- Granules -><br />
0000000000111111111122222222223333333333444444444455555555556666666666<br />
0123456789012345678901234567890123456789012345678901234567890123456789<br />
___________ ____________ ____________ ____________ _____________<br />
|_Vorbis____||_Vorbis_____||_Vorbis_____||_Vorbis_____||_Vorbis______|<br />
____________________ ____________________________________<br />
|_A____________>_____| |_D____________>______________>______|<br />
_________ ___ __________ ___________<br />
|_B_______| |_C_| |_E________| |_F_________|<br />
.<br />
(note: these have been seperated vertically for easy viewing only)<br />
.<br />
Packet Granule Description<br />
V H0 0 Vorbis Header 0x01 (page by itself, BOS)<br />
W H0 0 Writ Header 0 (page by itself, BOS)<br />
V H1 0 Vorbis Header 0x03<br />
V H2 0 Vorbis Header 0x05<br />
W H1 0 Writ Header 1 (Language Defs)<br />
W H2 0 Writ Header 2 (Window Defs)<br />
W A 0 Writ Phrase A<br />
W B 4 Writ Phrase B<br />
V 12 Vorbis 0-12<br />
W A 15 Writ Phrase A<br />
W C 19 Writ Phrase C<br />
W D 23 Writ Phrase D<br />
V 26 Vorbis 13-26<br />
W E 26 Writ Phrase E<br />
W D 38 Writ Phrase D<br />
V 40 Vorbis 27-40<br />
W F 41 Writ Phrase F<br />
W D 53 Writ Phrase D (EOS)<br />
V 54 Vorbis 41-54<br />
V 69 Vorbis 55-69 (EOS) <br />
<br />
<br />
Player begins decoding at beginning of stream. It reads the BOS pages for both codecs, then receives a non-BOS page. At this point it knows that it has two bitstreams to decode and has resolved that one is Writ and the other Vorbis. It'll continue processing the headers for both.<br />
<br />
Next it's going to find two Writ packets (phrases A and B) and toss them into libwrit. Then it'll get to the first Vorbis data page. It now has data from both bitstreams, and it knows (from the granulepos on the Vorbis page) that it has enough data to run until 12. If there were any Writ packets before 12 they would have appeared first.<br />
<br />
At around granule 9 the listener seeks forward to 24. This will cause a rapid seek through the file to find the first page with a granulepos greater than the seek position and begin decoding at that point.<br />
<br />
It'll find a Vorbis packet containing 13-26 (and not use 13-23) and Writ phrase E. Again, having data from both bitstreams it can begin playing. D would normally appear at granule 24 but is not known about yet. The player knows that this is only enough to decode until 26 so, knowing enough to prebuffer, continues reading the file as it plays the media.<br />
<br />
The next packet it finds is Writ phrase D, and passing it to libwrit, is found that the current granulepos is within the duration. It is thus displayed immediatly, as it's prebuffered, without waiting for granulepos 38. It'll keep reading (because the maximum decoded Vorbis is still 26) and find a Vorbis packet with a 40 granulepos.<br />
<br />
As it nears 38 it'll read the file again and find Writ phrase F, which takes it out to 41. Vorbis only goes until 40, so it'll have to keep reading until the next Vorbis packet.<br />
<br />
Next it'll find Writ phrase D, which will be ignored by libwrit because phrase D is already known (matches start granule of earlier D), and the EOS on that page marks this as the last of the Writ stream.<br />
<br />
It'll continue reading for the next Vorbis data and find the packet for granule 54, followed by the Vorbis packet for granule 69. With that it's EOS, EOF, finished.<br />
<br />
This is of course a simplistic example, Writ and Vorbis will rarely have granules which equal the same amount of time. Each bitstream has its own granule -> time mapping which is calculated when muxing concurrent bitstreams within the file. So if there are 44100 Vorbis granules per second and only 4 Writ granules per second, pages would be ordered as W25 V297892 W31 V385932 W39 W41 V463057 etc. The logic used in the above example works after this granule-time mapping is calculated.<br />
<br />
== Past Discussion ==<br />
<br />
=== How does this get "encoded" and "merged"? ===<br />
<purple_haese> The muxing rule is pages are arranged in ascending order by the timestamp that is represented by their granulepos.<br />
<br />
=== For what reason is the 0x00 and 0xFF byte at the beginning of header and data packet respectively? ===<br />
<xiphmont> If, after a seek, I hand your codec a header packet, what does the codec do?<br />
<xiphmont> It does nothing. If I haven't told it to reset, the header is not data, it must ignore the header.<br />
<xiphmont> this eliminates a huge raft of special cases in Ogg seeking.</div>Archttps://wiki.xiph.org/index.php?title=Main_Page&diff=2101Main Page2005-11-16T00:51:16Z<p>Arc: /* Codecs */ it very well is being implemented.</p>
<hr />
<div>= Projects/Formats =<br />
<br />
In an effort to bring open-source ideals to the world of multimedia The Xiph.org Foundation ([[XiphOrg]]) develops a multitude of amazing products. <br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommeded container for Xiph codecs.<br />
* [[OggSkeleton]]: Skeleton information on all logical content bitstreams in Ogg<br />
<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML playlist format<br />
<br />
== Codecs ==<br />
* '''Compressed Codecs:'''<br />
** [[Vorbis]]: Audio codec with a [[Tremor|fixed point decoder]]<br />
** [[Theora]]: Video codec<br />
** [[FLAC]]: Free Lossless Audio Codec<br />
** [[Speex]]: Speech codec<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
* '''[[RawCodecs|Uncompressed Codecs]]:'''<br />
** Audio:<br />
*** [[OggPCM]]: Uncompressed PCM audio, primarily as an interchange codec<br />
*** [[OggPCM2|Another PCM format]]: Uncompressed PCM audio, under active development<br />
*** [[OggPCM3|Humorous PCM format]]: Uncompressed PCM audio - and a lot more!<br />
** Video:<br />
*** [[OggRGB]]: Uncompressed RGB video, primarily as an interchange codec, under active development <br />
*** [[OggYUV]]: Uncompressed YUV video, primarily as an interchange codec, under active development<br />
*** [[OggUVS]]: Uncompressed RGB and YUV video, under active development as an alternative to OggRGB and OggYUV.<br />
** Text & Hyperlinking:<br />
*** [[OggWrit]]: Text phrase codec (e.g. subtitles)<br />
*** [http://www.annodex.net/TR/draft-pfeiffer-cmml-01.html CMML]: Continuous Media Markup Language, used for [http://www.annodex.net/ Annodex] and subtitles (xine and gstreamer support)<br />
* '''Metadata Codecs:'''<br />
** [[Metadata]]: Arbitrary metadata stream format (vapourware so far)<br />
<br />
== Software for distributing media ==<br />
<br />
* [[Icecast]]: Streaming server<br />
* [[Ices]]: Source client for Icecast servers<br />
* [[IceShare]]: P2P content distribution<br />
<br />
== Other software ==<br />
<br />
* [[OggComponent/VorbisComponent]]: Wrappers to integrate Ogg-Vorbis into MacOsX<br />
<br />
= Demonstrations =<br />
<br />
Want to hear Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams]]: Stations streaming with the Vorbis codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware]]: Hardware players using the Vorbis codec<br />
* [http://www.tversity.com TVersity Media Server]: A UPNP/AV compliant media server that uses the Ogg Vorbis libraries to transcode audio files to the Ogg Vorbis format.<br />
<br />
= Project management =<br />
<br />
* [[MonthlyMeeting]]<br />
* [[MailingLists]]<br />
* [[Bounties]]<br />
* [[HyperFish]]<br />
<br />
= Wiki internal =<br />
* [[Sandbox]]: Testbed for testing editing skills.<br />
* [[Translations]]: What about some translation work</div>Archttps://wiki.xiph.org/index.php?title=Main_Page&diff=2095Main Page2005-11-16T00:41:27Z<p>Arc: /* Codecs */ Welcome to codec spratl</p>
<hr />
<div>= Projects/Formats =<br />
<br />
In an effort to bring open-source ideals to the world of multimedia The Xiph.org Foundation ([[XiphOrg]]) develops a multitude of amazing products. <br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommeded container for Xiph codecs.<br />
* [[OggSkeleton]]: Skeleton information on all logical content bitstreams in Ogg<br />
<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML playlist format<br />
<br />
== Codecs ==<br />
* '''Compressed Codecs:'''<br />
** [[Vorbis]]: Audio codec with a [[Tremor|fixed point decoder]]<br />
** [[Theora]]: Video codec<br />
** [[FLAC]]: Free Lossless Audio Codec<br />
** [[Speex]]: Speech codec<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
* '''[[RawCodecs|Uncompressed Codecs]]:'''<br />
** Audio:<br />
*** [[OggPCM]]: Uncompressed PCM audio, primarily as an interchange codec<br />
*** [[OggPCM2|Another PCM format]]: Uncompressed PCM audio, under active development<br />
*** [[OggPCM3|Humorous PCM format]]: Uncompressed PCM audio - and a lot more!<br />
** Video:<br />
*** [[OggRGB]]: Uncompressed RGB video, primarily as an interchange codec, under active development <br />
*** [[OggYUV]]: Uncompressed YUV video, primarily as an interchange codec, under active development<br />
*** [[OggUVS]]: Uncompressed RGB and YUV video, under active development as an alternative to OggRGB and OggYUV.<br />
** Text & Hyperlinking:<br />
*** [[OggWrit]]: Text phrase codec (e.g. subtitles)<br />
*** [http://www.annodex.net/TR/draft-pfeiffer-cmml-01.html CMML]: Continuous Media Markup Language, foundation for [http://www.annodex.net/ Annodex]<br />
* '''Metadata Codecs:'''<br />
** [[Metadata]]: Arbitrary metadata stream format (vapourware so far)<br />
<br />
== Software for distributing media ==<br />
<br />
* [[Icecast]]: Streaming server<br />
* [[Ices]]: Source client for Icecast servers<br />
* [[IceShare]]: P2P content distribution<br />
<br />
== Other software ==<br />
<br />
* [[OggComponent/VorbisComponent]]: Wrappers to integrate Ogg-Vorbis into MacOsX<br />
<br />
= Demonstrations =<br />
<br />
Want to hear Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams]]: Stations streaming with the Vorbis codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware]]: Hardware players using the Vorbis codec<br />
* [http://www.tversity.com TVersity Media Server]: A UPNP/AV compliant media server that uses the Ogg Vorbis libraries to transcode audio files to the Ogg Vorbis format.<br />
<br />
= Project management =<br />
<br />
* [[MonthlyMeeting]]<br />
* [[MailingLists]]<br />
* [[Bounties]]<br />
* [[HyperFish]]<br />
<br />
= Wiki internal =<br />
* [[Sandbox]]: Testbed for testing editing skills.<br />
* [[Translations]]: What about some translation work</div>Archttps://wiki.xiph.org/index.php?title=Main_Page&diff=2094Main Page2005-11-16T00:36:20Z<p>Arc: wiki is for non-pov information - some of this was also clearly false.</p>
<hr />
<div>= Projects/Formats =<br />
<br />
In an effort to bring open-source ideals to the world of multimedia The Xiph.org Foundation ([[XiphOrg]]) develops a multitude of amazing products. <br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommeded container for Xiph codecs.<br />
* [[OggSkeleton]]: Skeleton information on all logical content bitstreams in Ogg<br />
<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML playlist format<br />
<br />
== Codecs ==<br />
* '''Compressed Codecs:'''<br />
** [[Vorbis]]: Audio codec with a [[Tremor|fixed point decoder]]<br />
** [[Theora]]: Video codec<br />
** [[FLAC]]: Free Lossless Audio Codec<br />
** [[Speex]]: Speech codec<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
* '''[[RawCodecs|Uncompressed Codecs]]:'''<br />
** [[OggPCM]]: Uncompressed PCM audio, primarily as an interchange codec<br />
** [[OggPCM2]]: Uncompressed PCM audio, under active development<br />
** [[OggRGB]]: Uncompressed RGB video, primarily as an interchange codec, under active development <br />
** [[OggYUV]]: Uncompressed YUV video, primarily as an interchange codec, under active development<br />
** [[OggUVS]]: Uncompressed RGB and YUV video, under active development as an alternative to OggRGB and OggYUV.<br />
** [[OggWrit]]: Text phrase codec (e.g. subtitles)<br />
** [http://www.annodex.net/TR/draft-pfeiffer-cmml-01.html CMML]: Continuous Media Markup Language, foundation for [http://www.annodex.net/ Annodex]<br />
* '''Metadata Codecs:'''<br />
** [[Metadata]]: Arbitrary metadata stream format (vapourware so far)<br />
<br />
== Software for distributing media ==<br />
<br />
* [[Icecast]]: Streaming server<br />
* [[Ices]]: Source client for Icecast servers<br />
* [[IceShare]]: P2P content distribution<br />
<br />
== Other software ==<br />
<br />
* [[OggComponent/VorbisComponent]]: Wrappers to integrate Ogg-Vorbis into MacOsX<br />
<br />
= Demonstrations =<br />
<br />
Want to hear Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams]]: Stations streaming with the Vorbis codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware]]: Hardware players using the Vorbis codec<br />
* [http://www.tversity.com TVersity Media Server]: A UPNP/AV compliant media server that uses the Ogg Vorbis libraries to transcode audio files to the Ogg Vorbis format.<br />
<br />
= Project management =<br />
<br />
* [[MonthlyMeeting]]<br />
* [[MailingLists]]<br />
* [[Bounties]]<br />
* [[HyperFish]]<br />
<br />
= Wiki internal =<br />
* [[Sandbox]]: Testbed for testing editing skills.<br />
* [[Translations]]: What about some translation work</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=3170OggPCM Draft12005-11-16T00:31:49Z<p>Arc: </p>
<hr />
<div>'''OggPCM is currently a topic of [http://lists.xiph.org/pipermail/ogg-dev/2005-November/thread.html heated debate].<br />
<br />
The following is a draft spec.'''<br />
<br />
== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [[Wikipedia:Pulse-code_modulation|Wikipedia]].<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF.<br />
<br />
== Stream Description ==<br />
A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.<br />
<br />
The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.<br />
<br />
== Packet Format ==<br />
Header and comment packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.<br />
<br />
=== Header Packet ===<br />
Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields: <br />
<br />
Packet 0, BOS, 16 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, more supported format id's)<br />
8 [uint] Number of header packets preceding data<br />
8 [uint] Number of Channels, 0 = 256<br />
-<br />
16 [flag] Flags<br />
16 [enum] PCM Format ID<br />
-<br />
32 [uint] Sample Rate<br />
<br />
The flags field is defined as follows:<br />
Bit Description<br />
15 (MSB) Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data<br />
packet containing 3 channels and 2 samples/channel, the chunked storage order would be<br />
001122. For the interleaved storage format (default), the order would be 012012.<br />
others Reserved<br />
<br />
Applications conforming to version 1.0 of this spec MUST:<ul><br />
<li>set all reserved flags to false (zero) when creating these streams.</li><br />
<li>preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.</li><br />
</ul><br />
<br />
=== Comment Packets ===<br />
At this time, there is only one defined comment packet.<br />
Comment Header Packet<br />
8 0x01 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
=== Data Packets ===<br />
Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.<br />
<br />
=== Supported PCM Formats ===<br />
Formats are identified within a header packet by a 16 bit "format type" field. While<br />
most applications will treat this as an opaque type, it is possible to discern some<br />
information about the format from the value of this field itself. Specifically, the<br />
format's storage size, in bytes, and its byte ordering, can be discerned by parsing<br />
the lower 6 bits of the value. These values are exposed so that it is possible to<br />
extract individual samples without necessarily understanding the coding scheme involved.<br />
While for pratical purposes, due to performance concerns, most applications will<br />
choose to operate on a buffer directly, it is nonetheless possible to work a sample<br />
at a time. <br />
<br />
Binary Value Meaning<br />
..xxxx00 N/A, or data not accurately described by this scheme.<br />
..xxxx01 Least significant byte first. Bytes are MS bit first.<br />
..xxxx10 Most significant byte first. Bytes are MS bit first.<br />
..xxxx11 Data is machine endian<br />
..0000xx Data can not be described by this bytepacking scheme.<br />
..0001xx Samples are stored using one byte per sample<br />
..0010xx Samples are stored using two bytes per sample<br />
..0011xx Samples are stored using three bytes per sample<br />
..0100xx Samples are stored using four bytes per sample<br />
..1000xx Samples are stored using eight bytes per sample<br />
<br />
The remaining 10 bits describe the coding scheme used to convert the digital value<br />
to an audio signal. The following formats are defined for version 1.0 of this<br />
format. For purposes of attribution, it should be noted that these formats are the<br />
PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and<br />
should be fairly comprehensive.<br />
<br />
Format ID Short Name Description<br />
-- Signed integer coding (0)<br />
0x0004 OGGPCM_FMT_S8 Signed integer 8 bit<br />
0x0009 OGGPCM_FMT_S16_LE Signed integer 16 bit little endian<br />
0x000A OGGPCM_FMT_S16_BE Signed integer 16 bit big endian<br />
0x000B OGGPCM_FMT_S16 Signed integer 16 bit machine endian<br />
0x000D OGGPCM_FMT_S24_3LE Signed integer 24 bit little endian<br />
0x000E OGGPCM_FMT_S24_3BE Signed integer 24 bit big endian<br />
0x0011 OGGPCM_FMT_S32_LE Signed integer 32 bit little endian<br />
0x0012 OGGPCM_FMT_S32_BE Signed integer 32 bit big endian<br />
0x0013 OGGPCM_FMT_S32 Signed integer 32 bit machine endian<br />
--<br />
-- Unsigned integer coding (1)<br />
0x0044 OGGPCM_FMT_U8 Unsigned integer 8 bit<br />
0x0049 OGGPCM_FMT_U16_LE Unsigned integer 16 bit little endian<br />
0x004A OGGPCM_FMT_U16_BE Unsigned integer 16 bit big endian<br />
0x004B OGGPCM_FMT_U16 Unsigned integer 16 bit machine endian<br />
0x004D OGGPCM_FMT_U24_3LE Unsigned integer 24 bit little endian<br />
0x004E OGGPCM_FMT_U24_3BE Unsigned integer 24 bit big endian<br />
0x0051 OGGPCM_FMT_U32_LE Unsigned integer 32 bit little endian<br />
0x0052 OGGPCM_FMT_U32_BE Unsigned integer 32 bit big endian<br />
0x0053 OGGPCM_FMT_U32 Unsigned integer 32 bit machine endian<br />
--<br />
-- IEEE Floating point coding (2)<br />
0x0091 OGGPCM_FMT_FLT_LE IEEE Float (-1,1) 32 bit little endian<br />
0x0092 OGGPCM_FMT_FLT_BE IEEE Float (-1,1) 32 bit big endian<br />
0x0093 OGGPCM_FMT_FLT IEEE Float (-1,1) 32 bit machine endian<br />
0x00A1 OGGPCM_FMT_FLT64_LE IEEE Float (-1,1) 64 bit little endian<br />
0x00A2 OGGPCM_FMT_FLT64_BE IEEE Float (-1,1) 64 bit big endian<br />
0x00A3 OGGPCM_FMT_FLT64 IEEE Float (-1,1) 64 bit machine endian<br />
--<br />
-- IEC958 coding (?) (3)<br />
0x00CD OGGPCM_FMT_IEC958_3LE IEC958 Subframe, 24 bit little endian<br />
0x00CE OGGPCM_FMT_IEC958_3BE IEC958 Subframe, 24 bit big endian<br />
0x00D1 OGGPCM_FMT_IEC958_LE IEC958 Subframe, 32 bit little endian<br />
0x00D2 OGGPCM_FMT_IEC958_BE IEC958 Subframe, 32 bit big endian<br />
0x00D3 OGGPCM_FMT_IEC958 IEC965 Subframe, 32 bit machine endian<br />
--<br />
-- Mu-Law coding (4)<br />
0x0104 OGGPCM_FMT_MU_LAW Mu-Law<br />
--<br />
-- A-Law coding (5)<br />
0x0144 OGGPCM_FMT_A_LAW A-Law<br />
--<br />
-- ADPCM coding (6)<br />
0x0180 OGGPCM_FMT_ADPCM Ima-ADPCM <br />
--<br />
-- GSM coding (7)<br />
0x01C0 OGGPCM_FMT_GSM GSM<br />
--<br />
-- 24 bit signed integer in 32 bit storage (8)<br />
0x0211 OGGPCM_FMT_S24_LE Signed integer 24 bit little endian<br />
0x0212 OGGPCM_FMT_S24_BE Signed integer 24 bit big endian<br />
0x0213 OGGPCM_FMT_S24 Signed integer 24 bit machine endian<br />
--<br />
-- 24 bit unsigned integer in 32 bit storage (9)<br />
0x0251 OGGPCM_FMT_U24_LE Unsigned integer 24 bit little endian<br />
0x0252 OGGPCM_FMT_U24_BE Unsigned integer 24 bit big endian<br />
0x0253 OGGPCM_FMT_U24 Unsigned integer 24 bit machine endian<br />
--<br />
-- 20 bit signed integer in 24 bit storage (10)<br />
0x028D OGGPCM_FMT_S20_3LE Signed integer 20 bit little endian<br />
0x028E OGGPCM_FMT_S20_3BE Signed integer 20 bit big endian<br />
--<br />
-- 20 bit unsigned integer in 24 bit storage (11)<br />
0x02CD OGGPCM_FMT_U20_3LE Unsigned integer 20 bit little endian<br />
0x02CE OGGPCM_FMT_U20_3BE Unsigned integer 20 bit big endian<br />
--<br />
-- 18 bit signed integer in 24 bit storage (12)<br />
0x030D OGGPCM_FMT_S18_3LE Signed integer 18 bit little endian<br />
0x030E OGGPCM_FMT_S18_3BE Signed integer 18 bit big endian<br />
--<br />
-- 18 bit unsigned integer in 24 bit storage (13)<br />
0x034D OGGPCM_FMT_U18_3LE Unsigned integer 18 bit little endian<br />
0x034E OGGPCM_FMT_U18_3BE Unsigned integer 18 bit big endian<br />
--<br />
Other coding schemes supported by ALSA but not specified here:<br />
MPEG<br />
--<br />
TODO: ADPCM and GSM need further specification (or elimination) since these aren't really<br />
byte packed like the other formats here are.<br />
<br />
== Encapsulation in Ogg ==<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2092OggPCM Draft12005-11-16T00:30:20Z<p>Arc: Removing spam: Use the frontpage or a See Also section at the bottom</p>
<hr />
<div>'''Warning: OggPCM is currently a topic of [http://lists.xiph.org/pipermail/ogg-dev/2005-November/thread.html heated debate]. The following is *not* the final spec.'''<br />
<br />
== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [[Wikipedia:Pulse-code_modulation|Wikipedia]].<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF.<br />
<br />
== Stream Description ==<br />
A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.<br />
<br />
The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.<br />
<br />
== Packet Format ==<br />
Header and comment packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.<br />
<br />
=== Header Packet ===<br />
Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields: <br />
<br />
Packet 0, BOS, 16 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, more supported format id's)<br />
8 [uint] Number of header packets preceding data<br />
8 [uint] Number of Channels, 0 = 256<br />
-<br />
16 [flag] Flags<br />
16 [enum] PCM Format ID<br />
-<br />
32 [uint] Sample Rate<br />
<br />
The flags field is defined as follows:<br />
Bit Description<br />
15 (MSB) Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data<br />
packet containing 3 channels and 2 samples/channel, the chunked storage order would be<br />
001122. For the interleaved storage format (default), the order would be 012012.<br />
others Reserved<br />
<br />
Applications conforming to version 1.0 of this spec MUST:<ul><br />
<li>set all reserved flags to false (zero) when creating these streams.</li><br />
<li>preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.</li><br />
</ul><br />
<br />
=== Comment Packets ===<br />
At this time, there is only one defined comment packet.<br />
Comment Header Packet<br />
8 0x01 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
=== Data Packets ===<br />
Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.<br />
<br />
=== Supported PCM Formats ===<br />
Formats are identified within a header packet by a 16 bit "format type" field. While<br />
most applications will treat this as an opaque type, it is possible to discern some<br />
information about the format from the value of this field itself. Specifically, the<br />
format's storage size, in bytes, and its byte ordering, can be discerned by parsing<br />
the lower 6 bits of the value. These values are exposed so that it is possible to<br />
extract individual samples without necessarily understanding the coding scheme involved.<br />
While for pratical purposes, due to performance concerns, most applications will<br />
choose to operate on a buffer directly, it is nonetheless possible to work a sample<br />
at a time. <br />
<br />
Binary Value Meaning<br />
..xxxx00 N/A, or data not accurately described by this scheme.<br />
..xxxx01 Least significant byte first. Bytes are MS bit first.<br />
..xxxx10 Most significant byte first. Bytes are MS bit first.<br />
..xxxx11 Data is machine endian<br />
..0000xx Data can not be described by this bytepacking scheme.<br />
..0001xx Samples are stored using one byte per sample<br />
..0010xx Samples are stored using two bytes per sample<br />
..0011xx Samples are stored using three bytes per sample<br />
..0100xx Samples are stored using four bytes per sample<br />
..1000xx Samples are stored using eight bytes per sample<br />
<br />
The remaining 10 bits describe the coding scheme used to convert the digital value<br />
to an audio signal. The following formats are defined for version 1.0 of this<br />
format. For purposes of attribution, it should be noted that these formats are the<br />
PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and<br />
should be fairly comprehensive.<br />
<br />
Format ID Short Name Description<br />
-- Signed integer coding (0)<br />
0x0004 OGGPCM_FMT_S8 Signed integer 8 bit<br />
0x0009 OGGPCM_FMT_S16_LE Signed integer 16 bit little endian<br />
0x000A OGGPCM_FMT_S16_BE Signed integer 16 bit big endian<br />
0x000B OGGPCM_FMT_S16 Signed integer 16 bit machine endian<br />
0x000D OGGPCM_FMT_S24_3LE Signed integer 24 bit little endian<br />
0x000E OGGPCM_FMT_S24_3BE Signed integer 24 bit big endian<br />
0x0011 OGGPCM_FMT_S32_LE Signed integer 32 bit little endian<br />
0x0012 OGGPCM_FMT_S32_BE Signed integer 32 bit big endian<br />
0x0013 OGGPCM_FMT_S32 Signed integer 32 bit machine endian<br />
--<br />
-- Unsigned integer coding (1)<br />
0x0044 OGGPCM_FMT_U8 Unsigned integer 8 bit<br />
0x0049 OGGPCM_FMT_U16_LE Unsigned integer 16 bit little endian<br />
0x004A OGGPCM_FMT_U16_BE Unsigned integer 16 bit big endian<br />
0x004B OGGPCM_FMT_U16 Unsigned integer 16 bit machine endian<br />
0x004D OGGPCM_FMT_U24_3LE Unsigned integer 24 bit little endian<br />
0x004E OGGPCM_FMT_U24_3BE Unsigned integer 24 bit big endian<br />
0x0051 OGGPCM_FMT_U32_LE Unsigned integer 32 bit little endian<br />
0x0052 OGGPCM_FMT_U32_BE Unsigned integer 32 bit big endian<br />
0x0053 OGGPCM_FMT_U32 Unsigned integer 32 bit machine endian<br />
--<br />
-- IEEE Floating point coding (2)<br />
0x0091 OGGPCM_FMT_FLT_LE IEEE Float (-1,1) 32 bit little endian<br />
0x0092 OGGPCM_FMT_FLT_BE IEEE Float (-1,1) 32 bit big endian<br />
0x0093 OGGPCM_FMT_FLT IEEE Float (-1,1) 32 bit machine endian<br />
0x00A1 OGGPCM_FMT_FLT64_LE IEEE Float (-1,1) 64 bit little endian<br />
0x00A2 OGGPCM_FMT_FLT64_BE IEEE Float (-1,1) 64 bit big endian<br />
0x00A3 OGGPCM_FMT_FLT64 IEEE Float (-1,1) 64 bit machine endian<br />
--<br />
-- IEC958 coding (?) (3)<br />
0x00CD OGGPCM_FMT_IEC958_3LE IEC958 Subframe, 24 bit little endian<br />
0x00CE OGGPCM_FMT_IEC958_3BE IEC958 Subframe, 24 bit big endian<br />
0x00D1 OGGPCM_FMT_IEC958_LE IEC958 Subframe, 32 bit little endian<br />
0x00D2 OGGPCM_FMT_IEC958_BE IEC958 Subframe, 32 bit big endian<br />
0x00D3 OGGPCM_FMT_IEC958 IEC965 Subframe, 32 bit machine endian<br />
--<br />
-- Mu-Law coding (4)<br />
0x0104 OGGPCM_FMT_MU_LAW Mu-Law<br />
--<br />
-- A-Law coding (5)<br />
0x0144 OGGPCM_FMT_A_LAW A-Law<br />
--<br />
-- ADPCM coding (6)<br />
0x0180 OGGPCM_FMT_ADPCM Ima-ADPCM <br />
--<br />
-- GSM coding (7)<br />
0x01C0 OGGPCM_FMT_GSM GSM<br />
--<br />
-- 24 bit signed integer in 32 bit storage (8)<br />
0x0211 OGGPCM_FMT_S24_LE Signed integer 24 bit little endian<br />
0x0212 OGGPCM_FMT_S24_BE Signed integer 24 bit big endian<br />
0x0213 OGGPCM_FMT_S24 Signed integer 24 bit machine endian<br />
--<br />
-- 24 bit unsigned integer in 32 bit storage (9)<br />
0x0251 OGGPCM_FMT_U24_LE Unsigned integer 24 bit little endian<br />
0x0252 OGGPCM_FMT_U24_BE Unsigned integer 24 bit big endian<br />
0x0253 OGGPCM_FMT_U24 Unsigned integer 24 bit machine endian<br />
--<br />
-- 20 bit signed integer in 24 bit storage (10)<br />
0x028D OGGPCM_FMT_S20_3LE Signed integer 20 bit little endian<br />
0x028E OGGPCM_FMT_S20_3BE Signed integer 20 bit big endian<br />
--<br />
-- 20 bit unsigned integer in 24 bit storage (11)<br />
0x02CD OGGPCM_FMT_U20_3LE Unsigned integer 20 bit little endian<br />
0x02CE OGGPCM_FMT_U20_3BE Unsigned integer 20 bit big endian<br />
--<br />
-- 18 bit signed integer in 24 bit storage (12)<br />
0x030D OGGPCM_FMT_S18_3LE Signed integer 18 bit little endian<br />
0x030E OGGPCM_FMT_S18_3BE Signed integer 18 bit big endian<br />
--<br />
-- 18 bit unsigned integer in 24 bit storage (13)<br />
0x034D OGGPCM_FMT_U18_3LE Unsigned integer 18 bit little endian<br />
0x034E OGGPCM_FMT_U18_3BE Unsigned integer 18 bit big endian<br />
--<br />
Other coding schemes supported by ALSA but not specified here:<br />
MPEG<br />
--<br />
TODO: ADPCM and GSM need further specification (or elimination) since these aren't really<br />
byte packed like the other formats here are.<br />
<br />
== Encapsulation in Ogg ==<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.</div>Archttps://wiki.xiph.org/index.php?title=RawCodecs&diff=3211RawCodecs2005-11-16T00:24:26Z<p>Arc: RawCodecs moved to OggRaw</p>
<hr />
<div>#REDIRECT [[OggRaw]]<br />
</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2090OggPCM Draft12005-11-15T18:21:10Z<p>Arc: removed link to unrelated project and meaningless vote</p>
<hr />
<div>'''Warning: OggPCM is currently a topic of [http://lists.xiph.org/pipermail/ogg-dev/2005-November/thread.html heated debate]. The following is *not* the final spec.'''<br />
<br />
== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [[Wikipedia:Pulse-code_modulation|Wikipedia]].<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF.<br />
<br />
== Stream Description ==<br />
A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.<br />
<br />
The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.<br />
<br />
== Packet Format ==<br />
Header and comment packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.<br />
<br />
=== Header Packet ===<br />
Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields: <br />
<br />
Packet 0, BOS, 16 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, more supported format id's)<br />
8 [uint] Number of header packets preceding data<br />
8 [uint] Number of Channels, 0 = 256<br />
-<br />
16 [flag] Flags<br />
16 [enum] PCM Format ID<br />
-<br />
32 [uint] Sample Rate<br />
<br />
The flags field is defined as follows:<br />
Bit Description<br />
15 (MSB) Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data<br />
packet containing 3 channels and 2 samples/channel, the chunked storage order would be<br />
001122. For the interleaved storage format (default), the order would be 012012.<br />
others Reserved<br />
<br />
Applications conforming to version 1.0 of this spec MUST:<ul><br />
<li>set all reserved flags to false (zero) when creating these streams.</li><br />
<li>preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.</li><br />
</ul><br />
<br />
=== Comment Packets ===<br />
At this time, there is only one defined comment packet.<br />
Comment Header Packet<br />
8 0x01 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
=== Data Packets ===<br />
Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.<br />
<br />
=== Supported PCM Formats ===<br />
Formats are identified within a header packet by a 16 bit "format type" field. While<br />
most applications will treat this as an opaque type, it is possible to discern some<br />
information about the format from the value of this field itself. Specifically, the<br />
format's storage size, in bytes, and its byte ordering, can be discerned by parsing<br />
the lower 6 bits of the value. These values are exposed so that it is possible to<br />
extract individual samples without necessarily understanding the coding scheme involved.<br />
While for pratical purposes, due to performance concerns, most applications will<br />
choose to operate on a buffer directly, it is nonetheless possible to work a sample<br />
at a time. <br />
<br />
Binary Value Meaning<br />
..xxxx00 N/A, or data not accurately described by this scheme.<br />
..xxxx01 Least significant byte first. Bytes are MS bit first.<br />
..xxxx10 Most significant byte first. Bytes are MS bit first.<br />
..xxxx11 Data is machine endian<br />
..0000xx Data can not be described by this bytepacking scheme.<br />
..0001xx Samples are stored using one byte per sample<br />
..0010xx Samples are stored using two bytes per sample<br />
..0011xx Samples are stored using three bytes per sample<br />
..0100xx Samples are stored using four bytes per sample<br />
..1000xx Samples are stored using eight bytes per sample<br />
<br />
The remaining 10 bits describe the coding scheme used to convert the digital value<br />
to an audio signal. The following formats are defined for version 1.0 of this<br />
format. For purposes of attribution, it should be noted that these formats are the<br />
PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and<br />
should be fairly comprehensive.<br />
<br />
Format ID Short Name Description<br />
-- Signed integer coding (0)<br />
0x0004 OGGPCM_FMT_S8 Signed integer 8 bit<br />
0x0009 OGGPCM_FMT_S16_LE Signed integer 16 bit little endian<br />
0x000A OGGPCM_FMT_S16_BE Signed integer 16 bit big endian<br />
0x000B OGGPCM_FMT_S16 Signed integer 16 bit machine endian<br />
0x000D OGGPCM_FMT_S24_3LE Signed integer 24 bit little endian<br />
0x000E OGGPCM_FMT_S24_3BE Signed integer 24 bit big endian<br />
0x0011 OGGPCM_FMT_S32_LE Signed integer 32 bit little endian<br />
0x0012 OGGPCM_FMT_S32_BE Signed integer 32 bit big endian<br />
0x0013 OGGPCM_FMT_S32 Signed integer 32 bit machine endian<br />
--<br />
-- Unsigned integer coding (1)<br />
0x0044 OGGPCM_FMT_U8 Unsigned integer 8 bit<br />
0x0049 OGGPCM_FMT_U16_LE Unsigned integer 16 bit little endian<br />
0x004A OGGPCM_FMT_U16_BE Unsigned integer 16 bit big endian<br />
0x004B OGGPCM_FMT_U16 Unsigned integer 16 bit machine endian<br />
0x004D OGGPCM_FMT_U24_3LE Unsigned integer 24 bit little endian<br />
0x004E OGGPCM_FMT_U24_3BE Unsigned integer 24 bit big endian<br />
0x0051 OGGPCM_FMT_U32_LE Unsigned integer 32 bit little endian<br />
0x0052 OGGPCM_FMT_U32_BE Unsigned integer 32 bit big endian<br />
0x0053 OGGPCM_FMT_U32 Unsigned integer 32 bit machine endian<br />
--<br />
-- IEEE Floating point coding (2)<br />
0x0091 OGGPCM_FMT_FLT_LE IEEE Float (-1,1) 32 bit little endian<br />
0x0092 OGGPCM_FMT_FLT_BE IEEE Float (-1,1) 32 bit big endian<br />
0x0093 OGGPCM_FMT_FLT IEEE Float (-1,1) 32 bit machine endian<br />
0x00A1 OGGPCM_FMT_FLT64_LE IEEE Float (-1,1) 64 bit little endian<br />
0x00A2 OGGPCM_FMT_FLT64_BE IEEE Float (-1,1) 64 bit big endian<br />
0x00A3 OGGPCM_FMT_FLT64 IEEE Float (-1,1) 64 bit machine endian<br />
--<br />
-- IEC958 coding (?) (3)<br />
0x00CD OGGPCM_FMT_IEC958_3LE IEC958 Subframe, 24 bit little endian<br />
0x00CE OGGPCM_FMT_IEC958_3BE IEC958 Subframe, 24 bit big endian<br />
0x00D1 OGGPCM_FMT_IEC958_LE IEC958 Subframe, 32 bit little endian<br />
0x00D2 OGGPCM_FMT_IEC958_BE IEC958 Subframe, 32 bit big endian<br />
0x00D3 OGGPCM_FMT_IEC958 IEC965 Subframe, 32 bit machine endian<br />
--<br />
-- Mu-Law coding (4)<br />
0x0104 OGGPCM_FMT_MU_LAW Mu-Law<br />
--<br />
-- A-Law coding (5)<br />
0x0144 OGGPCM_FMT_A_LAW A-Law<br />
--<br />
-- ADPCM coding (6)<br />
0x0180 OGGPCM_FMT_ADPCM Ima-ADPCM <br />
--<br />
-- GSM coding (7)<br />
0x01C0 OGGPCM_FMT_GSM GSM<br />
--<br />
-- 24 bit signed integer in 32 bit storage (8)<br />
0x0211 OGGPCM_FMT_S24_LE Signed integer 24 bit little endian<br />
0x0212 OGGPCM_FMT_S24_BE Signed integer 24 bit big endian<br />
0x0213 OGGPCM_FMT_S24 Signed integer 24 bit machine endian<br />
--<br />
-- 24 bit unsigned integer in 32 bit storage (9)<br />
0x0251 OGGPCM_FMT_U24_LE Unsigned integer 24 bit little endian<br />
0x0252 OGGPCM_FMT_U24_BE Unsigned integer 24 bit big endian<br />
0x0253 OGGPCM_FMT_U24 Unsigned integer 24 bit machine endian<br />
--<br />
-- 20 bit signed integer in 24 bit storage (10)<br />
0x028D OGGPCM_FMT_S20_3LE Signed integer 20 bit little endian<br />
0x028E OGGPCM_FMT_S20_3BE Signed integer 20 bit big endian<br />
--<br />
-- 20 bit unsigned integer in 24 bit storage (11)<br />
0x02CD OGGPCM_FMT_U20_3LE Unsigned integer 20 bit little endian<br />
0x02CE OGGPCM_FMT_U20_3BE Unsigned integer 20 bit big endian<br />
--<br />
-- 18 bit signed integer in 24 bit storage (12)<br />
0x030D OGGPCM_FMT_S18_3LE Signed integer 18 bit little endian<br />
0x030E OGGPCM_FMT_S18_3BE Signed integer 18 bit big endian<br />
--<br />
-- 18 bit unsigned integer in 24 bit storage (13)<br />
0x034D OGGPCM_FMT_U18_3LE Unsigned integer 18 bit little endian<br />
0x034E OGGPCM_FMT_U18_3BE Unsigned integer 18 bit big endian<br />
--<br />
Other coding schemes supported by ALSA but not specified here:<br />
MPEG<br />
--<br />
TODO: ADPCM and GSM need further specification (or elimination) since these aren't really<br />
byte packed like the other formats here are.<br />
<br />
== Encapsulation in Ogg ==<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2067OggPCM Draft12005-11-14T13:22:50Z<p>Arc: /* Header Packet */</p>
<hr />
<div>'''Warning: OggPCM is currently a topic of [http://lists.xiph.org/pipermail/ogg-dev/2005-November/thread.html heated debate]. The following is *not* the final spec.'''<br />
<br />
== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [[Wikipedia:Pulse-code_modulation|Wikipedia]].<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF.<br />
<br />
== Stream Description ==<br />
A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.<br />
<br />
The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.<br />
<br />
== Packet Format ==<br />
Header and comment packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.<br />
<br />
=== Header Packet ===<br />
Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields: <br />
<br />
Packet 0, BOS, 16 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, more supported format id's)<br />
8 [uint] Number of header packets preceding data<br />
8 [uint] Number of Channels, 0 = 256<br />
-<br />
16 [flag] Flags<br />
16 [enum] PCM Format ID<br />
-<br />
32 [uint] Sample Rate<br />
<br />
The flags field is defined as follows:<br />
Bit Description<br />
15 (MSB) Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data<br />
packet containing 3 channels and 2 samples/channel, the chunked storage order would be<br />
001122. For the interleaved storage format (default), the order would be 012012.<br />
others Reserved<br />
<br />
Applications conforming to version 1.0 of this spec MUST:<ul><br />
<li>set all reserved flags to false (zero) when creating these streams.</li><br />
<li>preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.</li><br />
</ul><br />
<br />
=== Comment Packets ===<br />
At this time, there is only one defined comment packet.<br />
Comment Header Packet<br />
8 0x01 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
=== Data Packets ===<br />
Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.<br />
<br />
=== Supported PCM Formats ===<br />
Formats are identified within a header packet by a 16 bit "format type" field. While<br />
most applications will treat this as an opaque type, it is possible to discern some<br />
information about the format from the value of this field itself. Specifically, the<br />
format's storage size, in bytes, and its byte ordering, can be discerned by parsing<br />
the lower 6 bits of the value. These values are exposed so that it is possible to<br />
extract individual samples without necessarily understanding the coding scheme involved.<br />
While for pratical purposes, due to performance concerns, most applications will<br />
choose to operate on a buffer directly, it is nonetheless possible to work a sample<br />
at a time. <br />
<br />
Binary Value Meaning<br />
..xxxx00 N/A, or data not accurately described by this scheme.<br />
..xxxx01 Least significant byte first. Bytes are MS bit first.<br />
..xxxx10 Most significant byte first. Bytes are MS bit first.<br />
..xxxx11 Data is machine endian<br />
..0000xx Data can not be described by this bytepacking scheme.<br />
..0001xx Samples are stored using one byte per sample<br />
..0010xx Samples are stored using two bytes per sample<br />
..0011xx Samples are stored using three bytes per sample<br />
..0100xx Samples are stored using four bytes per sample<br />
..1000xx Samples are stored using eight bytes per sample<br />
<br />
The remaining 10 bits describe the coding scheme used to convert the digital value<br />
to an audio signal. The following formats are defined for version 1.0 of this<br />
format. For purposes of attribution, it should be noted that these formats are the<br />
PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and<br />
should be fairly comprehensive.<br />
<br />
Format ID Short Name Description<br />
-- Signed integer coding (0)<br />
0x0004 OGGPCM_FMT_S8 Signed integer 8 bit<br />
0x0009 OGGPCM_FMT_S16_LE Signed integer 16 bit little endian<br />
0x000A OGGPCM_FMT_S16_BE Signed integer 16 bit big endian<br />
0x000B OGGPCM_FMT_S16 Signed integer 16 bit machine endian<br />
0x000D OGGPCM_FMT_S24_3LE Signed integer 24 bit little endian<br />
0x000E OGGPCM_FMT_S24_3BE Signed integer 24 bit big endian<br />
0x0011 OGGPCM_FMT_S32_LE Signed integer 32 bit little endian<br />
0x0012 OGGPCM_FMT_S32_BE Signed integer 32 bit big endian<br />
0x0013 OGGPCM_FMT_S32 Signed integer 32 bit machine endian<br />
--<br />
-- Unsigned integer coding (1)<br />
0x0044 OGGPCM_FMT_U8 Unsigned integer 8 bit<br />
0x0049 OGGPCM_FMT_U16_LE Unsigned integer 16 bit little endian<br />
0x004A OGGPCM_FMT_U16_BE Unsigned integer 16 bit big endian<br />
0x004B OGGPCM_FMT_U16 Unsigned integer 16 bit machine endian<br />
0x004D OGGPCM_FMT_U24_3LE Unsigned integer 24 bit little endian<br />
0x004E OGGPCM_FMT_U24_3BE Unsigned integer 24 bit big endian<br />
0x0051 OGGPCM_FMT_U32_LE Unsigned integer 32 bit little endian<br />
0x0052 OGGPCM_FMT_U32_BE Unsigned integer 32 bit big endian<br />
0x0053 OGGPCM_FMT_U32 Unsigned integer 32 bit machine endian<br />
--<br />
-- IEEE Floating point coding (2)<br />
0x0091 OGGPCM_FMT_FLT_LE IEEE Float (-1,1) 32 bit little endian<br />
0x0092 OGGPCM_FMT_FLT_BE IEEE Float (-1,1) 32 bit big endian<br />
0x0093 OGGPCM_FMT_FLT IEEE Float (-1,1) 32 bit machine endian<br />
0x00A1 OGGPCM_FMT_FLT64_LE IEEE Float (-1,1) 64 bit little endian<br />
0x00A2 OGGPCM_FMT_FLT64_BE IEEE Float (-1,1) 64 bit big endian<br />
0x00A3 OGGPCM_FMT_FLT64 IEEE Float (-1,1) 64 bit machine endian<br />
--<br />
-- IEC958 coding (?) (3)<br />
0x00CD OGGPCM_FMT_IEC958_3LE IEC958 Subframe, 24 bit little endian<br />
0x00CE OGGPCM_FMT_IEC958_3BE IEC958 Subframe, 24 bit big endian<br />
0x00D1 OGGPCM_FMT_IEC958_LE IEC958 Subframe, 32 bit little endian<br />
0x00D2 OGGPCM_FMT_IEC958_BE IEC958 Subframe, 32 bit big endian<br />
0x00D3 OGGPCM_FMT_IEC958 IEC965 Subframe, 32 bit machine endian<br />
--<br />
-- Mu-Law coding (4)<br />
0x0104 OGGPCM_FMT_MU_LAW Mu-Law<br />
--<br />
-- A-Law coding (5)<br />
0x0144 OGGPCM_FMT_A_LAW A-Law<br />
--<br />
-- ADPCM coding (6)<br />
0x0180 OGGPCM_FMT_ADPCM Ima-ADPCM <br />
--<br />
-- GSM coding (7)<br />
0x01C0 OGGPCM_FMT_GSM GSM<br />
--<br />
-- 24 bit signed integer in 32 bit storage (8)<br />
0x0211 OGGPCM_FMT_S24_LE Signed integer 24 bit little endian<br />
0x0212 OGGPCM_FMT_S24_BE Signed integer 24 bit big endian<br />
0x0213 OGGPCM_FMT_S24 Signed integer 24 bit machine endian<br />
--<br />
-- 24 bit unsigned integer in 32 bit storage (9)<br />
0x0251 OGGPCM_FMT_U24_LE Unsigned integer 24 bit little endian<br />
0x0252 OGGPCM_FMT_U24_BE Unsigned integer 24 bit big endian<br />
0x0253 OGGPCM_FMT_U24 Unsigned integer 24 bit machine endian<br />
--<br />
-- 20 bit signed integer in 24 bit storage (10)<br />
0x028D OGGPCM_FMT_S20_3LE Signed integer 20 bit little endian<br />
0x028E OGGPCM_FMT_S20_3BE Signed integer 20 bit big endian<br />
--<br />
-- 20 bit unsigned integer in 24 bit storage (11)<br />
0x02CD OGGPCM_FMT_U20_3LE Unsigned integer 20 bit little endian<br />
0x02CE OGGPCM_FMT_U20_3BE Unsigned integer 20 bit big endian<br />
--<br />
-- 18 bit signed integer in 24 bit storage (12)<br />
0x030D OGGPCM_FMT_S18_3LE Signed integer 18 bit little endian<br />
0x030E OGGPCM_FMT_S18_3BE Signed integer 18 bit big endian<br />
--<br />
-- 18 bit unsigned integer in 24 bit storage (13)<br />
0x034D OGGPCM_FMT_U18_3LE Unsigned integer 18 bit little endian<br />
0x034E OGGPCM_FMT_U18_3BE Unsigned integer 18 bit big endian<br />
--<br />
Other coding schemes supported by ALSA but not specified here:<br />
MPEG<br />
--<br />
TODO: ADPCM and GSM need further specification (or elimination) since these aren't really<br />
byte packed like the other formats here are.<br />
<br />
== Encapsulation in Ogg ==<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2055OggPCM Draft12005-11-14T13:22:29Z<p>Arc: /* Comment Packets */</p>
<hr />
<div>'''Warning: OggPCM is currently a topic of [http://lists.xiph.org/pipermail/ogg-dev/2005-November/thread.html heated debate]. The following is *not* the final spec.'''<br />
<br />
== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container. For the purposes of this document, the term PCM is used to describe a digital representation of an audio signal, where volume samples are taken at regular uniform intervals and then quantized into a digital (usually binary) code. A more complete definition of PCM and related terminology can be found at [[Wikipedia:Pulse-code_modulation|Wikipedia]].<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development. It is intended to be less complex to use than either RIFF or AIFF.<br />
<br />
== Stream Description ==<br />
A stream is composed of a header packet, zero or more comment packets, and one or more data packets. Data packets may be of variable length, including zero. The only valid use of a zero length data packet is to mark the end of stream. Data packets must contain samples for all channels. That is to say, the length of a data packet must be a multiple of the number of channels times the storage size of a single sample. For instance, for a stream containing 6 channels at 2 byte per channel, the length of the data packet must be a multiple of 12 bytes.<br />
<br />
The degenerate stream is a single header packet followed by the raw data packets. While this degenerate stream is not incredibly useful for long term storage or as a general purpose container, it is useful for applications where other data describing the stream is available out of band, for instance amongst cooperating applications in an inter-process communication scheme. Streams providing the extra defined comment packets are intended to be useful for long term storage and communication amongst diverse applications.<br />
<br />
== Packet Format ==<br />
Header and comment packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
The header packet contains a field indicating the number of comment packets preceding the raw data. Applications must either parse or skip exactly this many packets, in addition to the header packet, before treating the stream as raw data.<br />
<br />
=== Header Packet ===<br />
Multibyte fields in the header packets are packed in big endian order, to be consistent with network byte order. A header packet contains the following fields: <br />
<br />
Packet 0, BOS, 16 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, more supported format id's)<br />
8 [uint] Number of comment packets preceding data<br />
8 [uint] Number of Channels, 0 = 256<br />
-<br />
16 [flag] Flags<br />
16 [enum] PCM Format ID<br />
-<br />
32 [uint] Sample Rate<br />
<br />
The flags field is defined as follows:<br />
Bit Description<br />
15 (MSB) Interleaved/Chunked - If set, data in the packets is "chunked" by channel. In a data<br />
packet containing 3 channels and 2 samples/channel, the chunked storage order would be<br />
001122. For the interleaved storage format (default), the order would be 012012.<br />
others Reserved<br />
<br />
Applications conforming to version 1.0 of this spec MUST:<ul><br />
<li>set all reserved flags to false (zero) when creating these streams.</li><br />
<li>preserve all values of all reserved flags when reading or modifying these streams, unless the application sets the minor version field to zero, in which case the reserved flags must be set to false as well.</li><br />
</ul><br />
<br />
=== Comment Packets ===<br />
At this time, there is only one defined comment packet.<br />
Comment Header Packet<br />
8 0x01 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
=== Data Packets ===<br />
Data packets have no header word. This is done to preserve the alignment of the data payload. The contents of the data packets are specified by a combination of the 'PCM Format ID' field and the 'Flags' field. The length of the data packet must be a multiple of the number of channels specified in the header, and the storage size of a single sample, as specified by the 'PCM Format ID' field.<br />
<br />
=== Supported PCM Formats ===<br />
Formats are identified within a header packet by a 16 bit "format type" field. While<br />
most applications will treat this as an opaque type, it is possible to discern some<br />
information about the format from the value of this field itself. Specifically, the<br />
format's storage size, in bytes, and its byte ordering, can be discerned by parsing<br />
the lower 6 bits of the value. These values are exposed so that it is possible to<br />
extract individual samples without necessarily understanding the coding scheme involved.<br />
While for pratical purposes, due to performance concerns, most applications will<br />
choose to operate on a buffer directly, it is nonetheless possible to work a sample<br />
at a time. <br />
<br />
Binary Value Meaning<br />
..xxxx00 N/A, or data not accurately described by this scheme.<br />
..xxxx01 Least significant byte first. Bytes are MS bit first.<br />
..xxxx10 Most significant byte first. Bytes are MS bit first.<br />
..xxxx11 Data is machine endian<br />
..0000xx Data can not be described by this bytepacking scheme.<br />
..0001xx Samples are stored using one byte per sample<br />
..0010xx Samples are stored using two bytes per sample<br />
..0011xx Samples are stored using three bytes per sample<br />
..0100xx Samples are stored using four bytes per sample<br />
..1000xx Samples are stored using eight bytes per sample<br />
<br />
The remaining 10 bits describe the coding scheme used to convert the digital value<br />
to an audio signal. The following formats are defined for version 1.0 of this<br />
format. For purposes of attribution, it should be noted that these formats are the<br />
PCM formats supported by the Advanced Linux Sound Architecture (ALSA) project, and<br />
should be fairly comprehensive.<br />
<br />
Format ID Short Name Description<br />
-- Signed integer coding (0)<br />
0x0004 OGGPCM_FMT_S8 Signed integer 8 bit<br />
0x0009 OGGPCM_FMT_S16_LE Signed integer 16 bit little endian<br />
0x000A OGGPCM_FMT_S16_BE Signed integer 16 bit big endian<br />
0x000B OGGPCM_FMT_S16 Signed integer 16 bit machine endian<br />
0x000D OGGPCM_FMT_S24_3LE Signed integer 24 bit little endian<br />
0x000E OGGPCM_FMT_S24_3BE Signed integer 24 bit big endian<br />
0x0011 OGGPCM_FMT_S32_LE Signed integer 32 bit little endian<br />
0x0012 OGGPCM_FMT_S32_BE Signed integer 32 bit big endian<br />
0x0013 OGGPCM_FMT_S32 Signed integer 32 bit machine endian<br />
--<br />
-- Unsigned integer coding (1)<br />
0x0044 OGGPCM_FMT_U8 Unsigned integer 8 bit<br />
0x0049 OGGPCM_FMT_U16_LE Unsigned integer 16 bit little endian<br />
0x004A OGGPCM_FMT_U16_BE Unsigned integer 16 bit big endian<br />
0x004B OGGPCM_FMT_U16 Unsigned integer 16 bit machine endian<br />
0x004D OGGPCM_FMT_U24_3LE Unsigned integer 24 bit little endian<br />
0x004E OGGPCM_FMT_U24_3BE Unsigned integer 24 bit big endian<br />
0x0051 OGGPCM_FMT_U32_LE Unsigned integer 32 bit little endian<br />
0x0052 OGGPCM_FMT_U32_BE Unsigned integer 32 bit big endian<br />
0x0053 OGGPCM_FMT_U32 Unsigned integer 32 bit machine endian<br />
--<br />
-- IEEE Floating point coding (2)<br />
0x0091 OGGPCM_FMT_FLT_LE IEEE Float (-1,1) 32 bit little endian<br />
0x0092 OGGPCM_FMT_FLT_BE IEEE Float (-1,1) 32 bit big endian<br />
0x0093 OGGPCM_FMT_FLT IEEE Float (-1,1) 32 bit machine endian<br />
0x00A1 OGGPCM_FMT_FLT64_LE IEEE Float (-1,1) 64 bit little endian<br />
0x00A2 OGGPCM_FMT_FLT64_BE IEEE Float (-1,1) 64 bit big endian<br />
0x00A3 OGGPCM_FMT_FLT64 IEEE Float (-1,1) 64 bit machine endian<br />
--<br />
-- IEC958 coding (?) (3)<br />
0x00CD OGGPCM_FMT_IEC958_3LE IEC958 Subframe, 24 bit little endian<br />
0x00CE OGGPCM_FMT_IEC958_3BE IEC958 Subframe, 24 bit big endian<br />
0x00D1 OGGPCM_FMT_IEC958_LE IEC958 Subframe, 32 bit little endian<br />
0x00D2 OGGPCM_FMT_IEC958_BE IEC958 Subframe, 32 bit big endian<br />
0x00D3 OGGPCM_FMT_IEC958 IEC965 Subframe, 32 bit machine endian<br />
--<br />
-- Mu-Law coding (4)<br />
0x0104 OGGPCM_FMT_MU_LAW Mu-Law<br />
--<br />
-- A-Law coding (5)<br />
0x0144 OGGPCM_FMT_A_LAW A-Law<br />
--<br />
-- ADPCM coding (6)<br />
0x0180 OGGPCM_FMT_ADPCM Ima-ADPCM <br />
--<br />
-- GSM coding (7)<br />
0x01C0 OGGPCM_FMT_GSM GSM<br />
--<br />
-- 24 bit signed integer in 32 bit storage (8)<br />
0x0211 OGGPCM_FMT_S24_LE Signed integer 24 bit little endian<br />
0x0212 OGGPCM_FMT_S24_BE Signed integer 24 bit big endian<br />
0x0213 OGGPCM_FMT_S24 Signed integer 24 bit machine endian<br />
--<br />
-- 24 bit unsigned integer in 32 bit storage (9)<br />
0x0251 OGGPCM_FMT_U24_LE Unsigned integer 24 bit little endian<br />
0x0252 OGGPCM_FMT_U24_BE Unsigned integer 24 bit big endian<br />
0x0253 OGGPCM_FMT_U24 Unsigned integer 24 bit machine endian<br />
--<br />
-- 20 bit signed integer in 24 bit storage (10)<br />
0x028D OGGPCM_FMT_S20_3LE Signed integer 20 bit little endian<br />
0x028E OGGPCM_FMT_S20_3BE Signed integer 20 bit big endian<br />
--<br />
-- 20 bit unsigned integer in 24 bit storage (11)<br />
0x02CD OGGPCM_FMT_U20_3LE Unsigned integer 20 bit little endian<br />
0x02CE OGGPCM_FMT_U20_3BE Unsigned integer 20 bit big endian<br />
--<br />
-- 18 bit signed integer in 24 bit storage (12)<br />
0x030D OGGPCM_FMT_S18_3LE Signed integer 18 bit little endian<br />
0x030E OGGPCM_FMT_S18_3BE Signed integer 18 bit big endian<br />
--<br />
-- 18 bit unsigned integer in 24 bit storage (13)<br />
0x034D OGGPCM_FMT_U18_3LE Unsigned integer 18 bit little endian<br />
0x034E OGGPCM_FMT_U18_3BE Unsigned integer 18 bit big endian<br />
--<br />
Other coding schemes supported by ALSA but not specified here:<br />
MPEG<br />
--<br />
TODO: ADPCM and GSM need further specification (or elimination) since these aren't really<br />
byte packed like the other formats here are.<br />
<br />
== Encapsulation in Ogg ==<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=3171Talk:OggPCM Draft12005-11-13T09:10:48Z<p>Arc: /* Do we want/need the 32-bit data packet header? */</p>
<hr />
<div>== Needs ==<br />
As primarily an audio interchange codec, '''OggPCM''' should support all the capabilities of curret Ogg audio codecs and any feature we'll conceivably need in the near future. These should be supported in a way which is easy to implement.<br />
<br />
Not all features need to be supported by all software, ie, support for more than two channels or 8-bit audio is not needed.<br />
<br />
Current issues should be moved to the top.<br />
<br />
<br />
=== Seperate fields or unified table? ===<br />
* This has been the most contested issue to date, and one which I believe has been solved in a mutually acceptable way, since the two are not mutually exclusive. A table, designed with values non-linearly, such as the value of bits within that table can be tested within the context of a simple flow chart, can be used to discover the format. Meanwhile, a table can be implemented as desired and be in full compatability since the flow chart only permits valid choices.<br />
--[[User:Arc|Arc]] 00:48, 13 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
* Support for non-octet based sample sizes has been removed with the introduction of a data type table. We no longer need to worry about this topic.<br />
--[[User:Arc|Arc]] 23:48, 10 Nov 2005 (PST)<br />
<br />
* A new table is being worked on which allows 10bit to 32bit signed int data along with or without padding to octet. Padding may be up to 8 bits, which allows (ie) 24-bit values to be padded to 32-bit words.<br />
--[[User:Arc|Arc]] 00:48, 13 Nov 2005 (PST)<br />
<br />
<br />
=== Do we want/need the 32-bit data packet header? ===<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement.<br />
--[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]<br />
<br />
* This issue remains one of few left contested, however, I believe that for uniformity with Vorbis and Theora, this is the correct method to identify packet types within the current version of the Ogg container.<br />
--[[User:Arc|Arc]] 23:51, 10 Nov 2005 (PST)<br />
<br />
* As to wether we need it, we need a way to mark header packets from data packets, as we need a comment header to carry comments from decoded Vorbis/FLAC/etc (or to be encoded to Vorbis/FLAC/etc). Erik's comment re: 64-bit floats is one I'd like to highlight yellow. Pending a check on libogg2 to see what 64-bit alignment is available on which platforms, we could:<br />
** Extend the data packet header to 64-bits, prehaps only with 64-bit data<br />
** Have a packet0 field which specifies how many header packets there are, as [[User:Conrad|Conrad]] suggested<br />
** Have the last header packet of ID \xFF which marks the end of the headers<br />
--[[User:Arc|Arc]] 01:10, 13 Nov 2005 (PST)<br />
<br />
=== Signed/Unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Int/Float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Vorbiscomment-style header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2036Talk:OggPCM Draft12005-11-13T08:48:26Z<p>Arc: </p>
<hr />
<div>== Needs ==<br />
As primarily an audio interchange codec, '''OggPCM''' should support all the capabilities of curret Ogg audio codecs and any feature we'll conceivably need in the near future. These should be supported in a way which is easy to implement.<br />
<br />
Not all features need to be supported by all software, ie, support for more than two channels or 8-bit audio is not needed.<br />
<br />
Current issues should be moved to the top.<br />
<br />
<br />
=== Seperate fields or unified table? ===<br />
* This has been the most contested issue to date, and one which I believe has been solved in a mutually acceptable way, since the two are not mutually exclusive. A table, designed with values non-linearly, such as the value of bits within that table can be tested within the context of a simple flow chart, can be used to discover the format. Meanwhile, a table can be implemented as desired and be in full compatability since the flow chart only permits valid choices.<br />
--[[User:Arc|Arc]] 00:48, 13 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
* Support for non-octet based sample sizes has been removed with the introduction of a data type table. We no longer need to worry about this topic.<br />
--[[User:Arc|Arc]] 23:48, 10 Nov 2005 (PST)<br />
<br />
* A new table is being worked on which allows 10bit to 32bit signed int data along with or without padding to octet. Padding may be up to 8 bits, which allows (ie) 24-bit values to be padded to 32-bit words.<br />
--[[User:Arc|Arc]] 00:48, 13 Nov 2005 (PST)<br />
<br />
<br />
=== Do we want/need the 32-bit data packet header? ===<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement.<br />
--[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]<br />
<br />
* This issue remains one of few left contested, however, I believe that for uniformity with Vorbis and Theora, this is the correct method to identify packet types within the current version of the Ogg container.<br />
--[[User:Arc|Arc]] 23:51, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Signed/Unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Int/Float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Vorbiscomment-style header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggYUV&diff=3224Talk:OggYUV2005-11-13T08:23:57Z<p>Arc: /* Aspect ratio */</p>
<hr />
<div>=== Interlace Flag? ===<br />
* The interlacing information doesn't seem complete to me. How do you know which field(s) you have in any give packet, for example? How do you distinguish between a 25Hz shutter and a 50Hz shutter? Field order switching? Mixing with uninterlaced data?<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* In my experience, all interlace is every other frame, even scanlines followed by odd scanlines. Is there any video codec which supports more than an interlace flag? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Variable frame-rates ===<br />
* There doesn't seem to be any handling of variable frame-rate data, or a specification for a timebase for the granulepos.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Granulepos is the last frame decodable in the current packet/page. As far as variable framerates within a single stream, is there any codec which supports this currently? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Codec Identifier ===<br />
* The identifier seems a little short. You'd get false positives if somebody wanted to use a "YUVx" format, for example.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* I believe that's OK with raw formats, if someone wanted to use a YUV-like codec they could use a prefix, vs a suffix, to identify it by. Also, if their header packet ID is something other than 0x00, it will not generate a false positive to have a YUV* codec identifier since the YUV plugins only support streams which begin with packet id 0. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Aspect ratio ===<br />
* Is the aspect ratio the pixel aspect or the frame aspect? <br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Frame aspect, this acts exactly like the aspect ratio in the Theora header, right down to having the same bit-size for the fields. Typically, the ratio is 4:3 or 16:9. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
* Arc, Theora uses pixel aspect ratio, from the spec:<br />
13. Read a 24-bit unsigned integer as PARN.<br />
14. Read a 24-bit unsigned integer as PARD. Together with PARN, these<br />
specify the aspect ratio of the pixels within a frame, defined as the ratio<br />
of the physical width of a pixel to its physical height. This is given by<br />
the ratio PARN : PARD. If either of these fields are zero, this indicates<br />
that pixel aspect ratio information was not available to the encoder. In<br />
this case it MAY be specified by the application via an external means,<br />
or a default value of 1 : 1 MAY be used.<br />
--[[User:J^|J^]] 13:42, 11 Nov 2005 (PST)<br />
<br />
* Sorry, my mistake or typo. Thanks for pointing this out. :-)<br />
<br />
=== Chroma Subsampling Methods ===<br />
* We need to know two things, what the size/shape of chroma pixels are, and if they are packed, what order they are provided in the bitstream.<br />
** This seperates the order of the data and the processing of the data, both of which are important but get very complicated if mixed<br />
** The order must match the shape it's based on, ie, 4:4:4 should be in "0:0" order, any other value (which is illegal) should not be supported by any software if encountered and should never be generated.<br />
<br />
Chroma Pixel "Shapes"<br />
=====================<br />
ID Shape Used-In<br />
0 #--- 4:4:4<br />
----<br />
----<br />
----<br />
.<br />
1 ##-- 4:2:2<br />
----<br />
---- <br />
----<br />
.<br />
2 #### 4:1:1<br />
----<br />
----<br />
----<br />
.<br />
3 #--- ?<br />
#---<br />
----<br />
----<br />
.<br />
4 ##-- 4:2:0<br />
##--<br />
----<br />
----<br />
.<br />
5 #### 4:1:0<br />
####<br />
----<br />
----<br />
.<br />
6 #### ?<br />
####<br />
####<br />
####<br />
.<br />
7 Extended Shape, unsupported in v1.0<br />
<br />
* In things like 420 you also need to know the relative phase of the sampling of the chroma. Some standards have it as being sampled exactly on the top left, some halfway across the top row, some halfway down the left edge, some in the centre of everything... then there are all the interlaced variants.<br />
: You also have the matter of whether or not there are unused areas at the top and bottom of the range of legal values, and many other things along those lines.<br />
--[[User:Gumboot|Gumboot]] 02:58, 10 Nov 2005 (PST)<br />
<br />
I'm a little confused about the x:y:z notation, so I'm not going to use that here. But here are a couple formats that aren't described in this table that I use. Dash (-) indicates a luma sample without a chroma sample, and # indicates a luma and chroma sample, I'm not sure if this is the same as used above or not..<br />
<br />
####<br />
#### Luma and Chroma sampled at at every pixel.<br />
####<br />
####<br />
.<br />
---- #-#-<br />
#-#- and ---- Typically sampled with a phase offset, but ignoring that for now. One <br />
---- #-#- chroma sample per two luma horizontally, every other row. <br />
#-#- ----<br />
.<br />
#-#- <br />
#-#- One chroma sample per two luma horizontally, every row. <br />
#-#- <br />
#-#- <br />
.<br />
There may be variants of #-#- (eg -#-#, maybe even -##- or #--#) out there too<br />
--[[User:Jkoleszar|Jkoleszar]] 11:09, 10 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggRGB&diff=3225Talk:OggRGB2005-11-13T08:20:56Z<p>Arc: /* Do we need a gamma correction field? */</p>
<hr />
<div>=== Do we need a gamma correction field? ===<br />
<br />
<br />
=== YUV+RGB? ===<br />
* How about specifying the color space (ie sRGB) -- And why not merging OggRGB and OggYUV and adding an additional color space identifier ? There's only one raw codec for audio althoug it may come in different flavours as well (float/integer samples, channel count, sampling rate) -- Sebi<br />
<br />
* It's not purely an issue of colorspace, YUV does many things RGB does not do - most specifically chroma subsampling, which RGB doesn't worry about. It's different from PCM - PCM is a format of samples, no matter what format those samples are in. The RGB vs YUV issue is, by contrast, the difference between delta encoded audio and plain samples encoding. Additionally, there's OggStream processing methods which are complicating this issue, I'll explain more with time.<br />
<br />
=== Is there ever a case where RGB video has non-square pixels, ie, do we need an aspect ratio? ===<br />
<br />
Sebi: Storing that won't hurt. I'd do so. (Recall DVD resolutions, 720x480 (NTSC) or 720x576 is supposed to be DAR 4:3 or 16:9 (DAR = display aspect ratio))<br />
<br />
=== How about support for skipped frames or variable duration? ===<br />
<br />
<br />
=== Is there a need for a VorbisComment -like header? ===</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2028OggPCM Draft12005-11-11T08:17:08Z<p>Arc: /* Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development.<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give that all-important header which carries information about the number of channels, sample width, and sample frequency. So what is needed is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
'' This is a the current working draft, a compromise between the different promposed elements needed ''<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packets are packed in little endian order. Multibyte fields in the data packet are packed according to the endian flag in the stream header packet.<br />
<br />
An audio frame consists of one sample from each audio channel encoded in sequence. The granule position specified is the total audio frames in the stream including the last complete packet in a page. Audio frames must not be split across packets. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is in samples as well as provides the current stream position to seeking routines. A truncated stream will still return the proper number of audio frames that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extended unsupported by 1.0 software<br />
<br />
'''Encapsulation in Ogg'''<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame.<br />
<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
'''Constraints'''<br />
<br />
* Version 1.0 codec software MUST NOT attempt to decode when the Extended (7) Data Type is specified.<br />
<br />
* An OggPCM packet MUST NOT be constructed with a partial frame; ie. an audio frame must not span two Ogg packets.<br />
<br />
== Alternative Format ==<br />
<br />
''This format was written by [[User:Jkoleszar|Jkoleszar]], and has since been combined with other ideas into the primary format (above)''<br />
<br />
It is intended to support channels from the same source having different sampling parameters.<br />
<br />
'''Packet structure'''<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
-<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate ** this field crosses a 32bit-word barrier ** <br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
'''Sample Format'''<br />
<br />
OGG_PCM_S8 = 0x1 /* Signed 8 bit. */<br />
OGG_PCM_S16 = 0x2<br />
OGG_PCM_S24 = 0x3<br />
OGG_PCM_S32 = 0x4<br />
OGG_PCM_U8 = 0x5 /* Unsigned 8 bit */<br />
OGG_PCM_FLOAT32 = 0x6<br />
OGG_PCM_FLOAT64 = 0x7<br />
<br />
<br />
<br />
'''Discussion'''<br />
<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. [[User:Jkoleszar|Jkoleszar]] is not convinced that the source id and channel block are necessary, but figured he'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2035Talk:OggPCM Draft12005-11-11T07:53:15Z<p>Arc: adding a second newline between sections</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
* Support for non-octet based sample sizes has been removed with the introduction of a data type table. We no longer need to worry about this topic.<br />
--[[User:Arc|Arc]] 23:48, 10 Nov 2005 (PST)<br />
<br />
<br />
=== Do we want/need the 32-bit data packet header? ===<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement.<br />
--[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]<br />
<br />
* This issue remains one of few left contested, however, I believe that for uniformity with Vorbis and Theora, this is the correct method to identify packet types within the current version of the Ogg container.<br />
--[[User:Arc|Arc]] 23:51, 10 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2019Talk:OggPCM Draft12005-11-11T07:52:05Z<p>Arc: /* Do we want/need the 32-bit data packet header? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
* Support for non-octet based sample sizes has been removed with the introduction of a data type table. We no longer need to worry about this topic.<br />
--[[User:Arc|Arc]] 23:48, 10 Nov 2005 (PST)<br />
<br />
=== Do we want/need the 32-bit data packet header? ===<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement.<br />
--[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]<br />
<br />
* This issue remains one of few left contested, however, I believe that for uniformity with Vorbis and Theora, this is the correct method to identify packet types within the current version of the Ogg container.<br />
--[[User:Arc|Arc]] 23:51, 10 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2018Talk:OggPCM Draft12005-11-11T07:51:33Z<p>Arc: /* Do we want/need the 32-bit data packet header? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
* Support for non-octet based sample sizes has been removed with the introduction of a data type table. We no longer need to worry about this topic.<br />
--[[User:Arc|Arc]] 23:48, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement.<br />
--[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]<br />
<br />
* This issue remains one of few left contested, however, I believe that for uniformity with Vorbis and Theora, this is the correct method to identify packet types within the current version of the Ogg container.<br />
--[[User:Arc|Arc]] 23:51, 10 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2017Talk:OggPCM Draft12005-11-11T07:48:43Z<p>Arc: /* Are samples padded to some round number of bits? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
* Support for non-octet based sample sizes has been removed with the introduction of a data type table. We no longer need to worry about this topic.<br />
--[[User:Arc|Arc]] 23:48, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2016Talk:OggPCM Draft12005-11-11T07:46:37Z<p>Arc: /* How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
* With the introduction of a data type lookup table for the most recent [[OggPCM#Format|format]], float types of neither 32bit or 64bit size is no longer available. If other sizes of float are needed they may be added in a future minor revision with an extended type.<br />
--[[User:Arc|Arc]] 23:46, 10 Nov 2005 (PST)<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2015Talk:OggPCM Draft12005-11-11T07:44:41Z<p>Arc: /* Is it worth supporting a vorbiscomment header? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
* A comment header, identical to vorbis's comment header, has been added to the most recent draft [[OggPCM#Format|format]]<br />
--[[User:Arc|Arc]] 23:44, 10 Nov 2005 (PST)<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2014Talk:OggPCM Draft12005-11-11T07:42:39Z<p>Arc: /* Do we need to offer endian data flag? If not, which is used? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; an endian flag is provided seperate from the data format, though it will not effect 8bit sample types.<br />
--[[User:Arc|Arc]] 23:42, 10 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2013Talk:OggPCM Draft12005-11-11T07:40:43Z<p>Arc: /* Do we need to record int/float data flag? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; float support is provided for 32bit and 64bit samples only.<br />
--[[User:Arc|Arc]] 23:40, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=2012Talk:OggPCM Draft12005-11-11T07:39:11Z<p>Arc: /* Do we need signed/unsigned data flag? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec.<br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally.<br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
* I see no reason to support any unsigned PCM format other than 8 bit. For instance, I know of no container format which supports unsigned 16 bit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
* This issue has been resolved in the most recent [[OggPCM#Format|Format]] draft; unsigned support is provided for 8bit samples only.<br />
--[[User:Arc|Arc]] 23:39, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
* Please don't make determination of the data format depend on multiple fields. Instead use an enumeration so that something like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16 and big endian 16 bit doubles can be specified as OGG_PCM_BE_FLOAT_64. This scheme is far more transparent and self documenting. If the format field is 8 bits, this scheme supports 256 formats; if its 16 bit it will support 65536 formats.<br />
<br />
I also suggest leaving the format associated with a value of zero as an invalid format.<br />
--[[Erikd|Erikd]]<br />
* It would ''not'' support 256 formats. It would support the small set of formats that somebody bothered to define early on, and it would not be able to expand because many implementations would fail to follow the changing specification thereby forcing everybody to limit themselves to the initial set.<br />
--[[User:Gumboot|Gumboot]] 02:08, 10 Nov 2005 (PST)<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Big and little endian data formats should both be supported with equal status. There should not even be a default; the endian-ness should be explicit.<br />
--[[User:Erikd|Erikd]]<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
* Agree<br />
--[[User:Conrad|Conrad]]<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* Many file formats (WAV, AIFF, AU and others) support 64 bit float data. WAV stores floats as little endian data and AIFF stores if as big endian data. OggPCM should support both 32 and 64 bit floats of both endian-nesses (is that a word?). I don't know of any other floating point format that needs consideration.<br />
--[[Erikd|Erikd]]<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* The occurrence of N bit PCM where N is not a multiple of 8 bits is so rare that it should probably be ignored. In addition, there really isn't any reason to treat 10 bit data packed into the 10 most significant bits of a 16 bit int any different from a real 16 bit value. So why make any distinction?<br />
--[[Erikd|Erikd]]<br />
<br />
* 10-bit values have a range of -512 to +511. When you shift them up the range is -32768 to 32704, so they need scaling if you want them to have their proper range in a normalised system.<br />
* Precisions that aren't a multiple of 8 bit aren't at all rare, but they're normally rounded off to a multiple for compatibility.<br />
--[[User:Gumboot|Gumboot]] 02:02, 10 Nov 2005 (PST)<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
* On UltraSparc and Alpha CPUs (both 64 bit) accessing a 64 bit double at an address that is not 8 byte aligned causes a segmentation fault. However, accessing unaligned doubles on x86 (ie 32 bit) is slower than accessing aligned doubles. You might want to consider this.<br />
--[[Erikd|Erikd]]<br />
<br />
* I cannot see why that data header is necessary. No other uncompressed audio format requires extra framing information, so I cannot see how future additional header fields would require to be added. It should be clear from the bos page how many samples go into a packet and thus this field is just complicating decoding with an extra parsing step IMHO.<br />
--[[User:Silvia|Silvia]]<br />
<br />
* This header is unnecessary. Ogg already provides packet framing, and the existing headers (BOS, comments) can be determined by sequence order. The BOS header already contains forwards compatability versioning for extra header fields. Even if new headers were to be created, they could be indicated by an 'extra_headers' field in the BOS header, as is done in Speex.<br />
--[[User:Conrad|Conrad]]</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2020OggPCM Draft12005-11-11T07:34:07Z<p>Arc: Moved Conrad's Encapsulation/Constraints sections to primary Format section, corrected Ogg terminology</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development.<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give that all-important header which carries information about the number of channels, sample width, and sample frequency. So what is needed is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
'' This is a the current working draft, a compromise between the different promposed elements needed ''<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extended unsupported by 1.0 software<br />
<br />
'''Encapsulation in Ogg'''<br />
<br />
The granulepos of an Ogg page indicates the presentation time of the last presentable element in the last complete packet within that page; for '''OggPCM''', a granule is an audio frame.<br />
<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
'''Constraints'''<br />
<br />
* Version 1.0 codec software MUST NOT attempt to decode when the Extended (7) Data Type is specified.<br />
<br />
* An OggPCM packet MUST NOT be constructed with a partial frame; ie. an audio frame must not span two Ogg packets.<br />
<br />
<br />
== Alternative Format ==<br />
<br />
''This format was written by [[User:Jkoleszar|Jkoleszar]], and has since been combined with other ideas into the primary format (above)''<br />
<br />
It is intended to support channels from the same source having different sampling parameters.<br />
<br />
'''Packet structure'''<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
-<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate ** this field crosses a 32bit-word barrier ** <br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
'''Sample Format'''<br />
<br />
OGG_PCM_S8 = 0x1 /* Signed 8 bit. */<br />
OGG_PCM_S16 = 0x2<br />
OGG_PCM_S24 = 0x3<br />
OGG_PCM_S32 = 0x4<br />
OGG_PCM_U8 = 0x5 /* Unsigned 8 bit */<br />
OGG_PCM_FLOAT32 = 0x6<br />
OGG_PCM_FLOAT64 = 0x7<br />
<br />
<br />
<br />
'''Discussion'''<br />
<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. [[User:Jkoleszar|Jkoleszar]] is not convinced that the source id and channel block are necessary, but figured he'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2010OggPCM Draft12005-11-11T07:10:19Z<p>Arc: /* Alternative Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development.<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give that all-important header which carries information about the number of channels, sample width, and sample frequency. So what is needed is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
'' This is a the current working draft, a compromise between the different promposed elements needed ''<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extend - unsupported by 1.0-only software<br />
<br />
== Alternative Format ==<br />
<br />
''This format was written by [[User:Jkoleszar|Jkoleszar]], and has since been combined with other ideas into the primary format (above)''<br />
<br />
It is intended to support channels from the same source having different sampling parameters.<br />
<br />
'''Packet structure'''<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
-<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate ** this field crosses a 32bit-word barrier ** <br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
'''Sample Format'''<br />
<br />
OGG_PCM_S8 = 0x1 /* Signed 8 bit. */<br />
OGG_PCM_S16 = 0x2<br />
OGG_PCM_S24 = 0x3<br />
OGG_PCM_S32 = 0x4<br />
OGG_PCM_U8 = 0x5 /* Unsigned 8 bit */<br />
OGG_PCM_FLOAT32 = 0x6<br />
OGG_PCM_FLOAT64 = 0x7<br />
<br />
'''Encapsulation in Ogg'''<br />
<br />
Ogg provides encapsulation of data in packets, which may be marked with a granulepos. The granulepos of an Ogg packet indicates the presentation time of the last presentable element in the packet; for audio, this corresponds to the timestamp of the last audio frame.<br />
<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
An OggPCM packet MUST NOT be constructed with a partial frame; ie. an audio frame must not span two Ogg packets.<br />
<br />
'''Constraints'''<br />
<br />
This format can support only specified sample formats. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
'''Discussion'''<br />
<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. [[User:Jkoleszar|Jkoleszar]] is not convinced that the source id and channel block are necessary, but figured he'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2009OggPCM Draft12005-11-11T07:06:47Z<p>Arc: Corrected incorrect information entered by User:Conrad</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, for example for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video, as opposed to using RIFF/WAV (.wav) and YUV4MPEG (.yuv) in separate files as was done during [[Theora]] development.<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give that all-important header which carries information about the number of channels, sample width, and sample frequency. So what is needed is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
'' This is a the current working draft, a compromise between the different promposed elements needed ''<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extend - unsupported by 1.0-only software<br />
<br />
== Alternative Format ==<br />
<br />
''This format was written by [[User:Jkoleszar]], and has since been combined with other ideas into the primary format (above)''<br />
<br />
It is intended to support channels from the same source having different sampling parameters.<br />
<br />
'''Packet structure'''<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
-<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate ** this field crosses a 32bit-word barrier ** <br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
'''Sample Format'''<br />
<br />
OGG_PCM_S8 = 0x1 /* Signed 8 bit. */<br />
OGG_PCM_S16 = 0x2<br />
OGG_PCM_S24 = 0x3<br />
OGG_PCM_S32 = 0x4<br />
OGG_PCM_U8 = 0x5 /* Unsigned 8 bit */<br />
OGG_PCM_FLOAT32 = 0x6<br />
OGG_PCM_FLOAT64 = 0x7<br />
<br />
'''Encapsulation in Ogg'''<br />
<br />
Ogg provides encapsulation of data in packets, which may be marked with a granulepos. The granulepos of an Ogg packet indicates the presentation time of the last presentable element in the packet; for audio, this corresponds to the timestamp of the last audio frame.<br />
<br />
Following standard terminology for uncompressed audio, an audio frame is the collection of samples for all channels for a single sampling period. For example, an audio frame for a stereo signal is a pair of sample values for the left and right channels.<br />
<br />
An OggPCM packet MUST NOT be constructed with a partial frame; ie. an audio frame must not span two Ogg packets.<br />
<br />
'''Constraints'''<br />
<br />
This format can support only specified sample formats. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
'''Discussion'''<br />
<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. [[User:Jkoleszar]] is not convinced that the source id and channel block are necessary, but figured he'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2003OggPCM Draft12005-11-11T05:08:17Z<p>Arc: /* Alternative Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, especially for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video for development, vs RIFF/WAV (.wav) and YUV4MPEG (.yuv) in seperate files as we did with [[Theora]].<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give us that all-important header which carries information about the number of channels, sample width, and sample frequency. So what we need is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extend - unsupported by 1.0-only software<br />
<br />
== Alternative Format ==<br />
<br />
The primary difference between this format and the one above is that it is intended to support channels from the same source having different sampling parameters.<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
-<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate ** this field crosses a 32bit-word barrier ** <br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
Constraints:<br />
This format can support any __documented and registered__ format by since it uses an enumeration. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
Discussion:<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. I'm not convinced that the source id and channel block are necessary, but figured I'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=Main_Page&diff=2026Main Page2005-11-11T05:02:29Z<p>Arc: /* Codecs */</p>
<hr />
<div>= Projects/Formats =<br />
<br />
In an effort to bring open-source ideals to the world of multimedia The Xiph.org Foundation ([[XiphOrg]]) develops a multitude of amazing products. <br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommeded container for Xiph codecs.<br />
* [[OggSkeleton]]: Skeleton information on all logical content bitstreams in Ogg<br />
<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML playlist format<br />
<br />
== Codecs ==<br />
* '''Compressed Codecs:'''<br />
** [[Vorbis]]: Audio codec with a [[Tremor|fixed point decoder]]<br />
** [[Theora]]: Video codec<br />
** [[FLAC]]: Free Lossless Audio Codec<br />
** [[Speex]]: Speech codec<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
* '''[[RawCodecs|Uncompressed Codecs]]:'''<br />
** [[OggPCM]]: Uncompressed PCM audio, primarily as an interchange codec<br />
** [[OggRGB]]: Uncompressed RGB video, primarily as an interchange codec<br />
** [[OggYUV]]: Uncompressed YUV video, primarily as an interchange codec, undergoing heavy debate<br />
** [[OggWrit]]: Text phrase codec (e.g. subtitles)<br />
* '''Metadata Codecs:'''<br />
** [[Metadata]]: Arbitrary metadata stream format (vapourware so far)<br />
<br />
== Software for distributing media ==<br />
<br />
* [[Icecast]]: Streaming server<br />
* [[Ices]]: Source client for Icecast servers<br />
* [[IceShare]]: P2P content distribution<br />
<br />
== Other software ==<br />
<br />
* [[OggComponent/VorbisComponent]]: Wrappers to integrate Ogg-Vorbis into MacOsX<br />
<br />
= Demonstrations =<br />
<br />
Want to hear Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams]]: Stations streaming with the Vorbis codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware]]: Hardware players using the Vorbis codec<br />
* [http://www.tversity.com TVersity Media Server]: A UPNP/AV compliant media server that uses the Ogg Vorbis libraries to transcode audio files to the Ogg Vorbis format.<br />
<br />
= Project management =<br />
<br />
* [[MonthlyMeeting]]<br />
* [[MailingLists]]<br />
* [[Bounties]]<br />
* [[HyperFish]]<br />
<br />
= Wiki internal =<br />
* [[Sandbox]]: Testbed for testing editing skills.<br />
* [[Translations]]: What about some translation work</div>Archttps://wiki.xiph.org/index.php?title=Ogg_Writ&diff=3176Ogg Writ2005-11-11T04:47:18Z<p>Arc: Ogg Writ moved to OggWrit</p>
<hr />
<div>#redirect [[OggWrit]]<br />
</div>Archttps://wiki.xiph.org/index.php?title=OggRaw&diff=3227OggRaw2005-11-11T04:44:53Z<p>Arc: /* Why Not FourCC */</p>
<hr />
<div>== Purpose ==<br />
<br />
=== For the Ogg Media Framework ===<br />
Within OggStream, codecs are recoded (encoded, decoded, or transcoded) from one format to another. For example, a Vorbis codec plugin could be used to convert a Vorbis I stream to a PCM stream. Ogg packets of these streams are imported and exported from OggStream, and several of these conversions can be used in sequence (plugin chaining) to attain a desired output from any supported input within that media type.<br />
<br />
Some codec plugins will only support one or two raw codecs, providing translations between different formatting options as needed, such that (ie) if an Ogg FLAC file contains 64-bit float data, and the media player attempting to play this data only supports 16-bit signed int data, a second plugin could provide the conversion from 64-bit float to 16-bit signed int.<br />
<br />
Having these uncompressed codecs is, thus, essential for implementing the new Ogg media framework as it depends on interchange codecs which all applications desiring to work with a certain media type can reasonably support.<br />
<br />
<br />
=== For Low-CPU Storage ===<br />
While losslessly compressed Ogg codecs are available for both audio and video, some applications (ie, live recording, editing, etc) find the higher CPU requirements for processing these formats less desireable than the need for additional storage capacity. Many of these applications require syncing information not provided sufficiently by RIFF (.wav/.avi) or Quicktime, where using an uncompressed codec within Ogg provides excellent cross-bitstream syncing, or the application may be designed around the Ogg media framework where storing data in an uncompressed Ogg codec makes encoding it later while keeping comments/etc is made easier.<br />
<br />
<br />
=== For Codec Development ===<br />
As we experienced with Ogg [[Theora]] development, there is a shortage of simple raw data formats which support the capabilities being tested in codec development. Additionally, the lack of inter-codec sync information (ie, when using .wav & yuv4mpeg2) for these non-Ogg raw formats makes debugging more difficult than it should be.<br />
<br />
These uncompressed Ogg codecs will hopefully solve these problems, by allowing a wide variety of data formatting options and proper inter-codec syncing for testing and development. We should not be limiting ourselves to what existing raw formats support.<br />
<br />
<br />
== Design ==<br />
<br />
=== Why Not FourCC ===<br />
The RIFF/Quicktime set of codecs have several dozen raw codecs each supporting very specific formatting options. Many of these are special purpose, never used on a wide scale, and many formatting options are not available in this system. This situation was designed under the philosophy that if an application supports a codec, as identified by the 32-bit codec identifier (aka FourCC), that it would be expected to support all the format options possible with that codec.<br />
<br />
That system, at it's heart, is what we call "FourCC". Media frameworks designed around FourCC identify a codec purely by the 32-bit identifier, without versioning information, without further formatting information, so that by a simple table of 32-bit IDs the media framework could know which plugin to use and wether the application could support it. This, inevitably, creates a situation where most applications must support a pool of popular uncompressed codecs, increasing the footprint and complexity of the application considerably.<br />
<br />
We can do better.<br />
<br />
=== Our Philosophy ===<br />
Ogg, by contrast, doesn't have a unified codec identifier. Codec software only required to be able to accuratly identify their own streams, based on information in the first packet of those streams, from other codec's streams. Thus, we don't look at a universal identifier, matching it against a table, then knowing exactly which plugin to load. Instead, we pass packet 0 to our codecs plugins and, thus, know which codec plugins which support a specific version, different feature and formatting sets, and other things which cannot be fit into a 32-bit identifier.<br />
<br />
By this, codec plugins may, and can even be expected to, support only a subset of the formatting options available with a codec. A video codec plugin could, for example, support only low bitrate video, but do so very well, or support only non-interlaced video. This allows the codec specification to include clean backwards compatability, where a Theora to VP32 plugin could be written to only support Theora streams which do not use options unavailable to VP32.<br />
<br />
This changes the paradigm for our uncompressed codecs, as unlike FourCC, we only need a handful of unified, generalized codecs which each support a wide variety of format options, perhaps far more than any application or codec plugin would ever use. This makes sense- it eliminates artificial limitations, previously implied on codecs to simplify implementation as support was expected to be binary (either none at all or complete).<br />
<br />
Through minor revisions, we do not even need to support every possible format in the first implemented version. Values on formatting options, however, should be reserved for "extended" settings and a minor version field available to specify which version can, at minimum, parse the meaning of extended settings.<br />
<br />
== See Also ==<br />
<br />
* [[OggPCM]] - uncompressed audio codec<br />
* [[OggRGB]] - uncompressed RGB video codec<br />
* [[OggYUV]] - uncompressed YUV video codec</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=2000OggPCM Draft12005-11-11T03:24:57Z<p>Arc: /* Alternative Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, especially for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video for development, vs RIFF/WAV (.wav) and YUV4MPEG (.yuv) in seperate files as we did with [[Theora]].<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give us that all-important header which carries information about the number of channels, sample width, and sample frequency. So what we need is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extend - unsupported by 1.0-only software<br />
<br />
== Alternative Format ==<br />
<br />
The primary difference between this format and the one above is that it is intended to support channels from the same source having different sampling parameters.<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
-<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate ** this field falls on a 32bit-word barrier ** <br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
Constraints:<br />
This format can support any __documented and registered__ format by since it uses an enumeration. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
Discussion:<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. I'm not convinced that the source id and channel block are necessary, but figured I'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=1997OggPCM Draft12005-11-11T03:23:25Z<p>Arc: /* Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, especially for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video for development, vs RIFF/WAV (.wav) and YUV4MPEG (.yuv) in seperate files as we did with [[Theora]].<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give us that all-important header which carries information about the number of channels, sample width, and sample frequency. So what we need is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Stream Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used in later minor version<br />
-<br />
32 [int] Samplerate (samples/second)<br />
<br />
Comment Header Packet<br />
8 0x03 Comment Header Packet ID<br />
24 "PCM" Codec Identifier<br />
-- Continues as [[http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#vorbis-spec-comment|Vorbis's Comment Header]]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extend - unsupported by 1.0-only software<br />
<br />
== Alternative Format ==<br />
<br />
The primary difference between this format and the one above is that it is intended to support channels from the same source having different sampling parameters.<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
Constraints:<br />
This format can support any __documented and registered__ format by since it uses an enumeration. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
Discussion:<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. I'm not convinced that the source id and channel block are necessary, but figured I'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggRaw&diff=1998OggRaw2005-11-11T01:05:47Z<p>Arc: /* See Also */</p>
<hr />
<div>== Purpose ==<br />
<br />
=== For the Ogg Media Framework ===<br />
Within OggStream, codecs are recoded (encoded, decoded, or transcoded) from one format to another. For example, a Vorbis codec plugin could be used to convert a Vorbis I stream to a PCM stream. Ogg packets of these streams are imported and exported from OggStream, and several of these conversions can be used in sequence (plugin chaining) to attain a desired output from any supported input within that media type.<br />
<br />
Some codec plugins will only support one or two raw codecs, providing translations between different formatting options as needed, such that (ie) if an Ogg FLAC file contains 64-bit float data, and the media player attempting to play this data only supports 16-bit signed int data, a second plugin could provide the conversion from 64-bit float to 16-bit signed int.<br />
<br />
Having these uncompressed codecs is, thus, essential for implementing the new Ogg media framework as it depends on interchange codecs which all applications desiring to work with a certain media type can reasonably support.<br />
<br />
<br />
=== For Low-CPU Storage ===<br />
While losslessly compressed Ogg codecs are available for both audio and video, some applications (ie, live recording, editing, etc) find the higher CPU requirements for processing these formats less desireable than the need for additional storage capacity. Many of these applications require syncing information not provided sufficiently by RIFF (.wav/.avi) or Quicktime, where using an uncompressed codec within Ogg provides excellent cross-bitstream syncing, or the application may be designed around the Ogg media framework where storing data in an uncompressed Ogg codec makes encoding it later while keeping comments/etc is made easier.<br />
<br />
<br />
=== For Codec Development ===<br />
As we experienced with Ogg [[Theora]] development, there is a shortage of simple raw data formats which support the capabilities being tested in codec development. Additionally, the lack of inter-codec sync information (ie, when using .wav & yuv4mpeg2) for these non-Ogg raw formats makes debugging more difficult than it should be.<br />
<br />
These uncompressed Ogg codecs will hopefully solve these problems, by allowing a wide variety of data formatting options and proper inter-codec syncing for testing and development. We should not be limiting ourselves to what existing raw formats support.<br />
<br />
<br />
== Design ==<br />
<br />
=== Why Not FourCC ===<br />
The RIFF/Quicktime (FourCC) set of codecs have several dozen raw codecs each supporting very specific formatting options. Many of these are special purpose, never used on a wide scale, and many formatting options are not available in this system. This situation was designed under the philosophy that if an application supports a codec, as identified by the 32-bit codec identifier (aka FourCC), that it would be expected to support all the format options possible with that codec.<br />
<br />
That system, at it's heart, is what we call "FourCC". Media frameworks designed around FourCC identify a codec purely by the 32-bit identifier, without versioning information, without further formatting information, so that by a simple table of 32-bit IDs the media framework could know which plugin to use and wether the application could support it. This, inevitably, creates a situation where most applications support a pool of popular uncompressed codecs while newer formats are delayed as they wait for approval from a central authority to introduce a new uncompressed codec to the pool, then for application developers to add support for it.<br />
<br />
We can do better.<br />
<br />
<br />
=== Our Philosophy ===<br />
Ogg, by contrast, doesn't have a unified codec identifier. Codec software only required to be able to accuratly identify their own streams, based on information in the first packet of those streams, from other codec's streams. Thus, we don't look at a universal identifier, matching it against a table, then knowing exactly which plugin to load. Instead, we pass packet 0 to our codecs plugins and, thus, know which codec plugins which support a specific version, different feature and formatting sets, and other things which cannot be fit into a 32-bit identifier.<br />
<br />
By this, codec plugins may, and can even be expected to, support only a subset of the formatting options available with a codec. A video codec plugin could, for example, support only low bitrate video, but do so very well, or support only non-interlaced video. This allows the codec specification to include clean backwards compatability, where a Theora to VP32 plugin could be written to only support Theora streams which do not use options unavailable to VP32.<br />
<br />
This changes the paradigm for our uncompressed codecs, as unlike FourCC, we only need a handful of unified, generalized codecs which each support a wide variety of format options, perhaps far more than any application or codec plugin would ever use. This makes sense- it eliminates artificial limitations, previously implied on codecs to simplify implementation as support was expected to be binary (either none at all or complete).<br />
<br />
Through minor revisions, we do not even need to support every possible format in the first implemented version. Values on formatting options, however, should be reserved for "extended" settings and a minor version field available to specify which version can, at minimum, parse the meaning of extended settings.<br />
<br />
== See Also ==<br />
<br />
* [[OggPCM]] - uncompressed audio codec<br />
* [[OggRGB]] - uncompressed RGB video codec<br />
* [[OggYUV]] - uncompressed YUV video codec</div>Archttps://wiki.xiph.org/index.php?title=OggRaw&diff=1995OggRaw2005-11-11T01:05:14Z<p>Arc: </p>
<hr />
<div>== Purpose ==<br />
<br />
=== For the Ogg Media Framework ===<br />
Within OggStream, codecs are recoded (encoded, decoded, or transcoded) from one format to another. For example, a Vorbis codec plugin could be used to convert a Vorbis I stream to a PCM stream. Ogg packets of these streams are imported and exported from OggStream, and several of these conversions can be used in sequence (plugin chaining) to attain a desired output from any supported input within that media type.<br />
<br />
Some codec plugins will only support one or two raw codecs, providing translations between different formatting options as needed, such that (ie) if an Ogg FLAC file contains 64-bit float data, and the media player attempting to play this data only supports 16-bit signed int data, a second plugin could provide the conversion from 64-bit float to 16-bit signed int.<br />
<br />
Having these uncompressed codecs is, thus, essential for implementing the new Ogg media framework as it depends on interchange codecs which all applications desiring to work with a certain media type can reasonably support.<br />
<br />
<br />
=== For Low-CPU Storage ===<br />
While losslessly compressed Ogg codecs are available for both audio and video, some applications (ie, live recording, editing, etc) find the higher CPU requirements for processing these formats less desireable than the need for additional storage capacity. Many of these applications require syncing information not provided sufficiently by RIFF (.wav/.avi) or Quicktime, where using an uncompressed codec within Ogg provides excellent cross-bitstream syncing, or the application may be designed around the Ogg media framework where storing data in an uncompressed Ogg codec makes encoding it later while keeping comments/etc is made easier.<br />
<br />
<br />
=== For Codec Development ===<br />
As we experienced with Ogg [[Theora]] development, there is a shortage of simple raw data formats which support the capabilities being tested in codec development. Additionally, the lack of inter-codec sync information (ie, when using .wav & yuv4mpeg2) for these non-Ogg raw formats makes debugging more difficult than it should be.<br />
<br />
These uncompressed Ogg codecs will hopefully solve these problems, by allowing a wide variety of data formatting options and proper inter-codec syncing for testing and development. We should not be limiting ourselves to what existing raw formats support.<br />
<br />
<br />
== Design ==<br />
<br />
=== Why Not FourCC ===<br />
The RIFF/Quicktime (FourCC) set of codecs have several dozen raw codecs each supporting very specific formatting options. Many of these are special purpose, never used on a wide scale, and many formatting options are not available in this system. This situation was designed under the philosophy that if an application supports a codec, as identified by the 32-bit codec identifier (aka FourCC), that it would be expected to support all the format options possible with that codec.<br />
<br />
That system, at it's heart, is what we call "FourCC". Media frameworks designed around FourCC identify a codec purely by the 32-bit identifier, without versioning information, without further formatting information, so that by a simple table of 32-bit IDs the media framework could know which plugin to use and wether the application could support it. This, inevitably, creates a situation where most applications support a pool of popular uncompressed codecs while newer formats are delayed as they wait for approval from a central authority to introduce a new uncompressed codec to the pool, then for application developers to add support for it.<br />
<br />
We can do better.<br />
<br />
<br />
=== Our Philosophy ===<br />
Ogg, by contrast, doesn't have a unified codec identifier. Codec software only required to be able to accuratly identify their own streams, based on information in the first packet of those streams, from other codec's streams. Thus, we don't look at a universal identifier, matching it against a table, then knowing exactly which plugin to load. Instead, we pass packet 0 to our codecs plugins and, thus, know which codec plugins which support a specific version, different feature and formatting sets, and other things which cannot be fit into a 32-bit identifier.<br />
<br />
By this, codec plugins may, and can even be expected to, support only a subset of the formatting options available with a codec. A video codec plugin could, for example, support only low bitrate video, but do so very well, or support only non-interlaced video. This allows the codec specification to include clean backwards compatability, where a Theora to VP32 plugin could be written to only support Theora streams which do not use options unavailable to VP32.<br />
<br />
This changes the paradigm for our uncompressed codecs, as unlike FourCC, we only need a handful of unified, generalized codecs which each support a wide variety of format options, perhaps far more than any application or codec plugin would ever use. This makes sense- it eliminates artificial limitations, previously implied on codecs to simplify implementation as support was expected to be binary (either none at all or complete).<br />
<br />
Through minor revisions, we do not even need to support every possible format in the first implemented version. Values on formatting options, however, should be reserved for "extended" settings and a minor version field available to specify which version can, at minimum, parse the meaning of extended settings.<br />
<br />
== See Also ==<br />
<br />
* OggPCM - uncompressed audio codec<br />
* OggRGB - uncompressed RGB video codec<br />
* OggYUV - uncompressed YUV video codec</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=1996OggPCM Draft12005-11-10T23:40:02Z<p>Arc: /* Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, especially for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video for development, vs RIFF/WAV (.wav) and YUV4MPEG (.yuv) in seperate files as we did with [[Theora]].<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give us that all-important header which carries information about the number of channels, sample width, and sample frequency. So what we need is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
8 [int] Samples per Second<br />
-<br />
1 [flg] False = MSB, True = LSB<br />
3 [int] PCM Data Type (see table below)<br />
4 [nil] Padding to byte, may be used for "extended" data type<br />
[ Channel Identifiers ]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
PCM Data Type<br />
=============<br />
ID# Bits Type<br />
0 8 signed (char)<br />
1 8 unsigned (char)<br />
2 16 signed (short int)<br />
3 24 signed (int + 8bit padding)<br />
4 32 signed (int)<br />
5 32 float (float)<br />
6 64 float (double)<br />
7 ? Extend - unsupported by 1.0-only software<br />
<br />
== Alternative Format ==<br />
<br />
The primary difference between this format and the one above is that it is intended to support channels from the same source having different sampling parameters.<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
Constraints:<br />
This format can support any __documented and registered__ format by since it uses an enumeration. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
Discussion:<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. I'm not convinced that the source id and channel block are necessary, but figured I'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggPCM_Draft1&diff=1993OggPCM Draft12005-11-10T21:35:10Z<p>Arc: /* Format */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggPCM''' is a pulse-code modulation (PCM) audio codec for Ogg. Similar to Microsoft's .wav or Apple's .aiff formats, it's a simple way to store and transfer uncompressed audio within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
The intention for this format is as an interchange format, especially for use with [[OggStream]]. It is also useful for storing time-synced decoded audio/video for development, vs RIFF/WAV (.wav) and YUV4MPEG (.yuv) in seperate files as we did with [[Theora]].<br />
<br />
It is also less complex than either .wav (RIFF) or .aiff (AIFF), both of these formats being designed for generic multimedia (audio, video, etc). Full compatability with these formats includes support for non-PCM data.<br />
<br />
Using raw PCM data, on the other hand, doesn't give us that all-important header which carries information about the number of channels, sample width, and sample frequency. So what we need is a header followed by raw PCM data - nothing more complicated.<br />
<br />
== Format ==<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Multibyte fields in the header packet are packed in big endian order. Other fields are stored MSB first. Multibyte fields in the data packet are packed in little endian order.<br />
<br />
The granule position specified is the total samples encoded after including all samples on the page. Samples must not be split across pages. The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.<br />
<br />
An example of how this can be useful is the proposed ReplayGain extension to .wav format: http://replaygain.hydrogenaudio.org/file_format_wav.html<br />
<br />
Note that no such extension is planned, nor is the need for a future format forseen, but history has shown that even the most basic formats eventually become obsolete.<br />
<br />
Packet 0, BOS, 12 bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
-<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [int] Number of Channels (1-256)<br />
8 [int] Samples per Second<br />
-<br />
4 [int] Bytes per Sample (*8)<br />
1 [flg] False = MSB, True = LSB<br />
1 [flg] Float, if >2bytes/sample, Unsigned if 1byte/sample (False=signed int)<br />
2 [nil] Padding to byte/int - may be used for "extended" data type<br />
[ Channel Identifiers ]<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data<br />
<br />
== Alternative Format ==<br />
<br />
The primary difference between this format and the one above is that it is intended to support channels from the same source having different sampling parameters.<br />
<br />
Packet 0, BOS, tbd bytes<br />
8 0x00 Header Packet ID<br />
24 "PCM" Codec identifier <br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream)<br />
8 [uint] Channel Block<br />
16 [bitfield] Indicates which of the 16 channels in this channel block <br />
are present in this logical OGGPCM stream.<br />
8 [enum] Sample format (OGGPCM_FMT_U8, OGGPCM_FMT_LE_S16, OGGPCM_FMT_BE_S16, etc) <br />
24 [uint] Sample rate<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "PCM" Codec identifier, pads data to 32-bits<br />
.. [data] variable length pcm data, packing defined by Sample Format field in header<br />
<br />
Constraints:<br />
This format can support any __documented and registered__ format by since it uses an enumeration. Each logical stream can support up to 16 channels sharing a fixed sample rate. Logical streams from the same source may be multiplexed to provide up to 4096 channels per source, each with their own sample rate. Up to 256 Sources may be multiplexed within a physical Ogg stream, unless an application takes other measures to logically partition the stream. <br />
<br />
Discussion:<br />
This seems to make it easy to support the simple/normal cases and possible to support the pathological cases, for instance:<br />
{| border="1" cellpadding="1"<br />
| Source ID || Channel Bitfield || Sample Rate || Sample Format || Comment<br />
|-<br />
| 0x00 || 0000 0000 0000 0011 || 96000 || OGGPCM_FMT_LE_S24 || Front Stereo Pair<br />
|-<br />
| 0x00 || 0000 0000 0011 1100 || 44100 || OGGPCM_FMT_LE_S16 || Center And Surrounds<br />
|-<br />
| 0x00 || 0000 0000 0010 0000 || 8000 || OGGPCM_FMT_LE_S16 || LFE Channel<br />
|-<br />
| 0x01 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || PC Speaker<br />
|-<br />
| 0x02 || 0000 0000 0000 0001 || 8000 || OGGPCM_FMT_U8 || Microphone<br />
|-<br />
| 0x03 || 0000 0000 0000 0011 || 8000 || OGGPCM_FMT_LE_S16 || Voice Chat<br />
|}<br />
<br />
Each entry in the table is a logical Ogg stream. I'm not convinced that the source id and channel block are necessary, but figured I'd throw it out there.</div>Archttps://wiki.xiph.org/index.php?title=OggRaw&diff=1994OggRaw2005-11-10T19:19:46Z<p>Arc: </p>
<hr />
<div>== Purpose ==<br />
<br />
=== For the Ogg Media Framework ===<br />
Within OggStream, codecs are recoded (encoded, decoded, or transcoded) from one format to another. For example, a Vorbis codec plugin could be used to convert a Vorbis I stream to a PCM stream. Ogg packets of these streams are imported and exported from OggStream, and several of these conversions can be used in sequence (plugin chaining) to attain a desired output from any supported input within that media type.<br />
<br />
Some codec plugins will only support one or two raw codecs, providing translations between different formatting options as needed, such that (ie) if an Ogg FLAC file contains 64-bit float data, and the media player attempting to play this data only supports 16-bit signed int data, a second plugin could provide the conversion from 64-bit float to 16-bit signed int.<br />
<br />
Having these uncompressed codecs is, thus, essential for implementing the new Ogg media framework as it depends on interchange codecs which all applications desiring to work with a certain media type can reasonably support.<br />
<br />
<br />
=== For Low-CPU Storage ===<br />
While losslessly compressed Ogg codecs are available for both audio and video, some applications (ie, live recording, editing, etc) find the higher CPU requirements for processing these formats less desireable than the need for additional storage capacity. Many of these applications require syncing information not provided sufficiently by RIFF (.wav/.avi) or Quicktime, where using an uncompressed codec within Ogg provides excellent cross-bitstream syncing, or the application may be designed around the Ogg media framework where storing data in an uncompressed Ogg codec makes encoding it later while keeping comments/etc is made easier.<br />
<br />
<br />
=== For Codec Development ===<br />
As we experienced with Ogg [[Theora]] development, there is a shortage of simple raw data formats which support the capabilities being tested in codec development. Additionally, the lack of inter-codec sync information (ie, when using .wav & yuv4mpeg2) for these non-Ogg raw formats makes debugging more difficult than it should be.<br />
<br />
These uncompressed Ogg codecs will hopefully solve these problems, by allowing a wide variety of data formatting options and proper inter-codec syncing for testing and development. We should not be limiting ourselves to what existing raw formats support.<br />
<br />
<br />
== Design ==<br />
<br />
=== Why Not FourCC ===<br />
The RIFF/Quicktime (FourCC) set of codecs have several dozen raw codecs each supporting very specific formatting options. Many of these are special purpose, never used on a wide scale, and many formatting options are not available in this system. This situation was designed under the philosophy that if an application supports a codec, as identified by the 32-bit codec identifier (aka FourCC), that it would be expected to support all the format options possible with that codec.<br />
<br />
That system, at it's heart, is what we call "FourCC". Media frameworks designed around FourCC identify a codec purely by the 32-bit identifier, without versioning information, without further formatting information, so that by a simple table of 32-bit IDs the media framework could know which plugin to use and wether the application could support it. This, inevitably, creates a situation where most applications support a pool of popular uncompressed codecs while newer formats are delayed as they wait for approval from a central authority to introduce a new uncompressed codec to the pool, then for application developers to add support for it.<br />
<br />
We can do better.<br />
<br />
<br />
=== Our Philosophy ===<br />
Ogg, by contrast, doesn't have a unified codec identifier. Codec software only required to be able to accuratly identify their own streams, based on information in the first packet of those streams, from other codec's streams. Thus, we don't look at a universal identifier, matching it against a table, then knowing exactly which plugin to load. Instead, we pass packet 0 to our codecs plugins and, thus, know which codec plugins which support a specific version, different feature and formatting sets, and other things which cannot be fit into a 32-bit identifier.<br />
<br />
By this, codec plugins may, and can even be expected to, support only a subset of the formatting options available with a codec. A video codec plugin could, for example, support only low bitrate video, but do so very well, or support only non-interlaced video. This allows the codec specification to include clean backwards compatability, where a Theora to VP32 plugin could be written to only support Theora streams which do not use options unavailable to VP32.<br />
<br />
This changes the paradigm for our uncompressed codecs, as unlike FourCC, we only need a handful of unified, generalized codecs which each support a wide variety of format options, perhaps far more than any application or codec plugin would ever use. This makes sense- it eliminates artificial limitations, previously implied on codecs to simplify implementation as support was expected to be binary (either none at all or complete).<br />
<br />
Through minor revisions, we do not even need to support every possible format in the first implemented version. Values on formatting options, however, should be reserved for "extended" settings and a minor version field available to specify which version can, at minimum, parse the meaning of extended settings.</div>Archttps://wiki.xiph.org/index.php?title=OggRaw&diff=1988OggRaw2005-11-10T18:49:59Z<p>Arc: </p>
<hr />
<div>== Purpose ==<br />
<br />
=== For the Ogg Media Framework ===<br />
Within OggStream, codecs are recoded (encoded, decoded, or transcoded) from one format to another. For example, a Vorbis codec plugin could be used to convert a Vorbis I stream to a PCM stream. Ogg packets of these streams are imported and exported from OggStream, and several of these conversions can be used in sequence (plugin chaining) to attain a desired output from any supported input within that media type.<br />
<br />
Some codec plugins will only support one or two raw codecs, providing translations between different formatting options as needed, such that (ie) if an Ogg FLAC file contains 64-bit float data, and the media player attempting to play this data only supports 16-bit signed int data, a second plugin could provide the conversion from 64-bit float to 16-bit signed int.<br />
<br />
Having these uncompressed codecs is, thus, essential for implementing the new Ogg media framework as it depends on interchange codecs which all applications desiring to work with a certain media type can reasonably support.<br />
<br />
<br />
=== For Low-CPU Storage ===<br />
While losslessly compressed Ogg codecs are available for both audio and video, some applications (ie, live recording, editing, etc) find the higher CPU requirements for processing these formats less desireable than the need for additional storage capacity. Many of these applications require syncing information not provided sufficiently by RIFF (.wav/.avi) or Quicktime, where using an uncompressed codec within Ogg provides excellent cross-bitstream syncing, or the application may be designed around the Ogg media framework where storing data in an uncompressed Ogg codec makes encoding it later while keeping comments/etc is made easier.<br />
<br />
<br />
=== For Codec Development ===<br />
As we experienced with Ogg [[Theora]] development, there is a shortage of simple raw data formats which support the capabilities being tested in codec development. Additionally, the lack of inter-codec sync information (ie, when using .wav & yuv4mpeg2) for these non-Ogg raw formats makes debugging more difficult than it should be.<br />
<br />
These uncompressed Ogg codecs will hopefully solve these problems, by allowing a wide variety of data formatting options and proper inter-codec syncing for testing and development. We should not be limiting ourselves to what existing raw formats support.</div>Archttps://wiki.xiph.org/index.php?title=Main_Page&diff=1999Main Page2005-11-10T18:13:32Z<p>Arc: /* Codecs */</p>
<hr />
<div>= Projects/Formats =<br />
<br />
In an effort to bring open-source ideals to the world of multimedia The Xiph.org Foundation ([[XiphOrg]]) develops a multitude of amazing products. <br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommeded container for Xiph codecs.<br />
* [[OggSkeleton]]: Skeleton information on all logical content bitstreams in Ogg<br />
<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML playlist format<br />
<br />
== Codecs ==<br />
<br />
* [[Vorbis]]: Audio codec<br />
* [[Tremor]]: Fixed-point decoder<br />
* [[Theora]]: Video codec<br />
* [[FLAC]]: Free Lossless Audio Codec<br />
* [[Speex]]: Speech codec<br />
* [[Ogg Writ]]: Text phrase codec (e.g. subtitles)<br />
* '''Under Development:'''<br />
** '''Metadata Codecs:'''<br />
*** [[Metadata]]: Arbitrary metadata stream format (vapourware so far)<br />
** '''Compressed Codecs:'''<br />
*** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
** '''[[RawCodecs|Raw/Uncompressed Codecs]]:'''<br />
*** [[OggPCM]]: Uncompressed PCM audio, primarily as an interchange codec<br />
*** [[OggRGB]]: Uncompressed RGB video, primarily as an interchange codec<br />
*** [[OggYUV]]: Uncompressed YUV video, primarily as an interchange codec, undergoing heavy debate<br />
<br />
== Software for distributing media ==<br />
<br />
* [[Icecast]]: Streaming server<br />
* [[Ices]]: Source client for Icecast servers<br />
* [[IceShare]]: P2P content distribution<br />
<br />
== Other software ==<br />
<br />
* [[OggComponent/VorbisComponent]]: Wrappers to integrate Ogg-Vorbis into MacOsX<br />
<br />
= Demonstrations =<br />
<br />
Want to hear Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams]]: Stations streaming with the Vorbis codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware]]: Hardware players using the Vorbis codec<br />
* [http://www.tversity.com TVersity Media Server]: A UPNP/AV compliant media server that uses the Ogg Vorbis libraries to transcode audio files to the Ogg Vorbis format.<br />
<br />
= Project management =<br />
<br />
* [[MonthlyMeeting]]<br />
* [[MailingLists]]<br />
* [[Bounties]]<br />
* [[HyperFish]]<br />
<br />
= Wiki internal =<br />
* [[Sandbox]]: Testbed for testing editing skills.<br />
* [[Translations]]: What about some translation work</div>Archttps://wiki.xiph.org/index.php?title=OggYUV&diff=1986OggYUV2005-11-10T06:08:15Z<p>Arc: /* Arc's Draft */</p>
<hr />
<div>== What is it ==<br />
<br />
'''OggYUV''' is an uncompressed YUV (YCbCr) video codec for Ogg. It's a simple way to store and transfer uncompressed video within an Ogg container.<br />
<br />
<br />
== Why is it ==<br />
<br />
The purpose of '''OggYUV''' is as a raw video interchange format within [[OggStream]] and other media frameworks. The format design is to be simple, complete, and efficient enough to be reasonably used to export decoded video to a media player for display.<br />
<br />
It can also replace our dependence on yuv4mpeg2 for lossless video storage, which Ogg [[Theora]] originally used as an encoding source. The main advantage over yuv4mpeg2 files is that, within an Ogg container, it can be time-synced with [[OggPCM]] in a much more reliable way than yuv4mpeg2 + wav audio.<br />
<br />
<br />
== Format ==<br />
<br />
Packets are processed as per the value of their first byte. Packets of unknown ID should be silently ignored, providing a convient way to add future expandability which does not break the data format. Additional data in packet 0 (the header packet) must also be silently ignored.<br />
<br />
=== Arc's Draft ===<br />
Packet 0, BOS, 24 bytes<br />
8 0x00 Header Packet ID<br />
24 "YUV" Codec identifier <br />
--<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
1 [flg] Color (false = B/W, next 8 bits and must be null)<br />
3 [int] Chroma Pixel "Shape" (see [[Talk:OggYUV#Chroma_Subsampling_Methods|Draft Table]])<br />
1 [flg] 50% physical horizontal offset for Cr samples<br />
1 [flg] 50% physical verticle offset for Cr samples<br />
2 [int] Chroma blending: 0=None, 1=UpperLeft Sampled/Blend Others, 2=Blend All<br />
1 [flg] Packed (false = Planar, next 2 bits must be null)<br />
1 [flg] Cr Staggered Horizontally in Bytestream<br />
1 [flg] Cr Staggered Vertically in Bytestream<br />
1 [flg] Big Endian/LSB (false = Little Endian/MSB)<br />
1 [flg] Interlaced (false = Progressive)<br />
3 ????? Interlace Options, other bytestream options<br />
--<br />
8 [int] Alpha channel bpp<br />
8 [int] Luma channel bpp<br />
8 [int] Chroma channels bpp<br />
8 [int] Colorspace (same as in [[theora]])<br />
--<br />
24 [int] Frame Width<br />
24 [int] Frame Height<br />
24 [int] Aspect Numerator<br />
24 [int] Aspect Denominator<br />
--<br />
32 [int] Framerate Numerator<br />
32 [int] Framerate Denominator<br />
<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "YUV" Codec identifier, pads data to 32-bits<br />
.. [data] variable length packed [a]rgb frame<br />
<br />
=== John's Draft ===<br />
Packet 0, BOS, 24 bytes<br />
8 0x00 Header Packet ID<br />
24 "UVS" Codec identifier (uncompressed video stream)<br />
8 0x01 Version Major (breaks backwards compatability to increment)<br />
8 0x00 Version Minor (backwards compatable, ie, via extended header)<br />
---- Video description ----<br />
16 [uint] Display Width<br />
16 [uint] Display Height<br />
16 [uint] Frame Aspect Ratio Numerator<br />
16 [uint] Frame Aspect Ratio Denomonator<br />
16 [uint] Field Rate Numerator<br />
16 [uint] Field Rate Denomonator<br />
8 [uint] Fields Per Frame<br />
8 [enum] Colorspace<br />
32 [uint] FourCC (optional, set to zero if N/A or unknown)<br />
---- Sampling description ----<br />
16 [uint] Stored Width<br />
16 [uint] Stored Height<br />
2 [uint] U Channel Horiz. (X) Samples/Macropixel, 0 = 4.<br />
2 [uint] U Channel Vert. (Y) Samples/Macropixel, 0 = 4.<br />
2 [uint] U Channel Horiz. Sample Offset (Fraction of pixel stride)<br />
2 [uint] U Channel Vert. Sample Offset (Fraction of pixel stride)<br />
2 [uint] V Channel Horiz. (X) Samples/Macropixel, 0 = 4.<br />
2 [uint] V Channel Vert. (Y) Samples/Macropixel, 0 = 4.<br />
2 [uint] V Channel Horiz. Sample Offset (Fraction of pixel stride)<br />
2 [uint] V Channel Vert. Sample Offset (Fraction of pixel stride)<br />
16 [uint] Pad<br />
---- Storage description ----<br />
8 [uint] A Channel Bits Per Sample<br />
8 [uint] Y/R Channel Bits Per Sample<br />
8 [uint] U/G Channel Bits Per Sample<br />
8 [uint] V/B Channel Bits Per Sample<br />
32 [uint] A Channel Data Offset (in bits)<br />
32 [ int] A Channel X Stride (in bits)<br />
32 [ int] A Channel Y Stride (in bits)<br />
32 [uint] Y/R Channel Data Offset (in bits)<br />
32 [ int] Y/R Channel X Stride (in bits)<br />
32 [ int] Y/R Channel Y Stride (in bits)<br />
32 [uint] U/G Channel Data Offset (in bits)<br />
32 [ int] U/G Channel X Stride (in bits)<br />
32 [ int] U/G Channel Y Stride (in bits)<br />
32 [uint] V/B Channel Data Offset (in bits)<br />
32 [ int] V/B Channel X Stride (in bits)<br />
32 [ int] V/B Channel Y Stride (in bits)<br />
<br />
Data Packet<br />
8 0xFF Data Packet ID<br />
24 "UVS" Codec identifier, pads data to 32-bits<br />
.. [data] variable length packed field</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggYUV&diff=1981Talk:OggYUV2005-11-10T05:30:20Z<p>Arc: /* Chroma Subsampling Methods */</p>
<hr />
<div>=== Interlace Flag? ===<br />
* The interlacing information doesn't seem complete to me. How do you know which field(s) you have in any give packet, for example? How do you distinguish between a 25Hz shutter and a 50Hz shutter? Field order switching? Mixing with uninterlaced data?<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* In my experience, all interlace is every other frame, even scanlines followed by odd scanlines. Is there any video codec which supports more than an interlace flag? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Variable frame-rates ===<br />
* There doesn't seem to be any handling of variable frame-rate data, or a specification for a timebase for the granulepos.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Granulepos is the last frame decodable in the current packet/page. As far as variable framerates within a single stream, is there any codec which supports this currently? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Codec Identifier ===<br />
* The identifier seems a little short. You'd get false positives if somebody wanted to use a "YUVx" format, for example.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* I believe that's OK with raw formats, if someone wanted to use a YUV-like codec they could use a prefix, vs a suffix, to identify it by. Also, if their header packet ID is something other than 0x00, it will not generate a false positive to have a YUV* codec identifier since the YUV plugins only support streams which begin with packet id 0. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Aspect ratio ===<br />
* Is the aspect ratio the pixel aspect or the frame aspect? <br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Frame aspect, this acts exactly like the aspect ratio in the Theora header, right down to having the same bit-size for the fields. Typically, the ratio is 4:3 or 16:9. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Chroma Subsampling Methods ===<br />
* We need to know two things, what the size/shape of chroma pixels are, and if they are packed, what order they are provided in the bitstream.<br />
** This seperates the order of the data and the processing of the data, both of which are important but get very complicated if mixed<br />
** The order must match the shape it's based on, ie, 4:4:4 should be in "0:0" order, any other value (which is illegal) should not be supported by any software if encountered and should never be generated.<br />
<br />
Chroma Pixel "Shapes"<br />
=====================<br />
ID Shape Used-In<br />
0 #--- 4:4:4<br />
----<br />
----<br />
----<br />
.<br />
1 ##-- 4:2:2<br />
----<br />
---- <br />
----<br />
.<br />
2 #### 4:1:1<br />
----<br />
----<br />
----<br />
.<br />
3 #--- ?<br />
#---<br />
----<br />
----<br />
.<br />
4 ##-- 4:2:0<br />
##--<br />
----<br />
----<br />
.<br />
5 #### 4:1:0<br />
####<br />
----<br />
----<br />
.<br />
6 #### ?<br />
####<br />
####<br />
####<br />
.<br />
7 Extended Shape, unsupported in v1.0</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggYUV&diff=1974Talk:OggYUV2005-11-10T03:36:36Z<p>Arc: </p>
<hr />
<div>=== Interlace Flag? ===<br />
* The interlacing information doesn't seem complete to me. How do you know which field(s) you have in any give packet, for example? How do you distinguish between a 25Hz shutter and a 50Hz shutter? Field order switching? Mixing with uninterlaced data?<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* In my experience, all interlace is every other frame, even scanlines followed by odd scanlines. Is there any video codec which supports more than an interlace flag? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Variable frame-rates ===<br />
* There doesn't seem to be any handling of variable frame-rate data, or a specification for a timebase for the granulepos.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Granulepos is the last frame decodable in the current packet/page. As far as variable framerates within a single stream, is there any codec which supports this currently? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Codec Identifier ===<br />
* The identifier seems a little short. You'd get false positives if somebody wanted to use a "YUVx" format, for example.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* I believe that's OK with raw formats, if someone wanted to use a YUV-like codec they could use a prefix, vs a suffix, to identify it by. Also, if their header packet ID is something other than 0x00, it will not generate a false positive to have a YUV* codec identifier since the YUV plugins only support streams which begin with packet id 0. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Aspect ratio ===<br />
* Is the aspect ratio the pixel aspect or the frame aspect? <br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Frame aspect, this acts exactly like the aspect ratio in the Theora header, right down to having the same bit-size for the fields. Typically, the ratio is 4:3 or 16:9. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Chroma Subsampling Methods ===<br />
# There are less than 32 possible methods for chroma subsampling within a 4x2 block, it'd be fairly simple to simply put these into a table and refer to them by index# <br>--[[User:Arc|Arc]] 19:36, 9 Nov 2005 (PST)<br />
#* This method seems a bit.. non-optimal, and also doesn't define how those samples are mapped to neighboring pixels.<br>--[[User:Arc|Arc]] 19:36, 9 Nov 2005 (PST)<br />
# U and V can be defined seperatly, with more patterns specified. These patterns are listed below. --[[User:Arc|Arc]] 19:36, 9 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggYUV&diff=1973Talk:OggYUV2005-11-09T21:17:41Z<p>Arc: syntax unification, comments as list items, followed by sig on newline</p>
<hr />
<div>=== Interlace Flag? ===<br />
* The interlacing information doesn't seem complete to me. How do you know which field(s) you have in any give packet, for example? How do you distinguish between a 25Hz shutter and a 50Hz shutter? Field order switching? Mixing with uninterlaced data?<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* In my experience, all interlace is every other frame, even scanlines followed by odd scanlines. Is there any video codec which supports more than an interlace flag? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Variable frame-rates ===<br />
* There doesn't seem to be any handling of variable frame-rate data, or a specification for a timebase for the granulepos.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Granulepos is the last frame decodable in the current packet/page. As far as variable framerates within a single stream, is there any codec which supports this currently? <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Codec Identifier ===<br />
* The identifier seems a little short. You'd get false positives if somebody wanted to use a "YUVx" format, for example.<br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* I believe that's OK with raw formats, if someone wanted to use a YUV-like codec they could use a prefix, vs a suffix, to identify it by. Also, if their header packet ID is something other than 0x00, it will not generate a false positive to have a YUV* codec identifier since the YUV plugins only support streams which begin with packet id 0. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Aspect ratio ===<br />
* Is the aspect ratio the pixel aspect or the frame aspect? <br />
--[[User:Gumboot|Gumboot]] 03:00, 9 Nov 2005 (PST)<br />
<br />
* Frame aspect, this acts exactly like the aspect ratio in the Theora header, right down to having the same bit-size for the fields. Typically, the ratio is 4:3 or 16:9. <br />
--[[User:Arc|Arc]] 10:42, 9 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=1967Talk:OggPCM Draft12005-11-09T21:13:19Z<p>Arc: </p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec. <br />
--[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally. <br />
--[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. <br />
--[[User:Arc|Arc]]<br />
<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. <br />
--[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. <br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. <br />
--[[User:Arc|Arc]]<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. <br />
--[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=1964Talk:OggPCM Draft12005-11-09T21:12:10Z<p>Arc: /* Do we need to offer endian data flag? If not, which is used? */</p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec. --[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally. --[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. --[[User:Arc|Arc]]<br />
<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. --[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it. --[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. --[[User:Arc|Arc]]<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=1963Talk:OggPCM Draft12005-11-09T21:11:52Z<p>Arc: </p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec. --[[User:Arc|Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally. --[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. --[[User:Arc|Arc]]<br />
<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. --[[User:Arc|Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. --[[User:Arc|Arc]]<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc|Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Talk:OggPCM_Draft1&diff=1962Talk:OggPCM Draft12005-11-09T21:11:06Z<p>Arc: </p>
<hr />
<div>=== Do we need signed/unsigned data flag? ===<br />
<br />
* Not really. The data can be easily changed to signed as default losslessly. Unsigned 8-bit data (where 128 is the median) is easily changed to signed, and changed back if being saved as RIFF/WAV (which only supports unsigned 8-bit). However, it wouldn't hurt to support it. Applications can be built to support one or multiple formats, thus requesting conversion if not supported by the codec. --[[User:Arc]]<br />
<br />
* I don't agree with that. It just puts more conditional code into packages that would normally have only one native format and it gives them more opportunity to fail to support variants of the format. If it's fixed then a few packages will always have to modify the data, and most will never get it wrong. If it's variable then every package will have to do something sometimes, or fail occasionally. --[[User:Gumboot|Gumboot]] 01:28, 8 Nov 2005 (PST)<br />
<br />
=== Do we need to record int/float data flag? ===<br />
<br />
* Some codecs (Vorbis) use floating point samples natively. Others only support int. Support for int/float data flag is thus important. --[[User:Arc]]<br />
<br />
<br />
=== Do we need to offer endian data flag? If not, which is used? ===<br />
<br />
* LSB/MSB can be changed losslessly, one should probobally be settled on for the data and stick with it. It's a fairly low-CPU process to change the endian on the application side in any event, and if the application uses the bitpacker, this isn't even an issue. Supporting both is possible, too, but adds complexity to a format intended to be ''simple''. --[[User:Arc]]<br />
<br />
* We should just standardize on little endian ordering for the data. It's commonly used and well supported in hardware and software. Any cross architecture application that can deal WAV's will already know how to support it.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I agree that we should use little endian as standard, however, I'm questioning if big endian should be supported as well... after all, it'd be trivial for a plugin to convert from one to another. --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
=== Is it worth supporting a vorbiscomment header? ===<br />
<br />
* It'd be useful to be able to carry information like what was decoded, or CDDB IDs, or replaygain information. Besides, if you don't put it in then five other people will do it five different ways. --[[User:Arc]]<br />
<br />
<br />
=== How does one interpret a file where the Bits per Sample is neither 32 nor 64 and the Data Type is float? ===<br />
* One doesn't. Standardize on IEEE floats and be done with it. Simple, remember? :)<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I'm uncertain exactly what this question is. Hopefully the submitter can clarify? --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)<br />
<br />
<br />
=== Are samples padded to some round number of bits? ===<br />
* I don't know of any PCM formats for non-octet based samples, but if you want to specify something, I'd say pack them into the MSB's of the next larger byte boundary, round toward zero, on a per channel basis. This should allow software that knows how to handle 16 bit audio but not 10 bit to operate on the data.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
<br />
== Do we want/need the 32-bit data packet header? ==<br />
* The issue was raised on the ogg-dev mailing list of wether this is necessary. With only a single header packet, it could be considered an unneeded complication, however, additional header packets (current or future) will make this a requirement. --[[User:Arc]]<br />
<br />
* I can definitely see people wanting to use comment pages, so I'd say leave the header on the data pages as well. On the other hand, if ogg provides guarantees about the alignment of packet data from packetout, I could see getting rid of it since there are benefits to working on buffers aligned to larger boundaries on some architectures. As far as I can tell, either no guarantees are made, or you'll get a buffer aligned to a word boundary, in which case having the header has no penalty.<br />
--[[User:Jkoleszar|Jkoleszar]] 11:48, 9 Nov 2005 (PST)<br />
<br />
* I believe that 64-bit platforms still use 32-bit memory space (I may be wrong!). Yes, libogg2 buffers should always begin on a 32-bit word boundary, so the beginning of the data should also be on a boundary. This was done intentionally, as was the choice to use a three letter codec identifier for raw codecs (since the packet ID + codec ID = 32bits this way), after an extended IRC discussion on the subject. If ending on a 64-bit boundary is something we're really worried about, we could always add 4 bytes, but I really don't think it should be necessary. --[[User:Arc|Arc]] 13:11, 9 Nov 2005 (PST)</div>Archttps://wiki.xiph.org/index.php?title=Main_Page&diff=1982Main Page2005-11-09T20:59:15Z<p>Arc: /* Codecs */</p>
<hr />
<div>= Projects/Formats =<br />
<br />
In an effort to bring open-source ideals to the world of multimedia The Xiph.org Foundation ([[XiphOrg]]) develops a multitude of amazing products. <br />
<br />
== Container Formats ==<br />
<br />
* [[Ogg]]: Media container. This is our native format and the recommeded container for Xiph codecs.<br />
* [[OggSkeleton]]: Skeleton information on all logical content bitstreams in Ogg<br />
<br />
* [[SpeexRTP]]: RTP payload format for voice<br />
* [[VorbisRTP]]: RTP payload format for general audio<br />
* [[TheoraRTP]]: RTP payload format for video<br />
* [[XSPF]]: XML playlist format<br />
<br />
== Codecs ==<br />
<br />
* [[Vorbis]]: Audio codec<br />
* [[Tremor]]: Fixed-point decoder<br />
* [[Theora]]: Video codec<br />
* [[FLAC]]: Free Lossless Audio Codec<br />
* [[Speex]]: Speech codec<br />
* [[Ogg Writ]]: Text phrase codec (e.g. subtitles)<br />
* '''Under Development:'''<br />
** [[Metadata]]: Arbitrary metadata stream format (vapourware so far)<br />
** [[OggMNG]]: A mapping for encapsulating the MNG animation format in Ogg<br />
** [[OggPCM]]: Uncompressed PCM audio, primarily as an interchange codec<br />
** [[OggRGB]]: Uncompressed RGB video, primarily as an interchange codec<br />
** [[OggYUV]]: Uncompressed YUV video, primarily as an interchange codec, undergoing heavy debate<br />
<br />
== Software for distributing media ==<br />
<br />
* [[Icecast]]: Streaming server<br />
* [[Ices]]: Source client for Icecast servers<br />
* [[IceShare]]: P2P content distribution<br />
<br />
== Other software ==<br />
<br />
* [[OggComponent/VorbisComponent]]: Wrappers to integrate Ogg-Vorbis into MacOsX<br />
<br />
= Demonstrations =<br />
<br />
Want to hear Xiph in action? These projects are using our codecs, formats, or libraries.<br />
<br />
* [[VorbisStreams]]: Stations streaming with the Vorbis codec<br />
* [[Games that use Vorbis]]: Games using the Vorbis codec for music or sound effects<br />
* [[VorbisHardware]]: Hardware players using the Vorbis codec<br />
* [http://www.tversity.com TVersity Media Server]: A UPNP/AV compliant media server that uses the Ogg Vorbis libraries to transcode audio files to the Ogg Vorbis format.<br />
<br />
= Project management =<br />
<br />
* [[MonthlyMeeting]]<br />
* [[MailingLists]]<br />
* [[Bounties]]<br />
* [[HyperFish]]<br />
<br />
= Wiki internal =<br />
* [[Sandbox]]: Testbed for testing editing skills.<br />
* [[Translations]]: What about some translation work</div>Arc