Summer of Code 2008: Difference between revisions

Revision as of 02:25, 20 March 2007

Current Ideas

This is our ideas page for Google Summer of Code projects with Xiph.org and Annodex. The two projects are participating jointly this year under Xiph's name.

We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)

note: Google Summer of Code 2007, mentoring organizations to apply between March 5 and March 12, students March 14 - March 24

Students please use the template at Summer of Code Applications when applying for a GSoC position.

Mentors for details of our mentor application and plan, please see Summer of Code 2007.

Students should also check out projects related to the Elphel Open Source cameras.

Optimize Theora encoding/decoding speed, SSE/SSE2

Work on MMX, SSE/SSE2 implementations of the crucial encoding and decoding elements in libtheora and/or theora-exp. This could include porting the vp3 mmx and altivec code to the libtheora decoder, and writing sse improvements on the mmx work that has already been done. The results must still build cleanly on other archs and do run-time capability detection.

You could start improving this MMX loop filter patch for theora-exp that was never completed nor merged in current theora-exp (see all list thread).

Mentor: Ralph Giles, Timothy Terriberry, backup: Jan Gerber, Mike Smith

Encode support in theora-exp

Implement a rate-distortion optimized encoding mode for theora-exp, including R-D optimzed mode decision and quantization (e.g., constant lambda). Then, use the above routines to implement a medium-latency ABR encoding mode (e.g., varying lambda), with a default target buffer size of approximately 2 seconds.

Mentor: Timothy "Derf" Terriberry, backup: Mike Smith, Ralph Giles

Development assistant for the "Ghost" audio codec

Designing a cutting edge perceptual codec is a very daunting task. Xiph is in the research stage on a new low-latency, general purpose audio codec, code-named "Ghost". This is basically a "code assistant" position, where you will be asked to implement, test, and give feedback on ideas from Christopher Montgomery, designer of the Ogg Vorbis format. Be prepared to learn a lot about audio coding, or apply what you already know. While there's less "ownership" potential in this project proposal, it will be a great opportunity to learn about compression algorithm design, practice your programming chops, and learn to work in team.

Mentor: Christopher "Monty" Montgomery, backup: Jean-Marc Valin

OggMNG implementation

Implement the OggMNG decode support in gstreamer and/or illi's dshow filters. Implement encoding support in based on byzanz or Istanbul. Bonus points for overlay support. Details on the OggMNG specifications here

Mentors: Mike Smith, Ralph Giles

Theora reference encoder quality optimization

The libtheora encoder could make more use of some features present in the spec but not currently implemented in the encoder. This is a little open ended, but suggestions are: quant matrix tuning, per-block qi choice, 4:2:2 and 4:4:4 chroma support.

Mentor: Ralph Giles, backup: Timothy Terriberry

Subtitle Definition

There has been a long-standing need for the introduction of subtitles into Ogg. Several means have been suggested and various implementations exist. However, there has been no standard way that is supported by Xiph at this stage.

The CMML format with its time-aligned means of interleaving text into Ogg bitstreams is a platform on which we would very much like to define a standard means of including subtitles.

In this project, a standard means of interleaving subtitles (as found on DVDs or in srt files) into Ogg will be defined using CMML.

The project requires to make changes to the CMML definition and extend it in several ways. CMML needs to have a valid XML schema or DTD definition associated with it, so that standard XML tools will parse it. The associated documentation should then be updated and software written to put e.g. a srt file into CMML inside Ogg. If there is enough time, it would also be good to implement support for this format in a media player such as vlc or mplayer or xine.

Mentor: Silvia Pfeiffer, backup: Conrad Parker

Theora support in ekiga

Implement support for Theora as a video codec in the ekiga chat application. Overlaps with GNOME.

Mentor: Ralph Giles

MXF support in gstreamer

Implement an MXF mux/demux for gstreamer, with mappings for Vorbis and Theora.

Mentors: Christian Schaller, Mike Smith

Cascading Style Sheet support for CMML in GStreamer

Implement support for Cascading Style Sheets to add styling and positioning hints to CMML text overlays on Theora video. Doing so allows for advanced titling features visually similar to TV-style news headlines, sports scores, and scrolling text. The advantage over conventional "burnt-in" titling is that the stylesheet-driven approach is machine-readable, allowing indexing for search and improved accessibility.

The project involves:

YUV compositing support in the textoverlay plugin in gstreamer in order to complete the low-level support for color and font attributes. Textoverlay already includes partial support for pango text, but lacks the necessary colorspace conversions. This portion of the project necessarily involves C programming.
Implementation of a test application to playback video marked up with CSS titling hints. It is recommended that this portion of the project be implemented in a higher-level language for which GStreamer support and CSS parsing libraries exist, such as Python, Ruby or Haskell.

Further work could include support for style sheets in liboggplay / Firefox, or direct support for style sheet retrieval and rendering in GStreamer (via the cmmldec plugin and/or a new style-sheet-aware textoverlay plugin).

Mentor: Conrad Parker, backup: Mike Smith, Silvia Pfeiffer

Hardware implementation of Theora decoding

Working on a hardware theora decoder, that can be used in embedded devices, dvd players and video pods. Presumedly GPL verilog source to run on an FPGA. See http://sourceforge.net/projects/elphel/ for a rough encoder implementation. This was a successful project in 2006.

Intel to AT&T x86 assembly translation

There is a general need for cross platform projects to be able to compile the same asm accelleration code on both GCC and MSVC. Unfortunately, at least of x86, they have incompatible assembly formats. Currently people either convert one to the other by hand (a maintenance nightmare) or require an external compile/assemble step on one or the other platform.

Start with the (unmaintained?) intel2gas script. Spruce it up to support all of recent MMX, SSE, SSE2, SSE3 instructions. Then implement the reverse translation. Once both are working, write some glue code so it can be easily used as part of a GNU autotools build to derive one set of source from the other at build or package time.

Speex and FLAC encoders in Xiph QuickTime Components

Implement Speex and FLAC Core Audio encoders.

XiphQT has a Vorbis encoder component that could be used as a reference and starting point.

Mentor: Arek Korbik

New vocoder for Speex

This is a real speech coding project! Speex currently has a very low bit-rate (2.15 kbps) mode that is implemented as a trivial vocoder. This mode has four "reserved" bits per frame, which means it would be possible to transmit more information. The idea of this project would be to make use of these bits to improve the quality of the 2.15 kbps mode. Changes to both the encoder and the decoder are allowed, provided that they are compatible with older version. This means that the new bit-stream should be decodable by the old decoder with only minor loss in quality. This still leaves plenty of room for improvement. Upon successful completion, there will be a new vocoder that can be merged into the main Speex tree with improved quality and (non-exact) backward compatibility with the previous vocoder. The project requires some signal processing knowledge.

Mentor: Jean-Marc Valin

New Speex VAD/VBR code

The current Speex VAD/VBR code is a quick hack, put together a long time ago. This project consists of rewriting it to perform much better under all kinds of conditions. Performance criteria are: classification accuracy, quality in VAD/DTX mode, simplicity, and of course speed. This project involves pattern classification and speech processing, so it requires at least some knowledge in these fields (or at least one of them).

Mentor: Jean-Marc Valin

rehuff: a tool to losslessly compress Vorbis files

Would be nice to have an updated version of "rehuff", a tool to losslessly compress Vorbis files. There were an experimental version of it (see rehuff status), but had some limits:

it's not free software;
it has a bug causing the rehuffed file can't correctly seek;
it works only with stereo files.

Would be nice to have an updated rehuff, without the previous limits, and with a library part that will be included in libvorbisenc, so all encoders could use it (rehuff binary, oggenc, ...). Not an official Xiph.org project, only a user proposed idea.

Ogg and Annodex integration into open source web Content Management Systems

The goal of this project is to make the use of ogg theora video in existing CMSs as easy as possible. The project would consist of integrating in browser playback and structured cmml metadata into existing CMSs like mediaWiki, drupal or wordpress.

In browser playback support will be handled by vlc or java cortado plugin and then liboggplay & native browser decoding as that project matures. mv_embed may be a starting point for client plugin detection. The extension package should also handle thumbnail generation via mplayer or ffmpeg, and ideally support transcodeing via shell calls to ffmpeg2theora. Meta data attributes for the video in the CMS could be exportable as CMML. (a standardized xml format for tagging continues video)

Mentor: Michael Dale, backup: Silvia Pfeiffer

OggPlay: Time Offset Acceleration

OggPlay is a new library that enables developers to drop ogg media support into applications. OggPlay will be used to implement native Ogg/Annodex support in Firefox, and supports a range of features including playing Oggs provided in TCP streams.

This project involves adding time offset optimisation support to OggPlay in TCP mode. Upon successful completion, applications will be able to notify the library of "interesting" time regions of the file, either at the current time point or in the past or future. The library in turn will ensure that these regions are pinned in local memory or on disk using a combination of compressed stream and uncompressed frames. If the application later attempts to seek to a pinned time region, then access to the stream at that point will be much faster than other regions.

Time offset acceleration will be useful to projects such as Annodex, which allows annotation of time regions ('clips') in Ogg, as well as direct access to individual clips over the WWW, URL-based naming of clips, and linking between clips.

Mentor: Shane Stephens, backups: Marcin Lubonski, Michael Martin

Guidelines for Applying

Remember that many people will apply to work on the Summer of Code.

Keep in mind that those of us evaluating your application do not know you, we do not know what kind of experience you have, we do not know what you have done in the past and we have to pick the best people suited for a particular task.

Hence, it is very important that you tell us in your email why you should be considered to implement a particular project. Please use the application template at Summer of Code Applications as a starting point.

@@ Line 119: / Line 119: @@
 === New Speex VAD/VBR code ===
-The current Speex VAD/VBR code is a quick hack, put together a long time ago. This project would consist of rewriting it to perform much better under all kinds of conditions. This project involves pattern classification and speech processing. Hence, it requires at least some signal processing knowledge.
+The current Speex VAD/VBR code is a quick hack, put together a long time ago. This project consists of rewriting it to perform much better under all kinds of conditions. Performance criteria are: classification accuracy, quality in VAD/DTX mode, simplicity, and of course speed. This project involves pattern classification and speech processing, so it requires at least some knowledge in these fields (or at least one of them).
 ''Mentor: Jean-Marc Valin''