Summer of Code 2008

From XiphWiki
Jump to navigation Jump to search

This is our ideas page for Google Summer of Code projects with and Annodex. The two projects participate jointly this year under Xiph's name.

Students please use the template at Summer of Code Applications when applying for a GSoC position.

Mentors please visit Summer of Code Mentoring and help us prepare our application as a mentoring organization.

Current Ideas

We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)

Project ideas go here

  • Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) OggPusher details below:
  • Theora encoding support in GIMP
  • Cross-platform qt4 wrapper around Xiph encoders. A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd. This would make it tremendously easy for end-users to encode Theora et al
  • Theora Java port directly (semi)automatically derived from the reference sources
  • Optimisations for Oggenc and Co. as done by Lancer ( which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
The last statement shows why this is a bad idea.
  • Speex support in IceS
  • Better stream source gui for dvswitch
  • XSPF support in ogg123 and oggenc (playlist creation)
  • Initial support for OggPCM in some of our tools
  • OggMNG tools
Is this really necessary? I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
We still keep getting asked for a format where speex and images together make up a movie.--Silvia
  • ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
  • create Ogg caption support for vlc using CMML
  • ffmpeg improvements for Xiph codecs:
    • add Speex support
    • add Ogg Skeleton support
    • fix seeking bugs involving Ogg Theora
    • fix bugs in Ogg Theora decoder
    • improve ogg muxer
  • Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with
  • Improve Xiph QuickTime Components:
    • add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
    • add FLAC and Speex encoding support
    • improve user interface of the Ogg exporter
    • add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
  • Portable oggenc2 --Fp 02:26, 14 May 2008 (PDT)
    • oggenc2 is a fork of oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over oggenc, e.g.:
      • use of libsamplerate for resampling, giving a higher quality;
      • support of 32 bit and floating point WAV format for input;
    • this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.

Detailed Project Description

Mv_Embed: Accessibility and [re]usability:

Mentor: Michael Dale, Anna (EngageMedia)
Existing Feature Set: Mv_Embed is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).

Proposed Development: Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.

Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.

  • FilmForge (Drupal)
  • ShowInABox (Wordpress)
  • Plumi (Plone)

Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.

Accessibility & CMML Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).

Mv_Embed is part of MetavidWiki enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.

Theora Java port directly (semi)automatically derived from the reference sources

The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.

Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM ( and Cibyl ( are doing language agnostic translations to Java bytecode, using the GCC toolchain.

In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.

Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already ( and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port ( - it's sometimes faster, sometimes slower.

The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.

Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup ( So for the Cibyl port to work some autoconf magic may be necessary.

So what should this project do:

  • Create and document a working setup for doing language-agnostic Java conversions
  • Demonstrate this for Theora
  • Find a way to generate a Java interface in a way being automated as much as possible

This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too

--Maikmerten 03:43, 12 March 2008 (PDT)


Mentor: Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ?
Abstract: OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.
Sample Application Flow: is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc). The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application. Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress. If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.

Features for initial Release:

  • A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
  • Supports two modes of operation
    • zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
    • server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
  • A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
  • A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
  • A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.

Future Feature RoadMap: Once the basic implementation has been deployed the following features will be targeted for future versions:

  • Integration with popular open source CMS's first target is mediaWiki.
  • Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
    • Extend oggfwd and server side components for in browser live streaming to web services.
  • Extend to support ffmpeg2Dirac and future open source media codecs.
  • Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
  • Enable Bittorrent uploads

XSPF support in oggenc and ogg123 applications

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player. XSPF is a XML-based playlist format, extensible, but simple and efficient.

Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF. Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback. This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.

php_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.

What is the project? An initial version of php_annodex exists, but it is incomplete and not up-to-date. This is in comparison with such support in python through pyannodex. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module Acidfree. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.

ruby_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.

What is the project? A python wrapper of similar type called pyannodex exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.

Using ROE to create multi-track Ogg files

Mentor: Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens

What is it? ROE is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.

What is the project? In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.

SHARE application for the Spread Open Media project

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise. SHARE is a pratical step to build on this community and spread more files.

Proposed Development: SHARE is intended to be a PHP project. We do not discard the possibility of using Rails or Python, but the current SOM server does not support these. SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists. Basically, it is a playlist sharing application. Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.

Cross-platform Xiph encoder wrapper in qt4

Mentor: Not specified yet
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs. Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.

Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLAC. This would likely boost the popularity of said formats tremendously.

Dirac support in liboggplay and liboggz

Mentor: ??

Right now liboggplay only support Theora video. Your aim for this project is to add support for Dirac, this should be done using libschrodinger. Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.

Guidelines for Applying

Remember that many people will apply to work on the Summer of Code.

Keep in mind that those of us evaluating your application do not know you, we do not know what kind of experience you have, we do not know what you have done in the past and we have to pick the best people suited for a particular task.

Hence, it is very important that you tell us in your email why you should be considered to implement a particular project. Please use the application template at Summer of Code Applications as a starting point.

See Also