Summer of Code 2008: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(→‎Current Ideas: student application deadline has been extended to 26th)
m (Summer of Code moved to Summer of Code 2008: moved 2008 summer of code page to 2008)
 
(42 intermediate revisions by 11 users not shown)
Line 1: Line 1:
This is  our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org] and [http://annodex.org/ Annodex]. The two projects participate jointly this year under Xiph's name.
'''Students''' please use the template at [[Summer of Code Applications]] when applying for a GSoC position.
'''Mentors''' please visit [[Summer of Code Mentoring]] and help us prepare our application as a mentoring organization.
== Current Ideas ==
== Current Ideas ==


This is our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org] and [http://annodex.net/ Annodex]. The two projects are participating jointly this year under Xiph's name.
We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)


We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)
''Project ideas go here''
* Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) '''OggPusher''' details below:
* Theora encoding support in GIMP
* Cross-platform qt4 wrapper around Xiph encoders.  A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd.  This would make it tremendously easy for end-users to encode Theora et al
* Theora Java port directly (semi)automatically derived from the reference sources
* Optimisations for Oggenc and Co. as done by Lancer (http://homepage3.nifty.com/blacksword/) which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
:The last statement shows why this is a bad idea.
* Speex support in IceS
* Better stream source gui for dvswitch
* XSPF support in ogg123 and oggenc (playlist creation)
* Initial support for OggPCM in some of our tools
* OggMNG tools
:Is this really necessary?  I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
::We still keep getting asked for a format where speex and images together make up a movie.--Silvia
* ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
* create Ogg caption support for vlc using CMML
* ffmpeg improvements for Xiph codecs:
** add Speex support
** add Ogg Skeleton support
** fix seeking bugs involving Ogg Theora
** fix bugs in Ogg Theora decoder
** improve ogg muxer
* Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with [http://webcvs.freedesktop.org/gstreamer/gst-python/examples/remuxer.py?content-type=text%2Fplain&view=co remuxer.py])
* Improve [http://xiph.org/quicktime/ Xiph QuickTime Components]:
** add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
** add FLAC and Speex encoding support
** improve user interface of the Ogg exporter
** add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
* Portable [http://rarewares.org/ogg-oggenc.php oggenc2] --[[User:Fp|Fp]] 02:26, 14 May 2008 (PDT)
** oggenc2 is a fork of xiph.org oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over xiph.org oggenc, e.g.:
*** use of libsamplerate for resampling, giving a higher quality;
*** support of 32 bit and floating point WAV format for input;
** this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.
 
==Detailed Project Description==
 
===Mv_Embed: Accessibility and [re]usability:===
'''Mentor:''' Michael Dale, Anna (EngageMedia) <br/>
'''Existing Feature Set:''' [http://metavid.ucsc.edu/wiki/index.php/Mv_Embed Mv_Embed] is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).


note: Google Summer of Code 2007, mentoring organizations to apply between March 5 and March 12, students March 14 - March 26
'''Proposed Development:''' Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.


'''Students''' please use the template at [[Summer of Code Applications]] when applying for a GSoC position.
Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.
* FilmForge (Drupal)
* ShowInABox (Wordpress)
* Plumi (Plone)


'''Mentors''' for details of our mentor application and plan, please see [[Summer of Code 2007]].
Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.


Students should also check out projects related to the [http://wiki.elphel.com/index.php?title=SoC Elphel Open Source cameras].
'''Accessibility & CMML'''
Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will  offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).


=== Optimize Theora encoding/decoding speed, SSE/SSE2 ===
Mv_Embed is part of [http://metavid.ucsc.edu/wiki/index.php/MetaVidWiki MetavidWiki] enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.
Work on MMX, SSE/SSE2 implementations of the crucial encoding and
decoding  elements in [http://svn.xiph.org/trunk/theora/ libtheora] and/or [http://svn.xiph.org/trunk/theora-exp theora-exp].
This could include porting the vp3 mmx and altivec code to the libtheora decoder, and writing sse improvements on the
mmx work that has already been done. The results must still build cleanly on other archs and do
run-time capability detection.


You could start improving this [http://lists.xiph.org/pipermail/theora-dev/2005-August/002838.html MMX loop filter patch for theora-exp] that was never completed nor merged in current theora-exp (see all list thread).
=== Theora Java port directly (semi)automatically derived from the reference sources ===


''Mentor: Ralph Giles, Timothy Terriberry, backup: Jan Gerber, Mike Smith''
The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.


=== Encode support in theora-exp ===
Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM (http://nestedvm.ibex.org/) and Cibyl (http://spel.bth.se/index.php/Cibyl) are doing '''language agnostic translations to Java bytecode''', using the GCC toolchain.
Implement a rate-distortion optimized encoding mode for [http://svn.xiph.org/trunk/theora-exp/ theora-exp],
including R-D optimzed mode decision and quantization (e.g., constant
lambda). Then, use the above routines to implement a medium-latency ABR
encoding mode (e.g., varying lambda), with a default target buffer size
of approximately 2 seconds.


''Mentor: Timothy "Derf" Terriberry, backup: Mike Smith, Ralph Giles''
In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.


=== Development assistant for the "Ghost" audio codec ===
Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already (http://groups.google.com/group/nestedvm/browse_thread/thread/df96ef7337f390e4/a45fdd66534e7641?#a45fdd66534e7641) and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port (http://spel.bth.se/index.php/Cibyl_performance) - it's sometimes faster, sometimes slower.
Designing a cutting edge perceptual codec is a very daunting task. Xiph
is in the research stage on a new low-latency, general purpose audio codec,
code-named "Ghost". This is basically a "code assistant" position, where you
will be asked to implement, test, and give feedback on ideas from Christopher
Montgomery, designer of the Ogg Vorbis format. Be prepared to learn a lot about
audio coding, or apply what you already know. While there's less "ownership"
potential in this project proposal, it will be a great opportunity to learn
about compression algorithm design, practice your programming chops, and learn
to work in team.


''Mentor: Christopher "Monty" Montgomery, backup: Jean-Marc Valin''
The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.


=== OggMNG implementation ===
Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup (http://groups.google.com/group/cibyl-devel/browse_thread/thread/584e5fc3b9bc7e2c). So for the Cibyl port to work some autoconf magic may be necessary.
Implement the OggMNG decode support in [http://gstreamer.freedesktop.org/dev/ gstreamer] and/or [http://www.illiminable.com/ogg/ illi's dshow filters].
Implement encoding support in based on [http://www.advogato.org/person/company/diary.html?start=18 byzanz] or [http://live.gnome.org/Istanbul Istanbul]. Bonus points for
overlay support. Details on the OggMNG specifications [http://wiki.xiph.org/index.php/OggMNG here]


''Mentors: Mike Smith, Ralph Giles''
So what should this project do:


=== Theora reference encoder quality optimization ===
* Create and document a working setup for doing language-agnostic Java conversions
The [http://theora.org/download.html libtheora] encoder could make more use of some features present in the spec
* Demonstrate this for Theora
but not currently implemented in the encoder. This is a little open ended, but
* Find a way to generate a Java interface in a way being automated as much as possible
suggestions are: quant matrix tuning, per-block qi choice, 4:2:2 and 4:4:4 chroma
support.


''Mentor: Ralph Giles, backup: Timothy Terriberry''
This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too


=== Subtitle Definition ===
--[[User:Maikmerten|Maikmerten]] 03:43, 12 March 2008 (PDT)
There has been a long-standing need for the introduction of subtitles into Ogg. Several means have been suggested and various implementations exist. However, there has been no standard way that is supported by Xiph at this stage.


The [http://annodex.net/TR/draft-pfeiffer-cmml-03.html CMML] format with its time-aligned means of interleaving text into Ogg bitstreams is a platform on which we would very much like to define a standard means of including subtitles.


In this project, a standard means of interleaving subtitles (as found on DVDs or in srt files) into Ogg will be defined using CMML.
===OggPusher===
'''Mentor:''' Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ? <br>
'''Abstract:''' OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.<br>
'''Sample Application Flow:''' is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc).  The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application.  Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress.  If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.


The project requires to make changes to the CMML definition and extend it in several ways. CMML needs to have a valid XML schema or DTD definition associated with it, so that standard XML tools will parse it. The associated documentation should then be updated and software written to put e.g. a srt file into CMML inside Ogg. If there is enough time, it would also be good to implement support for this format in a media player such as vlc or mplayer or xine.
'''Features for initial Release:'''
* A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
* Supports two modes of operation
** zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
** server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
* A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
* A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
* A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.


''Mentor: Silvia Pfeiffer, backup: Conrad Parker''


=== Theora support in ekiga ===
'''Future Feature RoadMap:'''
Implement support for Theora as a video codec in the [http://www.gnomemeeting.org/ ekiga] chat application.
Once the basic implementation has been deployed the following features will be targeted for future versions:
Overlaps with [http://live.gnome.org/SummerOfCode2006/Ideas GNOME].


''Mentor: Ralph Giles''
* Integration with popular open source CMS's first target is mediaWiki.
* Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
** Extend oggfwd and server side components for in browser live streaming to web services.
* Extend to support ffmpeg2Dirac and future open source media codecs.
* Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
* Enable Bittorrent uploads


=== MXF support in gstreamer ===
===XSPF support in oggenc and ogg123 applications===
Implement an [http://www.digitalpreservation.gov/formats/fdd/fdd000013.shtml MXF] mux/demux for [http://gstreamer.freedesktop.org/dev/ gstreamer], with mappings for [http://xiph.org/vorbis/ Vorbis] and [http://xiph.org/theora Theora].


''Mentors: Christian Schaller, Mike Smith''
Mentor: Ivo Emanuel Gonçalves<br/>
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player.  XSPF is a XML-based playlist format, extensible, but simple and efficient.


=== Cascading Style Sheet support for CMML in GStreamer ===
Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF.  Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback.  This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.


Implement support for [http://www.w3.org/Style/CSS/ Cascading Style Sheets] to add styling and positioning hints to
[http://annodex.net/TR/draft-pfeiffer-cmml-03.html CMML] text overlays on [http://xiph.org/theora Theora] video. Doing so allows for advanced titling features visually similar to TV-style news headlines, sports scores, and scrolling text. The advantage over conventional "burnt-in" titling is that the stylesheet-driven approach is machine-readable, allowing indexing for search and improved accessibility.


The project involves:
===php_annodex: wrapper to libannodex or liboggz for doing media stuff===
# YUV compositing support in the [http://gstreamer.freedesktop.org/data/doc/gstreamer/0.10.1/gst-plugins-base-plugins/html/gst-plugins-base-plugins-textoverlay.html textoverlay] plugin in [http://gstreamer.freedesktop.org/dev/ gstreamer] in order to complete the low-level support for color and font attributes. Textoverlay already includes partial support for [http://developer.gnome.org/doc/API/2.0/pango/PangoMarkupFormat.html pango text], but lacks the necessary colorspace conversions. This portion of the project necessarily involves C programming.
# Implementation of a test application to playback video marked up with CSS titling hints. It is recommended that this portion of the project be implemented in a higher-level language for which GStreamer support and CSS parsing libraries exist, such as Python, Ruby or Haskell.


Further work could include support for style sheets in liboggplay / Firefox, or direct support for style sheet retrieval and rendering in GStreamer (via the [http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-good-plugins/html/gst-plugins-good-plugins-cmmldec.html  cmmldec] plugin and/or a new style-sheet-aware textoverlay plugin).
'''Mentor:''' Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale<br/>


''Mentor: Conrad Parker, backup: Mike Smith, Silvia Pfeiffer''
'''What is it?'''
Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.


=== Hardware implementation of Theora decoding ===
'''What is the project?'''
Working on a hardware theora decoder, that can be used in embedded
An initial version of [http://annodex.net/software/phpannodex/index.html php_annodex exists], but it is incomplete and not up-to-date. This is in comparison with such support in python through [http://annodex.net/taxonomy_menu/1/19 pyannodex]. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module [http://annodex.net/software/phpannodex/index.html Acidfree]. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.
devices, dvd players and video pods. Presumedly GPL verilog source
to run on an FPGA. See http://sourceforge.net/projects/elphel/ for a rough encoder implementation. This was a successful project in 2006.


=== Intel to AT&T x86 assembly translation ===
There is a general need for cross platform projects to be able to compile the same asm accelleration code on both GCC and MSVC. Unfortunately, at least of x86, they have incompatible assembly formats. Currently people either convert one to the other by hand (a maintenance nightmare) or require an external compile/assemble step on one or the other platform.


Start with the (unmaintained?) [http://www.niksula.hut.fi/~mtiihone/intel2gas/ intel2gas] script. Spruce it up to support all of recent MMX, SSE, SSE2, SSE3 instructions. Then implement the reverse translation. Once both are working, write some glue code so it can be easily used as part of a GNU autotools build to derive one set of source from the other at build or package time.
===ruby_annodex: wrapper to libannodex or liboggz for doing media stuff===


=== Speex and FLAC encoders in Xiph QuickTime Components ===
'''Mentor:''' Silvia Pfeiffer<br/>
Implement Speex and FLAC [http://developer.apple.com/documentation/MusicAudio/Reference/CoreAudio/index.html Core Audio] encoders.


[http://xiph.org/quicktime/ XiphQT] has a Vorbis encoder component that could be used as a reference and starting point.
'''What is it?'''
Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.


''Mentor: Arek Korbik''
'''What is the project?'''
A python wrapper of similar type called [http://annodex.net/taxonomy_menu/1/19 pyannodex] exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.


=== New vocoder for Speex ===
This is a real speech coding project! Speex currently has a very low bit-rate (2.15 kbps) mode that is implemented as a trivial vocoder. This mode has four "reserved" bits per frame, which means it would be possible to transmit more information. The idea of this project would be to make use of these bits to improve the quality of the 2.15 kbps mode. Changes to both the encoder and the decoder are allowed, provided that they are compatible with older version. This means that the new bit-stream should be decodable by the old decoder with only minor loss in quality. This still leaves plenty of room for improvement. Upon successful completion, there will be a new vocoder that can be merged into the main Speex tree with improved quality and (non-exact) backward compatibility with the previous vocoder. The project requires some signal processing knowledge.


''Mentor: Jean-Marc Valin''
===Using ROE to create multi-track Ogg files===


=== New Speex VAD/VBR code ===
'''Mentor:''' Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens<br/>
The current Speex VAD/VBR code is a quick hack, put together a long time ago. This project consists of rewriting it to perform much better under all kinds of conditions. Performance criteria are: classification accuracy, quality in VAD/DTX mode, simplicity, and of course speed. This project involves pattern classification and speech processing, so it requires at least some knowledge in these fields (or at least one of them).


''Mentor: Jean-Marc Valin''
'''What is it?'''
[http://trac.annodex.net/wiki/MovieDescriptionLanguage ROE] is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.


'''What is the project?'''
In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.


=== rehuff: a tool to losslessly compress Vorbis files ===
Would be nice to have an updated version of "rehuff", a tool to losslessly compress Vorbis files. There were an experimental version of it (see [http://lists.xiph.org/pipermail/vorbis-dev/2006-August/018522.html rehuff status]), but had some limits:
* it's not free software;
* it has a bug causing the rehuffed file can't correctly seek;
* it works only with stereo files.


Would be nice to have an updated rehuff, without the previous limits, and with a library part that will be included in libvorbisenc, so all encoders could use it (rehuff binary, oggenc, ...).
===SHARE application for the Spread Open Media project===
''Not an official Xiph.org project, only a user proposed idea.''


=== Ogg and Annodex integration into open source web Content Management Systems ===
Mentor: Ivo Emanuel Gonçalves<br/>
The goal of this project is to make the use of ogg theora video in existing CMSs as easy as possible.  
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise. SHARE is a pratical step to build on this community and spread more files.
The project would consist of integrating in browser playback and structured cmml metadata into existing CMSs like mediaWiki, drupal or wordpress.  


In browser playback support will be handled by vlc or java cortado plugin and then liboggplay & native browser decoding as that project matures. [http://metavid.ucsc.edu/wiki/index.php/Mv_embed mv_embed] may be a starting point for client plugin detection.  
Proposed Development: SHARE is intended to be a PHP project. We do not discard the possibility of using Rails or Python, but the current SOM server does not support these. SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists. Basically, it is a playlist sharing application.  Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.
The extension package should also handle thumbnail generation via mplayer or ffmpeg, and ideally support transcodeing via shell calls to ffmpeg2theora. Meta data attributes for the video in the CMS could be exportable as CMML. (a standardized xml format for tagging continues video)


''Mentor: Michael Dale'', ''backup: Silvia Pfeiffer''
=== Cross-platform Xiph encoder wrapper in qt4 ===


=== OggPlay: Time Offset Acceleration ===
Mentor: Not specified yet<br/>
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs.  Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.


OggPlay is a new library that enables developers to drop ogg media support into applicationsOggPlay will be used to implement native Ogg/Annodex support in Firefox, and supports a range of features including playing Oggs provided in TCP streams.
Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLACThis would likely boost the popularity of said formats tremendously.


This project involves adding time offset optimisation support to OggPlay in TCP mode.  Upon successful completion, applications will be able to notify the library of "interesting" time regions of the file, either at the current time point or in the past or future.  The library in turn will ensure that these regions are pinned in local memory or on disk using a combination of compressed stream and uncompressed frames.  If the application later attempts to seek to a pinned time region, then access to the stream at that point will be much faster than other regions.
===Dirac support in liboggplay and liboggz===


Time offset acceleration will be useful to projects such as Annodex, which allows annotation of time regions ('clips') in Ogg, as well as direct access to individual clips over the WWW, URL-based naming of clips, and linking between clips.
'''Mentor:''' ??


''Mentor: Shane Stephens'', ''backups: Marcin Lubonski, Michael Martin''
Right now [http://wiki.xiph.org/index.php/OggPlay liboggplay] only support Theora video.
Your aim for this project is to add support for [http://dirac.sourceforge.net/ Dirac],
this should be done using [http://www.diracvideo.org/ libschrodinger].
Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.


== Guidelines for Applying ==
== Guidelines for Applying ==
Line 164: Line 189:


== See Also ==
== See Also ==
*[[Todo]]
*[[Bounties]]
*[[CodingGuidelines]]
*[[CodingGuidelines]]
*[[MIT approach to design and implementation]]
*[[MIT approach to design and implementation]]
*[[How to do a release]]
*[[Summer of Code 2007]]
*[[Summer of Code 2006]]

Latest revision as of 18:14, 26 February 2009

This is our ideas page for Google Summer of Code projects with Xiph.org and Annodex. The two projects participate jointly this year under Xiph's name.

Students please use the template at Summer of Code Applications when applying for a GSoC position.

Mentors please visit Summer of Code Mentoring and help us prepare our application as a mentoring organization.

Current Ideas

We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)

Project ideas go here

  • Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) OggPusher details below:
  • Theora encoding support in GIMP
  • Cross-platform qt4 wrapper around Xiph encoders. A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd. This would make it tremendously easy for end-users to encode Theora et al
  • Theora Java port directly (semi)automatically derived from the reference sources
  • Optimisations for Oggenc and Co. as done by Lancer (http://homepage3.nifty.com/blacksword/) which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
The last statement shows why this is a bad idea.
  • Speex support in IceS
  • Better stream source gui for dvswitch
  • XSPF support in ogg123 and oggenc (playlist creation)
  • Initial support for OggPCM in some of our tools
  • OggMNG tools
Is this really necessary? I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
We still keep getting asked for a format where speex and images together make up a movie.--Silvia
  • ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
  • create Ogg caption support for vlc using CMML
  • ffmpeg improvements for Xiph codecs:
    • add Speex support
    • add Ogg Skeleton support
    • fix seeking bugs involving Ogg Theora
    • fix bugs in Ogg Theora decoder
    • improve ogg muxer
  • Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with remuxer.py)
  • Improve Xiph QuickTime Components:
    • add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
    • add FLAC and Speex encoding support
    • improve user interface of the Ogg exporter
    • add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
  • Portable oggenc2 --Fp 02:26, 14 May 2008 (PDT)
    • oggenc2 is a fork of xiph.org oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over xiph.org oggenc, e.g.:
      • use of libsamplerate for resampling, giving a higher quality;
      • support of 32 bit and floating point WAV format for input;
    • this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.

Detailed Project Description

Mv_Embed: Accessibility and [re]usability:

Mentor: Michael Dale, Anna (EngageMedia)
Existing Feature Set: Mv_Embed is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).

Proposed Development: Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.

Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.

  • FilmForge (Drupal)
  • ShowInABox (Wordpress)
  • Plumi (Plone)

Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.

Accessibility & CMML Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).

Mv_Embed is part of MetavidWiki enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.

Theora Java port directly (semi)automatically derived from the reference sources

The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.

Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM (http://nestedvm.ibex.org/) and Cibyl (http://spel.bth.se/index.php/Cibyl) are doing language agnostic translations to Java bytecode, using the GCC toolchain.

In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.

Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already (http://groups.google.com/group/nestedvm/browse_thread/thread/df96ef7337f390e4/a45fdd66534e7641?#a45fdd66534e7641) and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port (http://spel.bth.se/index.php/Cibyl_performance) - it's sometimes faster, sometimes slower.

The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.

Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup (http://groups.google.com/group/cibyl-devel/browse_thread/thread/584e5fc3b9bc7e2c). So for the Cibyl port to work some autoconf magic may be necessary.

So what should this project do:

  • Create and document a working setup for doing language-agnostic Java conversions
  • Demonstrate this for Theora
  • Find a way to generate a Java interface in a way being automated as much as possible

This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too

--Maikmerten 03:43, 12 March 2008 (PDT)


OggPusher

Mentor: Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ?
Abstract: OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.
Sample Application Flow: is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc). The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application. Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress. If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.

Features for initial Release:

  • A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
  • Supports two modes of operation
    • zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
    • server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
  • A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
  • A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
  • A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.


Future Feature RoadMap: Once the basic implementation has been deployed the following features will be targeted for future versions:

  • Integration with popular open source CMS's first target is mediaWiki.
  • Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
    • Extend oggfwd and server side components for in browser live streaming to web services.
  • Extend to support ffmpeg2Dirac and future open source media codecs.
  • Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
  • Enable Bittorrent uploads

XSPF support in oggenc and ogg123 applications

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player. XSPF is a XML-based playlist format, extensible, but simple and efficient.

Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF. Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback. This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.


php_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.

What is the project? An initial version of php_annodex exists, but it is incomplete and not up-to-date. This is in comparison with such support in python through pyannodex. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module Acidfree. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.


ruby_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.

What is the project? A python wrapper of similar type called pyannodex exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.


Using ROE to create multi-track Ogg files

Mentor: Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens

What is it? ROE is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.

What is the project? In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.


SHARE application for the Spread Open Media project

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise. SHARE is a pratical step to build on this community and spread more files.

Proposed Development: SHARE is intended to be a PHP project. We do not discard the possibility of using Rails or Python, but the current SOM server does not support these. SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists. Basically, it is a playlist sharing application. Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.

Cross-platform Xiph encoder wrapper in qt4

Mentor: Not specified yet
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs. Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.

Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLAC. This would likely boost the popularity of said formats tremendously.

Dirac support in liboggplay and liboggz

Mentor: ??

Right now liboggplay only support Theora video. Your aim for this project is to add support for Dirac, this should be done using libschrodinger. Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.

Guidelines for Applying

Remember that many people will apply to work on the Summer of Code.

Keep in mind that those of us evaluating your application do not know you, we do not know what kind of experience you have, we do not know what you have done in the past and we have to pick the best people suited for a particular task.

Hence, it is very important that you tell us in your email why you should be considered to implement a particular project. Please use the application template at Summer of Code Applications as a starting point.

See Also