Summer of Code 2008: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(delete last year's project, which is finished (Skeleton support))
m (Summer of Code moved to Summer of Code 2008: moved 2008 summer of code page to 2008)
 
(51 intermediate revisions by 13 users not shown)
Line 1: Line 1:
== CURRENT IDEAS ==
This is  our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org] and [http://annodex.org/ Annodex]. The two projects participate jointly this year under Xiph's name.


This is  our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org] and [http://annodex.net/ Annodex]. The two projects are participating jointly this year under Xiph's name.
'''Students''' please use the template at [[Summer of Code Applications]] when applying for a GSoC position.
 
'''Mentors''' please visit [[Summer of Code Mentoring]] and help us prepare our application as a mentoring organization.
 
== Current Ideas ==


We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)
We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)


note: Google Summer of Code 2007, mentoring organizations to apply between March 5 and March 12, students March 14 - March 24
''Project ideas go here''
* Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) '''OggPusher''' details below:
* Theora encoding support in GIMP
* Cross-platform qt4 wrapper around Xiph encoders.  A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd.  This would make it tremendously easy for end-users to encode Theora et al
* Theora Java port directly (semi)automatically derived from the reference sources
* Optimisations for Oggenc and Co. as done by Lancer (http://homepage3.nifty.com/blacksword/) which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
:The last statement shows why this is a bad idea.
* Speex support in IceS
* Better stream source gui for dvswitch
* XSPF support in ogg123 and oggenc (playlist creation)
* Initial support for OggPCM in some of our tools
* OggMNG tools
:Is this really necessary?  I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
::We still keep getting asked for a format where speex and images together make up a movie.--Silvia
* ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
* create Ogg caption support for vlc using CMML
* ffmpeg improvements for Xiph codecs:
** add Speex support
** add Ogg Skeleton support
** fix seeking bugs involving Ogg Theora
** fix bugs in Ogg Theora decoder
** improve ogg muxer
* Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with [http://webcvs.freedesktop.org/gstreamer/gst-python/examples/remuxer.py?content-type=text%2Fplain&view=co remuxer.py])
* Improve [http://xiph.org/quicktime/ Xiph QuickTime Components]:
** add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
** add FLAC and Speex encoding support
** improve user interface of the Ogg exporter
** add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
* Portable [http://rarewares.org/ogg-oggenc.php oggenc2] --[[User:Fp|Fp]] 02:26, 14 May 2008 (PDT)
** oggenc2 is a fork of xiph.org oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over xiph.org oggenc, e.g.:
*** use of libsamplerate for resampling, giving a higher quality;
*** support of 32 bit and floating point WAV format for input;
** this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.
 
==Detailed Project Description==


'''Students''' please use the template at [[Summer of Code Applications]] when applying for a GSoC position.
===Mv_Embed: Accessibility and [re]usability:===
'''Mentor:''' Michael Dale, Anna (EngageMedia) <br/>
'''Existing Feature Set:''' [http://metavid.ucsc.edu/wiki/index.php/Mv_Embed Mv_Embed] is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).
 
'''Proposed Development:''' Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.
 
Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.
* FilmForge (Drupal)
* ShowInABox (Wordpress)
* Plumi (Plone)
 
Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.
 
'''Accessibility & CMML'''
Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will  offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).
 
Mv_Embed is part of [http://metavid.ucsc.edu/wiki/index.php/MetaVidWiki MetavidWiki] enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.
 
=== Theora Java port directly (semi)automatically derived from the reference sources ===
 
The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.
 
Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM (http://nestedvm.ibex.org/) and Cibyl (http://spel.bth.se/index.php/Cibyl) are doing '''language agnostic translations to Java bytecode''', using the GCC toolchain.
 
In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.
 
Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already (http://groups.google.com/group/nestedvm/browse_thread/thread/df96ef7337f390e4/a45fdd66534e7641?#a45fdd66534e7641) and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port (http://spel.bth.se/index.php/Cibyl_performance) - it's sometimes faster, sometimes slower.
 
The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.
 
Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup (http://groups.google.com/group/cibyl-devel/browse_thread/thread/584e5fc3b9bc7e2c). So for the Cibyl port to work some autoconf magic may be necessary.
 
So what should this project do:
 
* Create and document a working setup for doing language-agnostic Java conversions
* Demonstrate this for Theora
* Find a way to generate a Java interface in a way being automated as much as possible
 
This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too


'''Mentors''' for details of our mentor application and plan, please see [[Summer of Code 2007]].
--[[User:Maikmerten|Maikmerten]] 03:43, 12 March 2008 (PDT)


Students should also check out projects related to the [http://wiki.elphel.com/index.php?title=SoC Elphel Open Source cameras].


=== Optimize Theora encoding/decoding speed, SSE/SSE2 ===
===OggPusher===
Work on MMX, SSE/SSE2 implementations of the crucial encoding and
'''Mentor:''' Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ? <br>
decoding  elements in [http://svn.xiph.org/trunk/theora/ libtheora] and/or [http://svn.xiph.org/trunk/theora-exp theora-exp].
'''Abstract:''' OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.<br>
This could include porting the vp3 mmx and altivec code to the libtheora decoder, and writing sse improvements on the
'''Sample Application Flow:''' is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc). The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application.  Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress.  If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.
mmx work that has already been done. The results must still build cleanly on other archs and do
run-time capability detection.


You could start improving this [http://lists.xiph.org/pipermail/theora-dev/2005-August/002838.html MMX loop filter patch for theora-exp] that was never completed nor merged in current theora-exp (see all list thread).
'''Features for initial Release:'''
* A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
* Supports two modes of operation
** zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
** server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
* A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
* A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
* A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.


''Mentor: Ralph Giles, Timothy Terriberry, backup: Jan Gerber, Mike Smith''


=== Encode support in theora-exp ===
'''Future Feature RoadMap:'''
Implement a rate-distortion optimized encoding mode for [http://svn.xiph.org/trunk/theora-exp/ theora-exp],
Once the basic implementation has been deployed the following features will be targeted for future versions:
including R-D optimzed mode decision and quantization (e.g., constant
lambda). Then, use the above routines to implement a medium-latency ABR
encoding mode (e.g., varying lambda), with a default target buffer size
of approximately 2 seconds.


''Mentor: Timothy "Derf" Terriberry, backup: Mike Smith, Ralph Giles''
* Integration with popular open source CMS's first target is mediaWiki.
* Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
** Extend oggfwd and server side components for in browser live streaming to web services.
* Extend to support ffmpeg2Dirac and future open source media codecs.
* Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
* Enable Bittorrent uploads


=== Development assistant for the "Ghost" audio codec ===
===XSPF support in oggenc and ogg123 applications===
Designing a cutting edge perceptual codec is a very daunting task. Xiph
is in the research stage on a new low-latency, general purpose audio codec,
code-named "Ghost". This is basically a "code assistant" position, where you
will be asked to implement, test, and give feedback on ideas from Christopher
Montgomery, designer of the Ogg Vorbis format. Be prepared to learn a lot about
audio coding, or apply what you already know. While there's less "ownership"
potential in this project proposal, it will be a great opportunity to learn
about compression algorithm design, practice your programming chops, and learn
to work in team.


''Mentor: Christopher "Monty" Montgomery, backup: Jean-Marc Valin''
Mentor: Ivo Emanuel Gonçalves<br/>
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player.  XSPF is a XML-based playlist format, extensible, but simple and efficient.


=== OggMNG implementation ===
Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF. Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback. This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.
Implement the OggMNG decode support in [http://gstreamer.freedesktop.org/dev/ gstreamer] and/or [http://www.illiminable.com/ogg/ illi's dshow filters].
Implement encoding support in based on [http://www.advogato.org/person/company/diary.html?start=18 byzanz] or [http://live.gnome.org/Istanbul Istanbul]. Bonus points for
overlay support. Details on the OggMNG specifications [http://wiki.xiph.org/index.php/OggMNG here]


''Mentors: Mike Smith, Ralph Giles''


=== Theora reference encoder quality optimization ===
===php_annodex: wrapper to libannodex or liboggz for doing media stuff===
The [http://theora.org/download.html libtheora] encoder could make more use of some features present in the spec
but not currently implemented in the encoder. This is a little open ended, but
suggestions are: quant matrix tuning, per-block qi choice, 4:2:2 and 4:4:4 chroma
support.


''Mentor: Ralph Giles, backup: Timothy Terriberry''
'''Mentor:''' Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale<br/>


=== Subtitle Definition ===
'''What is it?'''
There has been a long-standing need for the introduction of subtitles into Ogg. Several means have been suggested and various implementations exist. However, there has been no standard way that is supported by Xiph at this stage.
Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.


The [http://annodex.net/TR/draft-pfeiffer-cmml-03.html CMML] format with its time-aligned means of interleaving text into Ogg bitstreams is a platform on which we would very much like to define a standard means of including subtitles.
'''What is the project?'''
An initial version of [http://annodex.net/software/phpannodex/index.html php_annodex exists], but it is incomplete and not up-to-date. This is in comparison with such support in python through [http://annodex.net/taxonomy_menu/1/19 pyannodex]. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module [http://annodex.net/software/phpannodex/index.html Acidfree]. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.


In this project, a standard means of interleaving subtitles (as found on DVDs or in srt files) into Ogg will be defined using CMML.


The project requires to make changes to the CMML definition and extend it in several ways. CMML needs to have a valid XML schema or DTD definition associated with it, so that standard XML tools will parse it. CMML will need to be extended with tags for subtitles that also allows to provide formatting information for the text, e.g. using style sheets. The associated documentation should then be updated and software written to put e.g. a srt file into CMML inside Ogg. If there is enough time, it would also be good to implement support for this format in a media player such as vlc or mplayer or xine.
===ruby_annodex: wrapper to libannodex or liboggz for doing media stuff===


''Mentor: Silvia Pfeiffer, backup: Conrad Parker''
'''Mentor:''' Silvia Pfeiffer<br/>


=== Theora support in ekiga ===
'''What is it?'''
Implement support for Theora as a video codec in the [http://www.gnomemeeting.org/ ekiga] chat application.
Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.
Overlaps with [http://live.gnome.org/SummerOfCode2006/Ideas GNOME].


''Mentor: Ralph Giles''
'''What is the project?'''
A python wrapper of similar type called [http://annodex.net/taxonomy_menu/1/19 pyannodex] exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.


=== MXF support in gstreamer ===
Implement an [http://www.digitalpreservation.gov/formats/fdd/fdd000013.shtml MXF] mux/demux for [http://gstreamer.freedesktop.org/dev/ gstreamer], with mappings for [http://xiph.org/vorbis/ Vorbis] and [http://xiph.org/theora Theora].


''Mentors: Christian Schaller, Mike Smith''
===Using ROE to create multi-track Ogg files===


=== Hardware implementation of Theora decoding ===
'''Mentor:''' Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens<br/>
Working on a hardware theora decoder, that can be used in embedded
devices, dvd players and video pods. Presumedly GPL verilog source
to run on an FPGA. See http://sourceforge.net/projects/elphel/ for a rough encoder implementation. This was a successful project in 2006.


=== Intel to AT&T x86 assembly translation ===
'''What is it?'''
There is a general need for cross platform projects to be able to compile the same asm accelleration code on both GCC and MSVC. Unfortunately, at least of x86, they have incompatible assembly formats. Currently people either convert one to the other by hand (a maintenance nightmare) or require an external compile/assemble step on one or the other platform.
[http://trac.annodex.net/wiki/MovieDescriptionLanguage ROE] is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.


Start with the (unmaintained?) [http://www.niksula.hut.fi/~mtiihone/intel2gas/ intel2gas] script. Spruce it up to support all of recent MMX, SSE, SSE2, SSE3 instructions. Then implement the reverse translation. Once both are working, write some glue code so it can be easily used as part of a GNU autotools build to derive one set of source from the other at build or package time.
'''What is the project?'''
In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.


=== Speex and FLAC encoders in Xiph QuickTime Components ===
Implement Speex and FLAC [http://developer.apple.com/documentation/MusicAudio/Reference/CoreAudio/index.html Core Audio] encoders.


[http://xiph.org/quicktime/ XiphQT] has a Vorbis encoder component that could be used as a reference and starting point.
===SHARE application for the Spread Open Media project===


''Mentor: Arek Korbik''
Mentor: Ivo Emanuel Gonçalves<br/>
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise.  SHARE is a pratical step to build on this community and spread more files.


=== New vocoder for Speex ===
Proposed Development: SHARE is intended to be a PHP project. We do not discard the possibility of using Rails or Python, but the current SOM server does not support these. SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists.  Basically, it is a playlist sharing application. Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.
Speex currently has a very low bit-rate (2.15 kbps) mode that is implemented as a trivial vocoder. This mode has four "reserved" bits per frame, which means it would be possible to transmit more information. The idea of this project would be to make use of these bits to improve the quality of the 2.15 kbps mode. Changes to both the encoder and the decoder are allowed, provided that they are compatible with older version. This means that the new bit-stream should be decodable by the old decoder with only minor loss in quality. This still leaves plenty of room for improvement. Requires signal processing knowledge.


''Mentor: Jean-Marc Valin''
=== Cross-platform Xiph encoder wrapper in qt4 ===


=== New Speex VAD/VBR code ===
Mentor: Not specified yet<br/>
The current Speex VAD/VBR code is a quick hack, put together a long time ago. This project would consist of rewriting it to perform much better under all kinds of conditions. Requires signal processing knowledge.
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs.  Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.


''Mentor: Jean-Marc Valin''
Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLAC.  This would likely boost the popularity of said formats tremendously.


=== rehuff: a tool to losslessly compress Vorbis files ===
===Dirac support in liboggplay and liboggz===
Would be nice to have an updated version of "rehuff", a tool to losslessly compress Vorbis files. There were an experimental version of it (see [http://lists.xiph.org/pipermail/vorbis-dev/2006-August/018522.html rehuff status]), but had some limits:
* it's not free software;
* it has a bug causing the rehuffed file can't correctly seek;
* it works only with stereo files.


Would be nice to have an updated rehuff, without the previous limits, and with a library part that will be included in libvorbisenc, so all encoders could use it (rehuff binary, oggenc, ...).
'''Mentor:''' ??


''Not an official Xiph.org project, only a user proposed idea.''
Right now [http://wiki.xiph.org/index.php/OggPlay liboggplay] only support Theora video.
Your aim for this project is to add support for [http://dirac.sourceforge.net/ Dirac],
this should be done using [http://www.diracvideo.org/ libschrodinger].
Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.


== GUIDELINES FOR APPLYING ==
== Guidelines for Applying ==


Remember that many people will apply to work on the Summer of Code.
Remember that many people will apply to work on the Summer of Code.
Line 130: Line 187:
Hence, it is very important that you tell us in your email why you should be considered to implement a  
Hence, it is very important that you tell us in your email why you should be considered to implement a  
particular project. Please use the application template at [[Summer of Code Applications]] as a starting point.
particular project. Please use the application template at [[Summer of Code Applications]] as a starting point.
== See Also ==
*[[Todo]]
*[[Bounties]]
*[[CodingGuidelines]]
*[[MIT approach to design and implementation]]
*[[How to do a release]]
*[[Summer of Code 2007]]
*[[Summer of Code 2006]]

Latest revision as of 18:14, 26 February 2009

This is our ideas page for Google Summer of Code projects with Xiph.org and Annodex. The two projects participate jointly this year under Xiph's name.

Students please use the template at Summer of Code Applications when applying for a GSoC position.

Mentors please visit Summer of Code Mentoring and help us prepare our application as a mentoring organization.

Current Ideas

We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)

Project ideas go here

  • Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) OggPusher details below:
  • Theora encoding support in GIMP
  • Cross-platform qt4 wrapper around Xiph encoders. A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd. This would make it tremendously easy for end-users to encode Theora et al
  • Theora Java port directly (semi)automatically derived from the reference sources
  • Optimisations for Oggenc and Co. as done by Lancer (http://homepage3.nifty.com/blacksword/) which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
The last statement shows why this is a bad idea.
  • Speex support in IceS
  • Better stream source gui for dvswitch
  • XSPF support in ogg123 and oggenc (playlist creation)
  • Initial support for OggPCM in some of our tools
  • OggMNG tools
Is this really necessary? I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
We still keep getting asked for a format where speex and images together make up a movie.--Silvia
  • ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
  • create Ogg caption support for vlc using CMML
  • ffmpeg improvements for Xiph codecs:
    • add Speex support
    • add Ogg Skeleton support
    • fix seeking bugs involving Ogg Theora
    • fix bugs in Ogg Theora decoder
    • improve ogg muxer
  • Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with remuxer.py)
  • Improve Xiph QuickTime Components:
    • add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
    • add FLAC and Speex encoding support
    • improve user interface of the Ogg exporter
    • add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
  • Portable oggenc2 --Fp 02:26, 14 May 2008 (PDT)
    • oggenc2 is a fork of xiph.org oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over xiph.org oggenc, e.g.:
      • use of libsamplerate for resampling, giving a higher quality;
      • support of 32 bit and floating point WAV format for input;
    • this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.

Detailed Project Description

Mv_Embed: Accessibility and [re]usability:

Mentor: Michael Dale, Anna (EngageMedia)
Existing Feature Set: Mv_Embed is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).

Proposed Development: Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.

Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.

  • FilmForge (Drupal)
  • ShowInABox (Wordpress)
  • Plumi (Plone)

Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.

Accessibility & CMML Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).

Mv_Embed is part of MetavidWiki enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.

Theora Java port directly (semi)automatically derived from the reference sources

The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.

Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM (http://nestedvm.ibex.org/) and Cibyl (http://spel.bth.se/index.php/Cibyl) are doing language agnostic translations to Java bytecode, using the GCC toolchain.

In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.

Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already (http://groups.google.com/group/nestedvm/browse_thread/thread/df96ef7337f390e4/a45fdd66534e7641?#a45fdd66534e7641) and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port (http://spel.bth.se/index.php/Cibyl_performance) - it's sometimes faster, sometimes slower.

The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.

Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup (http://groups.google.com/group/cibyl-devel/browse_thread/thread/584e5fc3b9bc7e2c). So for the Cibyl port to work some autoconf magic may be necessary.

So what should this project do:

  • Create and document a working setup for doing language-agnostic Java conversions
  • Demonstrate this for Theora
  • Find a way to generate a Java interface in a way being automated as much as possible

This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too

--Maikmerten 03:43, 12 March 2008 (PDT)


OggPusher

Mentor: Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ?
Abstract: OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.
Sample Application Flow: is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc). The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application. Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress. If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.

Features for initial Release:

  • A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
  • Supports two modes of operation
    • zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
    • server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
  • A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
  • A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
  • A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.


Future Feature RoadMap: Once the basic implementation has been deployed the following features will be targeted for future versions:

  • Integration with popular open source CMS's first target is mediaWiki.
  • Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
    • Extend oggfwd and server side components for in browser live streaming to web services.
  • Extend to support ffmpeg2Dirac and future open source media codecs.
  • Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
  • Enable Bittorrent uploads

XSPF support in oggenc and ogg123 applications

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player. XSPF is a XML-based playlist format, extensible, but simple and efficient.

Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF. Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback. This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.


php_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.

What is the project? An initial version of php_annodex exists, but it is incomplete and not up-to-date. This is in comparison with such support in python through pyannodex. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module Acidfree. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.


ruby_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.

What is the project? A python wrapper of similar type called pyannodex exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.


Using ROE to create multi-track Ogg files

Mentor: Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens

What is it? ROE is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.

What is the project? In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.


SHARE application for the Spread Open Media project

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise. SHARE is a pratical step to build on this community and spread more files.

Proposed Development: SHARE is intended to be a PHP project. We do not discard the possibility of using Rails or Python, but the current SOM server does not support these. SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists. Basically, it is a playlist sharing application. Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.

Cross-platform Xiph encoder wrapper in qt4

Mentor: Not specified yet
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs. Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.

Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLAC. This would likely boost the popularity of said formats tremendously.

Dirac support in liboggplay and liboggz

Mentor: ??

Right now liboggplay only support Theora video. Your aim for this project is to add support for Dirac, this should be done using libschrodinger. Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.

Guidelines for Applying

Remember that many people will apply to work on the Summer of Code.

Keep in mind that those of us evaluating your application do not know you, we do not know what kind of experience you have, we do not know what you have done in the past and we have to pick the best people suited for a particular task.

Hence, it is very important that you tell us in your email why you should be considered to implement a particular project. Please use the application template at Summer of Code Applications as a starting point.

See Also