Summer of Code 2008: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
m (Summer of Code moved to Summer of Code 2008: moved 2008 summer of code page to 2008)
 
(71 intermediate revisions by 14 users not shown)
Line 1: Line 1:
== CURRENT IDEAS ==
This is  our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org] and [http://annodex.org/ Annodex]. The two projects participate jointly this year under Xiph's name.


This is  our ideas page for [http://code.google.com/soc/ Google Summer of Code] projects with [http://xiph.org Xiph.org].
'''Students''' please use the template at [[Summer of Code Applications]] when applying for a GSoC position.
 
'''Mentors''' please visit [[Summer of Code Mentoring]] and help us prepare our application as a mentoring organization.
 
== Current Ideas ==


We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)
We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)


'''If you're interested''' in doing a Summer of Code project with Xiph.org, [http://code.google.com/soc/student_step1.html sign up here] and submit an application.
''Project ideas go here''
* Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) '''OggPusher''' details below:
* Theora encoding support in GIMP
* Cross-platform qt4 wrapper around Xiph encoders.  A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd.  This would make it tremendously easy for end-users to encode Theora et al
* Theora Java port directly (semi)automatically derived from the reference sources
* Optimisations for Oggenc and Co. as done by Lancer (http://homepage3.nifty.com/blacksword/) which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
:The last statement shows why this is a bad idea.
* Speex support in IceS
* Better stream source gui for dvswitch
* XSPF support in ogg123 and oggenc (playlist creation)
* Initial support for OggPCM in some of our tools
* OggMNG tools
:Is this really necessary?  I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
::We still keep getting asked for a format where speex and images together make up a movie.--Silvia
* ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
* create Ogg caption support for vlc using CMML
* ffmpeg improvements for Xiph codecs:
** add Speex support
** add Ogg Skeleton support
** fix seeking bugs involving Ogg Theora
** fix bugs in Ogg Theora decoder
** improve ogg muxer
* Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with [http://webcvs.freedesktop.org/gstreamer/gst-python/examples/remuxer.py?content-type=text%2Fplain&view=co remuxer.py])
* Improve [http://xiph.org/quicktime/ Xiph QuickTime Components]:
** add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
** add FLAC and Speex encoding support
** improve user interface of the Ogg exporter
** add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
* Portable [http://rarewares.org/ogg-oggenc.php oggenc2] --[[User:Fp|Fp]] 02:26, 14 May 2008 (PDT)
** oggenc2 is a fork of xiph.org oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over xiph.org oggenc, e.g.:
*** use of libsamplerate for resampling, giving a higher quality;
*** support of 32 bit and floating point WAV format for input;
** this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.
 
==Detailed Project Description==
 
===Mv_Embed: Accessibility and [re]usability:===
'''Mentor:''' Michael Dale, Anna (EngageMedia) <br/>
'''Existing Feature Set:''' [http://metavid.ucsc.edu/wiki/index.php/Mv_Embed Mv_Embed] is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).
 
'''Proposed Development:''' Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.
 
Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.
* FilmForge (Drupal)
* ShowInABox (Wordpress)
* Plumi (Plone)
 
Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.
 
'''Accessibility & CMML'''
Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will  offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).
 
Mv_Embed is part of [http://metavid.ucsc.edu/wiki/index.php/MetaVidWiki MetavidWiki] enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.
 
=== Theora Java port directly (semi)automatically derived from the reference sources ===


=== Optimize Theora encoding/decoding speed, SSE/SSE2 ===
The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.
Work on MMX, SSE/SSE2 implementations of the crucial encoding and
decoding  elements in [http://svn.xiph.org/trunk/theora/ libtheora]. This could continue in theora-exp
where with ruik's work or consolidate theora-mmx, theora-oil work into
trunk. The results must still build cleanly on other archs and do
run-time capability detection.


''Mentor: Ralph Giles, backup: Jan Gerber, Mike Smith''
Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM (http://nestedvm.ibex.org/) and Cibyl (http://spel.bth.se/index.php/Cibyl) are doing '''language agnostic translations to Java bytecode''', using the GCC toolchain.


=== Theora reference encoder quality optimization ===
In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.
The [http://theora.org/download.html libtheora] encoder could make more use of some features present in the spec
but not currently implemented in the encoder. This is a little open ended, but
suggestions are: quant matrix tuning, per-block qi choice, 4:2:2 and 4:4:4 chroma
support.


''Mentor: Ralph Giles, backup: Timothy Terriberry''
Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already (http://groups.google.com/group/nestedvm/browse_thread/thread/df96ef7337f390e4/a45fdd66534e7641?#a45fdd66534e7641) and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port (http://spel.bth.se/index.php/Cibyl_performance) - it's sometimes faster, sometimes slower.


=== Encode support in theora-exp ===
The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.
Implement a rate-distortion optimized encoding mode for [http://svn.xiph.org/trunk/theora-exp/ theora-exp],
including R-D optimzed mode decision and quantization (e.g., constant
lambda). Then, use the above routines to implement a medium-latency ABR
encoding mode (e.g., varying lambda), with a default target buffer size
of approximately 2 seconds.


''Mentor: Timothy "Derf" Terriberry, backup: Mike Smith, Ralph Giles''
Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup (http://groups.google.com/group/cibyl-devel/browse_thread/thread/584e5fc3b9bc7e2c). So for the Cibyl port to work some autoconf magic may be necessary.


=== Development assistant for the "Ghost" audio codec ===
So what should this project do:
Designing a cutting edge perceptual codec is a very daunting task. Xiph
is in the research stage on a new low-latency, general purpose audio codec,
code-named "Ghost". This is basically a "code assistant" position, where you
will be asked to implement, test, and give feedback on ideas from Christopher
Montgomery, designer of the Ogg Vorbis format. Be prepared to learn a lot about
audio coding, or apply what you already know. While there's less "ownership"
potential in this project proposal, it will be a great opportunity to learn
about compression algorithm design, practice your programming chops, and learn
to work in team.


''Mentor: Christopher "Monty" Montgomery, backup: Jean-Marc Valin''
* Create and document a working setup for doing language-agnostic Java conversions
* Demonstrate this for Theora
* Find a way to generate a Java interface in a way being automated as much as possible


=== OggMNG implementation ===
This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too
Implement the OggMNG decode support in [http://gstreamer.freedesktop.org/dev/ gstreamer] and/or [http://www.illiminable.com/ogg/ illi's dshow filters].
Implement encoding support in based on [http://www.advogato.org/person/company/diary.html?start=18 byzanz] or [http://live.gnome.org/Istanbul Istanbul]. Bonus points for
overlay support. Details on the OggMNG specifications [http://wiki.xiph.org/index.php/OggMNG here]


''Mentors: Mike Smith, Ralph Giles''
--[[User:Maikmerten|Maikmerten]] 03:43, 12 March 2008 (PDT)


=== Subtitle Editor ===
this project would also consolidate the various proposals for
subtitles. from what i saw on #annodex,
using [http://annodex.net/TR/draft-pfeiffer-cmml-03.html CMML] might be the way to go. Coordinate with [http://annodex.net/ Annodex.net].


another option is to pick an existing format (e.g. srt) and get a
===OggPusher===
gui mockup done. It should: give you playback with scrubbing, let
'''Mentor:''' Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ? <br>
you set in and out points, and write out the results in an Ogg File.
'''Abstract:''' OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.<br>
'''Sample Application Flow:''' is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc).  The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application.  Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress.  If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.


=== OggSkeleton tool support ===
'''Features for initial Release:'''
Implement [http://wiki.xiph.org/OggSkeleton OggSkeleton] production in various xiph tool packages, e.g. oggenc, vorbisenc, speexenc, theora encoder_example or ffmpeg2theora. Possibly also implement support for the QuickTime media framework (most other media frameworks already support OggSkeleton e.g. gstreamer, xine, vlc, DirectShow). It may be interesting to do these as a general library, e.g. on top [http://annodex.net/software/liboggz/ liboggz].
* A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
* Supports two modes of operation
** zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
** server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
* A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
* A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
* A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.


''Mentor: Conrad Parker, backup: Silvia Pfeiffer''


=== RTP payloaders and depayloaders for Vorbis and Theora ===
'''Future Feature RoadMap:'''
Write a set of payloaders and depayloader plugins for Vorbis and Theora for [http://gstreamer.net/ GStreamer] and [http://farsight.sourceforge.net/ Farsight].
Once the basic implementation has been deployed the following features will be targeted for future versions:
These plugins should implement the current specifications for [http://xiph.org/vorbis/ Vorbis] and [http://xiph.org/theora/ Theora]. Philipe Khalaf and Rob Taylor
of the Farsight project are prepared to mentor this project.


''Mentors: Philipe Khalaf, Rob Taylor Backup: Christian Schaller''
* Integration with popular open source CMS's first target is mediaWiki.
* Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
** Extend oggfwd and server side components for in browser live streaming to web services.
* Extend to support ffmpeg2Dirac and future open source media codecs.
* Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
* Enable Bittorrent uploads


=== Theora support in ekiga ===
===XSPF support in oggenc and ogg123 applications===
Implement support for Theora as a video codec in the [http://www.gnomemeeting.org/ ekiga] chat application.
Overlaps with [http://live.gnome.org/SummerOfCode2006/Ideas GNOME].


''Mentor: Ralph Giles''
Mentor: Ivo Emanuel Gonçalves<br/>
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player.  XSPF is a XML-based playlist format, extensible, but simple and efficient.


=== MXF support in gstreamer ===
Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF. Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback. This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.
Implement an [http://www.digitalpreservation.gov/formats/fdd/fdd000013.shtml MXF] mux/demux for [http://gstreamer.freedesktop.org/dev/ gstreamer], with mappings for [http://xiph.org/vorbis/ Vorbis] and [http://xiph.org/theora Theora].


''Mentors: Christian Schaller, Mike Smith''


=== Hardware implementation of Theora decoding ===
===php_annodex: wrapper to libannodex or liboggz for doing media stuff===
Working on a hardware theora decoder, that can be used in embedded
devices, dvd players and video pods. Presumedly GPL verilog source
to run on an FPGA. See http://sourceforge.net/projects/elphel/ for a rough encoder implementation.


=== Intel to AT&T x86 assembly translation ===
'''Mentor:''' Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale<br/>
There is a general need for cross platform projects to be able to compile the same asm accelleration code on both GCC and MSVC. Unfortunately, at least of x86, they have incompatible assembly formats. Currently people either convert one to the other by hand (a maintenance nightmare) or require an external compile/assemble step on one or the other platform.


Start with the (unmaintained?) [http://www.niksula.hut.fi/~mtiihone/intel2gas/ intel2gas] script. Spruce it up to support all of recent MMX, SSE, SSE2, SSE3 instructions. Then implement the reverse translation. Once both are working, write some glue code so it can be easily used as part of a GNU autotools build to derive one set of source from the other at build or package time.
'''What is it?'''
Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.


=== Theora encoder and Ogg exporter/muxer QuickTime components ===
'''What is the project?'''
[http://developer.apple.com/quicktime/ QuickTime] is a major multimedia framework, used in many professional audio and video applications. The framework is flexible and its functionality can be extended by means of plugins - ''components''.
An initial version of [http://annodex.net/software/phpannodex/index.html php_annodex exists], but it is incomplete and not up-to-date. This is in comparison with such support in python through [http://annodex.net/taxonomy_menu/1/19 pyannodex]. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module [http://annodex.net/software/phpannodex/index.html Acidfree]. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.
While the number of existing tools able to export video in Theora format is still limited, Theora encoder/exporter components would somewhat improve the situation allowing applications based on QuickTime to produce content in Theora format.


The project idea is to implement two QuickTime components: Theora encoder and Ogg exporter.
In the Theora encoder component [http://theora.org/download.html libtheora] should be used for the encoding functionality and focus here is on integrating that functionality with the rest of the QuickTime framework.
Ogg file exporter would be built around [http://www.xiph.org/ogg/ libogg] for transfering internal QuickTime movie structures into physical ogg streams. The component would also need to support [http://www.xiph.org/ogg/doc/oggstream.html concurrent multiplexing] and Ogg Theora mapping ([http://www.theora.org/doc/Theora_I_spec.pdf Theora spec, appendix A]). [http://www.xiph.org/ogg/doc/oggstream.html Sequential multiplexing] and possibly [[OggSkeleton]] support would be a plus.


QuickTime is [http://developer.apple.com/quicktime/ extensively documented] including [http://developer.apple.com/documentation/QuickTime/QuickTimeComponentCreation-date.html#doclist component creation] and [http://developer.apple.com/samplecode/QuickTime/idxQuickTimeComponentCreation-date.html#doclist example components]. The old [http://qtcomponents.sourceforge.net/ qtcomponents] project also contains an Ogg/Vorbis exporter component.
===ruby_annodex: wrapper to libannodex or liboggz for doing media stuff===


''Mentor: Arek Korbik''
'''Mentor:''' Silvia Pfeiffer<br/>


=== Audio encoders in QuickTime/CoreAudio ===
'''What is it?'''
This is a complementary idea to the Theora and Ogg QuickTime components above, adding audio format support to the Ogg exporter component.
Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.


Implement Vorbis, Speex and possibly FLAC encoders as [http://developer.apple.com/documentation/MusicAudio/Reference/CoreAudio/index.html Core Audio AudioCodecs].
'''What is the project?'''
A python wrapper of similar type called [http://annodex.net/taxonomy_menu/1/19 pyannodex] exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.


Besides documentation, [http://developer.apple.com/audio/download/ Core Audio SDK] includes an example AudioCodec implementation.


''Mentor: Arek Korbik''
===Using ROE to create multi-track Ogg files===


== GUIDELINES FOR APPLYING ==
'''Mentor:''' Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens<br/>


Remember that many people will apply to work on the Summer of Code.
'''What is it?'''
[http://trac.annodex.net/wiki/MovieDescriptionLanguage ROE] is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.
 
'''What is the project?'''
In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.
 
 
===SHARE application for the Spread Open Media project===
 
Mentor: Ivo Emanuel Gonçalves<br/>
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise.  SHARE is a pratical step to build on this community and spread more files.
 
Proposed Development: SHARE is intended to be a PHP project.  We do not discard the possibility of using Rails or Python, but the current SOM server does not support these.  SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists.  Basically, it is a playlist sharing application.  Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.
 
=== Cross-platform Xiph encoder wrapper in qt4 ===
 
Mentor: Not specified yet<br/>
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs.  Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.


Keep in mind that those of us evaluating your application do not know you, we do not know what kind of
Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLAC.  This would likely boost the popularity of said formats tremendously.
experience you have, we do not know what you have done in the past and we have to pick the best people
suited for a particular task.


Hence, it is very important that you tell us in your email why you should be considered to implement a
===Dirac support in liboggplay and liboggz===
particular project. Projects with one line applications will probably get discarded so don't make an application
like this:


I would like to work on project XYZ
'''Mentor:''' ??


Do not cut-and-paste the text from this page in your application. We know full well what the text here is.  
Right now [http://wiki.xiph.org/index.php/OggPlay liboggplay] only support Theora video.
Instead explain to us your take on the problem "I could implement this using this and that", "I would need to  
Your aim for this project is to add support for [http://dirac.sourceforge.net/ Dirac],
research these areas", "I might need help sorting this out", etc.
this should be done using [http://www.diracvideo.org/ libschrodinger].
Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.


Explain to us why you are a good candidate, also explain which projects interest you (in case that there is more
== Guidelines for Applying ==
than one) and why.


During the summer of code, we will invest significant resources from existing team members to guide you, answer your questions, and help you architect the software in a way that is acceptable to Xiph.org and that has a high chance of  
Remember that many people will apply to work on the Summer of Code.
having an impact on the larger community Xiph.org works with.


Note that if you are a student that want to apply for the Summer of Code, you should go through the standard Google process.
Keep in mind that those of us evaluating your application do not know you, we do not know what kind of  
experience you have, we do not know what you have done in the past and we have to pick the best people
suited for a particular task.


If you have questions about these projects you can for instance come to the #theora channel on irc.freenode.com
Hence, it is very important that you tell us in your email why you should be considered to implement a
particular project. Please use the application template at [[Summer of Code Applications]] as a starting point.


(thanks to the Mono project for these general guidelines)
== See Also ==
*[[Todo]]
*[[Bounties]]
*[[CodingGuidelines]]
*[[MIT approach to design and implementation]]
*[[How to do a release]]
*[[Summer of Code 2007]]
*[[Summer of Code 2006]]

Latest revision as of 18:14, 26 February 2009

This is our ideas page for Google Summer of Code projects with Xiph.org and Annodex. The two projects participate jointly this year under Xiph's name.

Students please use the template at Summer of Code Applications when applying for a GSoC position.

Mentors please visit Summer of Code Mentoring and help us prepare our application as a mentoring organization.

Current Ideas

We need a primary and backup mentor volunteer for any project that is to become an official proposal, but submit something and we'll see who we can round up. :)

Project ideas go here

  • Transcode/Tag/Upload tool for Theora et al. (ideally as a firefox extension so web cms integration is easy) OggPusher details below:
  • Theora encoding support in GIMP
  • Cross-platform qt4 wrapper around Xiph encoders. A do-it-all encoder in one simple GUI, possibly drag&drop a la OggDropXPd. This would make it tremendously easy for end-users to encode Theora et al
  • Theora Java port directly (semi)automatically derived from the reference sources
  • Optimisations for Oggenc and Co. as done by Lancer (http://homepage3.nifty.com/blacksword/) which gives around 3x more speed. If one fears quality lose, make it ./configure option. Lancers diffs only work for Windows.
The last statement shows why this is a bad idea.
  • Speex support in IceS
  • Better stream source gui for dvswitch
  • XSPF support in ogg123 and oggenc (playlist creation)
  • Initial support for OggPCM in some of our tools
  • OggMNG tools
Is this really necessary? I mean, OggMNG seems to have gone nowhere and serve no niche.--Ivo
We still keep getting asked for a format where speex and images together make up a movie.--Silvia
  • ROE implementation for network: using ROE in a client-server negotiation to dynamically request a specific multi-track ogg file using skeleton (Silvia)
  • create Ogg caption support for vlc using CMML
  • ffmpeg improvements for Xiph codecs:
    • add Speex support
    • add Ogg Skeleton support
    • fix seeking bugs involving Ogg Theora
    • fix bugs in Ogg Theora decoder
    • improve ogg muxer
  • Ogg Cutter, a GUI to cut out segments from Ogg Videos, this could be based on oggz-chop (part of oggz-tools) or done with Gstreamer (starting with remuxer.py)
  • Improve Xiph QuickTime Components:
    • add Ogg Skeleton support (would make XiphQT able to properly play streams served with mod_annodex)
    • add FLAC and Speex encoding support
    • improve user interface of the Ogg exporter
    • add AudioFile components supporting Ogg and FLAC files (to make XiphQT available to applications using only CoreAudio without QuickTime)
  • Portable oggenc2 --Fp 02:26, 14 May 2008 (PDT)
    • oggenc2 is a fork of xiph.org oggenc available for the Win32 platform. Sources are available under the GPL. Unfortunately it does not compile under POSIX systems. Oggenc2 has a lot of features and bug fixes over xiph.org oggenc, e.g.:
      • use of libsamplerate for resampling, giving a higher quality;
      • support of 32 bit and floating point WAV format for input;
    • this project should port all the improvements in oggenc2 to oggenc. Note that the two projects are diverged somewhat and so oggenc may have some other features that oggenc2 have not, so a straight port of oggenc2 to POSIX could not be the right approach. The best way is to get the source of all versions of oggenc2, do a diff between them and try to apply in oggenc.

Detailed Project Description

Mv_Embed: Accessibility and [re]usability:

Mentor: Michael Dale, Anna (EngageMedia)
Existing Feature Set: Mv_Embed is an existing javascript library that takes html5 <video> tag and rewrites the video tag for to support in-page ogg theora playback in contemporary browsers. MV_embed supports may browsers and plugins including: native browser support such as firefox 3 video builds, oggplay plugin for firefox2 in win, mac, linux ; VLC activX/plugin for win IE, firefox, and mac, linux firefox; mplayer & totem for linux; and java cortado for microsoft, sun, apple java VM for IE, firefox & safari. Mv_embed maps all these plugin javascript systems to a ~near~ html5 spec api enabling web application developers to take advantage of a uniform javascript API for video control and interaction without having to worry about the underling plugin systems. Mv_Embed is used as part of the metavidWiki Project (screen cast).

Proposed Development: Mv_Embed will be enhanced around two goals integration into prominent open source Content Management Systems and better accessibility of close captions and associative video metadata.

Mv_Embed will integrate into existing CMS video extensions for quick "one-off" ogg theora support.

  • FilmForge (Drupal)
  • ShowInABox (Wordpress)
  • Plumi (Plone)

Additional server side components like transcoding to theora, generating thumbnails, and exporting metadata will also be developed. Where time / resources permit server side hooks into ffmepg2theora (for transcoding) and mplayer (for generating thumbnails) will be developed for the CMS systems as well. As OggPusher matures simple hooks will be added to the CMS's to support direct ogg theora clip uploads.

Accessibility & CMML Accessible components of mv_embed consist of obtaining the metadata and putting it into the dom as a child of the video element. Mv_Embed will offer a reference javascript interface for client interactions with that metadata. The metadata will be structured in Continues Media Markup Language (CMML). CMML is a part of the annodex technology set and can either be muxed into the ogg stream or be requested separately via XML. Mv_Embed will negotiate a transport method for the metadata that will work for the given plugin type.(Currently only oggplay plugin supports ogg-skeleton and exposing muxed CMML tracks in the ogg stream).

Mv_Embed is part of MetavidWiki enables community authored transcripts and exposes these multiple layers in CMML. Proposed work on Mv_Embed will generalize these development efforts taking place in the metavid project for other CMS's and improve the usability and accessibility of these metadata layers in javascript based interfaces and mutil-plugin playback environment.

Theora Java port directly (semi)automatically derived from the reference sources

The current Java decoder port (jheora) is rapidly heading towards becoming obsolete. It was based on the C reference implementation during alpha development stages, which means it cannot decode advanced Theora streams using non-VP3 features. Current Theora mainline features a completely new decoder, implementing all bitstream features, and a new encoder needing these advanced decoder capabilities is expected to arrive soon. jheora, however, appears to be unmaintainable for very same reasons the original alpha decoder was dropped. To make matters worse there's a very very noticable lack of someone being at least moderately skilled in Java AND being skilled in video coding AND writing Java code with acceptable speed (video decoding should be realtime). Any conventional manual Java source port may quickly bitrot to an unmaintainable state.

Thankfully there *are* technologies to get C code to execute in the Java Virtual Machine. The obvious idea would be to translate the actual source code to Java using an automated process, but no reliable tools exist doing this (and given the concept-clash in some areas between C and Java it's unlikely something really nice will emerge). Projects like NestedVM (http://nestedvm.ibex.org/) and Cibyl (http://spel.bth.se/index.php/Cibyl) are doing language agnostic translations to Java bytecode, using the GCC toolchain.

In the first step the code to be ported is compiled to MIPS ELF binaries. Those are then converted to Java bytecode. This works pretty well because MIPS is pretty similar to Java bytecode and most instructions can be mapped directly.

Crazyness? Work of mad men, living in nuclear families, fighting rampaging robots with nuclear missiles? Does this actually work? Yup, it does work, and some Xiph encoders/decoders have been successfully converted with NestedVM already (http://groups.google.com/group/nestedvm/browse_thread/thread/df96ef7337f390e4/a45fdd66534e7641?#a45fdd66534e7641) and figures provided by the Cibyl project indicate that the MIPS-to-Java approach isn't actually slower than a "real" Java port (http://spel.bth.se/index.php/Cibyl_performance) - it's sometimes faster, sometimes slower.

The problem with NestedVM is that there appears to be no means to generate a Java interface from the converted binaries - which means that while the converted binaries work fine on Java there's no way to call the functionality of the converted code by other Java classes, which would be necessary to e.g. write a player applet.

Cibyl, on the other hand, does provide means to generate Java interfaces, given the binary and the header files. Cibyl, however, needs to link some helper symbols into the MIPS binary, which apparently requires some tricks to work in the usual autoconf setup (http://groups.google.com/group/cibyl-devel/browse_thread/thread/584e5fc3b9bc7e2c). So for the Cibyl port to work some autoconf magic may be necessary.

So what should this project do:

  • Create and document a working setup for doing language-agnostic Java conversions
  • Demonstrate this for Theora
  • Find a way to generate a Java interface in a way being automated as much as possible

This project most likely is directly bound to progress made with either NestedVM or Cibyl. The upside of this is that any results may be directly applied to other projects, too

--Maikmerten 03:43, 12 March 2008 (PDT)


OggPusher

Mentor: Michael Dale ... or anyone else with more experience with firefox extensions/ffmpeg2theora ?
Abstract: OggPusher is a proposed cross platform packaging of ffmpeg2theora as a browser extension. This exposes JavaScript hooks to web applications enabling easy client side transcodes from high quality source originals such as DV or MPEG2 and uploading into web based content management systems.
Sample Application Flow: is as follows: A user visits a oggPusher enabled web service. The firefox user is prompted to install a browser extension via firefox's .xpi extension framework. Once enabled, the web service upload interface does a call to the oggPusher to expose a "open file" dialog box on the client. The websevice access the oggPusher api to set the requested transcode bitrate and other transcode options (such as interlace, number of audio channels, resolution etc). The client selects the high quality local file and begins transcoding to a temporary location on local disk. If there is an error in transcoding the upload is aborted and an error is exposed to web application. Once the file is done transcoding, the web interface has the client issue a POST of the transcoded file.(if the server supports more efficient PUT than that can be used). The amount of the file that has been transcoded and the amount uploaded are exposed via javascript hooks so that web application javascript interface can update the client on upload progress. If the the upload connection is reset a ajax request on the client can request "bytes upload so far" from the server and have oggPusher begin uploading from that point in the temporary local ogg file. A local file hash could be rechecked to insure the local file has not changed. The server can then do a simple join on the uploaded pieces, enabling reusable uploads over existing http protocol. If the server does not support resumes the file will be uploaded from the start.

Features for initial Release:

  • A .xpi extension based on ffmpeg2theoa that supports uploading of local files of any type that ffmpeg accepts.
  • Supports two modes of operation
    • zero server side config where oggPusher just gives the option of uploading theora video where it finds a form file input type.
    • server side config where the server/service hooks into oggPusher for extra functionality, like resuming transferrer and status updates integrated with the web application.
  • A simple javascript api for controlling ffmpeg2theora encoding options. These options will be pre-demerited and javascript input will be scrubbed to avoid client side security risks.
  • A set of javascript hooks for oggPusher that expose upload progress, encoding progress and transcoding errors.
  • A sample server side implementation using php/html/javascript for grabbing ogg files from oggPusher.


Future Feature RoadMap: Once the basic implementation has been deployed the following features will be targeted for future versions:

  • Integration with popular open source CMS's first target is mediaWiki.
  • Hooks for connecting into "live" interfaces such as firewire digital video input or USB web cams.
    • Extend oggfwd and server side components for in browser live streaming to web services.
  • Extend to support ffmpeg2Dirac and future open source media codecs.
  • Enable javascript hooks for grabbing highquality jpg or png screen grabs from the original source to be uploaded alongside the encoded video.
  • Enable Bittorrent uploads

XSPF support in oggenc and ogg123 applications

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: oggenc and ogg123 are part of a toolset named vorbis-tools, where oggenc is a Vorbis encoder and ogg123 an audio player. XSPF is a XML-based playlist format, extensible, but simple and efficient.

Proposed Development: this project would extend those two applications (oggenc and ogg123) to support XSPF. Namely, oggenc would be able to generate a playlist from the encoded files, and ogg123 would be able to parse a playlist for supported media for playback. This is a C project, with the intention of using code from or actually linking to the BSD-licensed libSpiff, which is a C++ XSPF library.


php_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer ... or anyone else with an a php background e.g. Michael Dale

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. php_annodex can e.g. be used to extend Drupal, MediaWiki and other php-based applications with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through php_annodex.

What is the project? An initial version of php_annodex exists, but it is incomplete and not up-to-date. This is in comparison with such support in python through pyannodex. A GSoC student would be expected to bring the support for Xiph and Annodex technology in php_annodex up-to-date. In addition, he/she could extend this work by also implementing media support in a plugin, e.g. the Drupal module Acidfree. php_annodex is simply a php wrapper around the C-libraries libannodex and liboggz. It may suffice to just focus on liboggz.


ruby_annodex: wrapper to libannodex or liboggz for doing media stuff

Mentor: Silvia Pfeiffer

What is it? Direct interaction with Ogg video and audio files from within a Web scripting language is key to providing further support to existing and new Web media applications. ruby_annodex can e.g. be used to extend rails with function calls to control opening, closing, seeking, playing, pausing, telling position and similar interactions with audio/video. Further, since Annodex has CMML for time-aligned annotations, hyperlinks to other places, and textual descriptions (such as captions) can be accessed and used through ruby_annodex.

What is the project? A python wrapper of similar type called pyannodex exists. The ruby_annodex wrapper should provide similar functionality to ruby, in particular with a view of using it from within rails for the development of Web applications. Development of an example application in ruby on rails would be part of this. Extension of this project to include media support into a ruby-based CMS is possible.


Using ROE to create multi-track Ogg files

Mentor: Silvia Pfeiffer ... and anyone else interested in ROE, e.g. Ralph Giles, Conrad Parker, Michael Dale, Shane Stephens

What is it? ROE is a small XML description language for multi-track media files. It can be used for authoring multi-track media files from separate physical files on disk. It can also be used on a Web server to dynamically create multi-track media resources where the tracks are selected through the request from the client.

What is the project? In this project, we only implement and experiment with the file multiplexing side of things. The ROE specification is very new and potentially incomplete, so part of the project will be to validate this specification. The other part will be to create an authoring tool that can take a ROE file, parse it, pull in all the input audio, video, text etc files and create an Ogg file with a Skeleton that contains the equivalent of ROE inside the binary file. The project will start with a focus on multiplexing vorbis audio and theora video, but also include speex, FLAC, CMML, and possibly MNG data. If this is achieved in a short time frame, the project can continue onto developing support for these multi-track files in e.g. vlc or ffmpeg. This can even extend to providing a full tool-chain from authoring captions for a video file, to creating the respective multitrack Ogg file, and finally to playing them back inside vlc where the captions are shown as overlays.


SHARE application for the Spread Open Media project

Mentor: Ivo Emanuel Gonçalves
Existing Feature Set: Spread Open Media is a community project to promote the different free formats for multimedia and otherwise. SHARE is a pratical step to build on this community and spread more files.

Proposed Development: SHARE is intended to be a PHP project. We do not discard the possibility of using Rails or Python, but the current SOM server does not support these. SHARE will be a WebJay-like clone, as in users will be able to register, vote, comment and upload their own XSPF playlists. Basically, it is a playlist sharing application. Using OpenID for registration and Cortado (an existing Java applet) for playback would be welcome additions.

Cross-platform Xiph encoder wrapper in qt4

Mentor: Not specified yet
Existing Feature Set: qt4 is a cross-platform C++ widget toolkit, which makes it easy to create GUI programs. Xiph has encoders for all of its main formats, but they are command line only, which a big no-no for the average user.

Proposed Development: The idea is to create a qt4 wrapper around those encoders to make it easier for anyone to encode media into Vorbis, Speex, Theora and FLAC. This would likely boost the popularity of said formats tremendously.

Dirac support in liboggplay and liboggz

Mentor: ??

Right now liboggplay only support Theora video. Your aim for this project is to add support for Dirac, this should be done using libschrodinger. Doing this, you will add OggDirac support to the OggPlay Browser Plugin and the upcoming <video> tag support in Firefox.

Guidelines for Applying

Remember that many people will apply to work on the Summer of Code.

Keep in mind that those of us evaluating your application do not know you, we do not know what kind of experience you have, we do not know what you have done in the past and we have to pick the best people suited for a particular task.

Hence, it is very important that you tell us in your email why you should be considered to implement a particular project. Please use the application template at Summer of Code Applications as a starting point.

See Also