<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.xiph.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Izx</id>
	<title>XiphWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.xiph.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Izx"/>
	<link rel="alternate" type="text/html" href="https://wiki.xiph.org/Special:Contributions/Izx"/>
	<updated>2026-04-24T13:43:39Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6495</id>
		<title>User:Izx:GSoC2007</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6495"/>
		<updated>2007-03-20T19:22:56Z</updated>

		<summary type="html">&lt;p&gt;Izx: /* My background */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ishaan Dalal == &lt;br /&gt;
&lt;br /&gt;
E-mail: [ishaand [at] gmail dot com]&lt;br /&gt;
&lt;br /&gt;
== My background ==&lt;br /&gt;
&lt;br /&gt;
Currently, I&#039;m working towards my master&#039;s degree in Electrical Engineering (EE) at the Cooper Union for the Advancement of Science and Art in New York, NY.  I graduated from Cooper with a B.E. in EE in 2006. Source coding (audio and video in particular) has always been my primary academic interest; it began in high school when I tried to discover *how* MP3 let me digitize my tape collection to such small sizes and still remain eminently listenable.&lt;br /&gt;
&lt;br /&gt;
To a great degree, this interest has dictated my upper-level and graduate coursework. Besides the usual EE undergrad curriculum, math coursework relevant to this proposal includes upper-level linear algebra and some graduate stochastics. Graduate EE courses, most of which I took when I was a junior/senior, included the theory of digital video and audio coding, adaptive filtering and multiresolution techniques (filterbanks/wavelets). Each of these also had a &amp;quot;final project&amp;quot;, which were a hierarchical DCT-based video codec, a from-scratch MIDI player and a simple subband audio codec, an embedded zerotree wavelet (EZW) image codec and non-separable 3D DWTs for image volumes (i.e. video) respectively.&lt;br /&gt;
&lt;br /&gt;
My hands-on experience with audio coding includes a simple MDCT-based codec with a rudimentary  psychoacoustic model, whose goal was primarily implementing real-time 44.1 KHz stereo decoding/playback on a 16-bit RISC microcontroller (the dsPIC). The decoder was written from scratch in PIC assembly, including sine windowing, MDCT, and RLE/Huffman. My senior project also focused on signal processing: a software-defined RF receiver and image processor for biomedical applications (MRI), implemented on an FPGA; it won the IEEE Region 1 Student Paper Contest and was also the subject of two conference papers (http://i.zdom.org/pubs).&lt;br /&gt;
&lt;br /&gt;
I consider myself quite proficient at coding &amp;quot;math&amp;quot; in both C and octave/matlab; however, my experience with coding GUIs is very limited. I have not contributed, code-wise, to an open-source project before this; I&#039;m an irregular on the lame (mp3-encoder) mailing list, where I try to answer questions from others when I can.&lt;br /&gt;
&lt;br /&gt;
== Project ==&lt;br /&gt;
&lt;br /&gt;
I&#039;d like to help in developing Ghost. Considering that Ghost is still fairly nascent, I&#039;ll try to formulate what I&#039;d like to do as best as I can, after having had discussions with both Monty and JM. Ghost will probably be a &amp;quot;sinusoidal+noise&amp;quot; codec. The &amp;quot;sinusoidal&amp;quot; portion requires a method of estimating the primary sinusoids in a given frame, classifying them psychoacoustically, and then quantizing them. Both Monty and JM have/are experimenting with STFT/basis methods for estimation, and I will help them with implementing and testing their ideas, based on their advice. This could vary from e.g. testing/tweaking their basis solvers, trying out dynamic windowing, quantization, etc.&lt;br /&gt;
&lt;br /&gt;
Another sinusoidal estimation technique that seems promising, and has not been tried out yet by either Monty or JM to the best of my knowledge, is &amp;quot;Matching Pursuit&amp;quot; (MP) [1]. If given a chance, I would like to see whether MP is qualitatively and computationally feasible for sinusoidal estimation in a real-time low-latency codec. The first part would probably include testing different kinds of dictionaries - audio researchers have been successful with atoms that are damped sinusoids [2], complex exponentials [3], harmonic sinusoids. Psycho-adaptive and hybrid MP methods, such as Bark-band-type splitting followed by MP [4], both estimate and also classify to a degree, which might be computationally more efficient. Finally, there are also fast MP techniques such as [5], that use the DFT for fast correlation.&lt;br /&gt;
== Project outcomes ==&lt;br /&gt;
&lt;br /&gt;
Again, it&#039;s hard to give specific numbers here. Over the summer, I would basically like help come to a consensus about the direction Ghost will take re sinusoidal analysis, including preliminary investigations into possible approaches to improve the computational efficiency and R-D tradeoff for the chosen  analysis technique.&lt;br /&gt;
&lt;br /&gt;
== Schedule ==&lt;br /&gt;
&lt;br /&gt;
A rough schedule would be:&lt;br /&gt;
&lt;br /&gt;
May 28 - July 10: Experiment with and decide on a promising sinusoidal estimation technique&lt;br /&gt;
&lt;br /&gt;
July 11 - July 31: Figure out sinusoidal classification - deltas, psych-model picking, etc.&lt;br /&gt;
&lt;br /&gt;
August 1 - August 20: Work on efficient quantization of estimated/classified sinusoids.&lt;br /&gt;
&lt;br /&gt;
I have no other formal commitments for the summer, and expect to devote the time-equivalent of a full-time job to this project. I will be away for two weeks at some point during the summer, visiting family in California. I&#039;ve also submitted a paper to the MWSCAS 2007 conference in Montreal; if accepted, I will be there from August 5-8.&lt;br /&gt;
&lt;br /&gt;
== Why me? ==&lt;br /&gt;
&lt;br /&gt;
The ability to write good code will be a common denominator among all the applicants. What&#039;s relatively unique about me is a broad knowledge of the underlying math and signal processing theory, which will allow me to assist with Ghost&#039;s design in a more substantial and efficient manner, than a &amp;quot;mercenary&amp;quot; coder. As a hypothetical example, I could interpret certain results and &amp;quot;fix&amp;quot; the algorithm/code, without having to be micromanaged by Monty/JM. My experience with implementing audio/signal processing algorithms in assembly and hardware also lets me easily break out of the high-level-language/floating-point mold when optimization is necessary.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
1. S. Mallat and Z. Zhang, &amp;quot;Matching Pursuits with time-frequency dictionaries&amp;quot;, IEEE Trans. Sig. Proc., Vol. 41, No. 12, 1993.&lt;br /&gt;
&lt;br /&gt;
2. M. Goodwin and M. Vetterli, &amp;quot;Matching Pursuit and Atomic Signal Models Based on Recursive Filter Banks&amp;quot;, IEEE Trans. Sig. Proc., Vol. 47, No. 7, July 1999&lt;br /&gt;
&lt;br /&gt;
3. R. Heusdens, et al. &amp;quot;Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits&amp;quot;. ICASSP &#039;01, pp. 3281-3284.&lt;br /&gt;
&lt;br /&gt;
4. H. Jang and S. Park, &amp;quot;Multiresolution Sinusoidal Model With Dynamic Segmentation for Timescale Modification of Polyphonic Audio Signals&amp;quot;, IEEE Trans. Speech and Audio Proc., Vol 13, No. 2, March 2005.&lt;br /&gt;
&lt;br /&gt;
5. T.S. Verma and T.H.Y. Meng, &amp;quot;Sinusoidal modeling using frame-based perceptually weighted matching pursuits&amp;quot;, ICASSP &#039;99, pp. 981-984.&lt;/div&gt;</summary>
		<author><name>Izx</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6494</id>
		<title>User:Izx:GSoC2007</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6494"/>
		<updated>2007-03-20T19:21:59Z</updated>

		<summary type="html">&lt;p&gt;Izx: /* My background */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ishaan Dalal == &lt;br /&gt;
&lt;br /&gt;
E-mail: [ishaand [at] gmail dot com]&lt;br /&gt;
&lt;br /&gt;
== My background ==&lt;br /&gt;
&lt;br /&gt;
Currently, I&#039;m working towards my master&#039;s degree in Electrical Engineering (EE) at the Cooper Union for the Advancement of Science and Art in New York, NY.  I graduated from Cooper with a B.E. in EE in 2006. Source coding (audio and video in particular) has always been my primary academic interest; it began in high school when I tried to discover *how* MP3 let me digitize my tape collection to such small sizes and still remain eminently listenable.&lt;br /&gt;
&lt;br /&gt;
To a great degree, this interest has dictated my upper-level and graduate coursework. Besides the usual EE undergrad curriculum, math coursework relevant to this proposal includes upper-level linear algebra and some graduate stochastics. Graduate EE courses, most of which I took when I was a junior/senior, included the theory of digital video and audio coding, adaptive filtering and multiresolution techniques (filterbanks/wavelets). Each of these also had a &amp;quot;final project&amp;quot;, which were a hierarchical DCT-based video codec, a from-scratch MIDI player and a simple subband audio codec, an embedded zerotree wavelet (EZW) image codec and non-separable 3D DWTs for image volumes (i.e. video) respectively.&lt;br /&gt;
&lt;br /&gt;
My hands-on experience with audio coding includes a simple MDCT-based codec with a rudimentary  psychoacoustic model, whose goal was primarily implementing real-time 44.1 KHz stereo decoding/playback on a 16-bit RISC microcontroller (the dsPIC). The decoder was written from scratch in PIC assembly, including sine windowing, MDCT, ATH-based. quant. and RLE/Huffman. My senior project also focused on signal processing: a software-defined RF receiver and image processor for biomedical applications (MRI), implemented on an FPGA; it won the IEEE Region 1 Student Paper Contest and was also the subject of two conference papers (http://i.zdom.org/pubs).&lt;br /&gt;
&lt;br /&gt;
I consider myself quite proficient at coding &amp;quot;math&amp;quot; in both C and octave/matlab; however, my experience with coding GUIs is very limited. I have not contributed, code-wise, to an open-source project before this; I&#039;m an irregular on the lame (mp3-encoder) mailing list, where I try to answer questions from others when I can.&lt;br /&gt;
&lt;br /&gt;
== Project ==&lt;br /&gt;
&lt;br /&gt;
I&#039;d like to help in developing Ghost. Considering that Ghost is still fairly nascent, I&#039;ll try to formulate what I&#039;d like to do as best as I can, after having had discussions with both Monty and JM. Ghost will probably be a &amp;quot;sinusoidal+noise&amp;quot; codec. The &amp;quot;sinusoidal&amp;quot; portion requires a method of estimating the primary sinusoids in a given frame, classifying them psychoacoustically, and then quantizing them. Both Monty and JM have/are experimenting with STFT/basis methods for estimation, and I will help them with implementing and testing their ideas, based on their advice. This could vary from e.g. testing/tweaking their basis solvers, trying out dynamic windowing, quantization, etc.&lt;br /&gt;
&lt;br /&gt;
Another sinusoidal estimation technique that seems promising, and has not been tried out yet by either Monty or JM to the best of my knowledge, is &amp;quot;Matching Pursuit&amp;quot; (MP) [1]. If given a chance, I would like to see whether MP is qualitatively and computationally feasible for sinusoidal estimation in a real-time low-latency codec. The first part would probably include testing different kinds of dictionaries - audio researchers have been successful with atoms that are damped sinusoids [2], complex exponentials [3], harmonic sinusoids. Psycho-adaptive and hybrid MP methods, such as Bark-band-type splitting followed by MP [4], both estimate and also classify to a degree, which might be computationally more efficient. Finally, there are also fast MP techniques such as [5], that use the DFT for fast correlation.&lt;br /&gt;
== Project outcomes ==&lt;br /&gt;
&lt;br /&gt;
Again, it&#039;s hard to give specific numbers here. Over the summer, I would basically like help come to a consensus about the direction Ghost will take re sinusoidal analysis, including preliminary investigations into possible approaches to improve the computational efficiency and R-D tradeoff for the chosen  analysis technique.&lt;br /&gt;
&lt;br /&gt;
== Schedule ==&lt;br /&gt;
&lt;br /&gt;
A rough schedule would be:&lt;br /&gt;
&lt;br /&gt;
May 28 - July 10: Experiment with and decide on a promising sinusoidal estimation technique&lt;br /&gt;
&lt;br /&gt;
July 11 - July 31: Figure out sinusoidal classification - deltas, psych-model picking, etc.&lt;br /&gt;
&lt;br /&gt;
August 1 - August 20: Work on efficient quantization of estimated/classified sinusoids.&lt;br /&gt;
&lt;br /&gt;
I have no other formal commitments for the summer, and expect to devote the time-equivalent of a full-time job to this project. I will be away for two weeks at some point during the summer, visiting family in California. I&#039;ve also submitted a paper to the MWSCAS 2007 conference in Montreal; if accepted, I will be there from August 5-8.&lt;br /&gt;
&lt;br /&gt;
== Why me? ==&lt;br /&gt;
&lt;br /&gt;
The ability to write good code will be a common denominator among all the applicants. What&#039;s relatively unique about me is a broad knowledge of the underlying math and signal processing theory, which will allow me to assist with Ghost&#039;s design in a more substantial and efficient manner, than a &amp;quot;mercenary&amp;quot; coder. As a hypothetical example, I could interpret certain results and &amp;quot;fix&amp;quot; the algorithm/code, without having to be micromanaged by Monty/JM. My experience with implementing audio/signal processing algorithms in assembly and hardware also lets me easily break out of the high-level-language/floating-point mold when optimization is necessary.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
1. S. Mallat and Z. Zhang, &amp;quot;Matching Pursuits with time-frequency dictionaries&amp;quot;, IEEE Trans. Sig. Proc., Vol. 41, No. 12, 1993.&lt;br /&gt;
&lt;br /&gt;
2. M. Goodwin and M. Vetterli, &amp;quot;Matching Pursuit and Atomic Signal Models Based on Recursive Filter Banks&amp;quot;, IEEE Trans. Sig. Proc., Vol. 47, No. 7, July 1999&lt;br /&gt;
&lt;br /&gt;
3. R. Heusdens, et al. &amp;quot;Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits&amp;quot;. ICASSP &#039;01, pp. 3281-3284.&lt;br /&gt;
&lt;br /&gt;
4. H. Jang and S. Park, &amp;quot;Multiresolution Sinusoidal Model With Dynamic Segmentation for Timescale Modification of Polyphonic Audio Signals&amp;quot;, IEEE Trans. Speech and Audio Proc., Vol 13, No. 2, March 2005.&lt;br /&gt;
&lt;br /&gt;
5. T.S. Verma and T.H.Y. Meng, &amp;quot;Sinusoidal modeling using frame-based perceptually weighted matching pursuits&amp;quot;, ICASSP &#039;99, pp. 981-984.&lt;/div&gt;</summary>
		<author><name>Izx</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6493</id>
		<title>User:Izx:GSoC2007</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6493"/>
		<updated>2007-03-20T19:20:17Z</updated>

		<summary type="html">&lt;p&gt;Izx: /* Why me? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ishaan Dalal == &lt;br /&gt;
&lt;br /&gt;
E-mail: [ishaand [at] gmail dot com]&lt;br /&gt;
&lt;br /&gt;
== My background ==&lt;br /&gt;
&lt;br /&gt;
Currently, I&#039;m working towards my master&#039;s degree in Electrical Engineering (EE) at the Cooper Union for the Advancement of Science and Art in New York, NY.  I graduated from Cooper with a B.E. in EE in 2006. Source coding (audio and video in particular) has always been my primary academic interest; it began in high school when I tried to discover *how* MP3 let me digitize my tape collection to such small sizes and still remain eminently listenable.&lt;br /&gt;
&lt;br /&gt;
To a great degree, this interest has dictated my upper-level and graduate coursework. Besides the usual EE undergrad curriculum, math coursework relevant to this proposal includes upper-level linear algebra and some graduate stochastics. Graduate EE courses, most of which I took when I was a junior/senior, included the theory of digital video and audio coding, adaptive filtering and multiresolution techniques (filterbanks/wavelets). Each of these also had a &amp;quot;final project&amp;quot;, which were a hierarchical DCT-based video codec, a from-scratch MIDI player and a simple subband audio codec, an embedded zerotree wavelet (EZW) image codec and non-separable 3D DWTs for image volumes (i.e. video) respectively.&lt;br /&gt;
&lt;br /&gt;
My hands-on experience with audio coding includes a simple MDCT-based codec with a rudimentary  psychoacoustic model, whose goal was primarily implementing real-time 44.1 KHz stereo decoding/playback on a 16-bit RISC microcontroller (the dsPIC). The decoder was written from scratch in PIC assembly, including sine windowing, MDCT, ATH-based. quant. and RLE/Huffman. My senior project also focused on signal processing: a software-defined RF receiver and image processor for biomedical applications (MRI), implemented on an FPGA; it won the IEEE Region 1 Student Paper Contest and was also the subject of two conference papers (http://i.zdom.org/pubs).&lt;br /&gt;
&lt;br /&gt;
I consider myself quite proficient at coding &amp;quot;math&amp;quot; in both C and octave/matlab; however, my experience with coding GUIs is very limited. I have not contributed, code-wise, to an open-source project before this.&lt;br /&gt;
&lt;br /&gt;
== Project ==&lt;br /&gt;
&lt;br /&gt;
I&#039;d like to help in developing Ghost. Considering that Ghost is still fairly nascent, I&#039;ll try to formulate what I&#039;d like to do as best as I can, after having had discussions with both Monty and JM. Ghost will probably be a &amp;quot;sinusoidal+noise&amp;quot; codec. The &amp;quot;sinusoidal&amp;quot; portion requires a method of estimating the primary sinusoids in a given frame, classifying them psychoacoustically, and then quantizing them. Both Monty and JM have/are experimenting with STFT/basis methods for estimation, and I will help them with implementing and testing their ideas, based on their advice. This could vary from e.g. testing/tweaking their basis solvers, trying out dynamic windowing, quantization, etc.&lt;br /&gt;
&lt;br /&gt;
Another sinusoidal estimation technique that seems promising, and has not been tried out yet by either Monty or JM to the best of my knowledge, is &amp;quot;Matching Pursuit&amp;quot; (MP) [1]. If given a chance, I would like to see whether MP is qualitatively and computationally feasible for sinusoidal estimation in a real-time low-latency codec. The first part would probably include testing different kinds of dictionaries - audio researchers have been successful with atoms that are damped sinusoids [2], complex exponentials [3], harmonic sinusoids. Psycho-adaptive and hybrid MP methods, such as Bark-band-type splitting followed by MP [4], both estimate and also classify to a degree, which might be computationally more efficient. Finally, there are also fast MP techniques such as [5], that use the DFT for fast correlation.&lt;br /&gt;
== Project outcomes ==&lt;br /&gt;
&lt;br /&gt;
Again, it&#039;s hard to give specific numbers here. Over the summer, I would basically like help come to a consensus about the direction Ghost will take re sinusoidal analysis, including preliminary investigations into possible approaches to improve the computational efficiency and R-D tradeoff for the chosen  analysis technique.&lt;br /&gt;
&lt;br /&gt;
== Schedule ==&lt;br /&gt;
&lt;br /&gt;
A rough schedule would be:&lt;br /&gt;
&lt;br /&gt;
May 28 - July 10: Experiment with and decide on a promising sinusoidal estimation technique&lt;br /&gt;
&lt;br /&gt;
July 11 - July 31: Figure out sinusoidal classification - deltas, psych-model picking, etc.&lt;br /&gt;
&lt;br /&gt;
August 1 - August 20: Work on efficient quantization of estimated/classified sinusoids.&lt;br /&gt;
&lt;br /&gt;
I have no other formal commitments for the summer, and expect to devote the time-equivalent of a full-time job to this project. I will be away for two weeks at some point during the summer, visiting family in California. I&#039;ve also submitted a paper to the MWSCAS 2007 conference in Montreal; if accepted, I will be there from August 5-8.&lt;br /&gt;
&lt;br /&gt;
== Why me? ==&lt;br /&gt;
&lt;br /&gt;
The ability to write good code will be a common denominator among all the applicants. What&#039;s relatively unique about me is a broad knowledge of the underlying math and signal processing theory, which will allow me to assist with Ghost&#039;s design in a more substantial and efficient manner, than a &amp;quot;mercenary&amp;quot; coder. As a hypothetical example, I could interpret certain results and &amp;quot;fix&amp;quot; the algorithm/code, without having to be micromanaged by Monty/JM. My experience with implementing audio/signal processing algorithms in assembly and hardware also lets me easily break out of the high-level-language/floating-point mold when optimization is necessary.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
1. S. Mallat and Z. Zhang, &amp;quot;Matching Pursuits with time-frequency dictionaries&amp;quot;, IEEE Trans. Sig. Proc., Vol. 41, No. 12, 1993.&lt;br /&gt;
&lt;br /&gt;
2. M. Goodwin and M. Vetterli, &amp;quot;Matching Pursuit and Atomic Signal Models Based on Recursive Filter Banks&amp;quot;, IEEE Trans. Sig. Proc., Vol. 47, No. 7, July 1999&lt;br /&gt;
&lt;br /&gt;
3. R. Heusdens, et al. &amp;quot;Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits&amp;quot;. ICASSP &#039;01, pp. 3281-3284.&lt;br /&gt;
&lt;br /&gt;
4. H. Jang and S. Park, &amp;quot;Multiresolution Sinusoidal Model With Dynamic Segmentation for Timescale Modification of Polyphonic Audio Signals&amp;quot;, IEEE Trans. Speech and Audio Proc., Vol 13, No. 2, March 2005.&lt;br /&gt;
&lt;br /&gt;
5. T.S. Verma and T.H.Y. Meng, &amp;quot;Sinusoidal modeling using frame-based perceptually weighted matching pursuits&amp;quot;, ICASSP &#039;99, pp. 981-984.&lt;/div&gt;</summary>
		<author><name>Izx</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6492</id>
		<title>User:Izx:GSoC2007</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6492"/>
		<updated>2007-03-20T19:19:30Z</updated>

		<summary type="html">&lt;p&gt;Izx: /* Why me? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ishaan Dalal == &lt;br /&gt;
&lt;br /&gt;
E-mail: [ishaand [at] gmail dot com]&lt;br /&gt;
&lt;br /&gt;
== My background ==&lt;br /&gt;
&lt;br /&gt;
Currently, I&#039;m working towards my master&#039;s degree in Electrical Engineering (EE) at the Cooper Union for the Advancement of Science and Art in New York, NY.  I graduated from Cooper with a B.E. in EE in 2006. Source coding (audio and video in particular) has always been my primary academic interest; it began in high school when I tried to discover *how* MP3 let me digitize my tape collection to such small sizes and still remain eminently listenable.&lt;br /&gt;
&lt;br /&gt;
To a great degree, this interest has dictated my upper-level and graduate coursework. Besides the usual EE undergrad curriculum, math coursework relevant to this proposal includes upper-level linear algebra and some graduate stochastics. Graduate EE courses, most of which I took when I was a junior/senior, included the theory of digital video and audio coding, adaptive filtering and multiresolution techniques (filterbanks/wavelets). Each of these also had a &amp;quot;final project&amp;quot;, which were a hierarchical DCT-based video codec, a from-scratch MIDI player and a simple subband audio codec, an embedded zerotree wavelet (EZW) image codec and non-separable 3D DWTs for image volumes (i.e. video) respectively.&lt;br /&gt;
&lt;br /&gt;
My hands-on experience with audio coding includes a simple MDCT-based codec with a rudimentary  psychoacoustic model, whose goal was primarily implementing real-time 44.1 KHz stereo decoding/playback on a 16-bit RISC microcontroller (the dsPIC). The decoder was written from scratch in PIC assembly, including sine windowing, MDCT, ATH-based. quant. and RLE/Huffman. My senior project also focused on signal processing: a software-defined RF receiver and image processor for biomedical applications (MRI), implemented on an FPGA; it won the IEEE Region 1 Student Paper Contest and was also the subject of two conference papers (http://i.zdom.org/pubs).&lt;br /&gt;
&lt;br /&gt;
I consider myself quite proficient at coding &amp;quot;math&amp;quot; in both C and octave/matlab; however, my experience with coding GUIs is very limited. I have not contributed, code-wise, to an open-source project before this.&lt;br /&gt;
&lt;br /&gt;
== Project ==&lt;br /&gt;
&lt;br /&gt;
I&#039;d like to help in developing Ghost. Considering that Ghost is still fairly nascent, I&#039;ll try to formulate what I&#039;d like to do as best as I can, after having had discussions with both Monty and JM. Ghost will probably be a &amp;quot;sinusoidal+noise&amp;quot; codec. The &amp;quot;sinusoidal&amp;quot; portion requires a method of estimating the primary sinusoids in a given frame, classifying them psychoacoustically, and then quantizing them. Both Monty and JM have/are experimenting with STFT/basis methods for estimation, and I will help them with implementing and testing their ideas, based on their advice. This could vary from e.g. testing/tweaking their basis solvers, trying out dynamic windowing, quantization, etc.&lt;br /&gt;
&lt;br /&gt;
Another sinusoidal estimation technique that seems promising, and has not been tried out yet by either Monty or JM to the best of my knowledge, is &amp;quot;Matching Pursuit&amp;quot; (MP) [1]. If given a chance, I would like to see whether MP is qualitatively and computationally feasible for sinusoidal estimation in a real-time low-latency codec. The first part would probably include testing different kinds of dictionaries - audio researchers have been successful with atoms that are damped sinusoids [2], complex exponentials [3], harmonic sinusoids. Psycho-adaptive and hybrid MP methods, such as Bark-band-type splitting followed by MP [4], both estimate and also classify to a degree, which might be computationally more efficient. Finally, there are also fast MP techniques such as [5], that use the DFT for fast correlation.&lt;br /&gt;
== Project outcomes ==&lt;br /&gt;
&lt;br /&gt;
Again, it&#039;s hard to give specific numbers here. Over the summer, I would basically like help come to a consensus about the direction Ghost will take re sinusoidal analysis, including preliminary investigations into possible approaches to improve the computational efficiency and R-D tradeoff for the chosen  analysis technique.&lt;br /&gt;
&lt;br /&gt;
== Schedule ==&lt;br /&gt;
&lt;br /&gt;
A rough schedule would be:&lt;br /&gt;
&lt;br /&gt;
May 28 - July 10: Experiment with and decide on a promising sinusoidal estimation technique&lt;br /&gt;
&lt;br /&gt;
July 11 - July 31: Figure out sinusoidal classification - deltas, psych-model picking, etc.&lt;br /&gt;
&lt;br /&gt;
August 1 - August 20: Work on efficient quantization of estimated/classified sinusoids.&lt;br /&gt;
&lt;br /&gt;
I have no other formal commitments for the summer, and expect to devote the time-equivalent of a full-time job to this project. I will be away for two weeks at some point during the summer, visiting family in California. I&#039;ve also submitted a paper to the MWSCAS 2007 conference in Montreal; if accepted, I will be there from August 5-8.&lt;br /&gt;
&lt;br /&gt;
== Why me? ==&lt;br /&gt;
&lt;br /&gt;
The ability to write good code will be a common denominator among all the applicants. What&#039;s relatively unique about me is a broad knowledge of the underlying math and signal processing theory, which will allow me to assist with Ghost&#039;s design in a more substantial and efficient manner, than a &amp;quot;mercenary&amp;quot; coder. As a hypothetical example, I could interpret certain results and &amp;quot;fix&amp;quot; the algorithm/code, without having to be micromanaged by Monty/JM. My experience with implementing audio/signal processing algorithms in assembly and hardware also lets me easily break out of the high-level-language/floating-point mold when speed or other factors become too cumbersome.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
1. S. Mallat and Z. Zhang, &amp;quot;Matching Pursuits with time-frequency dictionaries&amp;quot;, IEEE Trans. Sig. Proc., Vol. 41, No. 12, 1993.&lt;br /&gt;
&lt;br /&gt;
2. M. Goodwin and M. Vetterli, &amp;quot;Matching Pursuit and Atomic Signal Models Based on Recursive Filter Banks&amp;quot;, IEEE Trans. Sig. Proc., Vol. 47, No. 7, July 1999&lt;br /&gt;
&lt;br /&gt;
3. R. Heusdens, et al. &amp;quot;Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits&amp;quot;. ICASSP &#039;01, pp. 3281-3284.&lt;br /&gt;
&lt;br /&gt;
4. H. Jang and S. Park, &amp;quot;Multiresolution Sinusoidal Model With Dynamic Segmentation for Timescale Modification of Polyphonic Audio Signals&amp;quot;, IEEE Trans. Speech and Audio Proc., Vol 13, No. 2, March 2005.&lt;br /&gt;
&lt;br /&gt;
5. T.S. Verma and T.H.Y. Meng, &amp;quot;Sinusoidal modeling using frame-based perceptually weighted matching pursuits&amp;quot;, ICASSP &#039;99, pp. 981-984.&lt;/div&gt;</summary>
		<author><name>Izx</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6491</id>
		<title>User:Izx:GSoC2007</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6491"/>
		<updated>2007-03-20T19:13:39Z</updated>

		<summary type="html">&lt;p&gt;Izx: added initial body&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Ishaan Dalal == &lt;br /&gt;
&lt;br /&gt;
E-mail: [ishaand [at] gmail dot com]&lt;br /&gt;
&lt;br /&gt;
== My background ==&lt;br /&gt;
&lt;br /&gt;
Currently, I&#039;m working towards my master&#039;s degree in Electrical Engineering (EE) at the Cooper Union for the Advancement of Science and Art in New York, NY.  I graduated from Cooper with a B.E. in EE in 2006. Source coding (audio and video in particular) has always been my primary academic interest; it began in high school when I tried to discover *how* MP3 let me digitize my tape collection to such small sizes and still remain eminently listenable.&lt;br /&gt;
&lt;br /&gt;
To a great degree, this interest has dictated my upper-level and graduate coursework. Besides the usual EE undergrad curriculum, math coursework relevant to this proposal includes upper-level linear algebra and some graduate stochastics. Graduate EE courses, most of which I took when I was a junior/senior, included the theory of digital video and audio coding, adaptive filtering and multiresolution techniques (filterbanks/wavelets). Each of these also had a &amp;quot;final project&amp;quot;, which were a hierarchical DCT-based video codec, a from-scratch MIDI player and a simple subband audio codec, an embedded zerotree wavelet (EZW) image codec and non-separable 3D DWTs for image volumes (i.e. video) respectively.&lt;br /&gt;
&lt;br /&gt;
My hands-on experience with audio coding includes a simple MDCT-based codec with a rudimentary  psychoacoustic model, whose goal was primarily implementing real-time 44.1 KHz stereo decoding/playback on a 16-bit RISC microcontroller (the dsPIC). The decoder was written from scratch in PIC assembly, including sine windowing, MDCT, ATH-based. quant. and RLE/Huffman. My senior project also focused on signal processing: a software-defined RF receiver and image processor for biomedical applications (MRI), implemented on an FPGA; it won the IEEE Region 1 Student Paper Contest and was also the subject of two conference papers (http://i.zdom.org/pubs).&lt;br /&gt;
&lt;br /&gt;
I consider myself quite proficient at coding &amp;quot;math&amp;quot; in both C and octave/matlab; however, my experience with coding GUIs is very limited. I have not contributed, code-wise, to an open-source project before this.&lt;br /&gt;
&lt;br /&gt;
== Project ==&lt;br /&gt;
&lt;br /&gt;
I&#039;d like to help in developing Ghost. Considering that Ghost is still fairly nascent, I&#039;ll try to formulate what I&#039;d like to do as best as I can, after having had discussions with both Monty and JM. Ghost will probably be a &amp;quot;sinusoidal+noise&amp;quot; codec. The &amp;quot;sinusoidal&amp;quot; portion requires a method of estimating the primary sinusoids in a given frame, classifying them psychoacoustically, and then quantizing them. Both Monty and JM have/are experimenting with STFT/basis methods for estimation, and I will help them with implementing and testing their ideas, based on their advice. This could vary from e.g. testing/tweaking their basis solvers, trying out dynamic windowing, quantization, etc.&lt;br /&gt;
&lt;br /&gt;
Another sinusoidal estimation technique that seems promising, and has not been tried out yet by either Monty or JM to the best of my knowledge, is &amp;quot;Matching Pursuit&amp;quot; (MP) [1]. If given a chance, I would like to see whether MP is qualitatively and computationally feasible for sinusoidal estimation in a real-time low-latency codec. The first part would probably include testing different kinds of dictionaries - audio researchers have been successful with atoms that are damped sinusoids [2], complex exponentials [3], harmonic sinusoids. Psycho-adaptive and hybrid MP methods, such as Bark-band-type splitting followed by MP [4], both estimate and also classify to a degree, which might be computationally more efficient. Finally, there are also fast MP techniques such as [5], that use the DFT for fast correlation.&lt;br /&gt;
== Project outcomes ==&lt;br /&gt;
&lt;br /&gt;
Again, it&#039;s hard to give specific numbers here. Over the summer, I would basically like help come to a consensus about the direction Ghost will take re sinusoidal analysis, including preliminary investigations into possible approaches to improve the computational efficiency and R-D tradeoff for the chosen  analysis technique.&lt;br /&gt;
&lt;br /&gt;
== Schedule ==&lt;br /&gt;
&lt;br /&gt;
A rough schedule would be:&lt;br /&gt;
&lt;br /&gt;
May 28 - July 10: Experiment with and decide on a promising sinusoidal estimation technique&lt;br /&gt;
&lt;br /&gt;
July 11 - July 31: Figure out sinusoidal classification - deltas, psych-model picking, etc.&lt;br /&gt;
&lt;br /&gt;
August 1 - August 20: Work on efficient quantization of estimated/classified sinusoids.&lt;br /&gt;
&lt;br /&gt;
I have no other formal commitments for the summer, and expect to devote the time-equivalent of a full-time job to this project. I will be away for two weeks at some point during the summer, visiting family in California. I&#039;ve also submitted a paper to the MWSCAS 2007 conference in Montreal; if accepted, I will be there from August 5-8.&lt;br /&gt;
&lt;br /&gt;
== Why me? ==&lt;br /&gt;
&lt;br /&gt;
The ability to write good code will be a common denominator among all the applicants. What&#039;s somewhat unique about me is a broad knowledge of the underlying math and signal processing theory, which will allow me to assist with Ghost&#039;s design in a more substantial and efficient manner, than a &amp;quot;mercenary&amp;quot; coder. As a hypothetical example, I could interpret certain results and &amp;quot;fix&amp;quot; the algorithm/code, without having to be micromanaged by Monty/JM. My experience with implementing audio/signal processing algorithms in assembly and hardware also lets me easily break out of the high-level-language/floating-point mold when speed or other factors become too cumbersome.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
1. S. Mallat and Z. Zhang, &amp;quot;Matching Pursuits with time-frequency dictionaries&amp;quot;, IEEE Trans. Sig. Proc., Vol. 41, No. 12, 1993.&lt;br /&gt;
&lt;br /&gt;
2. M. Goodwin and M. Vetterli, &amp;quot;Matching Pursuit and Atomic Signal Models Based on Recursive Filter Banks&amp;quot;, IEEE Trans. Sig. Proc., Vol. 47, No. 7, July 1999&lt;br /&gt;
&lt;br /&gt;
3. R. Heusdens, et al. &amp;quot;Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits&amp;quot;. ICASSP &#039;01, pp. 3281-3284.&lt;br /&gt;
&lt;br /&gt;
4. H. Jang and S. Park, &amp;quot;Multiresolution Sinusoidal Model With Dynamic Segmentation for Timescale Modification of Polyphonic Audio Signals&amp;quot;, IEEE Trans. Speech and Audio Proc., Vol 13, No. 2, March 2005.&lt;br /&gt;
&lt;br /&gt;
5. T.S. Verma and T.H.Y. Meng, &amp;quot;Sinusoidal modeling using frame-based perceptually weighted matching pursuits&amp;quot;, ICASSP &#039;99, pp. 981-984.&lt;/div&gt;</summary>
		<author><name>Izx</name></author>
	</entry>
	<entry>
		<id>https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6479</id>
		<title>User:Izx:GSoC2007</title>
		<link rel="alternate" type="text/html" href="https://wiki.xiph.org/index.php?title=User:Izx:GSoC2007&amp;diff=6479"/>
		<updated>2007-03-18T01:57:35Z</updated>

		<summary type="html">&lt;p&gt;Izx: created placeholder text&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Placeholder for extended GSoC 2007 Proposal&lt;/div&gt;</summary>
		<author><name>Izx</name></author>
	</entry>
</feed>