Ghost: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 17: Line 17:


=== Sinusoidal ===
=== Sinusoidal ===
Good when most of the energy is contained in a few sinusoids. May be problematic for very harmonic signals, e.g. a male voice may have close to a hundred harmonics in the full audio band.
Good when most of the energy is contained in a few sinusoids. May be problematic for very harmonic signals, e.g. a male voice may have close to a hundred harmonics in the full audio band.
=== Pitch ===
=== Pitch ===
Good for harmonic signals. Hard to estimate and code when extra sinusoids and noise are present.
 
Good for harmonic signals. Hard to estimate and code when extra sinusoids and noise are present. At 48 kHz, no need for fractional pitch or anything like that, but sub-band pitch analysis or multi-tap gain is a good idea. Also, there needs to be a way to remove the effect of sinusoids and noise. Even then removing the "noise" also means removing all excitation to the pitch predictor, so that's a problem.
 
=== MDCT ===
=== MDCT ===
Very general. Can code anything, but not very good at anything. High delay (2x frame size).
 
Very general. Can code anything, but not very good at anything. High delay (2x frame size). Could put several "MDCT frames" in each codec frame to make latency smaller.
 
=== Wavelets ===
=== Wavelets ===
Just a fancy name for sub-bands with non-uniform width. Probably similar to having an MDCT with few sub-bands, except that that the sub-bands could follow (roughly) the critical bands.
=== LPC + stochastic cb ===
=== LPC + stochastic cb ===
Like CELP with no pitch. Could be used to code the noisy part of the signal with low bit-rate. Would need to figure out how to preserve the energy of the noise when going with 1/2 bit per sample and less.


== Codec Structure Ideas ==
== Codec Structure Ideas ==
Line 40: Line 51:
* LPC analysis
* LPC analysis
* CELP-like coding of the residual (mainly noise)
* CELP-like coding of the residual (mainly noise)
== Estimation Ideas ==
=== Sinusoid Estimation ===
Very hard to do, especially with reasonable complexity and low delay. Some ideas:
* Least-square type matching
* Phase lock loop (PLL)

Revision as of 18:26, 22 December 2005

This page is meant to track ideas about low-delay, high-quality audio coding. The work has just started, so don't expect anything in the near future (or at all for that matter).

Signal types

There are many signal types that can be found:

  • Sinusoids
    • A few pure (or nearly pure) tones
  • Harmonic
    • Periodic waveforms (e.g. voice)
    • Many (sometimes closely spaced) harmonics
  • Shapred noise
    • Signals that are (or are indistinguishable) from filtered (coloured) white noise
  • Transients
    • Whatever does't fit above I guess

Signal analysis

Sinusoidal

Good when most of the energy is contained in a few sinusoids. May be problematic for very harmonic signals, e.g. a male voice may have close to a hundred harmonics in the full audio band.

Pitch

Good for harmonic signals. Hard to estimate and code when extra sinusoids and noise are present. At 48 kHz, no need for fractional pitch or anything like that, but sub-band pitch analysis or multi-tap gain is a good idea. Also, there needs to be a way to remove the effect of sinusoids and noise. Even then removing the "noise" also means removing all excitation to the pitch predictor, so that's a problem.

MDCT

Very general. Can code anything, but not very good at anything. High delay (2x frame size). Could put several "MDCT frames" in each codec frame to make latency smaller.

Wavelets

Just a fancy name for sub-bands with non-uniform width. Probably similar to having an MDCT with few sub-bands, except that that the sub-bands could follow (roughly) the critical bands.

LPC + stochastic cb

Like CELP with no pitch. Could be used to code the noisy part of the signal with low bit-rate. Would need to figure out how to preserve the energy of the noise when going with 1/2 bit per sample and less.

Codec Structure Ideas

Sinusoidal + wavelet

  • Preemphasis
  • Extract as many sinusoids as possible
  • Wavelet transform
  • Code wavelet coefs using VQ

Sinusoidal, pitch and noise

  • Preemphasis
  • Joint pitch + sinusoidal estimation
  • LPC analysis
  • CELP-like coding of the residual (mainly noise)

Estimation Ideas

Sinusoid Estimation

Very hard to do, especially with reasonable complexity and low delay. Some ideas:

  • Least-square type matching
  • Phase lock loop (PLL)