Difference between revisions of "CEFT"

From XiphWiki
Jump to: navigation, search
m (CEFT: expand acronym once on this page...)
m (Promoted everything up a level heading (Level 1 is the page title))
Line 1: Line 1:
= Constrained-Energy Fourier Tranforms =
+
== Constrained-Energy Fourier Transforms ==
  
== Problems with current implementation and possible solutions ==
+
=== Problems with current implementation and possible solutions ===
  
=== Overlapped FFT not critically sampled ===
+
==== Overlapped FFT not critically sampled ====
  
 
In the current implementation, we encode 4/3 times more samples than necessary because we use 256-point FFTs with 64 samples overlap.
 
In the current implementation, we encode 4/3 times more samples than necessary because we use 256-point FFTs with 64 samples overlap.
Line 11: Line 11:
 
* Do extrapolation on the input and use a wider FFT. Then optimise the search only for the "real" samples
 
* Do extrapolation on the input and use a wider FFT. Then optimise the search only for the "real" samples
  
=== Non-harmonic signals (i.e. music) ===
+
==== Non-harmonic signals (i.e. music) ====
  
 
CEFT only works on speech because most of its coding efficiency is provided by the pitch predictor.
 
CEFT only works on speech because most of its coding efficiency is provided by the pitch predictor.
Line 20: Line 20:
 
* Use two (or more) pitch periods at the same time and use energy conservation to keep everything stable.
 
* Use two (or more) pitch periods at the same time and use energy conservation to keep everything stable.
  
=== Sparse spectrum ===
+
==== Sparse spectrum ====
  
 
CEFT tends to have musical noise, especially at high frequency when there are very few bits/bin.
 
CEFT tends to have musical noise, especially at high frequency when there are very few bits/bin.

Revision as of 09:03, 23 August 2015

Constrained-Energy Fourier Transforms

Problems with current implementation and possible solutions

Overlapped FFT not critically sampled

In the current implementation, we encode 4/3 times more samples than necessary because we use 256-point FFTs with 64 samples overlap.

Ideas:

  • Use an MDCT instead of the FFT
  • Do extrapolation on the input and use a wider FFT. Then optimise the search only for the "real" samples

Non-harmonic signals (i.e. music)

CEFT only works on speech because most of its coding efficiency is provided by the pitch predictor.

Ideas:

  • Sinusoidal prediction
  • Use two (or more) pitch periods and choose one for each bin/band/whatever
  • Use two (or more) pitch periods at the same time and use energy conservation to keep everything stable.

Sparse spectrum

CEFT tends to have musical noise, especially at high frequency when there are very few bits/bin.

Ideas:

  • Use a "rotation matrix"
  • Prediction from lower frequencies