Opus tuning: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
No edit summary
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This page is meant to provide help on tuning the current Opus encoder. It assumes you are working on the master branch, not 1.0.x. As work progresses, some info (especially any line numbers) may become outdated. The numbers in parentheses represent (in order):
This page is meant to provide help on tuning the current Opus encoder. It assumes you are working on the '''master''' branch, not 1.0.x.
 
As work progresses, some info (especially any line numbers) may become outdated.
 
The numbers in parentheses represent (in order):
* Impact on quality  
* Impact on quality  
* Quality of current tuning
* Quality of current tuning
Line 6: Line 10:
= Parameters =
= Parameters =


These are some parameters that can be tuned to improve the Opus encoder quality (in no particular order)  
These are some parameters that can be tuned to improve the Opus encoder quality (in no particular order).
== tonality (10 5 1) ==
 
== Tonality (10 5 1) ==


around line 1574 of celt_encoder.c:
Around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L1276 line 1276 of celt_encoder.c]''':


         tonal_target = target + (opus_int32)((coded_bins<<BITRES)*1.2f*tonal);
         tonal_target = target + (opus_int32)((coded_bins << BITRES) * 1.2f * tonal);


Replacing the constant 1.2f will change the impact of tonality on the bit-rate. The higher value, the higher the bit-rate of tonal frames/samples.
Replacing the constant '''1.2f''' will change the impact of tonality on the bit-rate. The higher the value, the higher the bit-rate of tonal frames/samples.


== trim (7 7 8) ==
== Trim (7 7 8) ==


Trim controls the bit allocation balance between low and high frequencies. This is currently computed by alloc_trim_analysis() around line 677 of celt_encoder.c. A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.
Trim controls the bit allocation balance between low and high frequencies.


== dynalloc (5 5 9) ==
This is currently computed by '''alloc_trim_analysis()''' around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L780 line 780 of celt_encoder.c]'''.


''Dynamic allocation'' is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s). This is currently computed in dynalloc_analysis() around line 810 of celt_encoder.c.  
A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.


== stereo saving (4 6 6) ==
== DynAlloc (5 5 9) ==


Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image. It is computed in alloc_trim_analysis(), around line 733 of celt_encoder.c. It is applied in celt_encode_with_ec(), around line 1558 or celt_encoder.c.
''Dynamic allocation'' is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s).


== spreading (4 7 7) ==
This is currently computed in '''dynalloc_analysis()''' around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L956 line 956 of celt_encoder.c]'''.


spreading_decision() bands.c around line 413.
== Stereo Saving (4 6 6) ==


== tapset (1 4 9) ==
Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image.


spreading_decision() bands.c around line 413.
It is computed in '''alloc_trim_analysis()''', around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L780 line 780 of celt_encoder.c]'''.
It is applied in '''celt_encode_with_ec()''', around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L1329 line 1329 of celt_encoder.c]'''.


== transient estimator (10 8 8) ==
== Spreading (4 7 7) ==


transient_analysis() celt_encoder.c around line 209.
'''spreading_decision()''' in '''bands.c''' around line 413.


===transient boost (8 6 6) ===
== Tapset (1 4 9) ==


transient_analysis() celt_encoder.c around line 209.
'''spreading_decision()''' in '''bands.c''' around line 413.


== tf estimator (6 6 9) ==
== Transient Estimator (10 8 8) ==


tf_analysis() in celt_encoder.c around line 475.
'''transient_analysis()''' in '''celt_encoder.c''' around line 209.


== Intensity threshold (7 8 3) ==
=== Transient Boost (8 6 6) ===
 
'''transient_analysis()''' in '''celt_encoder.c''' around line 209.
 
== TF Estimator (6 6 9) ==
 
'''tf_analysis()''' in '''celt_encoder.c''' around line 475.
 
== Intensity Threshold (7 8 3) ==


The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.
The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.
In celt_encoder.c around line 1492:
 
In '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L1932 celt_encoder.c, around line 1492]''':


       static const opus_val16 intensity_thresholds[21]=
       static const opus_val16 intensity_thresholds[21]=
Line 58: Line 73:
         {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};
         {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};


The intensity_thresholds[] table defines the intensity threshold as a function of the ''effective bitrate''. For example, between 63 and 68 kb/s, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled.
The '''intensity_thresholds[]''' table defines the intensity threshold as a function of the ''effective bitrate''.
For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16.
Above 130 kb/s, intensity stereo is completely disabled.


== skip threshold (3 6 4) ==
Note that for 20 ms frames, the effective rate is equal to the ''actual rate minus 4 kb/s''.


interp_bits2pulses() in rate.c around line 356.
== Skip Threshold (3 6 4) ==
 
In function interp_bits2pulses() in rate.c around line 356.


             if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth))
             if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth))


The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.  
The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.
== mode/bandwidth decisions (6 4 4) ==
 
Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.
 
The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then its contents get replaced by a copy of the lower MDCT spectrum, or by noise.
 
== Mode/Bandwidth Decisions (6 4 4) ==
 
[[Category:Opus]]

Latest revision as of 06:44, 14 November 2016

This page is meant to provide help on tuning the current Opus encoder. It assumes you are working on the master branch, not 1.0.x.

As work progresses, some info (especially any line numbers) may become outdated.

The numbers in parentheses represent (in order):

  • Impact on quality
  • Quality of current tuning
  • Difficulty

Parameters

These are some parameters that can be tuned to improve the Opus encoder quality (in no particular order).

Tonality (10 5 1)

Around line 1276 of celt_encoder.c:

       tonal_target = target + (opus_int32)((coded_bins << BITRES) * 1.2f * tonal);

Replacing the constant 1.2f will change the impact of tonality on the bit-rate. The higher the value, the higher the bit-rate of tonal frames/samples.

Trim (7 7 8)

Trim controls the bit allocation balance between low and high frequencies.

This is currently computed by alloc_trim_analysis() around line 780 of celt_encoder.c.

A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.

DynAlloc (5 5 9)

Dynamic allocation is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s).

This is currently computed in dynalloc_analysis() around line 956 of celt_encoder.c.

Stereo Saving (4 6 6)

Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image.

It is computed in alloc_trim_analysis(), around line 780 of celt_encoder.c. It is applied in celt_encode_with_ec(), around line 1329 of celt_encoder.c.

Spreading (4 7 7)

spreading_decision() in bands.c around line 413.

Tapset (1 4 9)

spreading_decision() in bands.c around line 413.

Transient Estimator (10 8 8)

transient_analysis() in celt_encoder.c around line 209.

Transient Boost (8 6 6)

transient_analysis() in celt_encoder.c around line 209.

TF Estimator (6 6 9)

tf_analysis() in celt_encoder.c around line 475.

Intensity Threshold (7 8 3)

The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.

In celt_encoder.c, around line 1492:

     static const opus_val16 intensity_thresholds[21]=
     /* 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19  20  off*/
       { 16,21,23,25,27,29,31,33,35,38,42,46,50,54,58,63,68,75,84,102,130};
     static const opus_val16 intensity_histeresis[21]=
       {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};

The intensity_thresholds[] table defines the intensity threshold as a function of the effective bitrate. For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled.

Note that for 20 ms frames, the effective rate is equal to the actual rate minus 4 kb/s.

Skip Threshold (3 6 4)

In function interp_bits2pulses() in rate.c around line 356.

           if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth))

The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.

Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.

The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then its contents get replaced by a copy of the lower MDCT spectrum, or by noise.

Mode/Bandwidth Decisions (6 4 4)