Opus tuning

From XiphWiki
Jump to navigation Jump to search

This page is meant to provide help on tuning the current Opus encoder. It assumes you are working on the master branch, not 1.0.x.

As work progresses, some info (especially any line numbers) may become outdated.

The numbers in parentheses represent (in order):

  • Impact on quality
  • Quality of current tuning
  • Difficulty


These are some parameters that can be tuned to improve the Opus encoder quality (in no particular order).

Tonality (10 5 1)

Around line 1276 of celt_encoder.c:

       tonal_target = target + (opus_int32)((coded_bins << BITRES) * 1.2f * tonal);

Replacing the constant 1.2f will change the impact of tonality on the bit-rate. The higher the value, the higher the bit-rate of tonal frames/samples.

Trim (7 7 8)

Trim controls the bit allocation balance between low and high frequencies.

This is currently computed by alloc_trim_analysis() around line 780 of celt_encoder.c.

A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.

DynAlloc (5 5 9)

Dynamic allocation is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s).

This is currently computed in dynalloc_analysis() around line 956 of celt_encoder.c.

Stereo Saving (4 6 6)

Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image.

It is computed in alloc_trim_analysis(), around line 780 of celt_encoder.c. It is applied in celt_encode_with_ec(), around line 1329 of celt_encoder.c.

Spreading (4 7 7)

spreading_decision() in bands.c around line 413.

Tapset (1 4 9)

spreading_decision() in bands.c around line 413.

Transient Estimator (10 8 8)

transient_analysis() in celt_encoder.c around line 209.

Transient Boost (8 6 6)

transient_analysis() in celt_encoder.c around line 209.

TF Estimator (6 6 9)

tf_analysis() in celt_encoder.c around line 475.

Intensity Threshold (7 8 3)

The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.

In celt_encoder.c, around line 1492:

     static const opus_val16 intensity_thresholds[21]=
     /* 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19  20  off*/
       { 16,21,23,25,27,29,31,33,35,38,42,46,50,54,58,63,68,75,84,102,130};
     static const opus_val16 intensity_histeresis[21]=
       {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};

The intensity_thresholds[] table defines the intensity threshold as a function of the effective bitrate. For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled.

Note that for 20 ms frames, the effective rate is equal to the actual rate minus 4 kb/s.

Skip Threshold (3 6 4)

In function interp_bits2pulses() in rate.c around line 356.

           if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth))

The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.

Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.

The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then its contents get replaced by a copy of the lower MDCT spectrum, or by noise.

Mode/Bandwidth Decisions (6 4 4)