Opus tuning: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 58: Line 58:
         {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};
         {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};


The intensity_thresholds[] table defines the intensity threshold as a function of the ''effective bitrate''. For example, between 63 and 68 kb/s, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled.
The intensity_thresholds[] table defines the intensity threshold as a function of the ''effective bitrate''. For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled. Note that for 20 ms frames, the effective rate is equal to the actual rate, minus 4 kb/s.


== skip threshold (3 6 4) ==
== skip threshold (3 6 4) ==

Revision as of 21:52, 4 February 2013

This page is meant to provide help on tuning the current Opus encoder. It assumes you are working on the master branch, not 1.0.x. As work progresses, some info (especially any line numbers) may become outdated. The numbers in parentheses represent (in order):

  • Impact on quality
  • Quality of current tuning
  • Difficulty

Parameters

These are some parameters that can be tuned to improve the Opus encoder quality (in no particular order)

tonality (10 5 1)

around line 1574 of celt_encoder.c:

       tonal_target = target + (opus_int32)((coded_bins<<BITRES)*1.2f*tonal);

Replacing the constant 1.2f will change the impact of tonality on the bit-rate. The higher value, the higher the bit-rate of tonal frames/samples.

trim (7 7 8)

Trim controls the bit allocation balance between low and high frequencies. This is currently computed by alloc_trim_analysis() around line 677 of celt_encoder.c. A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.

dynalloc (5 5 9)

Dynamic allocation is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s). This is currently computed in dynalloc_analysis() around line 810 of celt_encoder.c.

stereo saving (4 6 6)

Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image. It is computed in alloc_trim_analysis(), around line 733 of celt_encoder.c. It is applied in celt_encode_with_ec(), around line 1558 or celt_encoder.c.

spreading (4 7 7)

spreading_decision() bands.c around line 413.

tapset (1 4 9)

spreading_decision() bands.c around line 413.

transient estimator (10 8 8)

transient_analysis() celt_encoder.c around line 209.

transient boost (8 6 6)

transient_analysis() celt_encoder.c around line 209.

tf estimator (6 6 9)

tf_analysis() in celt_encoder.c around line 475.

Intensity threshold (7 8 3)

The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image. In celt_encoder.c around line 1492:

     static const opus_val16 intensity_thresholds[21]=
     /* 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19  20  off*/
       { 16,21,23,25,27,29,31,33,35,38,42,46,50,54,58,63,68,75,84,102,130};
     static const opus_val16 intensity_histeresis[21]=
       {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};

The intensity_thresholds[] table defines the intensity threshold as a function of the effective bitrate. For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled. Note that for 20 ms frames, the effective rate is equal to the actual rate, minus 4 kb/s.

skip threshold (3 6 4)

In function interp_bits2pulses() in rate.c around line 356.

           if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth))

The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample. The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then it's contents gets replaced by a copy of the lower MDCT spectrum, or by noise. Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.

mode/bandwidth decisions (6 4 4)