Opus tuning: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 18: Line 18:
         tonal_target = target + (opus_int32)((coded_bins << BITRES) * 1.2f * tonal);
         tonal_target = target + (opus_int32)((coded_bins << BITRES) * 1.2f * tonal);


Replacing the constant 1.2f will change the impact of tonality on the bit-rate. The higher the value, the higher the bit-rate of tonal frames/samples.
Replacing the constant '''1.2f''' will change the impact of tonality on the bit-rate. The higher the value, the higher the bit-rate of tonal frames/samples.


== Trim (7 7 8) ==
== Trim (7 7 8) ==


Trim controls the bit allocation balance between low and high frequencies. This is currently computed by '''alloc_trim_analysis()''' around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L780 line 780 of celt_encoder.c]'''.
Trim controls the bit allocation balance between low and high frequencies.
 
This is currently computed by '''alloc_trim_analysis()''' around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L780 line 780 of celt_encoder.c]'''.


A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.
A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.
Line 28: Line 30:
== DynAlloc (5 5 9) ==
== DynAlloc (5 5 9) ==


''Dynamic allocation'' is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s). This is currently computed in dynalloc_analysis() around line 810 of celt_encoder.c.  
''Dynamic allocation'' is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s).
 
This is currently computed in '''dynalloc_analysis()''' around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L956 line 956 of celt_encoder.c]'''.


== Stereo Saving (4 6 6) ==
== Stereo Saving (4 6 6) ==


Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image. It is computed in alloc_trim_analysis(), around line 733 of celt_encoder.c. It is applied in celt_encode_with_ec(), around line 1558 or celt_encoder.c.
Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image.
 
It is computed in '''alloc_trim_analysis()''', around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L780 line 780 of celt_encoder.c]'''.
It is applied in '''celt_encode_with_ec()''', around '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L1329 line 1329 of celt_encoder.c]'''.


== Spreading (4 7 7) ==
== Spreading (4 7 7) ==


spreading_decision() bands.c around line 413.
'''spreading_decision()''' in '''bands.c''' around line 413.


== Tapset (1 4 9) ==
== Tapset (1 4 9) ==


spreading_decision() bands.c around line 413.
'''spreading_decision()''' in '''bands.c''' around line 413.


== Transient Estimator (10 8 8) ==
== Transient Estimator (10 8 8) ==


transient_analysis() celt_encoder.c around line 209.
'''transient_analysis()''' in  '''celt_encoder.c''' around line 209.


=== Transient Boost (8 6 6) ===
=== Transient Boost (8 6 6) ===


transient_analysis() celt_encoder.c around line 209.
'''transient_analysis()''' in '''celt_encoder.c''' around line 209.


== TF Estimator (6 6 9) ==
== TF Estimator (6 6 9) ==


tf_analysis() in celt_encoder.c around line 475.
'''tf_analysis()''' in '''celt_encoder.c''' around line 475.


== Intensity Threshold (7 8 3) ==
== Intensity Threshold (7 8 3) ==


The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.
The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.
In celt_encoder.c around line 1492:
 
In '''[https://github.com/xiph/opus/blob/master/celt/celt_encoder.c#L1932 celt_encoder.c, around line 1492]''':


       static const opus_val16 intensity_thresholds[21]=
       static const opus_val16 intensity_thresholds[21]=
Line 65: Line 73:
         {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};
         {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};


The intensity_thresholds[] table defines the intensity threshold as a function of the ''effective bitrate''. For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled. Note that for 20 ms frames, the effective rate is equal to the actual rate, minus 4 kb/s.
The '''intensity_thresholds[]''' table defines the intensity threshold as a function of the ''effective bitrate''.
For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16.
Above 130 kb/s, intensity stereo is completely disabled.
 
Note that for 20 ms frames, the effective rate is equal to the ''actual rate minus 4 kb/s''.


== Skip Threshold (3 6 4) ==
== Skip Threshold (3 6 4) ==
Line 75: Line 87:
The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.
The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.


The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then it's contents gets replaced by a copy of the lower MDCT spectrum, or by noise. Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.  
Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.
 
The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then its contents get replaced by a copy of the lower MDCT spectrum, or by noise.


== Mode/Bandwidth Decisions (6 4 4) ==
== Mode/Bandwidth Decisions (6 4 4) ==


[[Category:Opus]]
[[Category:Opus]]

Latest revision as of 06:44, 14 November 2016

This page is meant to provide help on tuning the current Opus encoder. It assumes you are working on the master branch, not 1.0.x.

As work progresses, some info (especially any line numbers) may become outdated.

The numbers in parentheses represent (in order):

  • Impact on quality
  • Quality of current tuning
  • Difficulty

Parameters

These are some parameters that can be tuned to improve the Opus encoder quality (in no particular order).

Tonality (10 5 1)

Around line 1276 of celt_encoder.c:

       tonal_target = target + (opus_int32)((coded_bins << BITRES) * 1.2f * tonal);

Replacing the constant 1.2f will change the impact of tonality on the bit-rate. The higher the value, the higher the bit-rate of tonal frames/samples.

Trim (7 7 8)

Trim controls the bit allocation balance between low and high frequencies.

This is currently computed by alloc_trim_analysis() around line 780 of celt_encoder.c.

A high trim value signals more bits to the low frequencies, while a lower trim value signals more bits to the high frequencies.

DynAlloc (5 5 9)

Dynamic allocation is the part of the Opus bitstream that makes it possible to increase the allocation of any band(s).

This is currently computed in dynalloc_analysis() around line 956 of celt_encoder.c.

Stereo Saving (4 6 6)

Stereo saving controls the bitrate reduction applied for stereo signals with a narrow image.

It is computed in alloc_trim_analysis(), around line 780 of celt_encoder.c. It is applied in celt_encode_with_ec(), around line 1329 of celt_encoder.c.

Spreading (4 7 7)

spreading_decision() in bands.c around line 413.

Tapset (1 4 9)

spreading_decision() in bands.c around line 413.

Transient Estimator (10 8 8)

transient_analysis() in celt_encoder.c around line 209.

Transient Boost (8 6 6)

transient_analysis() in celt_encoder.c around line 209.

TF Estimator (6 6 9)

tf_analysis() in celt_encoder.c around line 475.

Intensity Threshold (7 8 3)

The intensity threshold is the first band to be coded as intensity stereo. When a band is intensity-coded, the coding noise in that band is lower, at the expense of a narrower stereo image.

In celt_encoder.c, around line 1492:

     static const opus_val16 intensity_thresholds[21]=
     /* 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19  20  off*/
       { 16,21,23,25,27,29,31,33,35,38,42,46,50,54,58,63,68,75,84,102,130};
     static const opus_val16 intensity_histeresis[21]=
       {  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6,  8, 12};

The intensity_thresholds[] table defines the intensity threshold as a function of the effective bitrate. For example, between 63 and 68 kb/s effective rate, the intensity threshold is 16. Above 130 kb/s, intensity stereo is completely disabled.

Note that for 20 ms frames, the effective rate is equal to the actual rate minus 4 kb/s.

Skip Threshold (3 6 4)

In function interp_bits2pulses() in rate.c around line 356.

           if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth))

The important constants are the 7 and the 9. They determine the minimum allocation for a band to be coded, with hysteresis. The units are 1/16 bit per sample.

Decreasing these values increases the quality of the highest frequency bands, at the expense of all other bands.

The code above means that if a band was coded in the previous frame, it needs 7/16 bit per sample in this frame, but if it wasn't coded, then it needs 9/16 bits. If a band isn't coded, then its contents get replaced by a copy of the lower MDCT spectrum, or by noise.

Mode/Bandwidth Decisions (6 4 4)