Summer of Code 2021: Difference between revisions
m (→Problem / Intro: Fix: Typo) |
|||
Line 117: | Line 117: | ||
==== Problem / Intro ==== | ==== Problem / Intro ==== | ||
Often times, an encoder does not know about the best way to allocate the bitrate budget across a frame, and may overspend a considerable amount of bitrate | Often times, an encoder does not know about the best way to allocate the bitrate budget across a frame, and may overspend a considerable amount of bitrate to regions that might not benefit from a low quantizer(low amounts of distortion, so less compression) while not giving enough bitrate to zones that might actually need it. This can even cause issues temporally, as bitrate allocation within a group of frames(GOPs) may be skewed towards more complex and high motion frames, while leaving other frames with barely any bitrate to work with, creating visual artifacts such as blocking and banding. | ||
==== Solution / Task ==== | ==== Solution / Task ==== |
Revision as of 14:07, 2 March 2021
Introduction
This year Xiph.org is focusing on the rav1e AV1 encoder for its GSoC participation. Both video and still images are currently hot topics, especially with the recent support of AVIF within browsers.
Below you'll find the description for the following GSoC project ideas around the rav1e project.
If you want to know more about a particular idea, please get in touch with the people listed under "possible mentors". While no guarantee, that the person will be the actual mentor for the task, they know it and will be happy to answer your questions.
In our previous participation we focused a lot on our multimedia codec projects. This turned out to be very challenging for students. So this year we're not offering project ideas from those. If you're a student interested in codec work, have previous experience in it and are confident, that you can convince us, you're welcome to get in touch.
Detailed Project Descriptions
These ideas were suggested by various members of the developer community as projects that would be beneficial and which we feel we can mentor. Students should feel free to select one of these, develop a variation, or propose their own ideas. Here, ideally.
rav1e-by-gop integration
rav1e-by-gop is an extended command line encoder that provides additional encoding strategies such as by-gop parallel encoding across multiple machines.
Problem / Intro
rav1e-by-gop is currently a command line program, some users might want to enjoy on the extended features from other programs even if they do not always belong to an encoder.
Solution / Task
make rav1e-by-gop expose the same API of the normal rav1e to make easy to use the multiple machine encoding features from other programs.
Requirements
The student should be familiar with Rust or C programming.
Video knowledge is not strictly necessary, however a basic understanding of the concepts is vastly beneficial.
Possible Mentors
Support WebAssembly SIMD
rav1e supports the WASI platform and it has its javascript API bindings relying on it.
Problem / Intro
The WebAssembly SIMD is getting closer to be available, we should support it.
Solution / Task
- Implement the dispatch logic for WASM SIMD as done already for x86_64 and aarch64.
- Implement the Sum of absolute difference (SAD) and Sum of absolute transformed differences (SATD)
- Implement the inverse transforms (idct, iadst, identity, ...)
- Implement the motion compensation.
Requirements
The student should be familiar with Rust, WASM, wasmtime and related tools. Knowledge of x86 or arm assembly is not needed but will help.
Possible Mentors
Implement butteraugli in av-metrics
av-metrics is a collection of video quality metrics, butteraugli is a promising psychovisual similarity metric.
Problem / Intro
Currently the implementation of butteraugli exists as stand-alone codebase. The code is readable, but it could be faster.
Solution / Task
- Implement rust bindings to the reference butteraugli.
- Implement butteraugli in pure rust within av-metrics.
- Write integration and unit tests to make sure the implementation does not diverge
- Write criterion benchmarks
- Implement x86_64 or aarch64 optimizations for it, using intrinsics or plain ASM.
Requirements
The student should be familiar with Rust, C and C++. Knowledge of x86 or arm assembly is welcome.
Possible Mentors
Grain synthesis implementation inside of the rav1e encoder
Grain synthesis is using the idea of modeling noise temporally and spatially using noise estimation.
Problem / Intro
Keeping high frequency detail and noise(dithering, camera noise, grain) using traditional encoder techniques is very expensive in terms of bitrate allocation, and some tools implemented to take care of that problem can create additional artifacts that are not pleasing to the general viewer experience, or are detrimental to the fidelity of the image.
Solution / Task
Implementing grain synthesis that models the noise parameters of a video, and applies the generated noise parameters during the decoding process, saving very high amounts of bitrates and providing a very high subjective visual fidelity and appeal.
Making it faster than other forms of grain synthesis via smarter algorithms and using various forms of threading to speed up its application, such as tile threading and integration with rav1e-by-gop, making it possible to use as part of any encoding workflow. This will make sure adoption of the technique becomes as widespread as possible.
Requirements
The student should be familiar with Rust and C, and must have a light background in general visual media encoding, such as video and image compression. Complexity: Medium.
Possible Mentors
User:Lu_zero && XX
Adaptive quantization
Adaptive quantization is the process of an algorithm trying to efficiently allocate bitrate among the various macroblocks found in a frame by varying the quantizer across each of them according to different visual targets.
Problem / Intro
Often times, an encoder does not know about the best way to allocate the bitrate budget across a frame, and may overspend a considerable amount of bitrate to regions that might not benefit from a low quantizer(low amounts of distortion, so less compression) while not giving enough bitrate to zones that might actually need it. This can even cause issues temporally, as bitrate allocation within a group of frames(GOPs) may be skewed towards more complex and high motion frames, while leaving other frames with barely any bitrate to work with, creating visual artifacts such as blocking and banding.
Solution / Task
Implementing 1-3 forms on adaptive quantization either based on variance, complexity and/or variance variance with a bias in low contrast frames in rav1e. This will make bitrate allocation more efficient, and avoid bitrate overspending in areas which either need the lower quantizer to avoid the presence of lower quality frames that might detract from the viewer experience, and make objective/subjective quality targets easier to achieved. The combination of powerful adaptive quantization and grain synthesis would allow for a higher subjective quality viewing experience at lower bitrates while potentially lowering computational complexity by a good margin.
Making a smart adaptive AQ mode in which the encoder chooses which adaptive quantization algorithm to use depending on the scene featured in a GOP. Potentially difficult.
Requirements
The student should be familiar with Rust and C, and having a background in general visual media encoding, such as video and image compression is recommended.
Difficulty: Medium depending on which targets the student chooses to follow
Possible Mentors
User:Lu_zero && XX
Visual metric targeting in rav1e-by-gop
Objective metrics are used to evaluate an encoder's performance in a diverse set of scenarios. Different metrics such as PSNR, SSIM, DSSIM, VMAF and some closed no-reference metrics are used in the field to record encoder performance changes across versions trying to correlate closely with human perception.
Problem / Intro
Classical methods of rate control such as ABR(Average BitRate), fixed quantizers and even CRF(Constant Rate Factor) have the issue of not targeting a certain quality level. This can result in starved encodes where the bitrate budget has to be kept low to stay watchable by the viewer without interruption, leading to scenes that have exceptionally good visual targets by overspending bitrate, and scenes that have very poor visual appeal by having too little bitrate, detracting from the viewer experience entirely. More advanced forms of rate control like CRF help somewhat, but they still have the issue of having to overshoot so the lower quality scenes do not suffer, and do not adapt to the different type of content encoded, resulting in variable quality encodes.
Solution / Task
Implementing visual metric targeting based on VMAF(mainly used for video) and butteraugli(mainly used for images) as part of rav1e-by-gop as a secondary rate control option.
The application of visual metric targeting in rav1e-by-gop would take advantage of its adaptive keyframe placement and smart scene detection to its fullest. This would allow for the best rate control possible, as short scenes in the length of 1-15s are where visual metrics such as VMAF shine the most. The idea is to encode first with a very fast speed preset in the encoder to gauge the quality at a prefixed quantizer. If the visual metric target set is not achieved, the encoder tries again once or twice until it gets the right result. With this method, instead of targeting an average of bitrate, you would target a visual score, getting higher efficiency and higher subjective quality. This would also be advantageous in terms of encoding time spent, as encoder complexity could be dialed back while keeping overall efficiency the same or higher, with efficiency being a function of both encoder efficiency and rate control.
Implementing butteraugli quality targeting as an option using rav1e for AVIF images. Since visual quality requirements are considerably higher for intra only(image only) media, keeping high visual fidelity is even more important than video compression. Quality targeting iterations would also be quite useful here.
Requirements
The student should be familiar with Rust and C. General interest in image and video coding is recommended
Difficulty: Low-medium.
Possible Mentors
User:Lu_zero && XX