# Meeting 2015-03-17 Mumble: mf4.xiph.org:64738 # Agenda - reviews - prepare for IETF 92 hackathon - lapping approach data from jm, TD, (unlord?) - quilting bugs # Attending azita, derf, jac, jmspeex, smarter, td-linux, unlord, xiphmont, yushin # reviews # IETF hackathon - d: We need to identify stuff for people to work on. - j: Is there a list from code parties? - d: Best list right now are the issues in GitHub. We have a number of them tagged as Bug or Easy. - j: We have a week so hopefully others can think of some easy bugs and put them in there. - jm: What day is the BoF? - d: Tuesday at 9am Central. - u: Do you need updated graph for the presentation, Tim? - d: Yes. I am hoping to get the new lapping added first. I will ping you about that when it's ready. Perhaps after the hackathon. # lapping stuff - d: Yushin's is the only new experiment that we did, which had some interesting results. - y: What I tried was use variable lapping from master on top of fixed lapping from Jean-Marc's RDO block size decision. It increases rate 1-2%. And then I have used his basis magnitude compensation. If I turned that off it increases another 1-2%. So we understand that if the block size decision is based on fixed lapping it will only get best efficiency with fixed lapping. The block size decision is somehow better than Thomas's (from the Tran paper). - d: Given that I think we should go ahead with fixed lapping. The question is does anyone else disagree with that assessment. - jm: No! - u: Are we saying that the fact that the lapping ... 4pt being covered with 8pt is the cause of hte improvement? - d: No. - u: We still agree that this is a subset of the lapping cases that you get with the full lapping? - jm: No you would never get that with full lapping; not the way it is structured right now. THis code will keep 8x8 on th exterior of 4x4s because every level is lapped the same. The good news is that if we go with this approach we can revisit things which previously did not work. One of those is TF. I suspect we can use TF to go from 16x16 to 32x32 or as a 64x64 level. - d: Why do you think that will work? - jm: Before we weren't willing to add a matrix to compensate for each coefficient. It was causing scaling issues and we didn't want to scale each coefficient differently. But the way the new code works is that I'm doing that anyway. - d: I thought the issue was that you didn't know how to do that. - jm: No. It was always a possibility but having to do that versus just doing 32x32 DCT there was just no point. But basically it is an option. Another option is that essentially the way the fixed lapping in structured, you could derive some TF with second stage that give you exactly what the DCT gives you. So there is a matrix that will change the DCT coeffs from one 8x8 block into DCT coeffs of 4 4x4s. And there is a matrix that will change a superblock into 32x32. Whether it is sparse or not is another thing. - d: That's only if you dont' have interior lapping right? - jm: No. The interior lapping is still linear. - d: Sure but it would be undoing that. - jm: Yes. - d: That would still be true so long as we had the lapping order done this way right? Nevermind; I'm being stupid. - u: Going back to the original patch, where is the gain coming from? Is it things like changes to the Haar DC or some of the other places? - jm: The improvement is due to actually selecting good block sizes and having the block size RDO match what is actually happening. - d: Yushin's experiment demonstrates that having the actual encoding match wasn't that important. - jm: Mostly I mean the RDO itself being self-consistent. Having decent metrics in the RDO. - d: Just looking at the only difference between what Yushin is doing and what master does (changing the decision), it got almost all of the gain. - u: That's not what I heard. He said hte cost went up. - d: 1-2% better than JM's branch, which is 7-9% better than master. The point is that all the other things that are not hte decision account for only 1-2% of this. - jm: The improvements I could make in Haar and compensating for magnitudes, I would say that is on par with what I lose with fixed vs. variable lapping. I lose 1-2% there and get that back by tuning. There are side benefits like fixed lapping being much easier to understand. - u: I still can't reason about 8pt lapping over a 4x4 block. I'm sure I'll figure that out. - jm: Look at what's happening in tools/compute_basis. - d: It means you lose linear phase, which is not the end of the world. We lost it when we switched block sizes anyway. Except now that we don't in some cases where we did before. - jm: We may be able to split HF even further to get 2x2 resolution. Another one is jointly coding 4 4x4s as if it were 8x8. A bit like what Opus does with short blocks. - s: Can you explain with more detail? - jm: You can take coeffs from 4 4x4s and interleave them and pretend that it's 8x8 data. - d: He's just talking about coeff coding. - jm: Not just that, also in PVQ itself; quantization coding. - s: Doesn't that break AM? - jm: No, it means you'd be able to do it. Now you can use decent bands for 4x4. - s: Do you do some transform on the coeffs? - d: No transformation, just rearrangement. - s: How are they rearranged? - d: You interleave them. - s: It still seems like AM would be very different. - d: It would be, but keep in mind most codecs do AM on a macroblock level, and we don't do it on 4x4 at all because edges. # quilting - x: Most of the stuff got done this week by Jean-Marc. He replicated my tests and found an error. I determined that full precision ref frames were not going to save us. He ran the test too and found they do save us. We can also eliminate the quilting by setting scale factors to 64 which renders lapping filters perfectly reversible. That also eliminates the quilting. Neither strategy is suitable for production, but the evidence is strong that this just comes down to rounding behavior. The next step is to understand rounding behavior to find another less expensive way to eliminate the artifact. - jm: The way I like to think of it is that there are two types of quilting. Noise quilting and pattern quilting. - x: I disagree that these are different. - jm: It's the same issue but it shows up in two different ways, and they may have solutions that are different. The pattern quilting is a bias in the noise and the other one is growing variance by additive noise. You can prevent a bias in the noise which would get rid of one, but you'd still have the other behavior. The noise is due to non-perfect reversibility and gets amplified by the downshifting to 8bits. - d: it would take more than 1 unit in 12 bits to push the rounding up across the threshold. - jm: Not if you have a value of 7 right next to a value of 8. - d: Why would we have values of 7 if all we're doing is DC. - jm: Because you are applying the postfilter and you have all the intermediate values. - x: Even if your four LSBs are all zero and you apply the post filter, then they are smoothed across and are non-zero. - d: If we only have two postfilters to worry about now that seems like we can do something about it. - x: We know two things we can fix this, but we need to find ones that we are willing to deploy. - u: I think he means we could train new filters if we knew what to train. - d: You could construct the coeffs of the filter so that given a ... - jm: I don't see how you can do that for every step size possible. - d: Not at high qualities because you don't have enough resolution. - jm: High quality is irrelevant here because it only happens at low quality. I thought of things like changing the downshifting code to take one superblock at a time and trying to quantize with different rounding biases and trying to measure the amount of 1bit noise in the output and picking whichever had the least noise and kind of got rid of some of the pattern quilting; you could probably tweak it to get rid of some or all of it, but you'd still get the other one. - x: I think you are reading far too much into that result. - jm: It was adaptive. I got rid of all the pattern quilting but still had noise quilting. I tried to do it automatically trying all superblocks and all biases and that got not nearly as good as doing it manually but it was decent at reducing it. - x: I still think you were poking at something and got some result and reading far too much into it. What happens when you take the full precision frame and downshift? It injects quantization noise. The prefilter takes that noise and blows it up. In sintel, we essentially have a soft gradient that happens block by block. The noise gets preserved when we filter back. Any quantization noise is going to do that. I think you are imagining the bias. I went looking for biases and couldn't find them. - jm: Look at the image I pasted. I'm not saying there is a bias in the filter itself. I'm saying that the entire loop has a bias. - x: So we pick some random magic value that gets rid of this bias for DC updates, but screws up the rest of hte filter. These are halfhearted band-aids. - jm: I'm not saying I want this fix; it's not good. I'm just trying to observe what's happening. If you look at http://jmvalin.ca/video/bias/no_060.png there are very clear lines. I've written some 1D code that plays with the DCs. I can make it such that every single frame gets a full 8bit integer added. After 60 frames the quilting will be 60. If you look at the other one which is a hack I do not recommend (changing the shifting bias), it just removes the bias in the quantization which adds 1 every time. - x: That does not look different to me. - jm: In both cases the variance is building. - x: They are quantitatively different not qualitatively different. - jm: In the second one there is no bias, just noise. It has clear white lines. - x: I can do this in Gimp by adding a mask. There's no surprise that changing the rounding point changes the rounding. So we don't actually have a solution to this right now. We're pretty certain at this point that it's a rounding problem. - d: Well we have a solution which increases memory bandwidth requirements by 2x which will not be popular. - x: Or requires det1 filters which may impact coding gain. - d: Also full resolution references requires MC work. - jm: In any case what's not helping us is that postfilter adds a bit of noise and prefilter blows it up. In normal codecs the filter may add noise, but there is nothing that amplifies it on the next frame. - d: I'm not sure that fits my model of what's going on. - jm: In 264, they could have decided to use our postfilter instead of something adaptive. It would not have caused this problem because the postfilter would always be blurring things. You add a bit of noise and it gets blurred on the next pass and is not a big deal. Our things amplifies what the postfilter dampens. - y: ??? - x: Subtractive dither is still adding noise. - y: +/- 1 added noise does not change the rounding bias? - jm: I was going to say something like dithering would completely get rid of the lines from the no060 but you would still get the noise behavior from the shift image http://jmvalin.ca/video/bias/shift_060.png - y: How about adding noise to the DCT coeffs and iDCT inverse transforms the noise? - jm: The noise is going to build up no matter what you do. - y: But we want it to be built up randomly so that it's not visibile. - jm: Look at shift_060; this is what you are asking for. THe noise is uncorrelated but still there. That's what you get with dithering. - y: In pixel domain you have 8bit precision right? - jm: Dithering is only in pixel domain but only where there is lapping. You are rounding to 8bit on every pass, so the noise you are adding is on the order of 1 in 8bit domain. - y: Coeff size are 12bit though. iDCT transforms this to 12bit and then postfilter works on that right? How about adding small noise in coeffs themselves. - jm: That will just create more noise. That will prevent bias but not prevent noise; it adds noise. The best you can get by adding noise is the result of the shift_060 image. That being said, maybe we'll decide we can live with this noise pattern. - x: The next thing was where are the extra LSB spikes coming from. I still don't understand what I'm supposed to be looking for there. - d: I'd like to be able to point to the calculations that actually cause some of those values to be more likely than others. We're trying to analze this at the level of functional blocks of code and I'd like to look at specific computations that are causing this. - x: That makes sense but we can talk about that offline. I'm still not sure exactly what I'm supposed to be looking for. The other part was that you had talked about potentially restructuring the filters in some way that would render them more symmetrical. The whole alternating the filter back and forth. - jm: I tried that and it didn't work. - x: I don't like that idea because there are many reasons it can't work. But were there more things in that vein that would have a better chance? - d: We can make it symmetrical with some increased complexity. - jm: Can you define symmetrical? - d: If you reverse the inputs, the initial computations are the butterflies. Then we shift down and that introduces an assymmetry. - x: I thougth we shifted then subtracted. - d: Actually yes we do do that. Which actually makes this harder. - jm: If you look at hte no file I showed, the lines are symmetrical. So I don't think fixing the symmetry will help you there. - x: It will if you flip and negate. - jm: That doesn't work for another reason. The quilting does not happen if you boost all DCs. The quilting is related to t4 to t7 (the difference terms that are lifted and processed around). These are the terms that are cuasing the quilting. - d: Those are exactly the terms where the filter is not bidirectionally reversible and where all the asymmetry is. - jm: Unless you update the DC every frame, it won't help you. If you increase it every other frame, you still get the build up. Whatever you did to reverse the effect on odd frames will never get triggered. - d: I think we all agreed that alternating the filter won't help. What I'm suggesting is altering the structure of the filter which we can do. I think about two years ago I had tried altering it for a different reason. Currently we have our sums in one place and our differences in another. I flipped it to be the other way and there was less rounding in that path. When I did this, the performance of the codec as a whole got dramatically worse and I didn't understand why at the time. But it seems relevant to try and understand that now. We can talk about this offline.