DaalaMeeting20150217

# Meeting 2015-02-17

Mumble: mf4.xiph.org:64738

# Agenda

- reviews
- switch to vidyo? (jack)
- Suggestion: more technical meeting time (jmspeex)
- New lapping branch (jmspeex)
- ??

# Attending

daala-person, derf, jack, jmspeex, smarter, td-linux, unlord, yushin

# reviews

# vidyo

several people opposed to having any proprietary software, even though the result is everyone in mountain view sittingin the same room each with headphones on and maximally distant.

# more technical meeting time

- jm: I want more meeting time, so I was thinking every day at noon or something
- u: I would like some more structure to them than just that.
- jm: I was thinking more watercooler time.
- d: That's a different thing. I don't think we should make this meeting longer just for that.
- td: I think the other meeting should have particular topics, otherwise we will get unbounded conversation time.
- jm: I'm trying to go one step in the direction of what happens when we have our work weeks. Those are more productive than average.
- d: Did you want to pick a day and time? Once a week to start with and adjust from there? Noon is great if you never want me to show up.
- td: I'm available all the times you met last week.
- jm: For me, after 3pm eastern it gets hard.
- d: Try 11 pacific on Wednesday?
- jm: That works fine. (no objections from others)
- j: Who's coming up with topic?
- jm: Blocksize will be a good topic.

# new lapping branch

- jm: There is a new lapping branch.
- u: Is it different than my patch?
- jm: Yes.
- jm: https://github.com/jmvalin/daala/
- jm: Essentially I've come to the conclusion that our lapping is a dead end when it comes to being able to write a reasonable encoder. The github branch I have is doing lapping differently. It uses fixed size lapping everywhere.
- td: On all frames? Or just inter?
- jm: Right now it was simpler to make it hte same on intra and inter. It might be good enough with that. We could always go back and have intra use the old lapping. I'm really focused on inter at this point. What it does it laps recursively just like nathan's patch. The difference is that at each level it is always the same level no matter what the blocksize is. Right now it's configured with superblock at 8pt, 16x16 level at 8pt, 8x8 at 8pt, and at 4x4 it uses 4pt. 16x16 and 32x32 are lapped half of what they would normally be. 4x4 is weird because it's lapped asymmetrically. Running this was 2-3% worse. Part of that is the fact that changing the lapping changes hte basis scaling and I haven't retuned the quant matrices. I figure whatever we lose we have to give up because the other option isn't working. It's either this or ditch lapping completely.
- u: What makes you say it's not working?
- jm: I don't know who else tried block size switching with current master, but whoever works on that will come to the same conclusion. I lot of this idea comes from Guillaume and I pushed it a little further. With the current master there is no reasonable way to search for blocksize with some kind of dynamic programming algorithm. At first we thought the search was just hard, but it's actually much harder than we thought. With current master you can't have a reasonable distortion metric to make the block size decision.
- td: Which branch in your github has this?
- jm: master. On my branch I do the lapping differently and I have some basic block size RDO. On FourPeople at low bitrate the block size decision makes sense already. The metric is trying to optimize for PSNR but it fights with PVQ so Guillaume and I are trying to come up with better metrics there. At low bitrate it's also fighting the untuned quantization matrices, but I know what the problem is there and can make a solution. Intraprediction may be possible now because you can unlap enough.
- u: What about top block in a superblock?
- jm: In all cases you can not unlap everything, but you can unlap enough. The lapped region will always be lapped the same way so you can use these samples. Unlike with freq domain intra where we tried to use the info from lapped pixels without accounting for how they were lapped.
- u: We still have to do the experiments and we can't know this will work until we've tried it.
- jm: Yes, of course. But unlike what we were doing before, this has a chance of working. I wouldn't suggestion doing those experiments now. Let's get inter actually working.
- d: We would also have to do IPR research.
- jm: If you wanted to you could run a 4x4 DCT on the boundary just like what we were doing and we would get to freq domain intra that has a chance of working because all the 4x4s are the same. That's the idea anyway. It potentially undooms intra prediction.
- u: We still have the same problem with intra superblocks in an inter frame?
- jm: I don't see what's special about that one. You have the coefficients and you do the post filtering and you have the pixel domain coeffs. It's not fundamentally different.
- d: We still have the same problem that the motion search does not consider intra.
- jm: Sure, and that will be fun because blocks aren't even aligned.
- d: All that is going to change anyway because we will have bigger blocks.
- jm: The good news is that I've been looking at motion compensation on FourPeople. Not sure about the rate itself, but in terms of prediction it is doing a good job.
- d: I had the same thought after replacing the motion compensation with another codecs was actually worse. I was hoping ours was broken :)
- jm: The known things to do before we get decent BSS on this branch is 1) new quantization matrices and 2) better distortion metric than MSE.
- d: What are your plans for that metric?
- jm: Guillaume?
- s: We talked about doing something different like PSNR-HVS. I don't know if you have a better idea than that.
- td: Doesn't x264 use a hadamard that is weighted?
- s: x264 uses sum of squared error. It also does hadamard at 4x4 and 8x8 levels. It's a way to keep the energy of whatever you are coding.
- td: Ok. psnrhvs seems slow but good, and we can try to make it faster.
- d: Except that we know that psnrhvs is the wrong metrics to optimize for at low bitrates.
- s: Maybe we could interpolate between two metrics based on bitrate?
- d: It all sounds very expensive, but let's get something that works first and then try to make it fast.
(some discussion of metrics used in real time coders (SAD, SATD, MSE))
- d: Have you looked at trying to use TF to get your 4x4s instead of doing 4pt lapping and actual DCTs?
- jm: I have not looked at that. Do you think that would be better?
- d: It's still asymmetric but you might have less leakage on the tail. This was the design I had 3 years ago until Greg shot all my ideas down with this experiments. The thought I had before was that we'd have 8pt lapping and TF down to 4x4 instead of doing 4x4 DCTs with 4pt lapping. If I remember the way the basis functions looked you end up with less ringing but the coding gain may not be as good.
- jm: I think your way would have more ringing because I have supprot of 12 and you're talking about support of 16.
- d: I have graphs of these.
- u: Are your graphs with 4x4 on one side and 8x8 on the other side?
- d: No, 8pt everywhere, with TFed down 4x4s. What I'm looking at is slide 36 from intro to video.
- jm: I don't know what filter you used for that, but what I'm getting with the current lapping filter is much worse than that.
- d: That was with the ramp constraint filter.
- jm: Uploading what it looks like now. http://jmvalin.ca/video/lapping4_tf.png and compared to this which is what the current code is running: http://jmvalin.ca/video/lapping4_8.png One of them has a lot of ringing and the DC goes below zero, and the other does not.
- d: I agree that is still a problem.