DaalaMeeting20140909

From XiphWiki
Revision as of 05:57, 16 September 2014 by Tomvanbraeckel (talk | contribs) (Unscrew the formatting)
Jump to navigation Jump to search
# Meeting 2014-09-09

Mumble:  mf4.xiph.org:64738

# Agenda

- reviews
- deringing
- intraprediction
- non-photographic video

# Attending
- unlord, tmatth, smarter, jack


# Reviews

https://review.xiph.org/455/
https://review.xiph.org/456/

# derining

- jm: I've continued on using paint as deringing. I have results that are slightly cheating but getting better. http://jmvalin.ca/video/dering/ What this is doing is computing a paint image and from the coded image figuring out (from variances and stuff) the gain that should apply to the derining. Ie, how much of the paint image we should use vs. how much of the decoded, pre-paint image we should use. It's doing a reasonable job but it's not perfect. On top of this I'm adding a single flag per block which disables the filter entirely for the block (gain of 0). I don't have crossfade, but I didn't notice much of an issue with blocking artifacts. I didn't generate images because I wasn't quite sure what the best thing to generate was.
- d: Howa bout Clovis?
- x: Good torture test.
- jm:
off: http://jmvalin.ca/video/dering/clovis0.png
on: http://jmvalin.ca/video/dering/clovis.png
 We can see easily. It's at least helping on the edges of the dunes. The color edges are just chroma contrast so they don't show up. I'm not sure yet how to handle chroma. THere's many ways to handle it. One way is through CfL. Ie, not apply deringing to chroma at all, but just on luma.
- d: That means you have to delay chroma until a whole line of superblocks has beend one. That is specifically what the HW people didn't like about CfL. We have an advantage because of doing it in the frequency domain.
- jm: Actually more than that because of the filter. Two superblock rows.
- jm: Those are -v60. Output is 18k. This is much below what I normally used for testing.
- d: I notice it does a fairly good job on the big balloon in the center on the upper left edge. But on the right edge seems like it does nothing.
- jm: It can't do much for features that are horizontal or vertical because the 2d dct is already a directional basis and its artifacts will be aligned with the direction of the image. If you look just below the bulge where its 45 degrees you see a huge difference.
- d: It seems bad that can't do anything about horizontal or vertical.
- jm: The rining is worst on the diagonals usually.
- d: The big question is "how slow is this?"
- jm: The normal encoder would take 0.4 seconds, and deringing takes a little under 3s. The vast majority is spent in the direction search.
- j: 6x slower!
- d: To be clear, just for this feature he is using 150x the decoder budget!
- d: Does the operation scale linearly with blocksize?
- jm: 16x16 is at most 2x as complex as 8x8 but it depends on what directional resolution we want. So per-pixel it's the same cost. The number of ops per pixel is on the order of 50. Assuming we don't do any downscaling before we find the direction.
- d: That's not completely unreasonable so why is it 150x slower?
- jm: I'm doing optimal interpolation weights around all four edges etc. If you look at intrapaint.c there is a pixel_interp function and that takes most of the cost. You call this for every pixel and it uses square roots etc. Then numbers I gave you are too optimistic. I forgot computing the distortion. It may be closer to 100 or 150.
- d: ekr is trying to hook us up with someone who's done speed optimizations for opensll, etc. Would it make sense for you to talk to him for trying to make this function super fast?
- jm: Yeah. Unlike openssl, I'm pretty flexible on the outputs.
- d: If all you want is a direction, why not just take an edge detection algoirthm and run an arctan?
- jm: I'm not a video guy so I've never run an edge detector. Actually that wouldn't work. I am not necesarily looking at edges, i'm looking at stuff like low variance in one direction and low variance in another. Like the tail of the horse.
- d: I'm talking about running a gradient operator. If one direction is dominant you should be able to find it. At lower rate you need a larger support for your gradient operator. We need to figure out if it is possible to make it sufficiently fast. Are you interested Monty?
- x: I think it is doomed. I'm not convinced it is generally useful. It does give a pleasing effect at low bitrates. I dont' see what it buys Daala; it's not codec specific.
- jm: Right now it's about the only thing we have to deal with ringing since we don't have intra.
- d: Monty to be clear, this is not a post processor. It's in-loop.
- x: Oh, I missed that. This is interesting, but we have other problems to ship a codec and I don't see how this solves our problems.
- d: Right now directional edges and features are not captured well since we haven't figured out intra.
- x: But we've also failed to make deringing from other codecs work. Figuring that out seems like a much more interesting problem.
- jm: I don't think the 265 stuff can get rid of the ringing we get at low bitrates. It just changes individual samples.
- x: Won't do that in 265 either right?
- jm: No, but 265 has intra prediction!
- j: Sounds like you want to spend more time on intra?
- x: Yes.
- d: Nathan is working on this. Sounds like you should move on to the next thing in your pipeline.

# intraprediction

- u: I have the still image test bed and it produces RD curves now. I'll post some of the curves. I don't think that AC coeefs from neighbors is a reasonable approach. It requires too large of support.
- d: Especially for shallow angles.
- u: At 45 you can go one block away, but at others you ahve to go 2 away. I'll post curves later today. I can't point you at the code Monty. It's super simple.

# non-photographic video

- jm: I think this is very useful and we should start planning for it.
- d; I was not going to worry about it until next year.
- jm: This has the potential to be a killer feature.
- x: That all by itself would get RedHat's attention.
- j: I brought htis up as a differentiator but Tim didn't seem super supportive.
- x: It's hard.
- d: I don't htink it's that hard.
- jm: There is a big piece that is missing from what I did for paint. Being able to encode antialiased text.
- d: Anything you can imagine you can do in the spatial domain and then transform it into the frequency domain.
- jm: Why would you want to do that?
- d: Because that way you don't have to worry about mode switching and all this other stuff. IT makes a bunch of stuff simpler.
- jm: If you get that to work, then you get paint to work.
- u: Didn't we talk about this with Greg in Orlando?
- d: I don't remember this discussion at all. I remember the Thai food, but not hte discussion. Go ask Greg if he remembers.
- jm: This would be even worse than paint. How do you disable it? Having no safe fallback... On top of that, if you code something, you're going to ring all over the place. If you ever code a DCT on top of the first step then it rings all over the place. In every piece that is non-photographic you can't code a dct on top of a basis function anyway. Attempting to code the whole thing with DCT and then using beefed up version of deringing to fix things up, ensuring the background for the text is flat, etc, you would already have wasted tons of bits on other things. I don't see a 2 pass version really working.
- d; Try to think of something that does work and in conjunction with motion compensation.
- jm: A separate codec that you switch block by block basis is the kind of thing I'm thinking of. In the desktop case, the motion will be flat. Either you're moving not at all or large areas at the same speed.
- j: With all the animations window managers have, that kidn of motion doesn't seem realistic.
- jm: The text is really the thing to focus on.
- d: The particular tool you use is not important. How it integrates into the codec is the hardest part of this problem.