DaalaMeeting20141118

From XiphWiki
Revision as of 09:04, 25 November 2014 by Jack (talk | contribs) (Created page with "<pre> # Meeting 2014-11-18 Mumble: mf4.xiph.org:64738 # Agenda - reviews - avx intrinsics - directional transforms - Doing non-intra work (when?) # Attending unlord, derf, ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
# Meeting 2014-11-18

Mumble:  mf4.xiph.org:64738

# Agenda

- reviews
- avx intrinsics
- directional transforms
- Doing non-intra work (when?)

# Attending

unlord, derf, jack, jmspeex, td-linux

# Reviews

- jm: Nathan, are you done with 516? Or is there more stuff to check?
- u: Looks good. I'm going to try to land it and see if there was any other piece we could remove. I can try to rebase on that version you gave me and see what that does.
- jm: I'd like to land it as an actual merge commit and not have to fix a rebase later. I have a merge ready locally, shall I commit?
- u: Sounds good.

# AVX Intrinsics

- d: How's that coming?
- td: I have one mostly done.
- d: Which one?
- td: AVX2 is the one I did first.

# Unlapping

- d: Nathan, are you planning to continue on unlapping?
- u: Yes, just updating status now to add that. Will probably work on that before the other things.

# Directional Transforms

- x: Filters didn't take the form I expected. Ran a bunch of tests at IETF. What I have now is that instead of just the first row, I use the first row and column. Those behave how I expect. It is not the case with the correct objective I found in ABQ. They converge to work with one specific lapping the best. If I train one to work with a few lappings at once, it splits the difference. I'll verify that is what is actually happening today.
- jm: How would you handle BSS?
- x: I don't know that I would. When I say splitting the difference, the filters aren't all that different. Is that inherent to the filter design? If it is, are they useful anyway? The resulting behavior may be good enough. I'm beginning to feel like they are believable but I'm not sure they are good enough. It's a little better than half the return we were wanting. I was going to stick it in the codec and plot what gain it gets for different directional blocks. I want to run that last test to see if the return is sort of like what Nathan got and what the other directional transform people got. It looks like the additional complexity is fairly high.
- d: The stuff people get with other directional transforms is not a peak at each direction. It's a peak at 45 degrees and that's it.
- x: What I saw in Nathan's stuff was kind of jaggy. THe big peak was at 45 and smaller peaks at different directions.
- u: There are two curves. Jaggy on lapped is because it's just crappy.
- x: The tests I had intended to run I still plan to run and compare to Nathan's results. If it's the same, there is no point to fixing any of this.
- d: Did you look at stability at all?
- x: That looks quite good. In ABQ it was abysmal. I now have an objective that inherently enforces stability. As soon as I was not magnifying error amplitude, the training starting moving energy into the right places. When you inject noise into the original transform or the quantized coeffs, the noise you get out is of a similar magnitude. Unsurprisingly the noise has hte same directional basis nature as the direction of the trained filter.
- d: I'm not surprised you had to go to a row and column.
- x: I was surprised I had to go to a full row and column. At least in the naive stuff, it wants to have full frequency representation in the row and column. I can cut it down the behavior becomes less desirable.
- jm: Setting slightly less ambitious goals, is there anything that we can salvage from all this? What if we use this to predict the diagonal bands? We code the first 15 coeffs, horizontal, and vertical, and then use everything we have to predict the diagonal 8x8 frequencies. I understood what you are doing is essentially a prediction.
- x: I was trying to do that. I can train a predictor and it works, but the form I was using was not yielding a perfect predictor.
- jm: What are you doing now that is not a prediction?
- x: I don't have a filter that is prediction only that is capable of representing the lapping?
- jm: So this is equivalent to an IIR filter?
- x: If it was pure prediction you would be able to predict coeffs from the ones you already have.
- jm: I don't understand why it's not a prediction.
- x: You decode the first row and column, and you predict everything else. If it is a pure edge in that direction, it will predict perfectly.
- jm: If that's the case, then with the current code and we're coding all horizontal and all vertical, you should be able to use that to predict the diagonal.
- x: There is a prefilter that effects a modification to the transform itself. The first row and column you use for input is not the same as the one you'd normally use.
- u: Jean-Marc you're suggesting this is still signal free right? So we could train something like this for the lapping on the four edges.
- jm: The other part of my idea is that you can use horizontal and vertical to figure out the direction without having to signal it. Which means that if it doesn't work at all then we are no worse than we were.
- x: There's an even easier test to do first. Drop it in the way it is now. I suppose I could do it with the partitions you have no (they are overkill). I was suggesting taking the intragain tool and running it and comparing it against the others so I can compare with Nathan's PCA graphs.
- jm: Intragains are completely meaningly. Nathan was testing on synthetic images which was even worse.
- x: I don't think that is actually true. On the other hand, we've already tried the other techniques. If this looks like them then there may not be any point. But the visual qualities given by the directional transforms are quite good.
- x: I'm using more information than is necessary to represent all the degrees of freedom. I need to check if the different lappings are splitting the difference in the training. I also wanted to see what tolerance the training had for leeway in the directions because I am coding more DoF now, it seemed possible I might be able to generate filters that could detune the filters. Or it might be possible to code a filter that might handle more than one direction. The real goal is to do something similar to what nathan did. I do want to see the CG. I agree the CG is not an accurate measure of what this tool is worth, however if the CG is much worse than Nathan's, that could mean something. Since adding the DoF I did not look and see if there were more techniques I could do to get a better pure predictor.
- d: Of all that stuff, the most important to me is hearing about the filter form and why you are using it.
- x: I was simply was using a big linear matrix to predict all the diagonals. What was happening is that I'd get a predictor, but it didn't handle the lapping reflections. If the physical edge was in the block I could get a perfect fit. If the edge was just outside the block, it wouldn't fit that. Or rather I could fit one or the other but not both. The next filter form I added is to add another layer of lifts. That layer was capable of moving energy out of the space I was trying to predict to the space I was predicting from. As soon as I did that, I was able to predict both things. The problem with that is that I was getting error in the things I was coding. The quantization error was very bad. Then ran inverse prefilter after the post filter. I got near unity scaling too. They still aren't even, but they aren't way out there.
- d: That is too high level for me. I need to see the DoF.
- x: I'll make some diagrams and scan them in.
- d: I wonder what would happen if we could do some of this by taking into account the band structure we already have.
- x: It's not too far off.
- d: The first 15 coeffs is the lower 4x4 quadrant.
- x: I'm sure I could train something to use just those.
- d: THe other thought I had is that if we have to code all fo those, perhaps there is enough information there that we can highly bias the probability for the mode, or not code one. If we can code the first three bands, there may be enough information that the decoder itself can do the search for the direction.