DaalaMeeting20140513

From XiphWiki
Jump to navigation Jump to search
# Meeting 2014-05-13

Mumble:  mf4.xiph.org:64738

# Agenda

- reviews
- Confirm dates for work week and coding party
- VQ
- Unlapping

# Attending

# Confirm dates

- jack: we'll try to get paperwork started for workweek 6/6-6/13
- jmspeex: code party moved to 8/18-8/22
- T-shirt sizes https://daala.etherpad.mozilla.org/daala-shirts

# VQ

- jmspeex: what i realized this week is that of the large gain we had about 2/3rds could be explained by removing intraprediction. in your slides there was a large gain between current daala and vq and 2/3rds was due to disabling intra whether or not you used VQ. the rest was divided in three between som assumption i didn't know about in the entropy coder, a symbol that i sometimes forgot to code, and actual real gain from vq. if i correct for all that, vq is now 1% better in metrics. it gets the edges slightly better in metrics. i'm no longer sure we really want to go that way.
- derf: what are we going to do instead?
- jack: i thought these images were amazing? what happened?
- jmspeex: the intra was hurting us, but disabling that is not to the credit of vq. the images are sitll slightly better despite the metrics being very close.
- derf: it's an awful lot of complexity for 1%.
- jm: exactly. i don't know how the actual quality relates (aside from raw metrics). once I saw 1% i figured it wasn't worth all the complexity. i still haven't tried linking HF VQ to LF VQ.
- d: are you still planning to do that?
- jm: eventually. not in the short term. if you want to play with it I have a decoder patch for VQ now.
- d: can you put that on review?
- jm: sure. i have about 10 codebooks that each give slightly different results.

(some discussion of codebooks)

- jm: the first version i had, in the bands where vq was active, you could either encode the vq and then an angle, or code noref and code PVQ. the second path is now gone. if you're a band at 15 that has no prediction, then you are using the vq. you can't use noref. it removes some redundancy.
- g: your prediction being bad is probably an issue for the intrapredictors. you only have a limited degree of freedom choosing them, not 7bits of choice.
- jm: if your prediction is bad, you have a problem here with the vq. you have a sign wrong or something.
- d: do we only use noref if the angle is > pi/2?
- jm: not only. right now i disable it whenever the correlation is < 0.5, but that could be changed. basically at pi/2 you want noref, because theta can't go to pi/2.
- d: it seems like that decision needed to be RDed
- jm: it is. the vq seems to produce cleaner edges in clovis fest, but i'm wondering if there is another way to get the same results. maybe some kind of deringing filter.
- d: i had it in mind that that could be one of our intern projects.
# interns
- j: who will mentor the interns?
- d: probably greg and i since we're local
- j: are you fine with that?
- d & g: yep

(switched to talking about possible intra projects for interns)

- jm: before trying to train 16x16, which i think is doomed (way too hard, we don't code edges at 16x16). how about we get 4x4 to work with TF. just getting 4x4 to have decent prediction would be a huge gain. let's just forget about 16x and 8x for now. it's no use saying we're going to try and do nice edges in 16x when we won't select that and 4x is not working well.
- d: along those lines, nathan has unlapping stuff going
- u: i was able to solve for the horizontal predictors. i don't have a number on that, but i was able to do the matlab solver and get something that was linear. the next step is to run an experiment to find out how that works by replacing horizontal mode with this instead.
- jm: back to training, i think doing 4x4 with images i flagged as being more edge-y could probably help.
- d: for 4x4 i think that makes sense
- u: would you use the flagged images 4x4 blocks for training?
- jm: i picked only blocks that were classified as 4x4, and then filtered them by the gain
- u: are you suggesting I look at only those 4x4 blocks with a certain gain?
- jm: don't even look at the gain. if you have a landscape with texture some blocks are bound to be selected as 4x4. these images are ones i flagged as having nice clean edges. the ones where edges were important to the image.
- u: what you're suggesting is to remove noise from training, but a synthetic covariance matrix could do the same thing.
- g: ???
- d: right now we're not getting dc scaling right when the block sizes change. which is why i wanted to look at unlapping instead of ratholing on training some more.
- jm: construct an image with wide vertical lines and pick block with just white or just black and compute the covariance matrix
- g: that won't get you the same covariance matrix. you'll still be correlated in the orthogonal direction with other edges. we do least squares directly from the covariance matrix. we can set them directly and we know what it would look like.
- u: when we do training we build a giant covariance matrix about what pixels are related to others.
- d: you could make a covar matrix where you are highly correlated on one angle and have a very low correlation in the orthogonal direction
- jm: the vq thing i've been working on can serve as the bar that intraprediction has to pass.
- d: definitely. i'm just sad that the bar doesn't get the rate lower.
- jm: next thing i want to look at it trying to encode dc using 32x32 level and working our way down from that, as we discussed in the work week. i have ideas for a few variants to try there. two ideas i had: one was doing the walsh-hadamard on the DC and stopping when there is no DC because of the block size.
- d: what do you mean sort of equiv?
- jm: Well, if everything was 4x4 then it would be a real WHT, but if some blocks are not split all the way down to 4x4 then you can't apply all of the stages to them.
- d: that's sort of a standard thing that gets done.
- jm: one possible drawback of that is that if the block is almost flat but there is some 4x4 in there, the DC can be offset. ie, the entire superblock is white except for a corner, you don't want to average. so i was thinking to average per blocksize. if there is a corner with 8x8 and 4x4 the average of the 8x8 would be coded as the differential from the other average. you're always coding a difference
- d: that sounds really hard for the encoder to search. you're using block size decisions as input
- jm: Well, we already assume you know the block sizes
- d: you make a bunch of assumptions about independence that aren't really true..
- jm: you can't take averages until you select blocksize
- d: it's an interesting idea and is definitely not how things are done normally
- g: have you looked at the data to see how the DC is related when the sizes are different/
- jm: i've looked at scale factors
- g: not scale factors, distributions
- jm: i'm not sure what you mean
- g: if we know the sizes of the blcoks, what info have we learned about how the DCs are related after scaling correction
- jm: if you are 4x4 there is likely to be something going on
- d: just as a thought experiment. 32x superblock with some edge in the middle. you split down to 4x along that edge.  Pretend 64x64 (32, 16, 8, 4, 8, 16, 32). if the dc is piecewise continuous, then in the 4x4 stuff in the middle you'd expect things to be in between. i'm trying to figure out how your model would account for that.
https://pastebin.mozilla.org/5148426
- jm: what i described would work when it's mostly flat except for something going on somewhere.
- jm: Consider a wire running across the sky
- jm: back to codign party and work week.