DaalaMeeting20140211

From XiphWiki
Jump to navigation Jump to search
# Meeting 2014-02-11

Mumble:  mf4.xiph.org:64738

# Agenda

- review review (11 open issues)
- current issues
- activity masking
- project plan updates

# Attending

jmspeex, jack, unlord, gmaxwell, derf

# review and current codec results

- d: i'm rebasing 32x32.
- g: it didn't work even with pre-filters turned off.
- d: ok. i can look at that.
- d: nathan are you reviewing greg's det1 training patch?
- u: yes
- jm: what's the status with the scaling patch?
- u: i sent an updated one but you found some rounding thing that i'm looking at now.
- jm: there were two things. one was removing the scaling for lossless and the other one was shifting wrong.
- u: we also talked about using FP on the commandline to specify quantization.
- d: we want to map integers on the ocmmandline. we ahve to code these in the bitstream.
- jm: i don't think it matters in the short term
- d: sure, but it's not hard, let's just do it correctly. the thing you want is to store them in log space. this is the way we should be testing too.
- u: i proposed some mapping where 1 would go to 1 + smallest increment
- d: we don't want the spacing to be linear. there's no reason it needs to be.
- u: what would we like it to be?
- d: logarithmic.
- g: because that will be linear in rate
- u: how do you pick what that space looks like?
- d: you say it doubles every N number of steps
- jm: as soon as we introduce activity masking it will be a bit more interesting
- g: teh status of activity masking is that it's producing much worse performance on SSIM than not having it. that's the opposite of what we want.
- jm: can you send me whatever intermediate patch you have so I can look?
- g: sure.

# current problems

- jm: i've been doing experiments with everything off and 8x8 only and 16x16 only and it's worse with 16x16. i'm looking at psnr vs. rate, no BSS, no nothing. pure coding gain.
- u: shouldn't we expect 16x16 to be worse than 8x8 in certain places?
- jm: yes, but overall it should do better.
- u: because we're measuring it on AR95 and some images aren't AR95
- jm: have we measured the 2d coding gain?
- d: yes. that's how the current filters were trained. 2d coding gain on actual images.
- jm: i need to do more experiments to make sure it's not hte entropy coding screwing up. that looked pretty fishy. adding intra meant that 8x8 was clearly better. there's no way to get better with 16x16 because it's much worse. i think as soon as you turn on lapping filters, 8x8 is already better than 16x16. and once everything is on, the gap is too big to make up.
- u: you were just running transform + prefilters and scalar?
- jm: only scalar. no intra. even tried without prefilter.
- d: in my status report, i said i woudl be working on putting a quantization matrix in
- jm: if you do that, can you make sure to split basis function magnitude and the quant matrix? we'll want this to work for pvq
- d: i planned to do exactly that.

# activity masking

- jm: i've been investigating what to do close to zero. whether we should code a single pulse or adding noise, etc.
- d: i'm hoping you could work with monty to see if you both could come up with something.
- jm: i think the first step would be to get 16x16 working.
- d: i agree
- jm: the 8x8 seems to not be too bad. before the scaling patch the intra on 8x8 made things worse, but now it's helping a bit.
- d: how much is it helping?
- jm: at high rate it's not helping at all. at low rate it seems to be improving distortion a bit, by maybe 0.2dB. it doesn't change rate.
- d: it should be helping more than that.
- jm: prefilter definitely helps for 8x8, mostly on distortion. it gives us about 2dB at low rate. assuming i didn't screw up these experiments. hold on, this is fastssim, which is crap. intra makes a tiny difference at high rate and at low rate it's 0.1dB.
- d: that doesn't sound like what we were expecting. 
- jm: PSNR is on the order of a dB for the filter.
- d: i think intra should be bigger than the lapping filter. one thing we never looked at is the mode decision stuff. that isn't adapted to rate at all is i?
- jm: no
- g: we've known that the mode deicsion stuff was broken. a year ago i inverted it and got better results.
- u: that might account for these things not showing improvmeent.
- d: maybe i will take a look at that also.
- jm: disabling RDO in the mode decision and not use curves that don't account for the rate of the mode decisions?
- d: that would certianly give you a best case.
- jm: the weighting we've known was wrong. i believe i came up with a proof of why it was wrong, but i don't remember the details. the weighting of the distortion metric for intra mode selection.
- u: is that like RDO?
- d: except that it doesn't consider rate
- u: other than the transform scaling, what would you weight SATD by?
- jm: we're trying to improve the entropy not just hte final distortion.
- u: where are these weights coming from?
- jm: we're going to code a residual after that. if you're reducing the error on DC by a small amount it wont' change the rate too miuch because coding one extra pulse in DC is cheap. if you're reducing HF then improving that can elp the rate more. on the other hand if you weren't going to code it it doesn't help you much.
- u: is this pvq only?
- jm: no it affects scalar as well. i think this is what was wrong because there's a point at which you're not going to code anything and at that point the weighting is completely useless.
- u: we're looking at the residual after prediction... you're going to bias SATD by position of coeff?
- jm: essentially yes
- u: aren't we coding these with same prob in sclaar?
- jm: absolutely not. the prob for close to DC is a lot higher. the first few AC coeff. the last one will have a small variance and encoding a 1 there is going to be really expensive. for small values, we're actually interesting in distortion because we're not going to be coding anything. it should grow as the square of the value. for large values it should grow as a weight times the absolute value. we ended up in the square region more often than not, which is why teh reverse scaling worked better. at least in terms of fixing the metric, it's probably something is hould have a look at. i at least have some delusion that i understand what's going on.
- d: normal codecs pick a lambda and optimize for d + lambda*r. i'll probalby try that on thursday.