DaalaMeeting20140722: Difference between revisions

From XiphWiki
Jump to navigation Jump to search
(Created page with "<pre> # Meeting 2014-07-22 Mumble: mf4.xiph.org:64738 # Agenda - reviews - coding party - demo progress - video-1-short results - edi - deringing -New intra pred # Attending ...")
m (Add word wrapping)
Line 1: Line 1:
<pre style="white-space: pre-wrap;
white-space: -moz-pre-wrap;
white-space: -pre-wrap;
white-space: -o-pre-wrap;
word-wrap: break-word;">
# Meeting 2014-07-22
# Meeting 2014-07-22

Latest revision as of 16:01, 6 February 2015

# Meeting 2014-07-22

Mumble:  mf4.xiph.org:64738

# Agenda
- reviews
- coding party
- demo progress
- video-1-short results
- edi
- deringing
-New intra pred

# Attending

jmspeex, td-linux, gmaxwell, bkoc, smarter

# Reviews

looks like daala-ts is waiting on nathan for three reviews, nathan has a review without a reviewer assigned.

# Coding party

- sounds like tmath will be able to attend

# demo progress

m: I'll update the demo
jm: the only thing I'm working that will improve still image compression is a long way away
m: my update says my process; doing js bindings for everything

# video-1-short results

ffmpeg -i akiyo_cif.y4m -to 00:01 akiyo_cif_short.y4m
ffmpeg -i ducks_take_off_420_720p50.y4m -ss 00:01 -to 00:02 ducks_take_off_short.y4m
ffmpeg -i grandma_qcif.y4m -ss 03 -to 04 grandma_short.y4m
ffmpeg -i park_joy_420_720p50.y4m -ss 04 -to 05 park_joy_short.y4m
ffmpeg -i sintel_trailer_2k_720p24.y4m -ss 06 -to 07 sintel_trailer_2k_short.y4m
ffmpeg -i soccer_4cif.y4m -to 1 soccer_short.y4m

- t: I setup the ec2 thing to run on a set of 1 second videos for Tim's IETF presentation today.

http://a.pomf.se/aawqys.png http://a.pomf.se/yjkbuh.png
http://a.pomf.se/sijtig.png http://a.pomf.se/yppxju.png

- jm: I thought we sucked a lot more than that; that said the only place we're winning is on ms-ssim higher rates.
- jm: how longs does it take to generate this curve
- td: about 15 minutes, on two large amazon machines
- jm: means we can add more stuff
- gm: one thing about the result is that x264/x265 are psnr tuning
- td: x264 is not— its defaults, x265 probably.
- jm: 264 is clearly winning on ssim so I assume its doing ssim tuning
- jm: You were going to add more videos
- td: Yes, I'm going to add two sequences and scale to 720.
- jm: should we have 1080p
- td: we could
- gm: how are these weighed
- td: By pixels
- gm: So grandma is basically irrelevant?
- td: Yes, I was going to make it longer
- jm: I guess your test-2 is going to be enough for a while, at least until we realize that its missing something— making 

# deringing

- bkoc: Right now I'm working on  another paper since the prior one didn't work too well.
- bkoc: currently .04 PSNR/PSNR-HVS improvements, helps to decrease the file size less than 1%.
- jm: what is a lot more relevant here is what it actually looks like, because for something like deringing I don't trust our metrics. Do you have something we can look at?
- bkoc: I can prepare something and send it out
- jm: that would be useful


- smarter: Working on porting the edi code, currently have some disagreements between the results from the JS and the C code, so currently debugging that.

# Intra

- jm: Thomas and I have been working on a potential new way of doing intra, Monty were you there?
- m: I was there for some of it but don't really have a good understanding, have you written this up?
- jm: Not yet— I had some interesting results:
- http://jmvalin.ca/video/new_intra/
- jm: basic idea is if we can't do intra prediction with lapping, we might as well do a intra-prediction pass on the whole image using 1d DCTs. Coding an entire image in a simple way "coding an SVG of the image first" (though not actually, due to the search)
- jm: what I'm actually doing— look at the images that have a suffix of 16.
- jm: say fruits. For each 16x16 block I am coding a 16pt 1d row and a angle.
- m: ah. you're explicitly signaling the pixels
- jm: this creates a lot of blocking artifacts, as expected— what I'm trying to implement now, you already have the left and top as predicted from the previous block then code the bottom right and do a bi-prediction, and for each edge you can jointly optimize for both sides.
- td: you can lap the 1d dcts
- m: you can also imply the pixels, looking at the other edges
- jm: I was going to look at the direction and predict the other edge
- m: This sounds interesting, or at least interesting to play with
- gm: This reminds me of a paper I saw a while ago that was kind of crazy where it coded the bottom right of a block using a 1d dct, then used the up and left from neighbors and then solved some wave-equation energy minimization for the inside of the box.
- jm: my thought about the dc mode, either that it might be doing something like bilinear or whatever it takes to predict a gradient without an explict tm mode, e.g. dc and tm being the same.
- gm: you keep referring to it as multiple passes, but there is no reason this couldn't run per superblock.
- jm: you don't need to implement this as two complete passes but conceptually it is, since there is no feedback.
- jm: The really interesting on— I don't think think it's a good image to test— adventure_8.
- jm: at some point we would probably want multiple block sizes but using an entirely different blocksize algorithm, I'm not sure what it looks like since there doesn't appear to be a dynamic programming solution.
- m: This is promising.
- jm: I intend to continue working on this, help is welcome.
- jm: There are probably many ways to implement this, e.g. the one I described where you code edges and blend within blocks, one which is closer to what greg mentioned— similar to what I have here doing this within blocks and blending further
- jm: Then there is the issue of controlling how much rate, in areas that can be easily be predicted you want to keep using more bits right up to the scale, in areas which are not you don't
- jm: in terms of directions what I was working on right now was to have 64 directions for 16x16— 32 directions for 8x8, basically one offset for every pixel. It may not help that much visually but it should work better for matching edges
- jm: there is also the idea of coding the orthogonal (to the directionality) DCT but there are some issues, like weighing  e.g. for the 45deg the edge pixels are basically only used for one pixel and then the issue is that blending between blocks is harder.
- jm: we might just want to have multiple block size and if edges cross you just decrease the size in that area
- m: those of us who are interested in this should all go out and think about this!