Add blur support #11

PandorasFox · 2016-10-15T01:25:13Z

Notes for myself:

do something similar to this
maybe incrementally blur so it gradually blurs into the lockscreen
(also maybe allow any keystroke in that period to cause it to exit?)
potentially let i3lock also capture the screen and blur that
allow blurring of an image passed in is a must (needs to hold it in ram, though)
need to play around and see how it handles large blur operations and what it displays in the meantime (i.e. if nothing displays, or if it shows the "locking..." thing)

PandorasFox · 2016-10-15T01:34:22Z

I believe setting some libev timers to handle the progressive blurring should work pretty well. I'll do some playing around with that.

PandorasFox · 2016-10-15T02:10:04Z

c3a95b8 adds initial blur support. Some notes:

A white 'fog' appears around the edges. I'll need to figure out how to fix this.
I need to update to allow for custom blur radius
There's a noticeable delay before i3lock actually shows up. This is problematic, and I need to fork off the blurring with libev or something similar.
Thank god @shiver has code that grabs the XCB framebuffer if there's no image passed in with the blur command. Dude's a wizard.

PandorasFox · 2016-10-15T02:10:54Z

@meskarune thoughts? (Time to see if you get notifications for @ mentions)

frysztak · 2016-10-16T09:11:40Z

Hi, I profiled your blurring code with Callgrind and it turns out that application spends over 90% of its runtime inside blur_image_surface function. It's gonna need some optimization :).

I have some experience manually vectorising code (see here) and I can take a look at the blurring code if you'd like.

PandorasFox · 2016-10-16T15:24:31Z

That'd be excellent! I was mostly just pulling from shiver/i3lock@69b40f1 to begin with and was then going to work on optimizing the blurring/working on an interative blur process, and any help would be appreciated.

I knew the runtime was going to be pretty awful at first; I was just using that as a base and was then going to see how it went from there.

I'm going to play around with it some more today and see what happens when I blur the image in another thread while also reading it and drawing, mostly just to see if it can be parallelized at all.

frysztak · 2016-10-16T16:53:02Z

Great, I'll start with SSE2 as it's the most widespread. We'll see if we'll need multiple threads for this.

About progressive blurring - that's definitely a cool idea, but I think it should be optional.

PandorasFox · 2016-10-16T20:47:14Z

Oh definitely, it'd be optional since it'd definitely increase CPU load (on a workstation it'd be pretty neat; on a laptop you'd want it locked as soon as possible so that you can hibernate/suspend/whatever).

Threading it off would mostly just be so that the blurring can be done in the background while i3lock grabs the display and locks (and perhaps display the image unblurred, until it's done blurring). If the blurring can be done quickly enough then it likely won't matter for most resolutions; I'll do some testing with some virtual desktops and extremely high resolutions once you have some blurring stuff done.

frysztak · 2016-10-20T15:19:09Z

A quick update: I have a working version implemented with SSE intrinsics. It still needs a lot of work, so I didn't even benchmark it. I'd be further along if it wasn't for university.

Meanwhile I discovered that Makefile doesn't set any optimization flags. It's pretty bizarre. Anyway, I recommend adding -O2 to CFLAGS. I didn't measure the speedup for naive implementation, but it feels faster.

PandorasFox · 2016-10-20T18:20:31Z

I'll look into that, thanks!

University has kept me pretty busy this semester. I'm a c++ data structures TA and I've been grading tests for most of this past week D:

PandorasFox · 2016-10-20T18:22:51Z

Also yeah; I just tried that out (and committed it); it definitely does seem to take about ~1/3 as much time as before from invocation to blurred lockscreen.

frysztak · 2016-10-22T13:55:11Z

It's there: https://github.com/sebastian-frysztak/i3lock-color/commits/fast-blur. And it's way faster.

Blurring 1080p image on my Thinkpad x220 with i7 takes about 40-60 ms. It used to take about 260 ms. Delay is almost unnoticable now and IMHO my code handles borders way better than one from Cairo cookbook. And there's still some room for optimization.

Passing arguments from command line is currently not implemented, so to change "blur factor" you have to change this line.

Code from Cairo cookbook used radius as the only parameter, which doesn't make much sense to me. I use standard deviation. Radius is constant (and equals 3), which means kernel is 7x7. This is more or less SIMD-friendly size. Because kernel size is constant, increasing stddev only makes sense up to a certain point. Afterwards, we're limited by radius (and smooth Gaussian blur basically turns into ordinary and ugly average).

It'd like you to let me know if current kernel size is fine. If output images are not blurry enough, than I'll either increase kernel size manually, or maybe figure out a way to implement dynamic kernel size.

frysztak · 2016-10-22T17:56:35Z

I tested slightly larger radii at constant stddev. You can take a look here: http://imgur.com/a/hEI2g.

PandorasFox · 2016-10-23T01:42:05Z

I'll play around with it some tomorrow and tinker with some different values / look at implementing CLI and stuff. It's looking great so far, and thanks for the work!

frysztak · 2016-10-23T17:24:36Z

No problem, I like tinkering with stuff like that. It's more fun than high-level programming.

I've been thinking and I came to a conclusion that using dynamic kernel size is probably the best solution. This way we can produce moderately blurry images fast, and very blurry images slower (but not too slow, I hope). It leaves the most space for end-user customization.

And as for input parameters, we can use ImageMagick's approach: radius = 3*stddev, where only stddev is user-visible.

I have some ideas how to further boost performance, so I hope that won't be an issue for large kernels.

In the future, I could implement other effects (pixelisation comes to mind as a obvious choice).

Oh, and I almost forgot: I made a mistake labelling those pictures. 7, 9, and 11 and kernel sizes, not radii.

PandorasFox · 2016-10-23T19:29:18Z

Ah, alright.

I think pixelisation should just be a quick downscale / upscale by the inverse of the scaling factor, right? I think resizing it to pixelize should be the fastest way to do so.

I haven't actually had much time to play with it yet since I'm basically doing all of a term project for a group this weekend 😂

frysztak · 2016-10-29T13:23:36Z

School keeps me rather busy too, but I managed to find some time to extend kernel size to 15x15, improve edges handling and implement SSSE3 version. I also added some benchmarking code. For 10 runs and beyond, SSSE3 version takes about 20 ms per run, SSE2 - 65 ms and 'naive' - 264 ms.

Since blurring is so fast now, we can perform it multiple times to produce strong blur.

The only problem with SSSE3 version is that after several iterations it darkens the image. This is, I suppose, due to quantization errors. I think it's fixable, but solution will slow down the code. Not that it matters that much :).

PandorasFox · 2016-10-29T17:11:50Z

Awesome! I'll try and play around with this some when I get the time, though that might be a few days.

PandorasFox · 2016-11-01T13:48:11Z

I haven't gotten around to playing with it much (I do think it's working excellently, though), but I think I may have figured out a way to get around this problem:

I have no idea how this is really going to help you in the long run, since I'll need to add support for overlaying text on the lockscreen as well. The lock icon can probably be handled as well, but I'll have to see.

I think I could probably rig up something with -I to allow passing in an arbitrary number of images, which'll then be layered over the pixmap that gets blurred. (Or at least, allow passing in one image... you get the idea).

If I get that working I'll be pretty happy. Hopefully I'll have time to work on this soon.

frysztak · 2016-11-01T15:19:43Z

I'm glad you like it :).
However I'm not sure how useful overlaying multiple images would be. Are they all going to be centered? If not, user would have to specify their positions. I see no real use for it (but I'm not one of those hardcode desktop ricers). But specifying one image (which might even default to a nice key icon) is a good idea IMO.

PandorasFox · 2016-11-01T19:23:36Z

Yeah, multiple images seems kinda silly in retrospect, but a single image (i.e. for a lock icon with text, over the blurred lockscreen) has its uses.

frysztak · 2016-11-01T19:39:47Z

So what's the plan for now? Do you need my help with anything? (apart from fixing SSSE3 impl.)

PandorasFox · 2016-11-01T20:33:16Z

I don't think so. The blurring stuff you've done so far is tremendous and I think that once you finish tidying it up it'll be pretty much ready to shit as it is now (maybe with some changes to make it more match the structure/style of what's there so far, although honestly, I kinda need to do that for a lot of the other stuff I've hacked into this fork).

I should be able to do this soon ish. I've got a few interviews this month, so I'll try to do this one of my bus trips/flights, assuming I'm not drowning in schoolwork on those, lol.

I know about what I need to do [should just be moving the stuff here to export the blurred background to a new cairo_surface_t, and then painting that onto the xcb_ctx before I paint the img onto it), but it just comes down to "when can I sit down for a few hours and do this properly (I might do it during a meeting tonight)" ;-;

Thanks for all the work so far, though! Seriously, it's amazing.

PandorasFox · 2016-11-01T20:39:33Z

Also, tangentially related (I haven't had the time to fully grok what the hell this is doing), but about how do you think the performance of your code stacks up against ffmpeg's gaussian blur stuff? @Airblader mentioned it over here, and I thought it was somewhat interesting (I have a feeling he'd look at this fork and probably wonder wtf I was doing with half these hacks).

I'm admittedly pretty inexperienced when it comes to low-level image manipulation, so I'm kinda just poking around and trying to learn some while tinkering when I can.

PandorasFox · 2016-11-01T21:16:33Z

It turns out that it was easier to do the image overlaying than I thought it would be, and I've got it implemented in my blur branch now. I'll start tidying everything up some soon.

frysztak · 2016-11-01T21:21:18Z

Like I said before, I like this low-level stuff, so really, I'm just glad I can finally be more active in open source community.

Regarding style of code - it seems that original i3lock uses Clang formatter. I don't particularly the style the original authors went with, but nevertheless I ran it over initial SSE2 commit. I might have forgotten to do it for SSSE3, though.
Before I forget, a list of things I need to do:

fix SSSE3
add one more optimization I have in my mind
add CPU-capabilities detection code (some old hardware doesn't support SSSE3)
add AVX2 version? this one could be tricky for my CPU doesn't support AVX2, I would have to use an emulator to test correctness. do you have AVX2-capable CPU?

Once those things are done, I'll create a pull request and then we can talk about adjusting style.

I took a brief look at this FFmpeg code. They don't use intrinsic functions, but if packagers compile it with -O3, some of those loops should be automatically vectorised.
It's hard to compare performance, because FFmpeg does quite a lot more than my code, but time returns about 400 ms. Thing is, to me, this blur in FFmpeg looks odd, and what's worse - there are blocky artifacts on perfectly flat surfaces.

PandorasFox · 2016-11-01T21:28:56Z

It doesn't appear my laptop's CPU (i7 3632QM) supports AVX2, but I'm pretty sure the 4690k in my desktop will (I'll check when I get back tonight, but from what google tells me, it does support it!).

I noticed the artifacts as well, but wasn't sure if they were noticeable for most displays (they're somewhat noticeable on my 1366768 laptop, but I dunno how noticeable it'd be on my 38401080 desktop).

I was assuming that the ffmpeg solution wouldn't be the best since it seems to do more than is necessary, but again, not super familiar with this stuff, so I figured I'd ask.

PandorasFox · 2016-11-01T21:36:02Z

Oh yeah, just kinda what the image overlaying looks like right now. Stuff is just algined to the top-left corner.

frysztak · 2016-11-01T21:39:39Z

Hey, that's pretty good. Even transparency works. So it's just cairo_paint(), essentially?

PandorasFox · 2016-11-01T21:44:43Z

Pretty much. I just set up an additional cairo_surface_t for storing the blur_img separately, and then I paint that onto the surface before painting the image passed in.

I felt like there'd be some edge bugs I'd find or some tweaks needed, but it seems to work perfectly fine on the first try (well, I think that there might be some memory leaks; I never checked that...)

frysztak · 2016-11-03T19:14:09Z

I updated SSSE3 implementation and I don't like how it looks. I'll let images speak.
SSE2 (floating point) after 5 iterations:

SSSE3 (integer) after 5 iterations:

It must be due to roundings that happen when kernel is scaled and rounded to integers. I don't think I can do anything about it, I'm already using biggest scaling factors I can. So, I think I'll drop SSSE3 version. Instead I'll prepare something else, I'm not sure what exactly will make the most sense yet. I'm considering AVX2, and perhaps mixed AVX and SSE2, so that folks like myself can still benefit from wider registers.

PandorasFox · 2016-11-03T19:29:31Z

Hm, is it just darker? That's rather curious.

Sounds good to me either way.

frysztak · 2016-11-03T19:49:02Z

Kind of. Background itself is not actually darker, but all the text is (it looks more blended-in comparing to SSE2). There are also some artifacts on the right hand side, near the border.

PandorasFox · 2016-11-04T02:06:12Z

Ah, I sort of see, now. I'm travelling for an interview and only have my laptop (wonderful old laptop with great guts, but it has a 1366*768 display...), so I won't really be able to tell the images apart easily until the weekend.

That's definitely problematic.

frysztak · 2016-11-04T12:20:41Z

Maybe it's too late, but if not, I wanted to wish you best of luck :).

PandorasFox · 2016-11-04T17:28:27Z

Thanks! I think I did pretty well on it :)

I'll be playing with the code some on my bus back, I think.

frysztak · 2016-11-11T21:06:24Z

You might want to check out box-blur branch. It approximates Gaussian blur very closely and is faster.
I think I'm going to prepare a generic version, to replace that blurring code that was written for Cairo. So that, you know, blurred image will look the same, regardless of system's capabilities.

PandorasFox · 2016-11-12T18:38:49Z

I'll check it out when I can. I glanced at the code and it looks pretty good.

PandorasFox · 2017-02-14T17:38:47Z

Hey, where's this at, currently? Courses have been eating up all my time, but I figured I should check.

I think last time I tried, things weren't blurred quite enough/there was some darkening of the screen buffer after blurring for some reason, so I can't really merge / push yet. Any idea what would need to be changed/how it'd need to be implemented? I may give a try at this later.

frysztak · 2017-02-14T20:14:49Z

I use code from box-blur branch since November. Haven't had a single issue.
There's only SSE2-based version, SSSE3 and AVX got removed, as they don't seem to be necessary.
I should probably implement generic, SSE2-free version for really old x86 and ARM CPUs. It's a very niche market I imagine, but it ought to be done. I'll send you a pull request once I'm done, okay?

edit - I forgot one thing: blurring factor. I'll add that too.

PandorasFox · 2017-02-15T00:15:52Z

Alright, thanks! If you need any help / testing, just let me know

PandorasFox · 2017-02-15T16:18:32Z

Implemented with #17 :D

PandorasFox self-assigned this Oct 15, 2016

PandorasFox added the feature request label Oct 15, 2016

PandorasFox mentioned this issue Oct 15, 2016

I'm probably going to move blurring into i3lock-color soonish meskarune/i3lock-fancy#57

Closed

PandorasFox closed this as completed Feb 15, 2017

Add blur support #11

Add blur support #11

Comments

PandorasFox commented Oct 15, 2016

PandorasFox commented Oct 15, 2016

PandorasFox commented Oct 15, 2016

PandorasFox commented Oct 15, 2016

frysztak commented Oct 16, 2016

PandorasFox commented Oct 16, 2016 • edited Loading

frysztak commented Oct 16, 2016

PandorasFox commented Oct 16, 2016

frysztak commented Oct 20, 2016

PandorasFox commented Oct 20, 2016

PandorasFox commented Oct 20, 2016

frysztak commented Oct 22, 2016

frysztak commented Oct 22, 2016

PandorasFox commented Oct 23, 2016

frysztak commented Oct 23, 2016

PandorasFox commented Oct 23, 2016 • edited Loading

frysztak commented Oct 29, 2016

PandorasFox commented Oct 29, 2016

PandorasFox commented Nov 1, 2016

frysztak commented Nov 1, 2016

PandorasFox commented Nov 1, 2016

frysztak commented Nov 1, 2016

PandorasFox commented Nov 1, 2016 • edited Loading

PandorasFox commented Nov 1, 2016

PandorasFox commented Nov 1, 2016 • edited Loading

frysztak commented Nov 1, 2016 • edited by PandorasFox Loading

PandorasFox commented Nov 1, 2016 • edited Loading

PandorasFox commented Nov 1, 2016

frysztak commented Nov 1, 2016

PandorasFox commented Nov 1, 2016

frysztak commented Nov 3, 2016

PandorasFox commented Nov 3, 2016

frysztak commented Nov 3, 2016

PandorasFox commented Nov 4, 2016

frysztak commented Nov 4, 2016

PandorasFox commented Nov 4, 2016

frysztak commented Nov 11, 2016 • edited Loading

PandorasFox commented Nov 12, 2016

PandorasFox commented Feb 14, 2017

frysztak commented Feb 14, 2017 • edited Loading

PandorasFox commented Feb 15, 2017

PandorasFox commented Feb 15, 2017

PandorasFox commented Oct 16, 2016 •

edited

Loading

PandorasFox commented Oct 23, 2016 •

edited

Loading

PandorasFox commented Nov 1, 2016 •

edited

Loading

PandorasFox commented Nov 1, 2016 •

edited

Loading

frysztak commented Nov 1, 2016 •

edited by PandorasFox

Loading

PandorasFox commented Nov 1, 2016 •

edited

Loading

frysztak commented Nov 11, 2016 •

edited

Loading

frysztak commented Feb 14, 2017 •

edited

Loading