Replies: 5 comments 3 replies
-
do u use V26model in fooocus or comfyui? |
Beta Was this translation helpful? Give feedback.
-
An informative comparison, thanks! So it looks like Xinsir inpainting at present doesn’t qualify as replacement for the focus patch as the inpainting component in the Krita AI plugin, the seams and color aberrations are too evident in the outpainted areas. I myself have already experimented a little with Xinsir under Forge, both for outpainting and inpainting (mostly for hi res images), the results are mixed. Actually, I couldn’t get it to work at all for anything that uses a mask, since, as you mentioned in your post in SD reddit, Xinsir promax takes as input the image with the masked area all black, I find it rather strange and unhelpful. Don’t you know, there exists another inpaint model for SDXL, by Kataragi (https://huggingface.co/kataragi)? It’s quite capable, I liked the results of its outpainting, also under Forge - liked them way better than fooocus’s results with the same test image & prompts (the latter is also rather slow). Worth checking out, now that you are in the comparison mode :)) What I miss a lot in Krita AI diffusion plugin is the inpainting functionality that is available with the inpaint_global_harmonious preprocessor under both A1111 and Forge (implementation in the latter is a bit incomplete though). It really delivers some magical results! I know there is no Comfy implementation of it, unfortunately (also according to mcmonkey of SwarmUI fame whom I asked about that on Discord), and it’s such a pity. Any chance to make a custom node for it in Comfy and bring it to the plugin eventually? If you haven’t seen what it does, I can show you examples. (Here’s one for starters: https://civitai.com/articles/5834, a less-demanding SUPIR alternative). |
Beta Was this translation helpful? Give feedback.
-
For an illustration, here's a nice quickie example of Kataragi's inpainting I just did in Forge, 1536x864 vector art frame of a fantasy landscape image using an enveloping mask. (most of my experiments are with outpainting actually, but the model works equally well with both, and it does support inpaint_global_harmonious) |
Beta Was this translation helpful? Give feedback.
-
Kataragi inpaint (weight=0.5, I also ran at 1.0 with worse results) It's pretty good in some situations, but not a winner. Maybe it does better with artwork/anime since that seems to be its focus. |
Beta Was this translation helpful? Give feedback.
-
It took me a while to investigate the various aspects of outpainting to address your comments, this is one of the more complicated areas of SD. Worse yet, outpainting via inpainting under Forge (which is my primary SD tool) works very differently, depending on the currently active tab: img2img, Inpaint or Inpaint Upload, even when using the same ControlNet settings, same input image/mask, same everything in short. Only the Inpaint tab can produce the artistic frame images which I used for the illustration above, they come out universally beautiful (with the right settings, of course). Timing. All SDXL inpainting models I tried worked fine for this task and rather fast: with the DPM++ 2M SDE Heun Exponential sampler and 24 steps they typically finished under 16 sec. after the initial preprocessing run (the card used was RTX 4070 Ti Super with 16 GB VRAM). Except for xinsir Promax, which took an unreasonably long time (8 min.) to process the same 1536x864 image and returned all black output. However, the settings in Inpaint must be chosen very carefully, for this to work. The main trick is to set the Ending Control Step very low - for some models, as low as the slider allows actually, i.e. 0.01. Testing outpainting was also difficult for the reason that the time it took for a model to process the image was inconsistent (even for the same input and parameters) and greatly varied from session to session, for no clear reason. Before testing, I found two more inpaint ControlNet models on HuggingFace: EcomXL Inpaint ControlNet and Controlnet - Inpainting dreamer by Desitech, they work ok (ecomx slower than the rest though), but probably won’t be a competition to those we have already tried. inpaint26.foocus proved the slowest of the bunch: 30 sec. at ECS 0.25, the setting value at which it produced the best output. (This is despite what you might have read on the web about this particular aspect - the suggestion is usually to set it to 0.5). Going any lower than 0.25 would result in much slower initial processing times (up to 10 min.), with no visible benefits in the output. Outpainting under fooocus, on the other hand, with the same input images and checkpoint and in Quality mode, was very consistent, time-wise: always 30-32 sec., to produce 2456x1382 output (I don’t know why it always renders an upscaled version, there seems to be no way to disable it). Output quality with the Initial preset and default style, however, was nowhere near to what I am getting under Forge with the various inpainting models. Perhaps I should experiment with the Fooocus’s presets and styles, given the time. I will return to inpaint_global_harmonious later, since I find it a very special preprocessor worthy of at least discussing, but it’s a bit of a separate topic. |
Beta Was this translation helpful? Give feedback.
-
I'm testing the inpaint mode of the latest "Union" ControlNet by Xinsir. This is a comparison to the Fooocus inpaint patch used at the moment (which I believe is based on Diffusers Inpainting model).
Test Setup
All tests use JuggernautXL. The checkpoint is not important here, both methods work with all SDXL checkpoints (excluding Pony derivatives). Tests also use the same default settings (DPM++ 2M Karras at 20 samples), same pre-processing, and identical seeds.
ControlNet strength is at 0.5 for the results below (the only case where 1.0 was better is the bear that was cut off).
Apple tree - illustration - outpaint (expand)
Left: IP-Adapter | Right: "children's illustration of kids next to an apple tree"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Canal - photo - inpaint (fill)
Left: IP-Adapter | Right: "photo of a canal in bruges, belgium"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Cornfield - anime - outpaint (expand)
Left: IP-Adapter | Right: "anime artwork of girl in a cornfield"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Jungle - painting - inpaint (fill)
Left: IP-Adapter | Right: "concept artwork of a lake in a forest"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Nature - photo - inpaint (add object)
"photo of a black bear standing in a stony river bed"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Park - photo - inpaint (add object)
"photo of a lady sitting on a bench in a park"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Street - photo - inpaint (remove object)
Left: IP-Adapter | Right: "photo of a street in tokyo"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Superman - photo - outpaint (expand)
Left: IP-Adapter | Right: "superman giving a speech at a congress hall filled with people"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Torii - photo - inpaint (fill)
Left: IP-Adapter | Right: "photo of torii, japanese garden"
Xinsir Union ControlNet
Diffusers Inpainting (Fooocus patch)
Conclusions
Some general trade-offs with the Xinsir ControlNet method:
Image quality is pretty good, but not convincingly better than previous method I'd say.
Beta Was this translation helpful? Give feedback.
All reactions