Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Segmentation and Refining is broken (with solution proposal) #473

Closed
3ddc-solaris opened this issue Dec 13, 2024 · 3 comments
Closed
Labels
Bug Something isn't working

Comments

@3ddc-solaris
Copy link

3ddc-solaris commented Dec 13, 2024

Expected Behavior

Expected Automatic Segmentation and Refining working as described here:
https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Prompt%20Syntax.md#automatic-segmentation-and-refining

and here:
Stability-AI/StableSwarmUI#11 (comment)

Actual Behavior

ComfyUI execution error: Input image size (352*352) doesn't match model (224*224).

Steps to Reproduce

Generating an image with following prompt:

Professional 4k photo of a woman sitting on her bed.
<segment:face>  Black-rimmed glasses, detailed face, detailed eyes

leads to error:

ComfyUI execution error: Input image size (352*352) doesn't match model (224*224).

Debug Logs

https://paste.denizenscript.com/View/129007

Other

I did some googling and found the following issues that could be related to the error:
comfyanonymous/ComfyUI#5402
huggingface/transformers#34415

I have solved the problem for myself by modifying the file “SwarmUI/src/BuiltinExtensions/ComfyUIBackend/ExtraNodes/SwarmComfyCommon/SwarmClipSeg.py” as follows:

  1. Added "import inspect"

  2. Lines 57-58, original:

    with torch.no_grad():
        mask = model(**processor(text=match_text, images=img, return_tensors="pt", padding=True))[0]

Changed as follows:

    # use inspect.signature to check whether model.forward accepts interpolate_pos_encoding
    kwargs = (
        {"interpolate_pos_encoding": True}
        if "interpolate_pos_encoding" in inspect.signature(model.forward).parameters
        else {}
    )
    with torch.no_grad():
        mask = model(**processor(text=match_text, images=img, return_tensors="pt", padding=True), **kwargs)[0]

The modified file is attached.
SwarmClipSeg.zip

That solved the problem for me. The segmentation works as expected again.

But since I don't have a clue about SwarmUI internals, Python or generative AI, I can't estimate what risks and side effects this “solution” has. Please take this only as an indication of where the problem might lie.

@3ddc-solaris 3ddc-solaris added the Bug Something isn't working label Dec 13, 2024
@mcmonkey4eva
Copy link
Member

See my answer to the first thread you linked comfyanonymous/ComfyUI#5402 (comment)

And note that the second thread you linked is literally from me reporting the bug and requesting a fix, which they then pushed a fix, which was included in the next release.

So, adjust the answer in the previous thread from install an old version, to install the new version

Open a terminal to (Swarm)\dlbackend\comfy and run python_embeded\python -s -m pip install -U transformers to update it to latest (4.47.0) which has the bug already fixed.
(or if in linux do it with the venv instead of python_embeded)

@mcmonkey4eva
Copy link
Member

also, obviously, delete any stray changes to source code you made lol

@3ddc-solaris
Copy link
Author

Haha, I didn't see that the linked threads are also from you. As I said, I have no idea... :D

Thanks for the quick reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants