-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add cache of load checkpoints #3545
Conversation
comfy/node_cache.py
Outdated
print(f"[==Cache==] Removed {k} from cache.") | ||
|
||
def add_cache(self, k, item): | ||
"""模型patch会修改key的名字,缓存之前保留一份key的数据和值的引用""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should keep comments in English for PRs to the main repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that, I fixed already.
comfy/node_cache.py
Outdated
# get matched ckpt, vae, controlnet queue | ||
cache_mapper = caches_mapping.get(dir_name, None) | ||
|
||
if cache_mapper is not None and device in (None, torch.device("cpu"), {}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the device is different, but the model is in cache, wouldn't it be better to use the cached copy and .to(device)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will try it later, Some tensor are loaded to GPU.
I don't think the checkpoint cache should be managed automatically. If the cache is released too lazily, even in a typical memory environment with around 32GB, using a few checkpoints can quickly lead to a shortage of RAM. On the other hand, if the cache is released too aggressively, checkpoints that shouldn't be released may be unloaded, requiring them to be reloaded. The release and loading of checkpoints should be very deliberate and manually controllable. That's why, in the Inspire Pack, I manage the checkpoint cache through keys and provide a feature to manually remove keys. |
There's a place for both systems imo -- manual is great for people like you and me that know exactly what we're doing and are willing to monitor it closely, but automatic is better for less experienced users, or shared instances (ie you're not the only person mucking with what's loaded or not), or just people who want things to load fast without having to think about it. There's a great point about considering memory load -- the existing ComfyUI code has a VRAM management module that automatically unloads from VRAM when it's too full to be able to fit things in. The cache here should probably operate similarly -- automatically unload cached data from RAM to make room when things might get tight. Probably there should even be a variable to set of the minimum amount of free space the cache needs to guarantee (since of course the cache can't track every memory allocation, a minimum free amount would reduce the risk of overloading). |
Actually not much memory cost here, It's just cache the references, Reduce the MAX_CACHE_xxx environment can reduce the cost. |
Oh, I didn't look at the code properly last night and mistakenly thought caching was based on the count. If caching is managed based on the model size, it shouldn't be a big issue. |
comment to english
Do not filter the load device
Sorry, some errors here, Update soon. |
Please check it here. #3605 |
Here is an example.