You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
couple of things i am wondering:
1.it is generally 4G larger in terms of disk usage for a 72b-sized model,even without considering pissa init folder size. nou sure why
2.i can just funtune this model directly on a pre-processed 4 bit model and saved chekpoint will also be a 4 bit model ,yes?
3.last thing ,do u have pre-processed Qwen2.5 series models ? only saw Qwen2 on huggingface,not sure how much GPU i need to process a large 72b sized model
thanks ,for u attention on this matter
The text was updated successfully, but these errors were encountered:
chuangzhidan
changed the title
why pre-processed 4 bit model is larger than normal 4 bit model ? and what about Qwen2.5,only saw Qwen2
why pre-processed 4 bit model on huggingface is larger than normal 4 bit model ? and what about Qwen2.5,only saw Qwen2
Jan 2, 2025
couple of things i am wondering:
1.it is generally 4G larger in terms of disk usage for a 72b-sized model,even without considering pissa init folder size. nou sure why
2.i can just funtune this model directly on a pre-processed 4 bit model and saved chekpoint will also be a 4 bit model ,yes?
3.last thing ,do u have pre-processed Qwen2.5 series models ? only saw Qwen2 on huggingface,not sure how much GPU i need to process a large 72b sized model
thanks ,for u attention on this matter
The text was updated successfully, but these errors were encountered: