Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReFixmatch #235

Open
hujiahujia opened this issue Oct 18, 2024 · 8 comments
Open

ReFixmatch #235

hujiahujia opened this issue Oct 18, 2024 · 8 comments

Comments

@hujiahujia
Copy link

I ran the refixmatch.py with the code provided, and I did not revise the code, but the accuracy of validation set is low. Is there some mistakes in the original code? I'd appreciate it if you could answer my question.

@JiaQuan1203
Copy link

I have met the same question when running the freematch. Whats the problem?

@Hhhhhhao
Copy link
Contributor

Can you post the training log here?

@JiaQuan1203
Copy link

JiaQuan1203 commented Dec 16, 2024

Thank you for your quick response. I have followed your provided configuration exactly for the training. I hope to receive your guidance. My GPU is a 4080.
@Hhhhhhao
By the way, USB is an excellent piece of work that has helped me a lot. I sincerely appreciate it.

2024-12-17 01:06:43,869 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth
[2024-12-17 01:06:43,870 INFO] 204800 iteration, USE_EMA: False, train/sup_loss: 0.0000, train/unsup_loss: 0.0100, train/total_loss: -0.0159, train/util_ratio: 0.8750, train/run_time: 0.0593, eval/loss: 1.1764, eval/top-1-acc: 0.8231, eval/top-5-acc: 0.9714, eval/balanced_acc: 0.8231, eval/precision: 0.8248, eval/recall: 0.8231, eval/F1: 0.8221, lr: 0.0001, train/prefecth_time: 0.0020 BEST_EVAL_ACC: 0.8258, at 139264 iters
[2024-12-17 01:06:47,194 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth
[2024-12-17 01:06:47,194 INFO] Model result - eval/best_acc : 0.8258
[2024-12-17 01:06:47,194 INFO] Model result - eval/best_it : 139263
[2024-12-17 01:06:47,194 WARNING] GPU 0 training is FINISHED

[2024-12-16 18:31:44,008 INFO] Use GPU: 0 for training
[2024-12-16 18:31:44,983 INFO] unlabeled data number: 50000, labeled data number 400
[2024-12-16 18:31:45,188 INFO] Create train and test data loaders
[2024-12-16 18:31:45,188 INFO] [!] data loader keys: dict_keys(['train_lb', 'train_ulb', 'eval'])
[2024-12-16 18:31:45,683 INFO] Create optimizer and scheduler
[2024-12-16 18:31:45,689 INFO] Number of Trainable Params: 21436900
[2024-12-16 18:31:45,745 INFO] Arguments: Namespace(save_dir='./saved_models/usb_cv/', save_name='freematch_cifar100_400_0', resume=True, load_path='./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth', overwrite=True, use_tensorboard=True, use_wandb=False, use_aim=False, epoch=200, num_train_iter=204800, num_warmup_iter=5120, num_eval_iter=2048, num_log_iter=256, num_labels=400, batch_size=8, uratio=1, eval_batch_size=16, ema_m=0.0, ulb_loss_ratio=1.0, optim='AdamW', lr=0.0005, momentum=0.9, weight_decay=0.0005, layer_decay=0.5, net='vit_small_patch2_32', net_from_name=False, use_pretrain=True, pretrain_path='https://github.com/microsoft/Semi-supervised-learning/releases/download/v.0.0.0/vit_small_patch2_32_mlp_im_1k_32.pth', algorithm='freematch', use_cat=True, amp=False, clip_grad=0, imb_algorithm=None, data_dir='./data', dataset='cifar100', num_classes=100, train_sampler='RandomSampler', num_workers=4, include_lb_to_ulb=True, lb_imb_ratio=1, ulb_imb_ratio=1, ulb_num_labels=None, img_size=32, crop_ratio=0.875, max_length=512, max_length_seconds=4.0, sample_rate=16000, world_size=1, rank=0, dist_url='tcp://127.0.0.1:28769', dist_backend='nccl', seed=0, gpu=0, multiprocessing_distributed=True, c='config/usb_cv/freematch/freematch_cifar100_400_0.yaml', hard_label=True, T=0.5, ema_p=0.999, ent_loss_ratio=0.001, use_quantile=False, clip_thresh=False, clip=0.0, distributed=True, ulb_dest_len=50000, lb_dest_len=400)
[2024-12-16 18:31:45,745 INFO] Resume load path ./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth does not exist
[2024-12-16 18:31:45,745 INFO] Model training

@Hhhhhhao
Copy link
Contributor

Thank you for your quick response. I have followed your provided configuration exactly for the training. I hope to receive your guidance. My GPU is a 4080. @Hhhhhhao By the way, USB is an excellent piece of work that has helped me a lot. I sincerely appreciate it.

2024-12-17 01:06:43,869 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth [2024-12-17 01:06:43,870 INFO] 204800 iteration, USE_EMA: False, train/sup_loss: 0.0000, train/unsup_loss: 0.0100, train/total_loss: -0.0159, train/util_ratio: 0.8750, train/run_time: 0.0593, eval/loss: 1.1764, eval/top-1-acc: 0.8231, eval/top-5-acc: 0.9714, eval/balanced_acc: 0.8231, eval/precision: 0.8248, eval/recall: 0.8231, eval/F1: 0.8221, lr: 0.0001, train/prefecth_time: 0.0020 BEST_EVAL_ACC: 0.8258, at 139264 iters [2024-12-17 01:06:47,194 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth [2024-12-17 01:06:47,194 INFO] Model result - eval/best_acc : 0.8258 [2024-12-17 01:06:47,194 INFO] Model result - eval/best_it : 139263 [2024-12-17 01:06:47,194 WARNING] GPU 0 training is FINISHED

[2024-12-16 18:31:44,008 INFO] Use GPU: 0 for training [2024-12-16 18:31:44,983 INFO] unlabeled data number: 50000, labeled data number 400 [2024-12-16 18:31:45,188 INFO] Create train and test data loaders [2024-12-16 18:31:45,188 INFO] [!] data loader keys: dict_keys(['train_lb', 'train_ulb', 'eval']) [2024-12-16 18:31:45,683 INFO] Create optimizer and scheduler [2024-12-16 18:31:45,689 INFO] Number of Trainable Params: 21436900 [2024-12-16 18:31:45,745 INFO] Arguments: Namespace(save_dir='./saved_models/usb_cv/', save_name='freematch_cifar100_400_0', resume=True, load_path='./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth', overwrite=True, use_tensorboard=True, use_wandb=False, use_aim=False, epoch=200, num_train_iter=204800, num_warmup_iter=5120, num_eval_iter=2048, num_log_iter=256, num_labels=400, batch_size=8, uratio=1, eval_batch_size=16, ema_m=0.0, ulb_loss_ratio=1.0, optim='AdamW', lr=0.0005, momentum=0.9, weight_decay=0.0005, layer_decay=0.5, net='vit_small_patch2_32', net_from_name=False, use_pretrain=True, pretrain_path='https://github.com/microsoft/Semi-supervised-learning/releases/download/v.0.0.0/vit_small_patch2_32_mlp_im_1k_32.pth', algorithm='freematch', use_cat=True, amp=False, clip_grad=0, imb_algorithm=None, data_dir='./data', dataset='cifar100', num_classes=100, train_sampler='RandomSampler', num_workers=4, include_lb_to_ulb=True, lb_imb_ratio=1, ulb_imb_ratio=1, ulb_num_labels=None, img_size=32, crop_ratio=0.875, max_length=512, max_length_seconds=4.0, sample_rate=16000, world_size=1, rank=0, dist_url='tcp://127.0.0.1:28769', dist_backend='nccl', seed=0, gpu=0, multiprocessing_distributed=True, c='config/usb_cv/freematch/freematch_cifar100_400_0.yaml', hard_label=True, T=0.5, ema_p=0.999, ent_loss_ratio=0.001, use_quantile=False, clip_thresh=False, clip=0.0, distributed=True, ulb_dest_len=50000, lb_dest_len=400) [2024-12-16 18:31:45,745 INFO] Resume load path ./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth does not exist [2024-12-16 18:31:45,745 INFO] Model training

Can you refer to the hyper-parameter in this log file: https://drive.google.com/drive/folders/1oON5Vyjvb3vWxOQl7hdUl-eh0K-TEPPS? It seems from an earlier issue that the we might used a different parameter from the config file.

@JiaQuan1203
Copy link

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

@Hhhhhhao
Copy link
Contributor

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

let me know if there is any issue remaining

@JiaQuan1203
Copy link

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

let me know if there is any issue remaining

Hi,I reviewed past issues and compared your log with mine. I found that there seems to be only one difference: the clip_thresh parameter is not present in your config. Is this important? Did you set this parameter? Upon inspecting the code, I found that it is disabled by default. Looking forward to your reply.

@Hhhhhhao
Copy link
Contributor

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

let me know if there is any issue remaining

Hi,I reviewed past issues and compared your log with mine. I found that there seems to be only one difference: the clip_thresh parameter is not present in your config. Is this important? Did you set this parameter? Upon inspecting the code, I found that it is disabled by default. Looking forward to your reply.

It has been too long for me and I cannot remember exactly the parameters we used for the released models. But from your training logs it seems the accuracy is not far away from what we reported? (82.58 vs 84.01)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants