ReFixmatch #235

hujiahujia · 2024-10-18T04:02:22Z

I ran the refixmatch.py with the code provided, and I did not revise the code, but the accuracy of validation set is low. Is there some mistakes in the original code? I'd appreciate it if you could answer my question.

JiaQuan1203 · 2024-12-16T16:44:46Z

I have met the same question when running the freematch. Whats the problem？

Hhhhhhao · 2024-12-16T16:47:35Z

Can you post the training log here?

JiaQuan1203 · 2024-12-16T17:08:46Z

Thank you for your quick response. I have followed your provided configuration exactly for the training. I hope to receive your guidance. My GPU is a 4080.
@Hhhhhhao
By the way, USB is an excellent piece of work that has helped me a lot. I sincerely appreciate it.

2024-12-17 01:06:43,869 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth
[2024-12-17 01:06:43,870 INFO] 204800 iteration, USE_EMA: False, train/sup_loss: 0.0000, train/unsup_loss: 0.0100, train/total_loss: -0.0159, train/util_ratio: 0.8750, train/run_time: 0.0593, eval/loss: 1.1764, eval/top-1-acc: 0.8231, eval/top-5-acc: 0.9714, eval/balanced_acc: 0.8231, eval/precision: 0.8248, eval/recall: 0.8231, eval/F1: 0.8221, lr: 0.0001, train/prefecth_time: 0.0020 BEST_EVAL_ACC: 0.8258, at 139264 iters
[2024-12-17 01:06:47,194 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth
[2024-12-17 01:06:47,194 INFO] Model result - eval/best_acc : 0.8258
[2024-12-17 01:06:47,194 INFO] Model result - eval/best_it : 139263
[2024-12-17 01:06:47,194 WARNING] GPU 0 training is FINISHED

[2024-12-16 18:31:44,008 INFO] Use GPU: 0 for training
[2024-12-16 18:31:44,983 INFO] unlabeled data number: 50000, labeled data number 400
[2024-12-16 18:31:45,188 INFO] Create train and test data loaders
[2024-12-16 18:31:45,188 INFO] [!] data loader keys: dict_keys(['train_lb', 'train_ulb', 'eval'])
[2024-12-16 18:31:45,683 INFO] Create optimizer and scheduler
[2024-12-16 18:31:45,689 INFO] Number of Trainable Params: 21436900
[2024-12-16 18:31:45,745 INFO] Arguments: Namespace(save_dir='./saved_models/usb_cv/', save_name='freematch_cifar100_400_0', resume=True, load_path='./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth', overwrite=True, use_tensorboard=True, use_wandb=False, use_aim=False, epoch=200, num_train_iter=204800, num_warmup_iter=5120, num_eval_iter=2048, num_log_iter=256, num_labels=400, batch_size=8, uratio=1, eval_batch_size=16, ema_m=0.0, ulb_loss_ratio=1.0, optim='AdamW', lr=0.0005, momentum=0.9, weight_decay=0.0005, layer_decay=0.5, net='vit_small_patch2_32', net_from_name=False, use_pretrain=True, pretrain_path='https://github.com/microsoft/Semi-supervised-learning/releases/download/v.0.0.0/vit_small_patch2_32_mlp_im_1k_32.pth', algorithm='freematch', use_cat=True, amp=False, clip_grad=0, imb_algorithm=None, data_dir='./data', dataset='cifar100', num_classes=100, train_sampler='RandomSampler', num_workers=4, include_lb_to_ulb=True, lb_imb_ratio=1, ulb_imb_ratio=1, ulb_num_labels=None, img_size=32, crop_ratio=0.875, max_length=512, max_length_seconds=4.0, sample_rate=16000, world_size=1, rank=0, dist_url='tcp://127.0.0.1:28769', dist_backend='nccl', seed=0, gpu=0, multiprocessing_distributed=True, c='config/usb_cv/freematch/freematch_cifar100_400_0.yaml', hard_label=True, T=0.5, ema_p=0.999, ent_loss_ratio=0.001, use_quantile=False, clip_thresh=False, clip=0.0, distributed=True, ulb_dest_len=50000, lb_dest_len=400)
[2024-12-16 18:31:45,745 INFO] Resume load path ./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth does not exist
[2024-12-16 18:31:45,745 INFO] Model training

Hhhhhhao · 2024-12-16T17:15:04Z

Thank you for your quick response. I have followed your provided configuration exactly for the training. I hope to receive your guidance. My GPU is a 4080. @Hhhhhhao By the way, USB is an excellent piece of work that has helped me a lot. I sincerely appreciate it.

2024-12-17 01:06:43,869 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth [2024-12-17 01:06:43,870 INFO] 204800 iteration, USE_EMA: False, train/sup_loss: 0.0000, train/unsup_loss: 0.0100, train/total_loss: -0.0159, train/util_ratio: 0.8750, train/run_time: 0.0593, eval/loss: 1.1764, eval/top-1-acc: 0.8231, eval/top-5-acc: 0.9714, eval/balanced_acc: 0.8231, eval/precision: 0.8248, eval/recall: 0.8231, eval/F1: 0.8221, lr: 0.0001, train/prefecth_time: 0.0020 BEST_EVAL_ACC: 0.8258, at 139264 iters [2024-12-17 01:06:47,194 INFO] model saved: ./saved_models/usb_cv/freematch_cifar100_400_0/latest_model.pth [2024-12-17 01:06:47,194 INFO] Model result - eval/best_acc : 0.8258 [2024-12-17 01:06:47,194 INFO] Model result - eval/best_it : 139263 [2024-12-17 01:06:47,194 WARNING] GPU 0 training is FINISHED

[2024-12-16 18:31:44,008 INFO] Use GPU: 0 for training [2024-12-16 18:31:44,983 INFO] unlabeled data number: 50000, labeled data number 400 [2024-12-16 18:31:45,188 INFO] Create train and test data loaders [2024-12-16 18:31:45,188 INFO] [!] data loader keys: dict_keys(['train_lb', 'train_ulb', 'eval']) [2024-12-16 18:31:45,683 INFO] Create optimizer and scheduler [2024-12-16 18:31:45,689 INFO] Number of Trainable Params: 21436900 [2024-12-16 18:31:45,745 INFO] Arguments: Namespace(save_dir='./saved_models/usb_cv/', save_name='freematch_cifar100_400_0', resume=True, load_path='./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth', overwrite=True, use_tensorboard=True, use_wandb=False, use_aim=False, epoch=200, num_train_iter=204800, num_warmup_iter=5120, num_eval_iter=2048, num_log_iter=256, num_labels=400, batch_size=8, uratio=1, eval_batch_size=16, ema_m=0.0, ulb_loss_ratio=1.0, optim='AdamW', lr=0.0005, momentum=0.9, weight_decay=0.0005, layer_decay=0.5, net='vit_small_patch2_32', net_from_name=False, use_pretrain=True, pretrain_path='https://github.com/microsoft/Semi-supervised-learning/releases/download/v.0.0.0/vit_small_patch2_32_mlp_im_1k_32.pth', algorithm='freematch', use_cat=True, amp=False, clip_grad=0, imb_algorithm=None, data_dir='./data', dataset='cifar100', num_classes=100, train_sampler='RandomSampler', num_workers=4, include_lb_to_ulb=True, lb_imb_ratio=1, ulb_imb_ratio=1, ulb_num_labels=None, img_size=32, crop_ratio=0.875, max_length=512, max_length_seconds=4.0, sample_rate=16000, world_size=1, rank=0, dist_url='tcp://127.0.0.1:28769', dist_backend='nccl', seed=0, gpu=0, multiprocessing_distributed=True, c='config/usb_cv/freematch/freematch_cifar100_400_0.yaml', hard_label=True, T=0.5, ema_p=0.999, ent_loss_ratio=0.001, use_quantile=False, clip_thresh=False, clip=0.0, distributed=True, ulb_dest_len=50000, lb_dest_len=400) [2024-12-16 18:31:45,745 INFO] Resume load path ./saved_models/usb_cv//freematch_cifar100_400_0/latest_model.pth does not exist [2024-12-16 18:31:45,745 INFO] Model training

Can you refer to the hyper-parameter in this log file: https://drive.google.com/drive/folders/1oON5Vyjvb3vWxOQl7hdUl-eh0K-TEPPS? It seems from an earlier issue that the we might used a different parameter from the config file.

JiaQuan1203 · 2024-12-16T17:25:42Z

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

Hhhhhhao · 2024-12-16T17:26:48Z

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

let me know if there is any issue remaining

JiaQuan1203 · 2024-12-18T06:56:17Z

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

let me know if there is any issue remaining

Hi，I reviewed past issues and compared your log with mine. I found that there seems to be only one difference: the clip_thresh parameter is not present in your config. Is this important? Did you set this parameter? Upon inspecting the code, I found that it is disabled by default. Looking forward to your reply.

Hhhhhhao · 2024-12-20T04:43:20Z

Okay, there are indeed some differences in the parameters. I understand now. I will refer to your log. Thank you for your reply.

let me know if there is any issue remaining

Hi，I reviewed past issues and compared your log with mine. I found that there seems to be only one difference: the clip_thresh parameter is not present in your config. Is this important? Did you set this parameter? Upon inspecting the code, I found that it is disabled by default. Looking forward to your reply.

It has been too long for me and I cannot remember exactly the parameters we used for the released models. But from your training logs it seems the accuracy is not far away from what we reported? (82.58 vs 84.01)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReFixmatch #235

ReFixmatch #235

hujiahujia commented Oct 18, 2024

JiaQuan1203 commented Dec 16, 2024

Hhhhhhao commented Dec 16, 2024

JiaQuan1203 commented Dec 16, 2024 •

edited

Loading

Hhhhhhao commented Dec 16, 2024

JiaQuan1203 commented Dec 16, 2024

Hhhhhhao commented Dec 16, 2024

JiaQuan1203 commented Dec 18, 2024

Hhhhhhao commented Dec 20, 2024

ReFixmatch #235

ReFixmatch #235

Comments

hujiahujia commented Oct 18, 2024

JiaQuan1203 commented Dec 16, 2024

Hhhhhhao commented Dec 16, 2024

JiaQuan1203 commented Dec 16, 2024 • edited Loading

Hhhhhhao commented Dec 16, 2024

JiaQuan1203 commented Dec 16, 2024

Hhhhhhao commented Dec 16, 2024

JiaQuan1203 commented Dec 18, 2024

Hhhhhhao commented Dec 20, 2024

JiaQuan1203 commented Dec 16, 2024 •

edited

Loading