-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training issues with custom dataset (Missing) and learning rate fluctuations #11804
Comments
@RangiLyu @MiXaiLL76 Need help pls asap. thank you |
This is normal, it means that your model did not predict a single bbox!)
In you setup:
This means that initially you have a low LR, which becomes normal around epoch 5, and then decreases as you go through MultiStepLR at 8 and 12 epochs. why did you call me?) I am not the developer of this library) |
@MiXaiLL76 i read in the previous issues [ #2942 ] it stated that the learning rate reduce by 10x will resolve the issues but it didnt. and Im also not sure why the train images are cut by half. i had 600 train images in the dataset but only 300 are trained. need someone help me asap since i no idea how to resolve this issue and developer is so busy. thats why |
There may be some problems with the data, send them to me by email, I can quickly train the model and check it. |
@MiXaiLL76 I have emailed you. Did you received it? |
Hello! Yes, I received a message, I'll see today when I'm free from work |
pipeline.zip +----------+-------+--------+--------+-------+-------+-------+
| category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l |
+----------+-------+--------+--------+-------+-------+-------+
| SMD | 0.544 | 0.835 | 0.643 | 0.0 | 0.486 | 0.719 |
| THP | 0.605 | 0.854 | 0.652 | nan | 0.371 | 0.687 |
+----------+-------+--------+--------+-------+-------+-------+ |
What was the issue in my config file that cause only half of the Train set to go missing? |
You setup train_dataloader = dict(
batch_sampler=dict(type='AspectRatioBatchSampler'),
batch_size=2, total frames = 600 600 / 2 = 300 |
@MiXaiLL76 i have a question. what does it means went bbox_mAP, bbox_mAP_50, and bbox_mAP_75 values are same with AP IoU=0.50:0.95, IoU=0.50 ,IoU=0.75 respectively? |
These are the same metrics, just called differently |
@Warcry25 I think it's time to close this issue |
Describe the bug
I custom trained retinaNet with my own dataset. i have 600 images in the train set but upon training only 300 is trained another 300 is missing i cant seems to figure out the cause.
The training at the early stage seems off 👍 mmengine - ERROR - /content/mmdetection/mmdet/evaluation/metrics/coco_metric.py - compute_metrics - 465 - The testing results of the whole dataset is empty. 5th epoch onwards is ok. and the learning rate as well. everything cn be see in the log file below.
20240619_183332.log
The text was updated successfully, but these errors were encountered: