-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Welcome to cytounet
's wiki. 😃
Compiled below is a of frequently asked questions.
Training with data from load_augmentations leads to the model get stuck after one epoch
This is related to keras
' model.fit
method and the steps_per_epoch
argument. If you set this in model.fit
, there may not been enough samples and the method will simply get stuck with no warning.
What you can do instead is to use the batch_size
argument and use the appropriate batch_size
based on you data.
Loss goes up and down
-
Try using a LRScehduler to "sequentially" decrease the learning rate.
-
If using
SGD
, it has been suggested that using a "raw"SGD
i.e. one without such parameters as decay. You can setmomentum
toNone
or delete this altogether. At the time of writing, this is not possible without manually editingcytounet
's source code. See this discussion.
How can I choose an optimal batch_size?
This will vary depending on your dataset and/or computational resources. For a general overview of the effects of batch_size
, take a look at this paper, this and this.
In general, it is stated that batch sizes of powers of two perform well for most data sets.
A simple way to think of this is:
Small batch sizes --> Fast(er) training, faster convergence, greater error in the gradient estimate.
Large batch sizes --> Better estimate, require more computational power and may take more time to converge.
Accuracy and Loss are stuck at the same value
-
Try a different optimiser. The default is
SGD
which may not perform well on some datasets. Try usingAdam
and see how that affects your results. -
You probably have an imbalanced data set. In the
train/test/validation
data generators, try using abatch_size
that ensures a balanced number of samples. -
Too high/ too low learning rate. Use a learning rate scheduler or write a callback that stops training on plateau.
My model trains so slowly, what can I do to get faster training times?
-
Try reducing the
batch_size
-
Try increasing the number of
steps_per_epoch
, or use a more powerful GPU. -
You can also try to reduce the
input_size
to the data generators.
My model is not improving, what should I do?
Try reducing the learning rate. Alternatively, try using a learning rate scheduler or try different combinations of different hyper-parameters. This is on the TODO list
My model has a low loss but does poorly on training data.
Please use generate_validation_data
and feed the result to fit_generator
. That way, you can be see how the model does on the training and validation set. Alternatively, try using a different metric. We currently support dice
and binary_crossentropy
. You can implement your own and feed it to the metrics argument in unet
.
Model cannot train, throws out of resource memory error
This is fairly common and is tied to your machine's computational power. The ideal way would be to have a strong(er) machine. If you're low on resources however, please reduce the batch_size
, reduce the input_shape
and/or increase the number of steps_per_epoch
. This will lead to more training time however.
If training with augmented data, try using higher steps_per_epoch
in the .fit
method. Alternatively, you can set higher batch_size
s as this reduces the computational load. Note that this may result in lower accuracy.
Model gives a black prediction
It is likely that the model converged too soon and/or you used a very low/high learning rate. Try to change these or train with more data. You can use load_augmentations
for example to process augmented images and increase your training set.
Thank you and do let us know if you have further questions.