Skip to content
NelsonGon edited this page Jul 26, 2020 · 14 revisions

Welcome to cytounet's wiki. First we look at a list of frequently asked questions.

Loss goes up and down

  • Try using a LRScehduler to "sequentially" decrease the learning rate.

  • If using SGD, it has been suggested that using a "raw" SGD i.e. one without such parameters as decay. You can set momentum to None or delete this altogether. At the time of writing, this is not possible without manually editing cytounet's source code. See this discussion.

How can I choose an optimal batch_size?

This will vary depending on your dataset and/or computational resources. For a general overview of the effects of batch_size, take a look at this paper, this and this.

In general, it is stated that batch sizes of powers of two perform well for most data sets.

A simple way to think of this is:

Small batch sizes --> Fast(er) training, faster convergence, greater error in the gradient estimate.

Large batch sizes --> Better estimate, require more computational power and may take more time to converge.

Accuracy and Loss are stuck at the same value

  • Try a different optimiser. The default is SGD which may not perform well on some datasets. Try using Adam and see how that affects your results.

  • You probably have an imbalanced data set. In the train/test/validation data generators, try using a batch_size that ensures a balanced number of samples.

  • Too high/ too low learning rate. Use a learning rate scheduler or write a callback that stops training on plateau.

My model trains so slowly, what can I do to get faster training times?

  • Try reducing the batch_size

  • Try increasing the number of steps_per_epoch, or use a more powerful GPU.

  • You can also try to reduce the input_size to the data generators.

My model is not improving, what should I do?

Try reducing the learning rate. Alternatively, try using a learning rate scheduler or try different combinations of different hyper-parameters. This is on the TODO list

My model has a low loss but does poorly on training data.

Please use generate_validation_data and feed the result to fit_generator. That way, you can be see how the model does on the training and validation set. Alternatively, try using a different metric. We currently support dice and binary_crossentropy. You can implement your own and feed it to the metrics argument in unet.

Model cannot train, throws out of resource memory error

This is fairly common and is tied to your machine's computational power. The ideal way would be to have a strong(er) machine. If you're low on resources however, please reduce the batch_size, reduce the input_shape and/or increase the number of steps_per_epoch. This will lead to more training time however.

If training with augmented data, try using higher steps_per_epoch in the .fit method. Alternatively, you can set higher batch_sizes as this reduces the computational load. Note that this may result in lower accuracy.

Model gives a black prediction

It is likely that the model converged too soon and/or you used a very low/high learning rate. Try to change these or train with more data. You can use load_augmentations for example to process augmented images and increase your training set.

Clone this wiki locally