Skip to content

Did you actually test batch size not in a power of 2 ? #14

Discussion options

You must be logged in to vote

It is fine to use batch sizes that aren't powers of 2. I have no evidence that non powers of two cause major problems, but I can't claim to know the details of what works best for every hardware platform. Personally, I often use batch sizes that are divisible by powers of 2 but aren't powers of two, such as 1536 or something.

Sometimes you might be able to train faster with a batch size of 42 than 32 just because you might reduce the number of training steps required to get a particular result more than you increase the step time.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Ca-ressemble-a-du-fake
Comment options

Answer selected by Ca-ressemble-a-du-fake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants