Replies: 1 comment 1 reply
-
I @numpee . We plan on adding QUASI_RANDOM to distributed soon hopefully. However in your situation nothing forces you to use these parameters if you don't have the resources required to do so (some machines have 700Gb of RAM but I assume yours don't. You can simply use JPEG or a combination of JPEG+RAW so that it fits in your RAM (I have no idea how much RAM you have). We have a paragraph in the Benchmark section of our website that gives different settings and their respective imagenet sizes. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi there,
I'm trying out FFCV for the first time and I've been writing the ImageNet dataset with the writing parameters listed here. It seems that at around 50k images, the writer uses around 27GB of storage. Assuming a linear scaling, I expect the ImageNet file to be ~700GB.
But is it ever feasible to load 700GB of data onto the RAM?
According to the tuning guide,
os_cache
should always be set toTrue
for distributed training, since only theRANDOM
setting is supported in distributed mode. Does that mean that if I don't have a machine that can fit the ImageNet dataset onto the RAM, I can't use FFCV (+ distributed)?Thanks,
Beta Was this translation helpful? Give feedback.
All reactions