update readme

zhvng · Mar 8, 2023 · 0a6386b · 0a6386b
1 parent 50a363c
commit 0a6386b
Showing 1 changed file with 15 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -87,6 +87,21 @@ python ./scripts/train_fine_stage.py \
     --kmeans_path PATH_TO_KMEANS_CHECKPOINT # path to previously trained kmeans
 ```
 
+## Preprocessing
+In the above case, we are using CLAP, Hubert and Encodec to generate clap, semantic and acoustic tokens live during training. However, these models take up space on the GPU, and it is inefficient to recompute these tokens if we're making multiple runs on the same data. We can instead compute these tokens ahead of time and iterate over them during training.
+
+To do this, fill in the `data_preprocessor_cfg` field in the config and set `use_preprocessed_data` to True in the trainer configs (look at train_fma_preprocess.json for inspiration). Then run the following to preprocess the dataset, followed by your training script.
+
+```shell
+python ./scripts/preprocess_data.py \
+    --stage all # stage(s) we want to preprocess for: all | semantic | coarse | fine
+    --model_config ./configs/model/musiclm_small.json \
+    --training_config ./configs/training/train_fma_preprocess.json \
+    --rvq_path PATH_TO_RVQ_CHECKPOINT \ # path to previously trained rvq
+    --kmeans_path PATH_TO_KMEANS_CHECKPOINT # path to previously trained kmeans
+```
+Note: make sure to process enough data for the number of training steps you're planning to run. Once the trainer runs out of data it cycles back to the beginning, but there is no random cropping in this case so the samples will be repeated.
+
 ## Inference
 Generate multiple samples and use CLAP to select the best ones:
 ```shell