Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
zhvng authored Mar 8, 2023
1 parent 50a363c commit 0a6386b
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,21 @@ python ./scripts/train_fine_stage.py \
--kmeans_path PATH_TO_KMEANS_CHECKPOINT # path to previously trained kmeans
```

## Preprocessing
In the above case, we are using CLAP, Hubert and Encodec to generate clap, semantic and acoustic tokens live during training. However, these models take up space on the GPU, and it is inefficient to recompute these tokens if we're making multiple runs on the same data. We can instead compute these tokens ahead of time and iterate over them during training.

To do this, fill in the `data_preprocessor_cfg` field in the config and set `use_preprocessed_data` to True in the trainer configs (look at train_fma_preprocess.json for inspiration). Then run the following to preprocess the dataset, followed by your training script.

```shell
python ./scripts/preprocess_data.py \
--stage all # stage(s) we want to preprocess for: all | semantic | coarse | fine
--model_config ./configs/model/musiclm_small.json \
--training_config ./configs/training/train_fma_preprocess.json \
--rvq_path PATH_TO_RVQ_CHECKPOINT \ # path to previously trained rvq
--kmeans_path PATH_TO_KMEANS_CHECKPOINT # path to previously trained kmeans
```
Note: make sure to process enough data for the number of training steps you're planning to run. Once the trainer runs out of data it cycles back to the beginning, but there is no random cropping in this case so the samples will be repeated.

## Inference
Generate multiple samples and use CLAP to select the best ones:
```shell
Expand Down

0 comments on commit 0a6386b

Please sign in to comment.