Skip to content

Latest commit

 

History

History
59 lines (46 loc) · 2 KB

README.md

File metadata and controls

59 lines (46 loc) · 2 KB

fastsag

This is a PyTorch/GPU implementation of the IJCAI 2024 paper FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation. Demo page can be found at demo.

@article{chen2024fastsag,
  title={FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation},
  author={Chen, Jianyi and Xue, Wei and Tan, Xu and Ye, Zhen and Liu, Qifeng and Guo, Yike},
  journal={arXiv preprint arXiv:2405.07682},
  year={2024}
}

Preparation

  1. Download this code:
git clone https://github.com/chenjianyi/fastsag/
cd fastsag
  1. Download fastsag checkpoint from here and put all weights in fastsag/weights

    BigvGAN checkpoints could be downloaded from BigvGAN. The checkpoints we used is "bigvgan_24khz_100band". I upgrade BigvGAN to BigvGAN-v2, and the checkpoints would be downloaded automatically.

    MERT pretrained checkpoints would be downloaded automatically from huggingface. Please make sure your sever could access huggingface.

Dataset

  1. Source seperation:
cd preprocessing
python3 demucs_processing.py  # you may need to change root_dir and out_dir in this file
  1. cliping to 10s and filtering salient clips
python3 clip_to_10s.py  # change src_root and des_root for your dataset

Training

cd ../sde_diffusion
python3 train.py --data_dir YOUR_TRAIN_DATA --data_dir_testset YOUR_TEST_DATA --results_folder RESULTS

Generation

python3 generate.py --ckpt TRAINED_MODEL --data_dir DATA_DIR --result_dir OUTPUT

Acknowledgement and reference

  1. Grad-TTS.
  2. CoMoSpeech