Skip to content

StereoSampleGAN: A computationally inexpensive approach high fidelity stereo audio sample generation.

License

Notifications You must be signed in to change notification settings

shuklabhay/stereo-sample-gan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StereoSampleGAN

On Push

StereoSampleGAN: A computationally inexpensive approach high fidelity stereo audio sample generation.

Model Usage

1. Prereqs

  • Optional but highly reccomended: Set up a Python virtual environment.
    • Audio loader package librosa requires an outdated version of Numpy
  • Install requirements by running pip3 install -r requirements.txt

2. Generate audio from pretrained models

Specify sample count to generate, output, etc usage_params.py

  • Generate audio from the Curated Kick model by running python3 src/run_pretrained/generate_curated_kick.py
  • Generate audio from the Diverse Kick model by running python3 src/run_pretrained/generate_diverse_kick.py
  • Generate audio from the Instrument One Shot model by running python3 src/run_pretrained/generate_instrument_one_shot.py

3. Train model

Specify training data paramaters in usage_params.py

  • I reccomend anywhere between 4,000-8000 training examples, any multiple of 8 and audio <1.5 sec long (longer hasn't been fully tested)
  • Prepare training data by running python3 src/data_processing/encode_audio_data.py
  • Train model by running python3 src/stereo_sample_gan.py
  • Generate audio (based on current usage_params.py) by running python3 src/generate.py

Training progress visualization (training Diverse Kick Drum Model):

Diverse kick training progress

Pretrained Models

Diverse Kick Drum

Kick drum generation model trained on ~8000 essentially random kick drums.

  • More variation between each generated sample, audio is occasionally inconsistent and noisy.

Diverse kick model generated examples

Curated Kick Drum

Kick drum generation model trained on ~4400 kick drums with closer matching overall characteristics.

  • Less variation between each drum sample's decay and auditory tone.

Curated kick model generated examples

Instrument One Shot

Instrument one shot generation model, trained on ~3000 semi-curated instrument one shots.

  • Demonstrates model's capability to generate longer audio, yet fails to generate coherent and useable instrument one shots.

Instrument one shot model generated examples

Directories

  • outputs: Trained model and generated audio
  • paper: Research paper / model writeup
  • static: Static resources
  • src: Model source code
    • utils: Model and data utilities
    • data_processing: Training data processing scripts

About

StereoSampleGAN: A computationally inexpensive approach high fidelity stereo audio sample generation.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages