Skip to content

A minimal implimentation of live audio streaming with an xtts_v2 model

License

Notifications You must be signed in to change notification settings

Jcwscience/xtts_live

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xtts_live

The aim of this project is to provide a simple wrapper around XTTS-v2 that allows for low latency streaming output.

Requirements

  • numpy
  • librosa
  • TTS
  • An audio stream backend such as pyaudio or sounddevice

Getting The Model

If you do not already have the xtts_v2 model you will need to download it. Follow the instructions at https://huggingface.co/coqui/XTTS-v2 and specify the path to it when running the script.

Usage

# Import the wrapper
from xtts_live import TextToSpeech

# Initialize an instance of the TextToSpeech class
TTS = TextToSpeech(model_path, speaker_wavs)

# Add text to the processing queue
TTS.speak("Text to be spoken.")

# Read frames from the audio buffer
TTS.audio_buffer.get_samples("Number of samples to retrieve")

# Clean up the tts buffers and threads
TTS.stop()

See demo.py for example stream setup and integration.

About

A minimal implimentation of live audio streaming with an xtts_v2 model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages