Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ElevenLabs text-to-speech integration #115645

Merged
merged 30 commits into from
Jul 31, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
0d8ce61
Add ElevenLabs text-to-speech integration
sorgfresser Apr 15, 2024
195d1ca
Remove commented out code
sorgfresser Apr 15, 2024
94edd8c
Use model_id instead of model_name for elevenlabs api
sorgfresser Apr 15, 2024
29035b6
Apply suggestions from code review
sorgfresser Apr 16, 2024
bde8a62
Use async client instead of sync
sorgfresser Apr 20, 2024
c205b18
Add ElevenLabs code owner
sorgfresser Apr 20, 2024
6bd9268
Apply suggestions from code review
sorgfresser May 24, 2024
28f50bf
Set entity title to voice
sorgfresser May 24, 2024
3b9751b
Rename to elevenlabs
sorgfresser May 24, 2024
96a37b1
Apply suggestions from code review
sorgfresser May 29, 2024
5449e08
Allow multiple voices and options flow
synesthesiam Jun 7, 2024
7502be1
Sort default voice at beginning
sorgfresser Jun 8, 2024
b6a7f8e
Rework config flow to include default model and reloading on options …
sorgfresser Jun 22, 2024
f33e6dc
Add error to strings
sorgfresser Jun 22, 2024
6586f43
Add ElevenLabsData and suggestions from code review
sorgfresser Jun 30, 2024
c92f0ea
Shorten options and config flow
sorgfresser Jul 6, 2024
eaab9bb
Fix comments
joostlek Jul 27, 2024
8b21925
Merge branch 'dev' into elevenlabs
joostlek Jul 27, 2024
19fb2ff
Fix comments
joostlek Jul 27, 2024
d7bbbd2
Add wip
sorgfresser Jul 30, 2024
990c6f2
Fix
sorgfresser Jul 30, 2024
25879d7
Cleanup
sorgfresser Jul 30, 2024
1f34556
Bump elevenlabs version
sorgfresser Jul 30, 2024
8824ab8
Add data description
sorgfresser Jul 30, 2024
f6d6641
Merge branch 'dev' into elevenlabs
joostlek Jul 31, 2024
8608aa2
Merge branch 'dev' into elevenlabs
sorgfresser Jul 31, 2024
b619c75
Merge branch 'dev' into elevenlabs
cdce8p Jul 31, 2024
4cdc67a
Merge branch 'dev' into elevenlabs
joostlek Jul 31, 2024
b36a3f1
Merge branch 'dev' into elevenlabs
joostlek Jul 31, 2024
2773a00
Fix
joostlek Jul 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .strict-typing
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ homeassistant.components.ecowitt.*
homeassistant.components.efergy.*
homeassistant.components.electrasmart.*
homeassistant.components.electric_kiwi.*
homeassistant.components.elevenlabs.*
homeassistant.components.elgato.*
homeassistant.components.elkm1.*
homeassistant.components.emulated_hue.*
Expand Down
2 changes: 2 additions & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,8 @@ build.json @home-assistant/supervisor
/tests/components/electrasmart/ @jafar-atili
/homeassistant/components/electric_kiwi/ @mikey0000
/tests/components/electric_kiwi/ @mikey0000
/homeassistant/components/elevenlabs/ @sorgfresser
/tests/components/elevenlabs/ @sorgfresser
/homeassistant/components/elgato/ @frenck
/tests/components/elgato/ @frenck
/homeassistant/components/elkm1/ @gwww @bdraco
Expand Down
56 changes: 56 additions & 0 deletions homeassistant/components/elevenlabs/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
"""The ElevenLabs text-to-speech integration."""

from __future__ import annotations

from dataclasses import dataclass

from elevenlabs.client import AsyncElevenLabs
from elevenlabs.core import ApiError

from homeassistant.config_entries import ConfigEntry
from homeassistant.const import CONF_API_KEY, Platform
from homeassistant.core import HomeAssistant
from homeassistant.exceptions import ConfigEntryAuthFailed

from .const import CONF_MODEL, DEFAULT_MODEL
from .tts import get_model_by_id

PLATFORMS: list[Platform] = [Platform.TTS]


@dataclass(kw_only=True, slots=True)
class ElevenLabsData:
"""ElevenLabs data type."""

client: AsyncElevenLabs
frenck marked this conversation as resolved.
Show resolved Hide resolved


async def async_setup_entry(hass: HomeAssistant, entry: ConfigEntry) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""Set up ElevenLabs text-to-speech from a config entry."""
entry.add_update_listener(update_listener)
client = AsyncElevenLabs(api_key=entry.data[CONF_API_KEY])
model_id = entry.options.get(CONF_MODEL, entry.data.get(CONF_MODEL))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is present in either options or data? Let's just pick one place for it to be and just use entry.something.get(CONF_MODEL, DEFAULT_MODEL)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be options, as users can change it.

# Fallback to default
model_id = model_id if model_id is not None else DEFAULT_MODEL
try:
model = await get_model_by_id(client, model_id)
except ApiError as err:
raise ConfigEntryAuthFailed from err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have a reauth flow, please raise ConfigEntryError instead


if model is None or (not model.languages):
return False

entry.runtime_data = ElevenLabsData(client=client)
await hass.config_entries.async_forward_entry_setups(entry, PLATFORMS)

return True


async def async_unload_entry(hass: HomeAssistant, entry: ConfigEntry) -> bool:
"""Unload a config entry."""
return await hass.config_entries.async_unload_platforms(entry, PLATFORMS)


async def update_listener(hass: HomeAssistant, config_entry: ConfigEntry) -> None:
"""Handle options update."""
await hass.config_entries.async_reload(config_entry.entry_id)
151 changes: 151 additions & 0 deletions homeassistant/components/elevenlabs/config_flow.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
"""Config flow for ElevenLabs text-to-speech integration."""

from __future__ import annotations

import logging
from types import MappingProxyType
from typing import Any

from elevenlabs.client import AsyncElevenLabs
from elevenlabs.core import ApiError
import voluptuous as vol

from homeassistant.config_entries import (
ConfigEntry,
ConfigFlow,
ConfigFlowResult,
OptionsFlow,
)
from homeassistant.const import CONF_API_KEY
from homeassistant.helpers.selector import (
SelectOptionDict,
SelectSelector,
SelectSelectorConfig,
)

from .const import CONF_MODEL, CONF_VOICE, DEFAULT_MODEL, DOMAIN

STEP_USER_DATA_SCHEMA_NO_AUTH = vol.Schema({vol.Required(CONF_API_KEY): str})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no_auth? api key is auth right?



_LOGGER = logging.getLogger(__name__)


async def get_voices_models(api_key: str) -> tuple[dict[str, str], dict[str, str]]:
"""Get available voices and models as dicts."""
client = AsyncElevenLabs(api_key=api_key)
voices = (await client.voices.get_all()).voices
models = await client.models.get_all()
voices_dict = {
voice.voice_id: voice.name
for voice in sorted(voices, key=lambda v: v.name or "")
if voice.name
}
models_dict = {
model.model_id: model.name
for model in sorted(models, key=lambda m: m.name or "")
if model.name and model.can_do_text_to_speech
}
return voices_dict, models_dict


class ElevenLabsConfigFlow(ConfigFlow, domain=DOMAIN):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something unique we can fetch to avoid adding the same user account twice?

"""Handle a config flow for ElevenLabs text-to-speech."""

VERSION = 1

async def async_step_user(
self, user_input: dict[str, Any] | None = None
) -> ConfigFlowResult:
"""Handle the initial step."""
errors = {}
if user_input is None:
return self.async_show_form(
step_id="user", data_schema=STEP_USER_DATA_SCHEMA_NO_AUTH
)
# Validate auth, get voices
try:
_, models = await get_voices_models(user_input[CONF_API_KEY])
except ApiError:
errors["base"] = "invalid_api_key"
if errors:
return self.async_show_form(
step_id="user", data_schema=STEP_USER_DATA_SCHEMA_NO_AUTH, errors=errors
)

return self.async_create_entry(
title=f"{models[DEFAULT_MODEL]}", data=user_input
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title=f"{models[DEFAULT_MODEL]}", data=user_input
title=models[DEFAULT_MODEL], data=user_input

)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal opinion, this flow could look cleaner if you flip it around

errors = {}
if user_input is not None:
    ...getting the voices and checking it works
return self.async_show_form

This leaves you with only 1 async_show_formcall

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we maybe just put the DEFAULT_MODEL in options already?


@staticmethod
def async_get_options_flow(
config_entry: ConfigEntry,
) -> OptionsFlow:
"""Create the options flow."""
return ElevenLabsOptionsFlow(config_entry)


class ElevenLabsOptionsFlow(OptionsFlow):
"""ElevenLabs options flow."""

# id -> name
voices: dict[str, str] = {}
models: dict[str, str] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't create dict as a class variable, but in the constructor as instance variable


def __init__(self, config_entry: ConfigEntry) -> None:
"""Initialize options flow."""
self.config_entry = config_entry
self.api_key: str = self.config_entry.data[CONF_API_KEY]

async def async_step_init(
self, user_input: dict[str, Any] | None = None
) -> ConfigFlowResult:
"""Manage the options."""
if not self.voices or not self.models:
self.voices, self.models = await get_voices_models(self.api_key)

assert self.models and self.voices

if user_input is not None:
return self.async_create_entry(
title=f"{self.models[user_input[CONF_MODEL]]}",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing title to ElevenLabs here as well

data=user_input,
)

schema = self.elevenlabs_config_option_schema(self.config_entry.options)
return self.async_show_form(
step_id="init",
data_schema=schema,
)

def elevenlabs_config_option_schema(
self, options: MappingProxyType[str, Any]
) -> vol.Schema:
"""Elevenlabs options schema."""
return self.add_suggested_values_to_schema(
vol.Schema(
{
vol.Required(
CONF_MODEL,
): SelectSelector(
SelectSelectorConfig(
options=[
SelectOptionDict(label=model_name, value=model_id)
for model_id, model_name in self.models.items()
]
)
),
vol.Required(
CONF_VOICE,
): SelectSelector(
SelectSelectorConfig(
options=[
SelectOptionDict(label=voice_name, value=voice_id)
for voice_id, voice_name in self.voices.items()
]
)
),
}
),
options,
)
7 changes: 7 additions & 0 deletions homeassistant/components/elevenlabs/const.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""Constants for the ElevenLabs text-to-speech integration."""

CONF_VOICE = "voice"
CONF_MODEL = "model"
DOMAIN = "elevenlabs"

DEFAULT_MODEL = "eleven_multilingual_v2"
11 changes: 11 additions & 0 deletions homeassistant/components/elevenlabs/manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"domain": "elevenlabs",
"name": "ElevenLabs",
"codeowners": ["@sorgfresser"],
"config_flow": true,
"documentation": "https://www.home-assistant.io/integrations/elevenlabs",
"integration_type": "service",
"iot_class": "cloud_polling",
"loggers": ["elevenlabs"],
"requirements": ["elevenlabs==1.1.2"]
}
30 changes: 30 additions & 0 deletions homeassistant/components/elevenlabs/strings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
{
"config": {
"step": {
"user": {
"data": {
"api_key": "[%key:common::config_flow::data::api_key%]"
frenck marked this conversation as resolved.
Show resolved Hide resolved
}
},
"voice": {
"data": {
"voice": "Voice",
"model": "Model"
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

},
"error": {
"invalid_api_key": "[%key:common::config_flow::error::invalid_api_key%]"
}
},
"options": {
"step": {
"init": {
"data": {
"voice": "Voice",
"model": "Model"
frenck marked this conversation as resolved.
Show resolved Hide resolved
}
}
}
}
}
117 changes: 117 additions & 0 deletions homeassistant/components/elevenlabs/tts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
"""Support for the ElevenLabs text-to-speech service."""

from __future__ import annotations

from functools import cached_property
import logging
from typing import Any

from elevenlabs.client import AsyncElevenLabs
from elevenlabs.core import ApiError
from elevenlabs.types import Model, Voice

from homeassistant.components import tts
from homeassistant.config_entries import ConfigEntry
from homeassistant.core import HomeAssistant
from homeassistant.exceptions import HomeAssistantError
from homeassistant.helpers.entity_platform import AddEntitiesCallback

from .const import CONF_MODEL, CONF_VOICE, DEFAULT_MODEL

_LOGGER = logging.getLogger(__name__)


async def get_model_by_id(client: AsyncElevenLabs, model_id: str) -> Model | None:
"""Get ElevenLabs model from their API by the model_id."""
models = await client.models.get_all()
for maybe_model in models:
if maybe_model.model_id == model_id:
return maybe_model
return None


async def async_setup_entry(
hass: HomeAssistant,
config_entry: ConfigEntry,
async_add_entities: AddEntitiesCallback,
) -> None:
"""Set up ElevenLabs tts platform via config entry."""
client = config_entry.runtime_data.client
# Get model, fallback to default
model_id = config_entry.options.get(CONF_MODEL, DEFAULT_MODEL)
model = await get_model_by_id(client, model_id)
assert model is not None, "Model was not found in async_setup_entry"
voices = (await client.voices.get_all()).voices
default_voice_id = config_entry.options.get(CONF_VOICE, voices[0].voice_id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_voice_id = config_entry.options.get(CONF_VOICE, voices[0].voice_id)
assert CONF_VOICE in config_entry.options and CONF_MODEL in config_entry.options
default_voice_id = config_entry.options[CONF_VOICE]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert? Agreed.

Copy link
Contributor Author

@sorgfresser sorgfresser Jul 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from get to keys neither, though with my new changes the CONF_VOICE key should always be set, so I would remove the .get() here in favour of __getitem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's awesome!

async_add_entities(
[ElevenLabsTTSEntity(config_entry, client, model, voices, default_voice_id)]
)


class ElevenLabsTTSEntity(tts.TextToSpeechEntity):
"""The ElevenLabs API entity."""

def __init__(
self,
config_entry: ConfigEntry,
client: AsyncElevenLabs,
model: Model,
voices: list[Voice],
default_voice_id: str,
) -> None:
"""Init ElevenLabs TTS service."""
self._client = client
self._model = model
self._default_voice_id = default_voice_id
self._voices = sorted(
(tts.Voice(v.voice_id, v.name) for v in voices if v.name),
key=lambda v: v.name,
)
# Default voice first
voice_indices = [
idx for idx, v in enumerate(self._voices) if v.voice_id == default_voice_id
]
if voice_indices:
self._voices.insert(0, self._voices.pop(voice_indices[0]))
self._attr_name = config_entry.title
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use device_info to create a device with entry_type -> Servce, and then leave the name empty. All new integrations should use _attr_has_entity_name = True. Then also add _attr_name = None so the entity takes the name of the device, which will take the name of the config entry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, could you tell me why we need this separate Service instead of the default tts.speak one? I am most likely missing something here. This code is heavily derived from the google translate tts integration, so maybe part of the things I did are not up to date

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, its not a service as in a service call, rather a device that is a service. So if you open your integration list on your environment, you see a lot of x DEVICES and x SERVICES underneath each integration, this makes it a service

self._attr_unique_id = config_entry.entry_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather just make the entry_id a paramter instead of hte whole config entry

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but there is only 1 entity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to pull in a whole object for only 1 field, having it as parameter is clearer

self._config_entry = config_entry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused


@cached_property
def default_language(self) -> str:
"""Return the default language."""
return self.supported_languages[0]

@cached_property
def supported_languages(self) -> list[str]:
"""Return list of supported languages."""
return [lang.language_id for lang in self._model.languages or []]

@property
def supported_options(self) -> list[str]:
"""Return a list of supported options."""
return [tts.ATTR_VOICE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #115684 merged, you can convert these properties to _attr instance variables.


def async_get_supported_voices(self, language: str) -> list[tts.Voice]:
"""Return a list of supported voices for a language."""
return self._voices

async def async_get_tts_audio(
self, message: str, language: str, options: dict[str, Any]
) -> tts.TtsAudioType:
"""Load tts audio file from the engine."""
_LOGGER.debug("Getting TTS audio for %s", message)
voice_id = options[tts.ATTR_VOICE]
try:
audio = await self._client.generate(
text=message,
voice=voice_id,
model=self._model.model_id,
)
bytes_combined = b"".join([byte_seg async for byte_seg in audio])
except ApiError as exc:
_LOGGER.warning(
"Error during processing of TTS request %s", exc, exc_info=True
)
raise HomeAssistantError(exc) from exc
return "mp3", bytes_combined
Loading
Loading