[Inference API] Fix Azure AI Studio Integration for Completions and Embeddings #119818

brendan-jugan-elastic · 2025-01-09T05:02:57Z

This draft PR fixes the Inference API integration with Azure AI Foundry (previously Azure AI Studio). The previous integration was broken for both completions and embeddings models due to API changes from Microsoft.

Core Changes:

no longer referencing an approved list of providers
introducing a required AzureAiStudioDeploymentType to the service settings
- either azure_ai_model_inference_service or serverless_api
modifying auth configuration for each deployment type
slight request format modifications after API changes
testing and rebranding changes are in-progress, wanted to get some eyes on the implementation while I complete them

Once testing is complete, I will add more detailed docs describing the deployment types, their configurations, and describe how to use this integration with screenshots from the Azure console.

Local Testing:

Embeddings:

PUT http://localhost:9200/_inference/text_embedding/cohere_serverless_embed

curl --location --request PUT 'http://localhost:9200/_inference/text_embedding/cohere_serverless_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target.eastus.models.ai.azure.com/embeddings",
    "deployment_type": "serverless_api",
    "deployment_name": "Cohere-embed-v3-english-hmcek"
  }
}

POST http://localhost:9200/_inference/text_embedding/cohere_serverless_embed

curl --location 'http://localhost:9200/_inference/text_embedding/cohere_serverless_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'



PUT http://localhost:9200/_inference/text_embedding/cohere_amlis_embed

curl --location --request PUT 'http://localhost:9200/_inference/text_embedding/cohere_amlis_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target/models/embeddings",
    "deployment_type": "azure_ai_model_inference_service",
    "deployment_name": "Cohere-embed-v3-english"
  }
}'

curl --location 'http://localhost:9200/_inference/text_embedding/cohere_amlis_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'

Completions:

PUT http://localhost:9200/_inference/completion/cohere_serverless_completion

curl --location --request PUT 'http://localhost:9200/_inference/completion/cohere_serverless_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target.eastus.models.ai.azure.com/chat/completions",
    "deployment_type": "serverless_api",
    "deployment_name": "Cohere-command-r"
  }
}'

POST http://localhost:9200/_inference/completion/cohere_serverless_completion

curl --location 'http://localhost:9200/_inference/completion/cohere_serverless_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'



PUT http://localhost:9200/_inference/completion/cohere_amlis_completion

curl --location --request PUT 'http://localhost:9200/_inference/completion/cohere_amlis_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview",
    "deployment_type": "azure_ai_model_inference_service",
    "deployment_name": "Cohere-command-r"
  }
}'

POST http://localhost:9200/_inference/completion/cohere_amlis_completion

curl --location 'http://localhost:9200/_inference/completion/cohere_amlis_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'

Related Issues:

Helpful Links:

…ngs, tests are broken

timgrein

Nice, good stuff! 👏 Gave it a first pass and left some comments, already looking good

...xpack/inference/external/request/azureaistudio/AzureAiStudioChatCompletionRequestEntity.java

...rch/xpack/inference/external/request/azureaistudio/AzureAiStudioEmbeddingsRequestEntity.java

...a/org/elasticsearch/xpack/inference/external/request/azureaistudio/AzureAiStudioRequest.java

...h/xpack/inference/external/response/azureaistudio/AzureAiStudioEmbeddingsResponseEntity.java

...elasticsearch/xpack/inference/external/request/azureaistudio/AzureAiStudioRequestFields.java

...a/org/elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioServiceSettings.java

.../inference/external/request/azureaistudio/AzureAiStudioChatCompletionRequestEntityTests.java

...pack/inference/external/request/azureaistudio/AzureAiStudioEmbeddingsRequestEntityTests.java

...arch/xpack/inference/external/request/azureaistudio/AzureAiStudioEmbeddingsRequestTests.java

…changes

brendan-jugan-elastic · 2025-01-10T05:09:38Z

Note: I'm waiting to complete the Azure AI Studio -> Azure AI Foundry renaming until tomorrow. The above commits contain all of the functional/test changes for this Inference API fix.

timgrein

Mainly left comments around changes we need to address with regards to the transport level changes - we can also sync on that, it can be a bit confusing in the beginning

...in/java/org/elasticsearch/xpack/inference/external/request/azureopenai/AzureOpenAiUtils.java

...ack/inference/external/response/azureaistudio/AzureAiStudioChatCompletionResponseEntity.java

...a/org/elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioServiceSettings.java

… renaming changes

davidkyle · 2025-01-14T11:16:45Z

...org/elasticsearch/xpack/inference/services/azureaifoundry/AzureAiFoundryServiceSettings.java

-        out.writeEnum(provider);
-        out.writeEnum(endpointType);
+        if (out.getTransportVersion().before(AZURE_AI_FOUNDRY_INTEGRATION_FIX_1_10_25)) {
+            out.writeEnum(AzureAiFoundryProvider.NONE);


The old node will does not know about the enum value AzureAiFoundryProvider.NONE as that is added in this PR. Enums are written by their ordinal values, the old node will read the ordinal value for AzureAiFoundryProvider.NONE (let's say it is 2) but then throw an error because it does not know of any AzureAiFoundryProvider enum with ordinal value 2.

If there isn't a logical mapping from the old fields (provider, endpointtype) to the new (deploymenttype, model) then it is a question of how we want this to fail.

Inference endpoint creation is a master node action. The request will be serialised from whichever node it lands on to the master node and the response in turn will be serialised back to the originating node. PutInferenceModelAction.Response contains the ServiceSettings (we return the new endpoint configuration), if the originating node is an old node that doesn't know about the new options it won't return the proper config as some fields have been lost in serialisation. We can solve for that by not allowing new AzureAiFoundry inference endpoints to be created in a mixed cluster where some nodes do not know about this change.

davidkyle · 2025-01-14T11:34:23Z

...erence/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java

            )
        );

        namedWriteables.add(
            new NamedWriteableRegistry.Entry(
                ServiceSettings.class,
-                AzureAiStudioChatCompletionServiceSettings.NAME,
-                AzureAiStudioChatCompletionServiceSettings::new
+                AzureAiFoundryChatCompletionServiceSettings.NAME,


NAME has changed from azure_ai_studio_chat_completion_service_settings to azure_ai_foundry_chat_completion_service_settings. When an old node writes a named writable with the old name azure_ai_studio_chat_completion_service_settings this node will not know about it.

Because there is logic in the AzureAiFoundryChatCompletionServiceSettings serialisation code to handle backwards compatibility it is better not to change the name. Just add a comment explaining that NAME hasn't changed to maintain BWC

wip(azure_ai_foundry): fix implementation for completions and embeddi…

5791901

…ngs, tests are broken

brendan-jugan-elastic requested a review from timgrein January 9, 2025 05:02

elasticsearchmachine added the v9.0.0 label Jan 9, 2025

brendan-jugan-elastic changed the title ~~WIP(azure_ai_foundry): fix implementation for completions and embeddings~~ [(WIP) Inference API] Fix Azure AI Foundry Integration for Completions and Embeddings Jan 9, 2025

timgrein reviewed Jan 9, 2025

View reviewed changes

fix(azure_ai_studio): fix completion/embedding integration after API …

5c4ca54

…changes

brendan-jugan-elastic changed the title ~~[(WIP) Inference API] Fix Azure AI Foundry Integration for Completions and Embeddings~~ [Inference API] Fix Azure AI Studio Integration for Completions and Embeddings Jan 10, 2025

timgrein reviewed Jan 10, 2025

View reviewed changes

brendan-jugan-elastic added 3 commits January 12, 2025 21:01

fix(azure_ai_sudio): address PR feedback

df703f5

fix(azure_ai_foundry): rename Azure AI Studio to Azure AI Foundry

8e00675

fix(azure_ai_foundry): Add transport version logic and missed Foundry…

2d86494

… renaming changes

davidkyle reviewed Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference API] Fix Azure AI Studio Integration for Completions and Embeddings #119818

[Inference API] Fix Azure AI Studio Integration for Completions and Embeddings #119818

brendan-jugan-elastic commented Jan 9, 2025

timgrein left a comment

brendan-jugan-elastic commented Jan 10, 2025

timgrein left a comment

davidkyle Jan 14, 2025

davidkyle Jan 14, 2025

[Inference API] Fix Azure AI Studio Integration for Completions and Embeddings #119818

Are you sure you want to change the base?

[Inference API] Fix Azure AI Studio Integration for Completions and Embeddings #119818

Conversation

brendan-jugan-elastic commented Jan 9, 2025

Core Changes:

Local Testing:

timgrein left a comment

Choose a reason for hiding this comment

brendan-jugan-elastic commented Jan 10, 2025

timgrein left a comment

Choose a reason for hiding this comment

davidkyle Jan 14, 2025

Choose a reason for hiding this comment

davidkyle Jan 14, 2025

Choose a reason for hiding this comment