Skip to content

Latest commit

 

History

History
49 lines (36 loc) · 4.73 KB

File metadata and controls

49 lines (36 loc) · 4.73 KB

Azure Cognitive Search

Azure Cognitive Search is a complete retrieval cloud service that supports vector search, text search, and hybrid (vectors + text combined to yield the best of the two approaches). Azure Cognitive Search also offers an optional L2 re-ranking step to further improve results quality.

You can find the Azure Cognitive Search documentation here. If you don't have an Azure account, you can start setting one up here.

Signing up for vector search

Azure Cognitive Search supports searching using pure vectors, pure text, or hybrid mode where both are combined. For the vector-based cases, you'll need to sign up for vector search private preview. To sign up, please fill in this form: https://aka.ms/VectorSearchSignUp

Environment variables

Name Required Description Default
DATASTORE Yes Datastore name, set to azuresearch
BEARER_TOKEN Yes Secret token
OPENAI_API_KEY Yes OpenAI API key
AZURESEARCH_SERVICE Yes Name of your search service
AZURESEARCH_INDEX Yes Name of your search index
AZURESEARCH_API_KEY No Your API key, if using key-based auth instead of Azure managed identity Uses managed identity
AZURESEARCH_DISABLE_HYBRID No Disable hybrid search and only use vector similarity Use hybrid search
AZURESEARCH_SEMANTIC_CONFIG No Enable L2 re-ranking with this configuration name see re-ranking below L2 not enabled
AZURESEARCH_LANGUAGE No If using L2 re-ranking, language for queries/documents (valid values listed here) en-us
AZURESEARCH_DIMENSIONS No Vector size for embeddings 1536 (OpenAI's Ada002)

Authentication Options

  • API key: this is enabled by default; you can obtain the key in the Azure Portal or using the Azure CLI.
  • Managed identity: If the plugin is running in Azure, you can enable managed identity for the host and give that identity access to the service, without having to manage keys (avoiding secret storage, rotation, etc.). More details here.

Re-ranking

Azure Cognitive Search offers the option to enable a second (L2) ranking step after retrieval to further improve results quality. This only applies when using text or hybrid search. Since it has latency and cost implications, if you want to try this option you need to explicitly enable "semantic search" in your Cognitive Search service, and create a semantic search configuration for your index.

Using existing search indexes

If an existing index has fields that align with what's needed by the retrieval plugin but just differ in names, you can map your fields to the plugin fields using the following environment variables:

Plugin field name Environment variable to override it
id AZURESEARCH_FIELDS_ID
text AZURESEARCH_FIELDS_TEXT
embedding AZURESEARCH_FIELDS_EMBEDDING
document_id AZURESEARCH_FIELDS_DOCUMENT_ID
source AZURESEARCH_FIELDS_SOURCE
source_id AZURESEARCH_FIELDS_SOURCE_ID
url AZURESEARCH_FIELDS_URL
created_at AZURESEARCH_FIELDS_CREATED_AT
author AZURESEARCH_FIELDS_AUTHOR