Estos días estoy jugando mucho, mucho, mucho con Azure Open AI y los diferentes modelos a los que puedes pedir acceso y ver cómo, desde el punto de vista del desarrollo, se implementa su uso desde cero para diferentes escenarios. En este artículo quiero compartir contigo mi Terraform para crear el recurso de Open AI y desplegar diferentes modelos para que no tengas que hacerlo a mano.
Acceso a Azure Open AI
Para poder desplegar este recurso, y sus despliegues, primero necesitas tener acceso a Azure Open AI. Si todavía no lo has pedido, puedes hacerlo a través del siguiente enlace.
Terraform para desplegar Azure Open AI
Y aquí tienes mi Terraform para desplegarlo todo, con comentarios sobre para qué sirve cada modelo:
# Provider
provider "azurerm" {
features {
cognitive_account {
purge_soft_delete_on_destroy = true
}
resource_group {
prevent_deletion_if_contains_resources = false
}
}
}
terraform {
required_providers {
azapi = {
source = "azure/azapi"
}
azurerm = {
source = "hashicorp/azurerm"
}
}
}
# Variables
variable "location" {
type = string
default = "westeurope"
validation {
condition = contains(["westeurope", "eastus"], var.location)
error_message = "Location must be either westeurope or eastus"
}
}
# Resource Group
resource "azurerm_resource_group" "rg" {
name = "azure-openai-demos"
location = var.location
}
# OpenAI resource
resource "azurerm_cognitive_account" "openai" {
name = "azopenai"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
kind = "OpenAI"
sku_name = "S0"
}
# Deployments
# https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models#model-summary-table-and-region-availability
##########################################################################
######################### Naming convention ##############################
##########################################################################
# The naming convention for the models is as follows:
# {capability}-{family}[-{input-type}]-{identifier}
# {capability} The model capability of the model. For example, GPT-3 models uses text, while Codex models use code.
# {family} The relative family of the model. For example, GPT-3 models include ada, babbage, curie, and davinci.
# {input-type} (Embeddings models only) The input type of the embedding supported by the model. For example, text search embedding models support doc and query.
# {identifier} The version identifier of the model.
##########################################################################
######################### Finding the right model ########################
##########################################################################
# We recommend starting with the most capable model in a model family to confirm whether the model capabilities meet your requirements.
# Then you can stay with that model or move to a model with lower capability and cost, optimizing around that model's capabilities.
##########################################################################
######################### GPT-3 models ##################################
##########################################################################
# The GPT-3 models can understand and generate natural language. The service offers four model capabilities,
# each with different levels of power and speed suitable for different tasks. Davinci is the most capable model,
# while Ada is the fastest. In the order of greater to lesser capability, the models are:
### Davinci ####
# Davinci is the most capable model and can perform any task the other models can perform, often with less instruction.
# For applications requiring deep understanding of the content, like summarization for a specific audience and creative content generation,
# Davinci produces the best results. The increased capabilities provided by Davinci require more compute resources, so Davinci costs more and isn't as fast as other models.
# Use for: Complex intent, cause and effect, summarization for audience.
resource "azurerm_cognitive_deployment" "text_davinci_003" {
name = "text-davinci-003"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-davinci-003"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### Curie ####
# Curie is powerful, yet fast. While Davinci is stronger when it comes to analyzing complicated text,
# Curie is capable for many nuanced tasks like sentiment classification and summarization.
# Curie is also good at answering questions and performing Q&A and as a general service chatbot.
# Use for: Language translation, complex classification, text sentiment, summarization.
resource "azurerm_cognitive_deployment" "text_curie_001" {
name = "text-curie-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-curie-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### Babbage ####
# Babbage can perform straightforward tasks like simple classification. It’s also capable when it comes to semantic search, ranking how well documents match up with search queries.
# Use for: Moderate classification, semantic search classification.
resource "azurerm_cognitive_deployment" "text_babbage_001" {
name = "text-babbage-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-babbage-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### Ada ####
# Ada is usually the fastest model and can perform tasks like parsing text, address correction and certain kinds of classification tasks
# that don’t require too much nuance. Ada’s performance can often be improved by providing more context.
# Use for: Parsing text, simple classification, address correction, keywords
resource "azurerm_cognitive_deployment" "text_ada_001" {
name = "text-ada-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-ada-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### ChatGPT (gpt-35-turbo) ###
# The ChatGPT model (gpt-35-turbo) is a language model designed for conversational interfaces and the model behaves differently than previous GPT-3 models.
# Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However,
# the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format,
# and returns a completion that represents a model-written message in the chat.
resource "azurerm_cognitive_deployment" "gpt_35_turbo" {
name = "gpt-35-turbo"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "gpt-35-turbo"
version = "0301"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
##########################################################################
######################### Codex Models ##################################
##########################################################################
# The Codex models are descendants of our base GPT-3 models that can understand and generate code. Their training data contains both natural language and billions of lines of public code from GitHub.
# They’re most capable in Python and proficient in over a dozen languages, including C#, JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL, and Shell. In the order of greater to lesser capability, the Codex models are:
### Davinci-Codex ####
# Similar to GPT-3, Davinci is the most capable Codex model and can perform any task the other models can perform, often with less instruction.
# For applications requiring deep understanding of the content, Davinci produces the best results. Greater capabilities require more compute resources,
# so Davinci costs more and isn't as fast as other models.
resource "azurerm_cognitive_deployment" "code_davinci_002" {
name = "code-davinci-002"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "code-davinci-002"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### Cushman-Codex ####
# Cushman is powerful, yet fast. While Davinci is stronger when it comes to analyzing complicated tasks,
# Cushman is a capable model for many code generation tasks. Cushman typically runs faster and cheaper than Davinci, as well.
resource "azurerm_cognitive_deployment" "code_cushman_001" {
name = "code-cushman-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "code-cushman-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
##########################################################################
######################### Embeddings Models #############################
##########################################################################
# An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms.
# The embedding is an information dense representation of the semantic meaning of a piece of text.
# Currently, we offer three families of Embeddings models for different functionalities: similarity, text search, and code search.
### Similarity embedding ###
# The similarity embedding models are trained to understand the semantic similarity between two pieces of text.
resource "azurerm_cognitive_deployment" "text_similarity_ada_001" {
name = "text-similarity-ada-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-similarity-ada-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_similarity_babbage_001" {
name = "text-similarity-babbage-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-similarity-babbage-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_similarity_curie_001" {
name = "text-similarity-curie-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-similarity-curie-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_similarity_davinci_001" {
name = "text-similarity-davinci-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-similarity-davinci-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### Text search embedding ###
# These models help measure whether long documents are relevant to a short search query.
# There are two input types supported by this family: doc, for embedding the documents to be retrieved, and query, for embedding the search query.
resource "azurerm_cognitive_deployment" "text_search_ada_doc_001" {
name = "text-search-ada-doc-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-ada-doc-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_ada_query_001" {
name = "text-search-ada-query-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-ada-query-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_babbage_doc_001" {
name = "text-search-babbage-doc-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-babbage-doc-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_babbage_query_001" {
name = "text-search-babbage-query-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-babbage-query-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_curie_doc_001" {
name = "text-search-curie-doc-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-curie-doc-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_curie_query_001" {
name = "text-search-curie-query-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-curie-query-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_davinci_doc_001" {
name = "text-search-davinci-doc-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-davinci-doc-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "text_search_davinci_query_001" {
name = "text-search-davinci-query-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "text-search-davinci-query-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
### Code search embedding ###
# Similar to text search embedding models, there are two input types supported by this family:
# code, for embedding code snippets to be retrieved, and text, for embedding natural language search queries.
resource "azurerm_cognitive_deployment" "code_search_ada_code_001" {
name = "code-search-ada-code-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "code-search-ada-code-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "code_search_ada_text_001" {
name = "code-search-ada-text-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "code-search-ada-text-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "code_search_babbage_code_001" {
name = "code-search-babbage-code-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "code-search-babbage-code-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
resource "azurerm_cognitive_deployment" "code_search_babbage_text_001" {
name = "code-search-babbage-text-001"
cognitive_account_id = azurerm_cognitive_account.openai.id
model {
format = "OpenAI"
name = "code-search-babbage-text-001"
version = "1"
}
scale {
type = "Standard"
}
depends_on = [azurerm_cognitive_account.openai]
}
##########################################################################
######################### GPT-4 models ##################################
##########################################################################
# A set of models that improve on GPT-3.5 and can understand as well as generate natural language and code.
# Due to high demand access to this model series is currently only available by request. To request access, existing Azure OpenAI customers can apply by filling out this form: https://aka.ms/oai/get-gpt4
# These models can only be used with the Chat Completion API.
# # GPT-4 (Max request tokens: 8,192)
# resource "azurerm_cognitive_deployment" "gpt-4" {
# name = "gpt-4"
# cognitive_account_id = azurerm_cognitive_account.openai.id
# model {
# format = "OpenAI"
# name = "gpt-4"
# version = "1"
# }
# scale {
# type = "Standard"
# }
# depends_on = [azurerm_cognitive_account.openai]
# }
# # GPT-4 32k (Max request tokens: 32,768)
# resource "azurerm_cognitive_deployment" "gpt-4" {
# name = "gpt-4-32k"
# cognitive_account_id = azurerm_cognitive_account.openai.id
# model {
# format = "OpenAI"
# name = "gpt-4-32k"
# version = "1"
# }
# scale {
# type = "Standard"
# }
# depends_on = [azurerm_cognitive_account.openai]
# }
¡Saludos!