From 56e9ea48170d6916ce8986025c87826c8a1741d0 Mon Sep 17 00:00:00 2001 From: msakande <17515964+msakande@users.noreply.github.com> Date: Thu, 29 Aug 2024 14:45:04 -0500 Subject: [PATCH 01/11] add serverless support for phi 3-5 vision --- .../how-to/deploy-models-phi-3-5-vision.md | 219 +++++++++++++++++- 1 file changed, 216 insertions(+), 3 deletions(-) diff --git a/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md b/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md index 207dd6c34a..55b896d8f2 100644 --- a/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md +++ b/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md @@ -5,7 +5,7 @@ description: Learn how to use Phi-3.5 chat model with vision with Azure AI Studi ms.service: azure-ai-studio manager: scottpolly ms.topic: how-to -ms.date: 08/19/2024 +ms.date: 08/29/2024 ms.reviewer: kritifaujdar reviewer: fkriti ms.author: mopeakande @@ -41,6 +41,15 @@ To use Phi-3.5 chat model with vision with Azure AI Studio, you need the followi ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure AI Studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy the model to serverless API endpoints](deploy-models-serverless.md) + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -103,6 +112,9 @@ client = ChatCompletionsClient( ) ``` +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -215,7 +227,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormatText +from azure.ai.inference.models import ChatCompletionsResponseFormat response = client.complete( messages=[ @@ -228,7 +240,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format=ChatCompletionsResponseFormatText(), + response_format={ "type": ChatCompletionsResponseFormat.TEXT }, ) ``` @@ -266,6 +278,42 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety + +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. + +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. + + +```python +from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage + +try: + response = client.complete( + messages=[ + SystemMessage(content="You are an AI assistant that helps people find information."), + UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."), + ] + ) + + print(response.choices[0].message.content) + +except HttpResponseError as ex: + if ex.status_code == 400: + response = ex.response.json() + if isinstance(response, dict) and "error" in response: + print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}") + else: + raise + raise +``` + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -361,6 +409,15 @@ To use Phi-3.5 chat model with vision with Azure AI Studio, you need the followi ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure AI Studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy the model to serverless API endpoints](deploy-models-serverless.md) + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -421,6 +478,9 @@ const client = new ModelClient( ); ``` +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -603,6 +663,48 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety + +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. + +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. + + +```javascript +try { + var messages = [ + { role: "system", content: "You are an AI assistant that helps people find information." }, + { role: "user", content: "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." }, + ]; + + var response = await client.path("/chat/completions").post({ + body: { + messages: messages, + } + }); + + console.log(response.body.choices[0].message.content); +} +catch (error) { + if (error.status_code == 400) { + var response = JSON.parse(error.response._content); + if (response.error) { + console.log(`Your request triggered an ${response.error.code} error:\n\t ${response.error.message}`); + } + else + { + throw error; + } + } +} +``` + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -704,6 +806,15 @@ To use Phi-3.5 chat model with vision with Azure AI Studio, you need the followi ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure AI Studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy the model to serverless API endpoints](deploy-models-serverless.md) + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -779,6 +890,9 @@ client = new ChatCompletionsClient( ); ``` +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -958,6 +1072,48 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety + +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. + +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. + + +```csharp +try +{ + requestOptions = new ChatCompletionsOptions() + { + Messages = { + new ChatRequestSystemMessage("You are an AI assistant that helps people find information."), + new ChatRequestUserMessage( + "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." + ), + }, + }; + + response = client.Complete(requestOptions); + Console.WriteLine(response.Value.Choices[0].Message.Content); +} +catch (RequestFailedException ex) +{ + if (ex.ErrorCode == "content_filter") + { + Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}"); + } + else + { + throw; + } +} +``` + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -1044,6 +1200,15 @@ To use Phi-3.5 chat model with vision with Azure AI Studio, you need the followi ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure AI Studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy the model to serverless API endpoints](deploy-models-serverless.md) + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -1073,6 +1238,9 @@ First, create the client to consume the model. The following code uses an endpoi When you deploy the model to a self-hosted online endpoint with **Microsoft Entra ID** support, you can use the following code snippet to create a client. +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -1323,6 +1491,47 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety + +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. + +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. + + +```json +{ + "messages": [ + { + "role": "system", + "content": "You are an AI assistant that helps people find information." + }, + { + "role": "user", + "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." + } + ] +} +``` + + +```json +{ + "error": { + "message": "The response was filtered due to the prompt triggering Microsoft's content management policy. Please modify your prompt and retry.", + "type": null, + "param": "prompt", + "code": "content_filter", + "status": 400 + } +} +``` + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -1413,6 +1622,10 @@ For more examples of how to use Phi-3 family models, see the following examples | LiteLLM | Python | [Link](https://aka.ms/phi-3/litellm-sample) | +## Cost and quota considerations for Phi-3 family models deployed as serverless API endpoints + +Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios. + ## Cost and quota considerations for Phi-3 family models deployed to managed compute Phi-3 family models deployed to managed compute are billed based on core hours of the associated compute instance. The cost of the compute instance is determined by the size of the instance, the number of instances running, and the run duration. From ac40dda360e20358765b6142d5f08baad679a223 Mon Sep 17 00:00:00 2001 From: msakande <17515964+msakande@users.noreply.github.com> Date: Thu, 29 Aug 2024 16:06:22 -0500 Subject: [PATCH 02/11] update azure ML article --- .../how-to-deploy-models-phi-3-5-vision.md | 215 +++++++++++++++++- 1 file changed, 214 insertions(+), 1 deletion(-) diff --git a/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md b/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md index b8df2003ca..2c5381714b 100644 --- a/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md +++ b/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md @@ -6,7 +6,7 @@ ms.service: azure-machine-learning ms.subservice: inferencing manager: scottpolly ms.topic: how-to -ms.date: 08/19/2024 +ms.date: 08/29/2024 ms.reviewer: kritifaujdar reviewer: fkriti ms.author: mopeakande @@ -40,6 +40,16 @@ To use Phi-3.5 chat model with vision with Azure Machine Learning, you need the ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure Machine Learning studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](how-to-deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy models as serverless API endpoints](how-to-deploy-models-serverless.md) + + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -102,6 +112,10 @@ client = ChatCompletionsClient( ) ``` +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -265,6 +279,44 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. +```csharp +try +{ + requestOptions = new ChatCompletionsOptions() + { + Messages = { + new ChatRequestSystemMessage("You are an AI assistant that helps people find information."), + new ChatRequestUserMessage( + "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." + ), + }, + }; + response = client.Complete(requestOptions); + Console.WriteLine(response.Value.Choices[0].Message.Content); +} +catch (RequestFailedException ex) +{ + if (ex.ErrorCode == "content_filter") + { + Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}"); + } + else + { + throw; + } +} +``` + + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -360,6 +412,16 @@ To use Phi-3.5 chat model with vision with Azure Machine Learning studio, you ne ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure Machine Learning studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](how-to-deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy models as serverless API endpoints](how-to-deploy-models-serverless.md) + + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -420,6 +482,10 @@ const client = new ModelClient( ); ``` +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -602,6 +668,44 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. +```csharp +try +{ + requestOptions = new ChatCompletionsOptions() + { + Messages = { + new ChatRequestSystemMessage("You are an AI assistant that helps people find information."), + new ChatRequestUserMessage( + "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." + ), + }, + }; + response = client.Complete(requestOptions); + Console.WriteLine(response.Value.Choices[0].Message.Content); +} +catch (RequestFailedException ex) +{ + if (ex.ErrorCode == "content_filter") + { + Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}"); + } + else + { + throw; + } +} +``` + + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -703,6 +807,16 @@ To use Phi-3.5 chat model with vision with Azure Machine Learning studio, you ne ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure Machine Learning studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](how-to-deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy models as serverless API endpoints](how-to-deploy-models-serverless.md) + + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -778,6 +892,10 @@ client = new ChatCompletionsClient( ); ``` +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -957,6 +1075,44 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. +```csharp +try +{ + requestOptions = new ChatCompletionsOptions() + { + Messages = { + new ChatRequestSystemMessage("You are an AI assistant that helps people find information."), + new ChatRequestUserMessage( + "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." + ), + }, + }; + response = client.Complete(requestOptions); + Console.WriteLine(response.Value.Choices[0].Message.Content); +} +catch (RequestFailedException ex) +{ + if (ex.ErrorCode == "content_filter") + { + Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}"); + } + else + { + throw; + } +} +``` + + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -1043,6 +1199,16 @@ To use Phi-3.5 chat model with vision with Azure Machine Learning studio, you ne ### A model deployment +**Deployment to serverless APIs** + +Phi-3.5 chat model with vision can be deployed to serverless API endpoints with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. + +Deployment to a serverless API endpoint doesn't require quota from your subscription. If your model isn't deployed already, use the Azure Machine Learning studio, Azure Machine Learning SDK for Python, the Azure CLI, or ARM templates to [deploy the model as a serverless API](how-to-deploy-models-serverless.md). + +> [!div class="nextstepaction"] +> [Deploy models as serverless API endpoints](how-to-deploy-models-serverless.md) + + **Deployment to a self-hosted managed compute** Phi-3.5 chat model with vision can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. @@ -1072,6 +1238,9 @@ First, create the client to consume the model. The following code uses an endpoi When you deploy the model to a self-hosted online endpoint with **Microsoft Entra ID** support, you can use the following code snippet to create a client. +> [!NOTE] +> Currently, serverless API endpoints do not support using Microsoft Entra ID for authentication. + ### Get the model's capabilities The `/info` route returns information about the model that is deployed to the endpoint. Return the model's information by calling the following method: @@ -1322,6 +1491,47 @@ The following extra parameters can be passed to Phi-3.5 chat model with vision: | `n` | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. | `int` | +### Apply content safety + +The Azure AI model inference API supports [Azure AI content safety](https://aka.ms/azureaicontentsafety). When you use deployments with Azure AI content safety turned on, inputs and outputs pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. + +The following example shows how to handle events when the model detects harmful content in the input prompt and content safety is enabled. + + +```json +{ + "messages": [ + { + "role": "system", + "content": "You are an AI assistant that helps people find information." + }, + { + "role": "user", + "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." + } + ] +} +``` + + +```json +{ + "error": { + "message": "The response was filtered due to the prompt triggering Microsoft's content management policy. Please modify your prompt and retry.", + "type": null, + "param": "prompt", + "code": "content_filter", + "status": 400 + } +} +``` + +> [!TIP] +> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety). + +> [!NOTE] +> Azure AI content safety is only available for models deployed as serverless API endpoints. + ## Use chat completions with images Phi-3.5-vision-Instruct can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Phi-3.5-vision-Instruct for vision in a chat fashion: @@ -1412,6 +1622,9 @@ For more examples of how to use Phi-3 family models, see the following examples | LiteLLM | Python | [Link](https://aka.ms/phi-3/litellm-sample) | +## Cost and quota considerations for Phi-3 family models deployed as serverless API endpoints +Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios. + ## Cost and quota considerations for Phi-3 family models deployed to managed compute Phi-3 family models deployed to managed compute are billed based on core hours of the associated compute instance. The cost of the compute instance is determined by the size of the instance, the number of instances running, and the run duration. From b92629c5b2e870a10ccd8cea485ff81fb03616ac Mon Sep 17 00:00:00 2001 From: msakande <17515964+msakande@users.noreply.github.com> Date: Thu, 29 Aug 2024 16:08:54 -0500 Subject: [PATCH 03/11] code update --- .../machine-learning/how-to-deploy-models-phi-3-5-vision.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md b/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md index 2c5381714b..68f69e9fc7 100644 --- a/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md +++ b/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md @@ -228,7 +228,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](reference-model-inference-api.md). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormatText +from azure.ai.inference.models import ChatCompletionsResponseFormat response = client.complete( messages=[ @@ -241,7 +241,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format=ChatCompletionsResponseFormatText(), + response_format={ "type": ChatCompletionsResponseFormat.TEXT }, ) ``` From f60c57c0c09d811972bfd55e26657bf582461e8e Mon Sep 17 00:00:00 2001 From: msakande <17515964+msakande@users.noreply.github.com> Date: Thu, 29 Aug 2024 16:16:06 -0500 Subject: [PATCH 04/11] updating region availability and model catalog --- articles/ai-studio/how-to/model-catalog-overview.md | 2 +- articles/ai-studio/includes/region-availability-maas.md | 1 + articles/machine-learning/concept-model-catalog.md | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/articles/ai-studio/how-to/model-catalog-overview.md b/articles/ai-studio/how-to/model-catalog-overview.md index dbd0c560bd..4ebed5d79a 100644 --- a/articles/ai-studio/how-to/model-catalog-overview.md +++ b/articles/ai-studio/how-to/model-catalog-overview.md @@ -68,7 +68,7 @@ Llama family models | Llama-2-7b
Llama-2-7b-chat
Llama-2-13b
Llam Mistral family models | mistralai-Mixtral-8x22B-v0-1
mistralai-Mixtral-8x22B-Instruct-v0-1
mistral-community-Mixtral-8x22B-v0-1
mistralai-Mixtral-8x7B-v01
mistralai-Mistral-7B-Instruct-v0-2
mistralai-Mistral-7B-v01
mistralai-Mixtral-8x7B-Instruct-v01
mistralai-Mistral-7B-Instruct-v01 | Mistral-large (2402)
Mistral-large (2407)
Mistral-small
Mistral-NeMo Cohere family models | Not available | Cohere-command-r-plus
Cohere-command-r
Cohere-embed-v3-english
Cohere-embed-v3-multilingual
Cohere-rerank-v3-english
Cohere-rerank-v3-multilingual JAIS | Not available | jais-30b-chat -Phi-3 family models | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct
Phi-3-vision-128k-Instruct
Phi-3.5-mini-Instruct
Phi-3.5-vision-Instruct
Phi-3.5-MoE-Instruct | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct

Phi-3.5-mini-Instruct +Phi-3 family models | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct
Phi-3-vision-128k-Instruct
Phi-3.5-mini-Instruct
Phi-3.5-vision-Instruct
Phi-3.5-MoE-Instruct | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct

Phi-3.5-mini-Instruct
Phi-3.5-vision-Instruct Nixtla | Not available | TimeGEN-1 Other models | Available | Not available diff --git a/articles/ai-studio/includes/region-availability-maas.md b/articles/ai-studio/includes/region-availability-maas.md index bfcc93ff1e..340256619e 100644 --- a/articles/ai-studio/includes/region-availability-maas.md +++ b/articles/ai-studio/includes/region-availability-maas.md @@ -44,6 +44,7 @@ Llama 3.1 405B Instruct | [Microsoft Managed Countries](/partner-center/marketp |Model |Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning | |---------|---------|---------|---------| +Phi-3.5-vision-Instruct | Not applicable | East US 2
Sweden Central | Not available | Phi-3.5-Mini-Instruct | Not applicable | East US 2
Sweden Central | Not available | Phi-3-Mini-4k-Instruct
Phi-3-Mini-128K-Instruct | Not applicable | East US 2
Sweden Central | East US 2 | Phi-3-Small-8K-Instruct
Phi-3-Small-128K-Instruct | Not applicable | East US 2
Sweden Central | Not available | diff --git a/articles/machine-learning/concept-model-catalog.md b/articles/machine-learning/concept-model-catalog.md index 59b615520f..7a3eae4092 100644 --- a/articles/machine-learning/concept-model-catalog.md +++ b/articles/machine-learning/concept-model-catalog.md @@ -59,7 +59,7 @@ Llama family models | Llama-2-7b
Llama-2-7b-chat
Llama-2-13b
Lla Mistral family models | mistralai-Mixtral-8x22B-v0-1
mistralai-Mixtral-8x22B-Instruct-v0-1
mistral-community-Mixtral-8x22B-v0-1
mistralai-Mixtral-8x7B-v01
mistralai-Mistral-7B-Instruct-v0-2
mistralai-Mistral-7B-v01
mistralai-Mixtral-8x7B-Instruct-v01
mistralai-Mistral-7B-Instruct-v01 | Mistral-large (2402)
Mistral-large (2407)
Mistral-small
Mistral-Nemo Cohere family models | Not available | Cohere-command-r-plus
Cohere-command-r
Cohere-embed-v3-english
Cohere-embed-v3-multilingual
Cohere-rerank-3-english
Cohere-rerank-3-multilingual JAIS | Not available | jais-30b-chat -Phi-3 family models | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct
Phi-3-vision-128k-Instruct
Phi-3.5-mini-Instruct
Phi-3.5-vision-Instruct
Phi-3.5-MoE-Instruct | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct

Phi-3.5-mini-Instruct +Phi-3 family models | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct
Phi-3-vision-128k-Instruct
Phi-3.5-mini-Instruct
Phi-3.5-vision-Instruct
Phi-3.5-MoE-Instruct | Phi-3-mini-4k-Instruct
Phi-3-mini-128k-Instruct
Phi-3-small-8k-Instruct
Phi-3-small-128k-Instruct
Phi-3-medium-4k-instruct
Phi-3-medium-128k-instruct

Phi-3.5-mini-Instruct
Phi-3.5-vision-Instruct Nixtla | Not available | TimeGEN-1 Other models | Available | Not available From 948e5642887d56d882feaf0c9fc0f42b56ec5f2a Mon Sep 17 00:00:00 2001 From: Larry <890747+Blackmist@users.noreply.github.com> Date: Fri, 30 Aug 2024 14:18:47 -0400 Subject: [PATCH 05/11] removing old breadcrumbs --- articles/machine-learning/breadcrumb/toc.yml | 7 ------- 1 file changed, 7 deletions(-) diff --git a/articles/machine-learning/breadcrumb/toc.yml b/articles/machine-learning/breadcrumb/toc.yml index 79887229f0..ef87a8bcfe 100644 --- a/articles/machine-learning/breadcrumb/toc.yml +++ b/articles/machine-learning/breadcrumb/toc.yml @@ -73,10 +73,3 @@ tocHref: /azure/devops/pipelines/languages/ topicHref: /azure/devops/pipelines/index -- name: Azure - tocHref: /azure/ - topicHref: /azure/index - items: - - name: Machine Learning - tocHref: /power-bi/connect-data/ - topicHref: /azure/machine-learning/v1/introduction From 9f0f854ec1b47d2131ad7e9dafc800562d48460e Mon Sep 17 00:00:00 2001 From: Zoubaolian <82649822+sally-baolian@users.noreply.github.com> Date: Tue, 3 Sep 2024 16:09:59 +0800 Subject: [PATCH 06/11] Update speech-synthesis-markup-voice.md --- .../ai-services/speech-service/speech-synthesis-markup-voice.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/articles/ai-services/speech-service/speech-synthesis-markup-voice.md b/articles/ai-services/speech-service/speech-synthesis-markup-voice.md index 00a16202c1..cb2ad2cb7e 100644 --- a/articles/ai-services/speech-service/speech-synthesis-markup-voice.md +++ b/articles/ai-services/speech-service/speech-synthesis-markup-voice.md @@ -244,6 +244,8 @@ The following table describes the usage of the `` element's attri > [!NOTE] > The `` element is incompatible with the `prosody` and `break` elements. You can't adjust pause and prosody like pitch, contour, rate, or volume in this element. +> +> Non-multilingual voices don't support the `` element by design. ### Multilingual voices with the lang element From a3d0629c6f54bbe235349b9285773e1cc6e06b64 Mon Sep 17 00:00:00 2001 From: Larry <890747+Blackmist@users.noreply.github.com> Date: Tue, 3 Sep 2024 09:36:05 -0400 Subject: [PATCH 07/11] addressing feedback --- articles/machine-learning/how-to-configure-environment.md | 2 ++ articles/machine-learning/v1/how-to-configure-environment.md | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/articles/machine-learning/how-to-configure-environment.md b/articles/machine-learning/how-to-configure-environment.md index fdf2dc9311..887fedfcad 100644 --- a/articles/machine-learning/how-to-configure-environment.md +++ b/articles/machine-learning/how-to-configure-environment.md @@ -15,6 +15,8 @@ ms.custom: devx-track-python, devx-track-azurecli, py-fresh-zinc # Set up a Python development environment for Azure Machine Learning +[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)] + Learn how to configure a Python development environment for Azure Machine Learning. The following table shows each development environment covered in this article, along with pros and cons. diff --git a/articles/machine-learning/v1/how-to-configure-environment.md b/articles/machine-learning/v1/how-to-configure-environment.md index 48b992d896..eb08b64078 100644 --- a/articles/machine-learning/v1/how-to-configure-environment.md +++ b/articles/machine-learning/v1/how-to-configure-environment.md @@ -15,6 +15,8 @@ ms.custom: UpdateFrequency5, devx-track-python, devx-track-azurecli, sdkv1, buil # Set up a Python development environment for Azure Machine Learning (v1) +[!INCLUDE [sdk v1](../includes/machine-learning-sdk-v1.md)] + Learn how to configure a Python development environment for Azure Machine Learning. The following table shows each development environment covered in this article, along with pros and cons. @@ -218,5 +220,5 @@ For more information, see [Data Science Virtual Machines](https://azure.microsof ## Next steps -- [Train and deploy a model](../tutorial-train-deploy-notebook.md) on Azure Machine Learning with the MNIST dataset. +- [Train and deploy a model](tutorial-train-deploy-notebook.md) on Azure Machine Learning with the MNIST dataset. - See the [Azure Machine Learning SDK for Python reference](/python/api/overview/azure/ml/intro). From d80b4316af761b12db05c10f9425f4dc0fedd415 Mon Sep 17 00:00:00 2001 From: Sheri Gilley Date: Tue, 3 Sep 2024 09:26:03 -0500 Subject: [PATCH 08/11] freshness updates --- .../tutorial-azure-ml-in-a-day.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/articles/machine-learning/tutorial-azure-ml-in-a-day.md b/articles/machine-learning/tutorial-azure-ml-in-a-day.md index 19166b7dbd..9b6bbce2f7 100644 --- a/articles/machine-learning/tutorial-azure-ml-in-a-day.md +++ b/articles/machine-learning/tutorial-azure-ml-in-a-day.md @@ -9,7 +9,7 @@ ms.topic: quickstart author: sdgilley ms.author: sgilley ms.reviewer: sgilley -ms.date: 10/20/2023 +ms.date: 09/03/2024 ms.custom: - sdkv2 - build-2023 @@ -55,7 +55,7 @@ Watch this video for an overview of the steps in this quickstart. [!INCLUDE [notebook set kernel](includes/prereq-set-kernel.md)] - + ## Create handle to workspace @@ -78,9 +78,9 @@ from azure.identity import DefaultAzureCredential # authenticate credential = DefaultAzureCredential() -SUBSCRIPTION="" -RESOURCE_GROUP="" -WS_NAME="" +SUBSCRIPTION = "" +RESOURCE_GROUP = "" +WS_NAME = "" # Get a handle to the workspace ml_client = MLClient( credential=credential, @@ -95,10 +95,10 @@ ml_client = MLClient( ```python -# Verify that the handle works correctly. +# Verify that the handle works correctly. # If you ge an error here, modify your SUBSCRIPTION, RESOURCE_GROUP, and WS_NAME in the previous cell. ws = ml_client.workspaces.get(WS_NAME) -print(ws.location,":", ws.resource_group) +print(ws.location, ":", ws.resource_group) ``` ## Create training script @@ -238,7 +238,6 @@ You might need to select **Refresh** to see the new folder and script in your ** Now that you have a script that can perform the desired tasks, and a compute cluster to run the script, you'll use a general purpose **command** that can run command line actions. This command line action can directly call system commands or run a script. Here, you'll create input variables to specify the input data, split ratio, learning rate and registered model name. The command script will: - * Use an *environment* that defines software and runtime libraries needed for the training script. Azure Machine Learning provides many curated or ready-made environments, which are useful for common training and inference scenarios. You'll use one of those environments here. In [Tutorial: Train a model in Azure Machine Learning](tutorial-train-model.md), you'll learn how to create a custom environment. * Configure the command line action itself - `python main.py` in this case. The inputs/outputs are accessible in the command via the `${{ ... }}` notation. * In this sample, we access the data from a file on the internet. @@ -263,7 +262,7 @@ job = command( ), code="./src/", # location of source code command="python main.py --data ${{inputs.data}} --test_train_ratio ${{inputs.test_train_ratio}} --learning_rate ${{inputs.learning_rate}} --registered_model_name ${{inputs.registered_model_name}}", - environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest", + environment="azureml://registries/azureml/environments/sklearn-1.5/labels/latest", display_name="credit_default_prediction", ) ``` From 64f79876af4ec873c751929ebdad1497c290a7e3 Mon Sep 17 00:00:00 2001 From: Sheri Gilley Date: Tue, 3 Sep 2024 09:43:08 -0500 Subject: [PATCH 09/11] acrolinx pass --- .../tutorial-azure-ml-in-a-day.md | 42 +++++++++---------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/articles/machine-learning/tutorial-azure-ml-in-a-day.md b/articles/machine-learning/tutorial-azure-ml-in-a-day.md index 9b6bbce2f7..6e242f9dfb 100644 --- a/articles/machine-learning/tutorial-azure-ml-in-a-day.md +++ b/articles/machine-learning/tutorial-azure-ml-in-a-day.md @@ -22,13 +22,13 @@ ms.custom: [!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)] -This tutorial is an introduction to some of the most used features of the Azure Machine Learning service. In it, you will create, register and deploy a model. This tutorial will help you become familiar with the core concepts of Azure Machine Learning and their most common usage. +This tutorial is an introduction to some of the most used features of the Azure Machine Learning service. In it, you create, register, and deploy a model. This tutorial helps you become familiar with the core concepts of Azure Machine Learning and their most common usage. -You'll learn how to run a training job on a scalable compute resource, then deploy it, and finally test the deployment. +You learn how to run a training job on a scalable compute resource, then deploy it, and finally test the deployment. -You'll create a training script to handle the data preparation, train and register a model. Once you train the model, you'll *deploy* it as an *endpoint*, then call the endpoint for *inferencing*. +You create a training script to handle the data preparation, train, and register a model. Once you train the model, you deploy it as an *endpoint*, then call the endpoint for *inferencing*. -The steps you'll take are: +The steps you take are: > [!div class="checklist"] > * Set up a handle to your Azure Machine Learning workspace @@ -61,13 +61,13 @@ Watch this video for an overview of the steps in this quickstart. Before we dive in the code, you need a way to reference your workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. -You'll create `ml_client` for a handle to the workspace. You'll then use `ml_client` to manage resources and jobs. +You create `ml_client` for a handle to the workspace. You then use `ml_client` to manage resources and jobs. In the next cell, enter your Subscription ID, Resource Group name and Workspace name. To find these values: 1. In the upper right Azure Machine Learning studio toolbar, select your workspace name. -1. Copy the value for workspace, resource group and subscription ID into the code. -1. You'll need to copy one value, close the area and paste, then come back for the next one. +1. Copy the value for workspace, resource group and subscription ID into the code. +1. You need to copy one value, close the area and paste, then come back for the next one. :::image type="content" source="media/tutorial-azure-ml-in-a-day/find-credentials.png" alt-text="Screenshot: find the credentials for your code in the upper right of the toolbar."::: @@ -117,7 +117,7 @@ os.makedirs(train_src_dir, exist_ok=True) This script handles the preprocessing of the data, splitting it into test and train data. It then consumes this data to train a tree based model and return the output model. -[MLFlow](how-to-log-mlflow-models.md) will be used to log the parameters and metrics during our pipeline run. +[MLFlow](how-to-log-mlflow-models.md) is used to log the parameters and metrics during our pipeline run. The cell below uses IPython magic to write the training script into the directory you just created. @@ -235,13 +235,13 @@ You might need to select **Refresh** to see the new folder and script in your ** ## Configure the command -Now that you have a script that can perform the desired tasks, and a compute cluster to run the script, you'll use a general purpose **command** that can run command line actions. This command line action can directly call system commands or run a script. +Now that you have a script that can perform the desired tasks, and a compute cluster to run the script, you use a general purpose **command** that can run command line actions. This command line action can directly call system commands or run a script. -Here, you'll create input variables to specify the input data, split ratio, learning rate and registered model name. The command script will: -* Use an *environment* that defines software and runtime libraries needed for the training script. Azure Machine Learning provides many curated or ready-made environments, which are useful for common training and inference scenarios. You'll use one of those environments here. In [Tutorial: Train a model in Azure Machine Learning](tutorial-train-model.md), you'll learn how to create a custom environment. +Here, you create input variables to specify the input data, split ratio, learning rate and registered model name. The command script will: +* Use an *environment* that defines software and runtime libraries needed for the training script. Azure Machine Learning provides many curated or ready-made environments, which are useful for common training and inference scenarios. You use one of those environments here. In [Tutorial: Train a model in Azure Machine Learning](tutorial-train-model.md), you learn how to create a custom environment. * Configure the command line action itself - `python main.py` in this case. The inputs/outputs are accessible in the command via the `${{ ... }}` notation. * In this sample, we access the data from a file on the internet. -* Since a compute resource was not specified, the script will be run on a [serverless compute cluster](how-to-use-serverless-compute.md) that is automatically created. +* Since a compute resource wasn't specified, the script is run on a [serverless compute cluster](how-to-use-serverless-compute.md) that is automatically created. ```python @@ -269,7 +269,7 @@ job = command( ## Submit the job -It's now time to submit the job to run in Azure Machine Learning. This time you'll use `create_or_update` on `ml_client`. +It's now time to submit the job to run in Azure Machine Learning. This time you use `create_or_update` on `ml_client`. ```python @@ -280,7 +280,7 @@ ml_client.create_or_update(job) View the job in Azure Machine Learning studio by selecting the link in the output of the previous cell. -The output of this job will look like this in the Azure Machine Learning studio. Explore the tabs for various details like metrics, outputs etc. Once completed, the job will register a model in your workspace as a result of training. +The output of this job looks like this in the Azure Machine Learning studio. Explore the tabs for various details like metrics, outputs etc. Once completed, the job registers a model in your workspace as a result of training. :::image type="content" source="media/tutorial-azure-ml-in-a-day/view-job.gif" alt-text="Screenshot shows the overview page for the job."::: @@ -291,11 +291,11 @@ The output of this job will look like this in the Azure Machine Learning studio. Now deploy your machine learning model as a web service in the Azure cloud, an [`online endpoint`](concept-endpoints.md). -To deploy a machine learning service, you'll use the model you registered. +To deploy a machine learning service, you use the model you registered. ## Create a new online endpoint -Now that you have a registered model, it's time to create your online endpoint. The endpoint name needs to be unique in the entire Azure region. For this tutorial, you'll create a unique name using [`UUID`](https://en.wikipedia.org/wiki/Universally_unique_identifier). +Now that you have a registered model, it's time to create your online endpoint. The endpoint name needs to be unique in the entire Azure region. For this tutorial, you create a unique name using [`UUID`](https://en.wikipedia.org/wiki/Universally_unique_identifier). ```python @@ -349,9 +349,9 @@ print( ## Deploy the model to the endpoint -Once the endpoint is created, deploy the model with the entry script. Each endpoint can have multiple deployments. Direct traffic to these deployments can be specified using rules. Here you'll create a single deployment that handles 100% of the incoming traffic. We have chosen a color name for the deployment, for example, *blue*, *green*, *red* deployments, which is arbitrary. +Once the endpoint is created, deploy the model with the entry script. Each endpoint can have multiple deployments. Direct traffic to these deployments can be specified using rules. Here you create a single deployment that handles 100% of the incoming traffic. We chose a color name for the deployment, for example, *blue*, *green*, *red* deployments, which is arbitrary. -You can check the **Models** page on Azure Machine Learning studio, to identify the latest version of your registered model. Alternatively, the code below will retrieve the latest version number for you to use. +You can check the **Models** page on Azure Machine Learning studio, to identify the latest version of your registered model. Alternatively, the code below retrieves the latest version number for you to use. ```python @@ -362,7 +362,7 @@ latest_model_version = max( print(f'Latest model is version "{latest_model_version}" ') ``` -Deploy the latest version of the model. +Deploy the latest version of the model. ```python @@ -428,7 +428,7 @@ ml_client.online_endpoints.invoke( ## Clean up resources -If you're not going to use the endpoint, delete it to stop using the resource. Make sure no other deployments are using an endpoint before you delete it. +If you're not going to use the endpoint, delete it to stop using the resource. Make sure no other deployments are using an endpoint before you delete it. > [!NOTE] @@ -461,7 +461,7 @@ Now that you have an idea of what's involved in training and deploying a model, |Tutorial |Description | |---------|---------| -| [Upload, access and explore your data in Azure Machine Learning](tutorial-explore-data.md) | Store large data in the cloud and retrieve it from notebooks and scripts | +| [Upload, access, and explore your data in Azure Machine Learning](tutorial-explore-data.md) | Store large data in the cloud and retrieve it from notebooks and scripts | | [Model development on a cloud workstation](tutorial-cloud-workstation.md) | Start prototyping and developing machine learning models | | [Train a model in Azure Machine Learning](tutorial-train-model.md) | Dive in to the details of training a model | | [Deploy a model as an online endpoint](tutorial-deploy-model.md) | Dive in to the details of deploying a model | From 8228951a9d0a09f4304f353a6fd3c44d3c666663 Mon Sep 17 00:00:00 2001 From: Larry <890747+Blackmist@users.noreply.github.com> Date: Tue, 3 Sep 2024 11:44:07 -0400 Subject: [PATCH 10/11] freshness --- articles/machine-learning/how-to-assign-roles.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/articles/machine-learning/how-to-assign-roles.md b/articles/machine-learning/how-to-assign-roles.md index 2d7d02b289..416f3941e9 100644 --- a/articles/machine-learning/how-to-assign-roles.md +++ b/articles/machine-learning/how-to-assign-roles.md @@ -9,9 +9,10 @@ ms.topic: how-to ms.reviewer: None ms.author: larryfr author: Blackmist -ms.date: 03/11/2024 -ms.custom: how-to, devx-track-azurecli, devx-track-arm-template +ms.date: 09/03/2024 +ms.custom: how-to, devx-track-azurecli, devx-track-arm-template, FY25Q1-Linter monikerRange: 'azureml-api-1 || azureml-api-2' +# Customer Intent: As an admin, I want to understand what permissions I need to assign resources so my users can accomplish their tasks. --- # Manage access to Azure Machine Learning workspaces @@ -190,7 +191,7 @@ The following table is a summary of Azure Machine Learning activities and the pe | Submitting any type of run (V2) | Not required | Not required | Owner, contributor, or custom role allowing: `/workspaces/*/read`, `/workspaces/environments/write`, `/workspaces/jobs/*`, `/workspaces/metadata/artifacts/write`, `/workspaces/metadata/codes/*/write`, `/workspaces/environments/build/action`, `/workspaces/environments/readSecrets/action` | | Publishing pipelines and endpoints (V1) | Not required | Not required | Owner, contributor, or custom role allowing: `/workspaces/endpoints/pipelines/*`, `/workspaces/pipelinedrafts/*`, `/workspaces/modules/*` | | Publishing pipelines and endpoints (V2) | Not required | Not required | Owner, contributor, or custom role allowing: `/workspaces/endpoints/pipelines/*`, `/workspaces/pipelinedrafts/*`, `/workspaces/components/*` | -| Attach an AKS resource 2 | Not required | Owner or contributor on the resource group that contains AKS | +| Attach an AKS resource 2 | Not required | Owner or contributor on the resource group that contains AKS | | | Deploying a registered model on an AKS/ACI resource | Not required | Not required | Owner, contributor, or custom role allowing: `/workspaces/services/aks/write`, `/workspaces/services/aci/write` | | Scoring against a deployed AKS endpoint | Not required | Not required | Owner, contributor, or custom role allowing: `/workspaces/services/aks/score/action`, `/workspaces/services/aks/listkeys/action` (when you don't use Microsoft Entra auth) OR `/workspaces/read` (when you use token auth) | | Accessing storage using interactive notebooks | Not required | Not required | Owner, contributor, or custom role allowing: `/workspaces/computes/read`, `/workspaces/notebooks/samples/read`, `/workspaces/notebooks/storage/*`, `/workspaces/listStorageAccountKeys/action`, `/workspaces/listNotebookAccessToken/read`| @@ -210,7 +211,7 @@ The following table is a summary of Azure Machine Learning activities and the pe There are certain differences between actions for V1 APIs and V2 APIs. -| Asset | Action path for V1 API | Action path for V2 API +| Asset | Action path for V1 API | Action path for V2 API | | ----- | ----- | ----- | | Dataset | Microsoft.MachineLearningServices/workspaces/datasets | Microsoft.MachineLearningServices/workspaces/datasets/versions | | Experiment runs and jobs | Microsoft.MachineLearningServices/workspaces/experiments | Microsoft.MachineLearningServices/workspaces/jobs | @@ -492,7 +493,7 @@ Here are a few things to be aware of while you use Azure RBAC: - It can sometimes take up to one hour for your new role assignments to take effect over cached permissions across the stack. -## Next steps +## Related content - [Enterprise security and governance for Azure Machine Learning](concept-enterprise-security.md) - [Secure Azure Machine Learning workspace resources using virtual networks](how-to-network-security-overview.md) From 20d14fdcde452d17cc60a337fe06631ed5cdbd3b Mon Sep 17 00:00:00 2001 From: Larry <890747+Blackmist@users.noreply.github.com> Date: Tue, 3 Sep 2024 11:50:36 -0400 Subject: [PATCH 11/11] acrolinx --- .../machine-learning/how-to-assign-roles.md | 28 +++++++++---------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/articles/machine-learning/how-to-assign-roles.md b/articles/machine-learning/how-to-assign-roles.md index 416f3941e9..fd4da2e872 100644 --- a/articles/machine-learning/how-to-assign-roles.md +++ b/articles/machine-learning/how-to-assign-roles.md @@ -31,7 +31,7 @@ This article explains how to manage access (authorization) to Azure Machine Lear ## Default roles -Azure Machine Learning workspaces have built-in roles that are available by default. When adding users to a workspace, they can be assigned one of the following roles. +Azure Machine Learning workspaces have built-in roles that are available by default. When you add users to a workspace, they can be assigned one of the following roles. | Role | Access level | | --- | --- | @@ -41,13 +41,13 @@ Azure Machine Learning workspaces have built-in roles that are available by defa | **Contributor** | View, create, edit, or delete (where applicable) assets in a workspace. For example, contributors can create an experiment, create or attach a compute cluster, submit a run, and deploy a web service. | | **Owner** | Full access to the workspace, including the ability to view, create, edit, or delete (where applicable) assets in a workspace. Additionally, you can change role assignments. | -In addition, [Azure Machine Learning registries](how-to-manage-registries.md) have an AzureML Registry User role that can be assigned to a registry resource to grant user-level permissions to data scientists. For administrator-level permissions to create or delete registries, use the Contributor or Owner role. +In addition, [Azure Machine Learning registries](how-to-manage-registries.md) have an Azure Machine Learning Registry User role that can be assigned to a registry resource to grant user-level permissions to data scientists. For administrator-level permissions to create or delete registries, use the Contributor or Owner role. | Role | Access level | | --- | --- | | **AzureML Registry User** | Can get registries, and read, write, and delete assets within them. Can't create new registry resources or delete them. | -You can combine the roles to grant different levels of access. For example, you can grant a workspace user both AzureML Data Scientist and AzureML Compute Operator roles to permit the user to perform experiments while creating computes in a self-service manner. +You can combine the roles to grant different levels of access. For example, you can grant a workspace user both **AzureML Data Scientist** and **AzureML Compute Operator** roles to permit the user to perform experiments while creating computes in a self-service manner. > [!IMPORTANT] > Role access can be scoped to multiple levels in Azure. For example, someone with owner access to a workspace may not have owner access to the resource group that contains the workspace. For more information, see [How Azure RBAC works](/azure/role-based-access-control/overview#how-azure-rbac-works). @@ -74,13 +74,13 @@ az role assignment create --role "Contributor" --assignee "joe@contoso.com" --re You can use Microsoft Entra security groups to manage access to workspaces. This approach has following benefits: * Team or project leaders can manage user access to workspace as security group owners, without needing Owner role on the workspace resource directly. - * You can organize, manage and revoke users' permissions on workspace and other resources as a group, without having to manage permissions on user-by-user basis. + * You can organize, manage, and revoke users' permissions on workspace and other resources as a group, without having to manage permissions on user-by-user basis. * Using Microsoft Entra groups helps you to avoid reaching the [subscription limit](/azure/role-based-access-control/troubleshoot-limits) on role assignments. To use Microsoft Entra security groups: 1. [Create a security group](/azure/active-directory/fundamentals/active-directory-groups-view-azure-portal). 2. [Add a group owner](/azure/active-directory/fundamentals/how-to-manage-groups#add-or-remove-members-and-owners). This user has permissions to add or remove group members. The group owner isn't required to be group member, or have direct RBAC role on the workspace. - 3. Assign the group an RBAC role on the workspace, such as AzureML Data Scientist, Reader, or Contributor. + 3. Assign the group an RBAC role on the workspace, such as **AzureML Data Scientist**, **Reader**, or **Contributor**. 4. [Add group members](/azure/active-directory/fundamentals/how-to-manage-groups#add-or-remove-members-and-owners). The members gain access to the workspace. ## Create custom role @@ -115,7 +115,7 @@ To create a custom role, first construct a role definition JSON file that specif > [!TIP] > You can change the `AssignableScopes` field to set the scope of this custom role at the subscription level, the resource group level, or a specific workspace level. -> The above custom role is just an example, see some suggested [custom roles for the Azure Machine Learning service](#customroles). +> The previous custom role is just an example, see some suggested [custom roles for the Azure Machine Learning service](#customroles). This custom role can do everything in the workspace except for the following actions: @@ -172,7 +172,7 @@ You need to have permissions on the entire scope of your new role definition. Fo ## Use Azure Resource Manager templates for repeatability -If you anticipate that you'll need to recreate complex role assignments, an Azure Resource Manager template can be a significant help. The [machine-learning-dependencies-role-assignment template](https://github.com/Azure/azure-quickstart-templates/tree/master//quickstarts/microsoft.machinelearningservices/machine-learning-dependencies-role-assignment) shows how role assignments can be specified in source code for reuse. +If you anticipate that you need to recreate complex role assignments, an Azure Resource Manager template can be a significant help. The [machine-learning-dependencies-role-assignment template](https://github.com/Azure/azure-quickstart-templates/tree/master//quickstarts/microsoft.machinelearningservices/machine-learning-dependencies-role-assignment) shows how role assignments can be specified in source code for reuse. ## Common scenarios @@ -223,7 +223,7 @@ You can make custom roles compatible with both V1 and V2 APIs by including both ### Create a workspace using a customer-managed key -When using a customer-managed key (CMK), an Azure Key Vault is used to store the key. The user or service principal used to create the workspace must have owner or contributor access to the key vault. +When you use a customer-managed key (CMK), an Azure Key Vault is used to store the key. The user or service principal used to create the workspace must have owner or contributor access to the key vault. If your workspace is configured with a **user-assigned managed identity**, the identity must be granted the following roles. These roles allow the managed identity to create the Azure Storage, Azure Cosmos DB, and Azure Search resources used when using a customer-managed key: @@ -232,7 +232,7 @@ If your workspace is configured with a **user-assigned managed identity**, the i - `Microsoft.DocumentDB/databaseAccounts/write` -Within the key vault, the user or service principal must have create, get, delete, and purge access to the key through a key vault access policy. For more information, see [Azure Key Vault security](/azure/key-vault/general/security-features#controlling-access-to-key-vault-data). +Within the key vault, the user or service principal must have **create**, **get**, **delete**, and **purge** access to the key through a key vault access policy. For more information, see [Azure Key Vault security](/azure/key-vault/general/security-features#controlling-access-to-key-vault-data). ### User-assigned managed identity with Azure Machine Learning compute cluster @@ -244,8 +244,8 @@ To perform MLflow operations with your Azure Machine Learning workspace, use the | MLflow operation | Scope | | --- | --- | -| (V1) List, read, create, update or delete experiments | `Microsoft.MachineLearningServices/workspaces/experiments/*` | -| (V2) List, read, create, update or delete jobs | `Microsoft.MachineLearningServices/workspaces/jobs/*` | +| (V1) List, read, create, update, or delete experiments | `Microsoft.MachineLearningServices/workspaces/experiments/*` | +| (V2) List, read, create, update, or delete jobs | `Microsoft.MachineLearningServices/workspaces/jobs/*` | | Get registered model by name, fetch a list of all registered models in the registry, search for registered models, latest version models for each requests stage, get a registered model's version, search model versions, get URI where a model version's artifacts are stored, search for runs by experiment IDs | `Microsoft.MachineLearningServices/workspaces/models/*/read` | | Create a new registered model, update a registered model's name/description, rename existing registered model, create new version of the model, update a model version's description, transition a registered model to one of the stages | `Microsoft.MachineLearningServices/workspaces/models/*/write` | | Delete a registered model along with all its version, delete specific versions of a registered model | `Microsoft.MachineLearningServices/workspaces/models/*/delete` | @@ -448,7 +448,7 @@ Allows you to perform all operations within the scope of a workspace, **except** * Creating a new workspace * Assigning subscription or workspace level quotas -The workspace admin also cannot create a new role. It can only assign existing built-in or custom roles within the scope of their workspace: +The workspace admin also can't create a new role. It can only assign existing built-in or custom roles within the scope of their workspace: *workspace_admin_custom_role.json* : @@ -475,7 +475,7 @@ The workspace admin also cannot create a new role. It can only assign existing b ### Data labeling -There is a built-in role for data labeling, scoped only to labeling data. The following custom roles give other levels of access for a data labeling project. +There's a built-in role for data labeling, scoped only to labeling data. The following custom roles give other levels of access for a data labeling project. [!INCLUDE [custom-role-data-labeling](includes/custom-role-data-labeling.md)] @@ -483,7 +483,7 @@ There is a built-in role for data labeling, scoped only to labeling data. The fo Here are a few things to be aware of while you use Azure RBAC: -- When you create a resource in Azure, such as a workspace, you're not directly the owner of the resource. Your role is inherited from the highest scope role that you're authorized against in that subscription. As an example if you're a Network Administrator, and have the permissions to create a Machine Learning workspace, you would be assigned the Network Administrator role against that workspace, and not the Owner role. +- When you create a resource in Azure, such as a workspace, you're not directly the owner of the resource. Your role is inherited from the highest scope role that you're authorized against in that subscription. As an example, if you're a Network Administrator and have the permissions to create a Machine Learning workspace, you would be assigned the **Network Administrator** role against that workspace. Not the **Owner** role. - To perform quota operations in a workspace, you need subscription level permissions. This means setting either subscription level quota or workspace level quota for your managed compute resources can only happen if you have write permissions at the subscription scope.