Azure Data Factory: Keep your secrets in Azure Key Vault

Tibor Vekony
6 min readJan 5, 2022

Summary of the Article

This article will showcase the options currently available to link Azure Key Vault (AKV) with Azure Data Factory (ADF). Besides the basic how-to, I’ll also show how the linked AKV can be used in ADF.

In the next two sections, I’ll quickly go over what ADF and AKV are. If you’re already familiar with them, feel free to skip ahead!

What is Azure Data Factory (ADF)?

Azure Data Factory (ADF) is a service on the Microsoft Azure platform. It is a fully managed, no-code (drag & drop components onto a canvas), serverless (servers are taken care of by Microsoft, costs are consumption-based, computing power allocated on demand. Yes, “serverless” is a misnomer), integration service to build & orchestrate ETL/ELT pipelines.

What is Azure Key Vault (AKV)?

Azure Key Vault (AKV) is a service on the Microsoft Azure platform, for securely storing and accessing secrets. A secret is anything that shouldn’t be shared publicly or even within the company, prime candidates are: API keys, passwords, user names, connection strings, certificates & cryptographic keys, among many other things. AKV also logs which secrets were used when & by whom.

How to Connect ADF to AKV?

It’s pretty simple, really.

In ADF, navigate to the “Manage” menu, create a new Linked Service, of the Azure Key Vault type:

Creating a new Azure Key Vault (AKV) Linked Service in Azure Data Factory (ADF).

The first thing we’ll need to decide is the method of authentication for ADF to use when connecting to AKV. There are only 2, so we don’t have many choices:

  1. Managed Identity
  2. User-assigned Managed Identity

Which one to choose boils down to whether the identity should be kept, if the ADF resource gets deleted or if the same identity will be assigned to multiple resources. If the answer is yes to either of them, select User-assigned Managed Identity, if no, select Managed Identity.

Screen in Azure Data Factory, when creating an Azure Key Vault Linked Service.

For this exercise, I’ve choosen Managed Identity, but the process with a User-assigned Managed Identity is the same, after creating & selecting the User-assigned Managed Identity. They can be created [here].

Screen in Azure Data Factory for selecting a User-assigned Managed Identity.

Next, from the dropdown menus (“Azure subscription” & “Azure key vault name”) select the Azure Subscription containing the AKV. If the correct subscription was selected, the AKV should be visible in the “Azure key vault name” dropdown menu.

Below the dropdown menus, the UI very helpfully shows the name & object ID of the Managed Identity of the ADF resource (I’ve deleted them on the picture above), as well as a [link] to the documentation on how to grant ADF access to AKV. Note the Save button!

How to add Access Policy in AKV.
Creating a new Access Policy for the Managed Identity.

Usually, only the List and Get permissions are assigned, for any type (Secret, Key, Certificate), as ADF is a user/consumer of secrets & doesn’t usually manage them.

After assigning the permissions, ensure the correct principal is selected. Click on the “None selected” button besides the “Select principal” label & paste the object ID or name that the UI in ADF listed.

Click on “Add”, then make sure to click the Save button on the Access Policies screen, otherwise the Access Policy won’t be applied!

Afterwards, the Create button in ADF can be clicked to create the AKV Linked Service, which can used in other Linked Services in ADF to retrieve secrets from that AKV.

How to access AKV in ADF

There are 2 ways to access & use the secrets stored in AKV:

  1. via the Linked Service
  2. via the AKV’s REST API

Both can be valid. Which one to use will depend on where the secret will be used.

If the secret is needed within another Linked Service, then the AKV Linked Service should be used. On the other hand, if the secret is needed within a pipeline, then the REST API should be used.

AKV Linked Service in other Linked Services

During creation of Linked Services, the UI for many of them provides the option to use an existing AKV Linked Service to fetch secrets from, for properties like passwords or credentials.

Part of the SFTP Linked Service creation screen in ADF. The UI provides an option to use a secret in AKV as the password for the connection.

In other cases though, the UI doesn’t provide the option to use an AKV secret. Using the picture above as an example, the UI doesn’t allow to use a secret as the username.

Setting a String as value for a property vs. Setting a secret as the value for a property

By editing the JSON definition of a Linked Service (it can be viewed by clicking on the curly braces icon in the list of Linked Services in ADF) directly, it is possible to use secrets from an existing AKV Linked Service as values for properties that the UI would normally not allow.

Edited JSON definition of an SFTP Linked Service. Note that the username is defined as a secret coming from a linked AKV. Normally, this is not possible to setup via the UI.

This can be combined with parameterization of the Linked Service too, if the name of the secret is unknown or if it will be determined only during runtime. In the above example, the name of the secret is parameterized.

Using Secrets from AKV in Pipelines

If a secret from AKV is needed within a pipeline, it can be retrieved via making an HTTP request to the REST API of the AKV, using a Web Activity in ADF.

To retrieve secrets from AKV via the REST API, setting up a Linked Service to the AKV is not a pre-requisite.

As an example, let’s say that within ADF an external API must be called, which requires an API key. This API key — following best practices — is stored in AKV.

To retrieve the API key in ADF, a simple Web Activity can be used:

JSON definition of the Web Activity retrieving secrets from AKV.

In this example, I’ve selected MSI / Managed Identity as the method of authentication within the Web Activity. User-assigned Managed Identity could be used as well, if one is set-up with the AKV.

An API version must be specified as well as a query parameter in the URL that’s called. In the above example, it’s 7.0 (?api-version=7.0). If an API version isn’t specified, the activity will fail.

Another important bit is to make sure that the Resource (to which authentication happens) is set to “https://vault.azure.net”.

As this activity handles secrets, it’s a good idea to check the “Secure Output” option to avoid revealing the secrets within the logs.

The output of the Web Activity will contain the HTTP response from AKV, which will have the secret, if everything went well. This can be saved in a variable and used in other activities in the pipeline, or pass to other pipelines as a parameter.

Summary

Key learnings from this article:

  1. How to create an AKV Linked Service & which authentication method to use.
  2. AKV Linked Services can be used with other Linked Services. They can be used even if the UI doesn’t provide an option to do so.
  3. If a secret is needed within a pipeline, it can be fetched via a Web Activity sending an HTTP request to the REST API of AKV.

If you’d like to learn more interesting things, tricks & tips about Microsoft Azure, please follow me!!!

If you’ve learned something new, share this article to show what you’ve just learned & to make sure others will see it as well!!!

--

--

Tibor Vekony

Here to share & learn interesting practices, technologies & other stuff in the Cloud.