Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2

Posted

技术标签:

【中文标题】Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2【英文标题】:Azure Databricks accessing Azure Data Lake Storage Gen2 via Service principal 【发布时间】:2020-07-21 03:56:54 【问题描述】:

我想通过服务主体从 Azure Databricks 群集访问 Azure Data Lake Storage Gen2,以摆脱存储帐户访问密钥 我关注https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/azure-datalake-gen2#--mount-an-azure-data-lake-storage-gen2-account-using-a-service-principal-and-oauth-20 ..但是说存储帐户访问密钥仍在使用:

如果仍然需要存储帐户访问密钥,那么操作系统服务帐户的用途是什么? 主要问题 - 是否可以完全摆脱存储帐户访问密钥并仅使用服务主体?

【问题讨论】:

【参考方案1】:

这是一个文档错误,目前我正在立即修复。

应该是dbutils.secrets.get(scope = "<scope-name>", key = "<key-name-for-service-credential>") retrieves your service-credential that has been stored as a secret in a secret scope.

Python:通过传递直接值装载 Azure Data Lake Storage Gen2 文件系统

configs = "fs.azure.account.auth.type": "OAuth",
       "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
       "fs.azure.account.oauth2.client.id": "0xxxxxxxxxxxxxxxxxxxxxxxxxxf", #Enter <appId> = Application ID
       "fs.azure.account.oauth2.client.secret": "Arxxxxxxxxxxxxxxxxxxxxy7].vX7bMt]*", #Enter <password> = Client Secret created in AAD
       "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/72fxxxxxxxxxxxxxxxxxxxxxxxxb47/oauth2/token", #Enter <tenant> = Tenant ID
       "fs.azure.createRemoteFileSystemDuringInitialization": "true"

dbutils.fs.mount(
source = "abfss://filesystem@chepragen2.dfs.core.windows.net/flightdata", #Enter <container-name> = filesystem name <storage-account-name> = storage name
mount_point = "/mnt/flightdata",
extra_configs = configs)

Python:通过使用 dbutils 机密在机密范围内作为机密传递来装载 Azure Data Lake Storage Gen2 文件系统

configs = "fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "06xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0ef",
           "fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope = "chepra", key = "service-credential"),
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/72xxxxxxxxxxxxxxxxxxxx011db47/oauth2/token"

dbutils.fs.mount(
source = "abfss://filesystem@chepragen2.dfs.core.windows.net/flightdata", 
mount_point = "/mnt/flightdata",
extra_configs = configs)

希望这会有所帮助。如果您有任何进一步的疑问,请告诉我们。

【讨论】:

以上是关于Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2的主要内容,如果未能解决你的问题,请参考以下文章

如何通过服务主体通过 Azure Key Vault 访问 Azure 存储帐户

无法在Databricks中为ADLS Gen2创建安装点

Azure Data PlatformETL工具(21)——Azure Databricks使用——访问Azure Blob

Azure Data PlatformETL工具(21)——Azure Databricks使用——访问Azure Blob

Azure Databricks:访问防火墙后面的 Blob 存储

是否可以通过 azure databricks 连接到无服务器 sql 池?