Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2
Posted
技术标签:
【中文标题】Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2【英文标题】:Azure Databricks accessing Azure Data Lake Storage Gen2 via Service principal 【发布时间】:2020-07-21 03:56:54 【问题描述】:我想通过服务主体从 Azure Databricks 群集访问 Azure Data Lake Storage Gen2,以摆脱存储帐户访问密钥 我关注https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/azure-datalake-gen2#--mount-an-azure-data-lake-storage-gen2-account-using-a-service-principal-and-oauth-20 ..但是说存储帐户访问密钥仍在使用:
如果仍然需要存储帐户访问密钥,那么操作系统服务帐户的用途是什么? 主要问题 - 是否可以完全摆脱存储帐户访问密钥并仅使用服务主体?
【问题讨论】:
【参考方案1】:这是一个文档错误,目前我正在立即修复。
应该是dbutils.secrets.get(scope = "<scope-name>", key = "<key-name-for-service-credential>") retrieves your service-credential that has been stored as a secret in a secret scope.
Python:通过传递直接值装载 Azure Data Lake Storage Gen2 文件系统
configs = "fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": "0xxxxxxxxxxxxxxxxxxxxxxxxxxf", #Enter <appId> = Application ID
"fs.azure.account.oauth2.client.secret": "Arxxxxxxxxxxxxxxxxxxxxy7].vX7bMt]*", #Enter <password> = Client Secret created in AAD
"fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/72fxxxxxxxxxxxxxxxxxxxxxxxxb47/oauth2/token", #Enter <tenant> = Tenant ID
"fs.azure.createRemoteFileSystemDuringInitialization": "true"
dbutils.fs.mount(
source = "abfss://filesystem@chepragen2.dfs.core.windows.net/flightdata", #Enter <container-name> = filesystem name <storage-account-name> = storage name
mount_point = "/mnt/flightdata",
extra_configs = configs)
Python:通过使用 dbutils 机密在机密范围内作为机密传递来装载 Azure Data Lake Storage Gen2 文件系统。
configs = "fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": "06xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0ef",
"fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope = "chepra", key = "service-credential"),
"fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/72xxxxxxxxxxxxxxxxxxxx011db47/oauth2/token"
dbutils.fs.mount(
source = "abfss://filesystem@chepragen2.dfs.core.windows.net/flightdata",
mount_point = "/mnt/flightdata",
extra_configs = configs)
希望这会有所帮助。如果您有任何进一步的疑问,请告诉我们。
【讨论】:
以上是关于Azure Databricks 通过服务主体访问 Azure Data Lake Storage Gen2的主要内容,如果未能解决你的问题,请参考以下文章
如何通过服务主体通过 Azure Key Vault 访问 Azure 存储帐户
Azure Data PlatformETL工具(21)——Azure Databricks使用——访问Azure Blob
Azure Data PlatformETL工具(21)——Azure Databricks使用——访问Azure Blob