如何在 python jupyter notebook 中运行 bigquery SQL 查询
Posted
技术标签:
【中文标题】如何在 python jupyter notebook 中运行 bigquery SQL 查询【英文标题】:How to run a bigquery SQL query in python jupyter notebook 【发布时间】:2021-10-02 20:17:16 【问题描述】:我尝试在 Jupyter 笔记本中从 Google BigQuery 运行 SQL 查询。 我按照https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas#download_query_results_using_the_client_library 的规定做所有事情。 我开设了一个客户账户并下载了 JSON 文件。 现在我尝试运行脚本:
from google.cloud import bigquery
bqclient = bigquery.Client('c://folder/client_account.json')
# Download query results.
query_string = """
SELECT * from `project.dataset.table`
"""
dataframe = (
bqclient.query(query_string)
.result()
.to_dataframe(
# Optionally, explicitly request to use the BigQuery Storage API. As of
# google-cloud-bigquery version 1.26.0 and above, the BigQuery Storage
# API is used by default.
create_bqstorage_client=True,
)
)
print(dataframe.head())
但我不断收到错误消息:
DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
我不明白我做错了什么,因为 JSON 文件看起来很好,并且文件的路径是正确的。
【问题讨论】:
【参考方案1】:该错误表明您的 GCP 环境无法识别和配置所需的应用程序凭据。
要使用服务帐户进行身份验证,请遵循以下方法:
from google.cloud import bigquery
from google.oauth2 import service_account
# TODO(developer): Set key_path to the path to the service account key
# file.
key_path = "path/to/service_account.json"
credentials = service_account.Credentials.from_service_account_file(
key_path, scopes=["https://www.googleapis.com/auth/cloud-platform"],
)
bqclient = bigquery.Client(credentials=credentials, project=credentials.project_id,)
query_string = """
SELECT * from `project.dataset.table`
"""
dataframe = (
bqclient.query(query_string)
.result()
.to_dataframe(
# Optionally, explicitly request to use the BigQuery Storage API. As of
# google-cloud-bigquery version 1.26.0 and above, the BigQuery Storage
# API is used by default.
create_bqstorage_client=True,
)
)
print(dataframe.head())
【讨论】:
嗨,在您的解决方案中,我收到错误消息:未定义名称'bqclient' 嗨@Galat,谢谢。我更新了我的代码。 感谢@Sakshi Gatyan,您的解决方案正在运行,只需将 .to_dataframe(create_bqstorage_client=True,) 最后更改为 .to_dataframe() 即可。以上是关于如何在 python jupyter notebook 中运行 bigquery SQL 查询的主要内容,如果未能解决你的问题,请参考以下文章
如何将 python3 内核添加到 jupyter (IPython)