有效地将Pandas数据帧写入Google BigQuery

Question

我正在尝试使用pandas.DataFrame函数记录pandas.DataFrame.to_gbq()将here上传到谷歌大查询。问题是to_gbq()需要2.3分钟才能直接上传到Google Cloud Storage GUI不到一分钟。我正计划上传一堆数据帧（~32），每个数据帧都有相似的大小，所以我想知道它是什么更快的选择。

这是我正在使用的脚本：

dataframe.to_gbq('my_dataset.my_table', 
                 'my_project_id',
                 chunksize=None, # i've tryed with several chunksizes, it runs faster when is one big chunk (at least for me)
                 if_exists='append',
                 verbose=False
                 )

dataframe.to_csv(str(month) + '_file.csv') # the file size its 37.3 MB, this takes almost 2 seconds 
# manually upload the file into GCS GUI
print(dataframe.shape)
(363364, 21)

我的问题是，什么更快？

使用Dataframe函数上传pandas.DataFrame.to_gbq()
将Dataframe保存为csv，然后使用Python API将其作为文件上传到BigQuery
将Dataframe保存为csv，然后使用this procedure将文件上传到Google Cloud Storage，然后从BigQuery中读取它

更新：

替代方案2，使用pd.DataFrame.to_csv()和load_data_from_file()似乎需要比替代1更长的时间（平均3个循环多17.9秒）：

def load_data_from_file(dataset_id, table_id, source_file_name):
    bigquery_client = bigquery.Client()
    dataset_ref = bigquery_client.dataset(dataset_id)
    table_ref = dataset_ref.table(table_id)

    with open(source_file_name, 'rb') as source_file:
        # This example uses CSV, but you can use other formats.
        # See https://cloud.google.com/bigquery/loading-data
        job_config = bigquery.LoadJobConfig()
        job_config.source_format = 'text/csv'
        job_config.autodetect=True
        job = bigquery_client.load_table_from_file(
            source_file, table_ref, job_config=job_config)

    job.result()  # Waits for job to complete

    print('Loaded {} rows into {}:{}.'.format(
        job.output_rows, dataset_id, table_id))

谢谢！