Pandas 组合 BQ 表中的多个列以生成 FB 转换 api 的有效负载

Posted

技术标签:

【中文标题】Pandas 组合 BQ 表中的多个列以生成 FB 转换 api 的有效负载【英文标题】:Pandas combine mutilple columns in a BQ table to generate payload for FB conversions api 【发布时间】:2021-02-25 04:37:29 【问题描述】:

我正在从 bigquery 表中读取数据以生成有效负载以上传到 FB 转换 api。

cols=["payload","client_user_agent","event_source_url"] 我直接从 bq 表中复制列值,因为我无法在笔记本中打印数据框的完整输出。

payload=""pageDetail":"pageName":"Confirmation","pageContentType":"cart","pageSiteSection":"cart","breadcrumbs":["title":"Home","url":"/en/home.html","title":"Cart","url":"/cart","title":"Confirmation","url":"/order-confirmation="],"pageCategory":"Home","pageCategory1":"Cart","pageCategory2":"Confirmation","proBtbGlobalHeader":false,"orderDetails":"hceid":"3b94a","orderConfirmed":true,"orderDate":"2021-01-15","orderId":"0123","unique":2,"pricingSummary":"total":54.01,"items":["productId":"0456","quantity":1,"shippingAddress":"postalCode":"V4N 3X3","promotion":"voucherCode":null,"clickToInstall":"eligible":false,"productId":"0789","quantity":1,"fulfillment":"fulfillmentCost":"","shippingAddress":"postalCode":"A4N 3Y3","promotion":"voucherCode":null,"clickToInstall":"eligible":false],"billingAddress":"postalCode":"M$X1A7","event":"type":"Load","page":"Confirmation","timestamp":1610706772998,"language":"English","url":"https://www""

client_user_agent="Mozilla/5.0"
event_source_url= "https://www.def.com="

我需要 email=[orderDetails][hceid] 和 value=["orderDetails"]["pricingSummary"]["total"] 的值

最初我想要的所有有效负载都在一个列中,我能够使用以下代码实现上传

import time
from facebook_business.adobjects.serverside.event import Event
from facebook_business.adobjects.serverside.event_request import EventRequest
from facebook_business.adobjects.serverside.user_data import UserData
from facebook_business.adobjects.serverside.custom_data import CustomData
from facebook_business.api import FacebookAdsApi
import pandas as pd
import json

FacebookAdsApi.init(access_token=access_token)
query='''SELECT  JSON_EXTRACT(payload, '$') AS payload FROM `project.dataset.events` WHERE eventType = 'Page Load' AND pagename = "Confirmation" limit 1'''
df = pd.read_gbq(query, project_id= project, dialect='standard')
payload = df.to_dict(orient="records")
for i in payload:
    #print(type(i["payload"]))
    k = json.loads(i["payload"])
    email = k["orderDetails"]["hcemuid"]
    user_data = UserData(email)
    value=k["orderDetails"]["pricingSummary"]["total"]
    order_id = k["orderDetails"]["orderId"]
    custom_data = CustomData(
        currency='CAD',
        value=value)
    event = Event(
        event_name='Purchase',
        event_time=int(time.time()),
        user_data=user_data,
        custom_data=custom_data,
        event_id = order_id,
        data_processing_options= [])
    events = [event]
    #print(events)
    event_request = EventRequest(
        events=events,
        test_event_code='TEST8609',
        pixel_id=pixel_id)
    #print(event_request)
    a=event_request.execute()
    print(a)

现在有额外的值 client_user_agent 需要成为用户数据的一部分,而 event_source_url 作为上述代码中事件的一部分,在 GBQ 表中显示为两个不同的列。

我已经为多个列尝试了与上面类似的代码,但我收到了一个

TypeError: Object of type Series is not JSON serializable

所以我尝试连接列,然后创建一个 json 可序列化对象,但我无法进行上传。

下面是我卡住和迷路的地方,不知道如何进一步进行任何输入。

import time
from facebook_business.adobjects.serverside.event import Event
from facebook_business.adobjects.serverside.event_request import EventRequest
from facebook_business.adobjects.serverside.user_data import UserData
from facebook_business.adobjects.serverside.custom_data import CustomData
from facebook_business.api import FacebookAdsApi
import pandas as pd
import json
FacebookAdsApi.init(access_token=access_token)
query='''SELECT  payload  AS payload,location.userAgent as client_user_agent,location.referrer as event_source_url FROM `project.Dataset.events` WHERE eventType = 'Page Load' AND pagename = "Confirmation" limit 1'''
df = pd.read_gbq(query, project_id= project, dialect='standard')
df.reset_index(drop=True, inplace=True)
payload = df.to_dict(orient="records")
print(payload)
## cols = ['payload', 'client_user_agent', 'event_source_url']
## df['combined'] = df[cols].apply(lambda row: ','.join(row.values.astype(str)), axis=1)
## del df["payload"]
## del df["client"]
## del df["source"]
## payload = df.to_dict(orient="records")
#tried concatinating all columns in a the dataframe but not able to create a valid json object for upload
columns = ['payload', 'client_user_agent', 'event_source_url']
df['payload'] = df['payload'].str.replace(r'"$', '')
payload = df[columns].to_dict(orient='records')
print(payload)
## df = df.drop(columns=columns)
## pd.options.display.max_rows = 4000
# #print(payload)
# for i in payload:
#     print(i["payload"])
#     k = json.loads(i["payload"])
#     email = k["orderDetails"]["hcemuid"]
#     print(email)

我正在按照此页面的说明进行操作:https://developers.facebook.com/docs/marketing-api/conversions-api

【问题讨论】:

【参考方案1】:

我使用 bigquery json_extract_scalar 函数从嵌套列而不是 pandas 中提取数据,这对于我的场景来说是一个相对更好的解决方案。

【讨论】:

以上是关于Pandas 组合 BQ 表中的多个列以生成 FB 转换 api 的有效负载的主要内容,如果未能解决你的问题,请参考以下文章

在 Pandas 数据框中过滤多个列以获取相同的字符串

是否可以将多个 SQL 语句组合在一起?

Oracle:动态设置表中的所有 NOT NULL 列以允许 NULL

遍历 pandas 数据框中的所有列以在分隔符上拆分

从 Pandas 中的滚动窗口生成值组合

使用 groupby 循环遍历 pandas 中的多个变量组合