将多个 Amazon Redshift 表合并到一个表中会出现错误:列 X 的类型为布尔值,但表达式的类型为字符变化提示:
Posted
技术标签:
【中文标题】将多个 Amazon Redshift 表合并到一个表中会出现错误:列 X 的类型为布尔值,但表达式的类型为字符变化提示:【英文标题】:Join multiple tables of Amazon Redshift in a single one obtains error: column X is of type boolean but expression is of type character varying Hint: 【发布时间】:2021-08-30 21:38:41 【问题描述】:我正在尝试将多个 Amazon Redshift 表合并到一个表中。 初始表之一是这个:
create table order_customers(
id int,
email varchar(254),
phone varchar(50),
customer_id int,
order_id int NOT NULL,
ip text,
geoip_location varchar(1024),
logged_in boolean,
PRIMARY KEY (id),
FOREIGN KEY (order_id) REFERENCES orders (id)
);
我正在使用命令将数据插入到大表中:
INSERT INTO orders_large ( id, showid, created_at, status, status_enum, currency, tax_orders, shipping, discount_orders,
subtotal, total, store_id, payment_method_id, shipping_method_name, shipping_method_id, additional_information,
payment_information, locale, shipping_required_orders, payment_method_type, coupons, payment_notification_id,
recover_token, updated_at, external, shipping_tax, shipping_discount, shipping_discount_decimal,
completed_at, payment_name, shipping_service_id, app_id, fulfillment_status, date_traffic_sources,
landing_url, referral_url, referral_code, utm_campaign, utm_source, utm_term, utm_medium, utm_content,
user_agent, subscription_id_traffic_sources, email, phone, customer_id_order_customers,
ip, geoip_location, logged_in, name, surname, company, address, street_number, city, postal, country, region,
type, taxid, default_, region_format, municipality, latitude, longitude, subscription_id_addresses,
customer_id_addresses, pickup_point_id, taxid_type, sku, qty, price, product_id, weight,
product_option_property_id, discount_order_products, shipping_required_orders_products, brand,
tax_order_products, width, height, length, volume, diameter, package_format)
SELECT o.id, o.showid, o.created_at, o.status, o.status_enum, o.currency, o.tax, o.shipping, o.discount,
o.subtotal, o.total, o.store_id, o.payment_method_id, o.shipping_method_name, o.shipping_method_id, o.additional_information,
o.payment_information, o.locale, o.shipping_required, payment_method_type, coupons, payment_notification_id,
recover_token, o.updated_at, o.external, shipping_tax, o.shipping_discount, shipping_discount_decimal,
completed_at, payment_name, shipping_service_id, o.app_id, o.fulfillment_status, t.date,
t.landing_url, t.referral_url, t.referral_code, t.utm_campaign, t.utm_source, t.utm_term, t.utm_medium, t.utm_content,
t.user_agent, t.subscription_id, oc.email, oc.phone, oc.customer_id, oc.order_id, oc.ip,
oc.geoip_location, oc.logged_in, a.name, a.surname, a.company, a.address, a.street_number, a.city, a.postal,
a.country, a.region, a.type, a.taxid, a.default_, a.region_format, a.municipality, a.latitude, a.longitude, a.order_id,
a.subscription_id, a.customer_id, a.pickup_point_id, a.taxid_type, op.sku, op.qty, op.price, op.product_id,
op.order_id, op.weight, op.product_option_property_id, op.discount, op.shipping_required, op.brand,
op.tax, op.width, op.height, op.length, op.volume, op.diameter, op.package_format
FROM orders o
INNER JOIN traffic_sources t ON o.id = t.order_id
INNER JOIN order_customers oc ON o.id = oc.order_id
INNER JOIN addresses a ON o.id = a.order_id
INNER JOIN order_products op ON o.id = op.order_id;
我收到此错误消息:
ERROR: column "logged_in" is of type boolean but expression is of type character varying Hint: You will need to rewrite or cast the expression.
我尝试在 oc.logged_in 字段中使用 DECODE(oc.logged_in, 'false', '0', 'true', '1')::varchar::bool ,但出现另一条错误消息:
ERROR: cannot cast type character varying to boolean
【问题讨论】:
你为什么要解码成一个字符串值然后试图将它强制转换成一个布尔值?我希望DECODE(oc.logged_in, 'false', false, 'true', true, NULL)
或 DECODE(oc.logged_in, 'false', false, true)
会是您想要的。
两个选项都返回“错误:列“logged_in”是布尔类型,但表达式是字符变化类型提示:您需要重写或强制转换表达式。”
我也试过 DECODE(oc.logged_in, 'false', false, true)::Boolean, DECODE(oc.logged_in, 'false', false, 'true', true, NULL): :Boolean, DECODE(oc.logged_in, 'true', true, false)::Bool 并且错误总是相同的“错误:列“logged_in”是布尔类型,但表达式是字符类型不同提示:您将需要重写或转换表达式。”
【参考方案1】:
问题是字段之间的对应关系。它在使用“order_id”删除字段后工作。要在 Redshift 中投射,有 2 个选项:
-
CONVERT(类型、表达式)
CAST(表达式 AS 类型)或表达式 :: 类型
字体:https://docs.aws.amazon.com/redshift/latest/dg/r_CAST_function.html#convert-function
【讨论】:
【参考方案2】:正如消息所说 - 列“logged_in”是布尔类型
因此,在您的 DECODE 中,您需要将其与布尔值进行比较,而不是字符串。试试:
DECODE(oc.logged_in, true, 'true', 'false')
The code above works for my understanding of you issue. Below is test SQL which runs fine on Redshift.
create table oc as (select 1=1 as logged_in union all select 1=0);
select * from oc;
select DECODE(oc.logged_in, true, 'true string', 'false string') as test from oc;
I now expect that the issue is not in using oc.logged_in but rather with orders_large.logged_in and what you are putting in it. What data type is logged_in defined as in orders_large? Boolean, I assume. Which should take a boolean value just fine w/o casting.
Looking at your SQL I see that the number of elements in INSERT clause doesn't match the number of elements in the SELECT clause. This mismatch is causing your SQL to try and put a different (text) value into orders_large.logged_in. Here's a "diff" between the 2 lists (SELECT on the left / INSERT on the right):
id id
showid showid
created_at created_at
status status
status_enum status_enum
currency currency
tax_orders | tax
shipping shipping
discount_orders | discount
subtotal subtotal
total total
store_id store_id
payment_method_id payment_method_id
shipping_method_name shipping_method_name
shipping_method_id shipping_method_id
additional_information additional_information
payment_information payment_information
locale locale
shipping_required_orders | shipping_required
payment_method_type payment_method_type
coupons coupons
payment_notification_id payment_notification_id
recover_token recover_token
updated_at updated_at
external external
shipping_tax shipping_tax
shipping_discount shipping_discount
shipping_discount_decimal shipping_discount_decimal
completed_at completed_at
payment_name payment_name
shipping_service_id shipping_service_id
app_id app_id
fulfillment_status fulfillment_status
date_traffic_sources | date
landing_url landing_url
referral_url referral_url
referral_code referral_code
utm_campaign utm_campaign
utm_source utm_source
utm_term utm_term
utm_medium utm_medium
utm_content utm_content
user_agent user_agent
subscription_id_traffic_sources | subscription_id
email email
phone phone
customer_id_order_customers | customer_id
> order_id
ip ip
geoip_location geoip_location
logged_in logged_in
name name
surname surname
company company
address address
street_number street_number
city city
postal postal
country country
region region
type type
taxid taxid
default_ default_
region_format region_format
municipality municipality
latitude latitude
longitude longitude
subscription_id_addresses | order_id
customer_id_addresses | subscription_id
> customer_id
pickup_point_id pickup_point_id
taxid_type taxid_type
sku sku
qty qty
price price
product_id product_id
> order_id
weight weight
product_option_property_id product_option_property_id
discount_order_products | discount
shipping_required_orders_products | shipping_required
brand brand
tax_order_products | tax
width width
height height
length length
volume volume
diameter diameter
package_format package_format
As you can see there is an unmatched "orders_id" in the INSERT list just a couple of columns before logged_in. You need to get the column alignment fixed.
【讨论】:
我尝试过并返回相同的错误"ERROR: column "logged_in" is of type boolean but expression is of type character varying Hint: You will need to rewrite or cast the expression."
我也尝试过DECODE(oc.logged_in, true, 'true', 'false')::Bool
并返回此错误"ERROR: cannot cast type text to boolean"
我在SELECT ... DECODE(oc.logged_in, true, 'true', 'false') ...
中使用它
您的第二次尝试将不起作用,因为您无法将字符串转换为布尔值或将布尔值转换为字符串。您可以将布尔值转换为 INT(变为 1 或 0)。我发布的代码有效,我将更新我的答案并提供更多详细信息。以上是关于将多个 Amazon Redshift 表合并到一个表中会出现错误:列 X 的类型为布尔值,但表达式的类型为字符变化提示:的主要内容,如果未能解决你的问题,请参考以下文章
如何使用SQL或Python在Amazon Redshift中从Amazon Snow雪花重新创建数据库表? (一次重新创建所有ot,而不是一次一遍地创建)