Google表格中的笛卡尔/交叉连接以获取逗号分隔值

Posted

技术标签:

【中文标题】Google表格中的笛卡尔/交叉连接以获取逗号分隔值【英文标题】:Cartesian / Cross Join in Google Sheets for comma separated values 【发布时间】:2021-10-30 05:57:45 【问题描述】:

我正在为网上商店建立产品清单。父产品包含可用属性的逗号分隔列表。如果它是一件 T 恤,它将在另一个字段 b,r,g,y(蓝色、红色、绿色、黄色)的列表中显示带有颜色的 xs,s,m,l,xl

要生成子库存,我们需要在 Google 表格中创建一个包含 6 个字段的笛卡尔连接,其中 5 个有逗号分隔的列表,需要拆分,然后在笛卡尔连接中再次连接在一起。有些行在某些字段中没有值,因此公式必须考虑到这一点。

我已经尝试过这种方法及其变体,但没有成功。我可以让它工作 1 行,但我一使用一整列,它就会掉下来。

Generate all possible combinations for Columns in Google SpreadSheets

我已经包含了一个带有测试值的示例表。我还为每一行提供了一个预期结果的示例。

https://docs.google.com/spreadsheets/d/1pi1zjJuiWRJ2iPZo2Ve4dAekyVnGq5Tc0z3rgev_Ikw/edit?usp=sharing

我想尝试在不使用 Google 脚本的情况下执行此操作。

【问题讨论】:

【参考方案1】:
=ARRAYFORMULA(A2&TRANSPOSE(REGEXREPLACE(
SPLIT(FLATTEN(
SPLIT(FLATTEN(TRANSPOSE(SPLIT(B2,","))&IF(C2="",,"-"&SPLIT(C2,","))&"♦︎"),"♦︎")
&"-"&SPLIT(TEXTJOIN(",",TRUE,D2:F2),",")&"♦︎"),"♦︎")
,"-$",)))

【讨论】:

谢谢。你知道我如何让它在值列而不是一行中工作吗?我可能对此有点模棱两可。 您只需删除第一个TRANSPOSE即可。但是您的数据位于跨列输出的行中,这会使您无法将公式应用于其他行。【参考方案2】:

见:

=ARRAYFORMULA(QUERY(SUBSTITUTE(TRIM(FLATTEN(QUERY(TRANSPOSE(SUBSTITUTE(QUERY(SPLIT(QUERY(UNIQUE(FLATTEN(FLATTEN(FLATTEN(FLATTEN(FLATTEN(
 FILTER(ROW(A2:A)&"×"&A2:A, A2:A<>IF(,,))&
 SPLIT(TEXTJOIN(",", 1, IF(INDIRECT("A2:A"&COUNTA(A2:A)+ROW(A2)-1)=IF(,,),,IF(B2:B=IF(,,), "×"&ROW(B2:B)&"×♦", "×"&ROW(B2:B)&"×-"&SPLIT(B2:B, ",")))), ","))&
 SPLIT(TEXTJOIN(",", 1, IF(INDIRECT("A2:A"&COUNTA(A2:A)+ROW(A2)-1)=IF(,,),,IF(C2:C=IF(,,), "×"&ROW(C2:C)&"×♦", "×"&ROW(C2:C)&"×-"&SPLIT(C2:C, ",")))), ","))&
 SPLIT(TEXTJOIN(",", 1, IF(INDIRECT("A2:A"&COUNTA(A2:A)+ROW(A2)-1)=IF(,,),,IF(D2:D=IF(,,), "×"&ROW(D2:D)&"×♦", "×"&ROW(D2:D)&"×-"&SPLIT(D2:D, ",")))), ","))&
 SPLIT(TEXTJOIN(",", 1, IF(INDIRECT("A2:A"&COUNTA(A2:A)+ROW(A2)-1)=IF(,,),,IF(E2:E=IF(,,), "×"&ROW(E2:E)&"×♦", "×"&ROW(E2:E)&"×-"&SPLIT(E2:E, ",")))), ","))&
 SPLIT(TEXTJOIN(",", 1, IF(INDIRECT("A2:A"&COUNTA(A2:A)+ROW(A2)-1)=IF(,,),,IF(F2:F=IF(,,), "×"&ROW(F2:F)&"×♦", "×"&ROW(F2:F)&"×-"&SPLIT(F2:F, ",")))), ","))), 
 "where Col1 contains '♦'"), "×"), 
 "select Col2,Col4,Col6,Col8,Col10,Col12  
  where Col1=Col3 and Col3=Col5 and Col5=Col7 and Col7=Col9 and Col9=Col11"), "♦", IF(,,))),,999^99))), " ", IF(,,)), 
 "where not Col1 contains '--'"))

demo spreadsheet

【讨论】:

多么传奇。这完美无缺。非常感谢。 我们进一步测试了该解决方案,发现这对于超过 8 或 9 行的任何内容都不起作用。我们收到array too large 错误。如果我们删除unique,我们会发现该公式从 2 行生成了 3000+ 个结果。最终公式需要能够处理 6000 行数据,而不仅仅是 8 行或 9 行。【参考方案3】:

尝试:

=ARRAYFORMULA(SUBSTITUTE(QUERY(FLATTEN(FLATTEN(
 QUERY(FLATTEN(IF(IFERROR(SPLIT(B2:B, ","))=IF(,,),,
 A2:A&SPLIT(B2:B, ","))), "where Col1 is not null")&
 IFERROR(SPLIT(QUERY(FLATTEN(IF(IFERROR(SPLIT(B2:B, ","))=IF(,,),,
 IF(C2:C=IF(,,), IF(E2:E=IF(,,), CHAR(32),
 SUBSTITUTE("-"&E2:E, ",", ",-")), SUBSTITUTE("-"&C2:C, ",", ",-"))))&"",
 "where Col1 is not null limit "&COLUMNS(SPLIT(TEXTJOIN(",", 1, B2:B), ","))), ",")))&
 IF(IF(,,)=IFERROR(SPLIT(FLATTEN(
 IF(IF(,,)=IFERROR(SPLIT(QUERY(FLATTEN(IF(IFERROR(SPLIT(B2:B, ","))=IF(,,),,
 IF(C2:C=IF(,,), IF(E2:E=IF(,,), CHAR(32), E2:E), C2:C)))&"",
 "where Col1 is not null limit "&COLUMNS(SPLIT(TEXTJOIN(",", 1, B2:B), ","))), ",")),, 
 QUERY(FLATTEN(IF(IFERROR(SPLIT(B2:B, ","))=IF(,,),,
 IF(D2:D=IF(,,), IF(F2:F=IF(,,), CHAR(32), F2:F), D2:D)))&"",
 "where Col1 is not null limit "&COLUMNS(SPLIT(TEXTJOIN(",", 1, B2:B), ","))))), ",")), 0/0, 
 SPLIT(FLATTEN(IF(IF(,,)=IFERROR(SPLIT(QUERY(FLATTEN(IF(IFERROR(SPLIT(B2:B, ","))=IF(,,),,
 IF(C2:C=IF(,,), IF(E2:E=IF(,,), CHAR(32), E2:E), C2:C)))&"",
 "where Col1 is not null limit "&COLUMNS(SPLIT(TEXTJOIN(",", 1, B2:B), ","))), ",")),, 
 QUERY(FLATTEN(IF(IFERROR(SPLIT(B2:B, ","))=IF(,,),,
 IF(D2:D=IF(,,), IF(F2:F=IF(,,), CHAR(32),
 SUBSTITUTE("-"&F2:F, ",", ",-")), SUBSTITUTE("-"&D2:D, ",", ",-"))))&"",
 "where Col1 is not null limit "&COLUMNS(SPLIT(TEXTJOIN(",", 1, B2:B), ","))))), ","))),
 "where Col1 contains '-'"), CHAR(32), ))

demo spreadsheet

【讨论】:

以上是关于Google表格中的笛卡尔/交叉连接以获取逗号分隔值的主要内容,如果未能解决你的问题,请参考以下文章

vbscript VBA Excel交叉连接和清理数据,每个单元格有多个条目,以逗号或换行符分隔

用逗号分隔表的查询是交叉连接查询吗?

PHP mysql - 以表格形式显示数据库中的逗号分隔值

mysql--多表连接查询

excel表格中怎样使用CSV逗号分隔格式?

sql SQL - 在以逗号,管道或分号或任何其他字符分隔的列中获取多个值或连接值的值