无法推断 COUNT 函数
Posted
技术标签:
【中文标题】无法推断 COUNT 函数【英文标题】:Could not infer COUNT function 【发布时间】:2012-03-22 16:19:18 【问题描述】:我正在尝试编写一个猪拉丁语脚本来提取我已过滤的数据集的计数。
这是目前为止的脚本:
/* scans by title */
scans = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans = FILTER scans BY (title MATCHES 'proactiv');
scancount = FOREACH productscans GENERATE COUNT($0);
DUMP scancount;
由于某种原因,我得到了错误:
无法将 org.apache.pig.builtin.COUNT 的匹配函数推断为多个匹配或都不匹配。请使用显式强制转换。
我在这里做错了什么?我假设它与我传入的字段类型有关,但我似乎无法解决这个问题。
TIA, 杰森
【问题讨论】:
【参考方案1】:这是你要找的东西吗(把所有东西都放在一个袋子里,然后数数):
scans = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans = FILTER scans BY (title MATCHES 'proactiv');
grouped = GROUP productscans ALL;
count = FOREACH grouped GENERATE COUNT(productscans);
dump count;
【讨论】:
就是这样(减去“FOREACH g”应该是“FOREACH 分组”) - 谢谢克里斯!【参考方案2】:COUNT 对于全局计数需要前面的 GROUP ALL 语句,对于组计数需要一个 GROUP BY 语句。
您可以使用以下任何一种:
scans = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans = FILTER scans BY (title MATCHES 'proactiv');
grouped = GROUP productscans ALL;
count = FOREACH grouped GENERATE COUNT(productscans);
DUMP scancount;
或者
scans = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans = FILTER scans BY (title MATCHES 'proactiv');
grouped = GROUP productscans ALL;
count = FOREACH grouped GENERATE COUNT($1);
DUMP scancount;
【讨论】:
【参考方案3】:也许
/* scans by title */
scans = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans = FILTER scans BY (title MATCHES 'proactiv');
scancount = FOREACH productscans GENERATE COUNT(productscans);
DUMP scancount;
【讨论】:
感谢 Jake - 不幸的是,没有运气。这给了我:Invalid scalar projection: productscans : A column needs to be projected from a relation for it to be used as a scalar
以上是关于无法推断 COUNT 函数的主要内容,如果未能解决你的问题,请参考以下文章
Pig:错误 1045:无法将 COUNT 的匹配函数推断为多个匹配或都不匹配。请使用显式演员表
Scala 编译器无法在 Spark lambda 函数中推断类型
无法将 org.apache.pig.builtin.SUM 的匹配函数推断为多个匹配或都不匹配。请使用显式演员表
Kotlin函数 ⑤ ( 匿名函数变量类型推断 | 匿名函数参数类型自动推断 | 匿名函数又称为 Lambda 表达式 )