简单的 TableAPI SQL 查询不适用于 Flink 1.10 和 Blink

Posted

技术标签:

【中文标题】简单的 TableAPI SQL 查询不适用于 Flink 1.10 和 Blink【英文标题】:Simple TableAPI SQL query doesn't work on Flink 1.10 and Blink 【发布时间】:2020-06-01 15:41:32 【问题描述】:

我想使用 TableAPI 定义 Kafka 连接器并在此类描述的表上运行 SQL(由 Kafka 支持)。不幸的是,Rowtime 的定义似乎没有按预期工作。

这是一个可重现的例子:

object DefineSource extends App 

  import org.apache.flink.streaming.api.scala._
  import org.apache.flink.table.api.scala._

  val env = StreamExecutionEnvironment.getExecutionEnvironment
  env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)

  val config = EnvironmentSettings.newInstance().inStreamingMode().useBlinkPlanner().build()
  val tEnv = StreamTableEnvironment.create(env, config)

  val rowtime = new Rowtime().watermarksPeriodicBounded(5000)
  val schema = new Schema()
    .field("k", "string")
    .field("ts", "timestamp(3)").rowtime(rowtime)

  tEnv.connect(new Kafka()
    .topic("test")
    .version("universal"))
    .withSchema(schema)
    .withFormat(new Csv())
    .createTemporaryTable("InputTable")

  val output = tEnv.sqlQuery(
    """SELECT k, COUNT(*)
      |  FROM InputTable
      | GROUP BY k, TUMBLE(ts, INTERVAL '15' MINUTE)
      |""".stripMargin
  )

  tEnv.toAppendStream[(String, Long)](output).print()

  env.execute()

产生

org.apache.flink.table.api.TableException: Window aggregate can only be defined over a time attribute column, but TIMESTAMP(3) encountered.
    at org.apache.flink.table.planner.plan.rules.logical.StreamLogicalWindowAggregateRule.getInAggregateGroupExpression(StreamLogicalWindowAggregateRule.scala:51)
    at org.apache.flink.table.planner.plan.rules.logical.LogicalWindowAggregateRuleBase.onMatch(LogicalWindowAggregateRuleBase.scala:79)
    at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319)
    at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560)
    at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419)
    at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256)
    at org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
    at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215)
    at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202)
    at org.apache.flink.table.planner.plan.optimize.program.FlinkHepProgram.optimize(FlinkHepProgram.scala:69)
    at org.apache.flink.table.planner.plan.optimize.program.FlinkHepRuleSetProgram.optimize(FlinkHepRuleSetProgram.scala:87)
    at org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:62)
    at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:160)
    at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:160)
    at scala.collection.Iterator.foreach(Iterator.scala:941)
    at scala.collection.Iterator.foreach$(Iterator.scala:941)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
    at scala.collection.IterableLike.foreach(IterableLike.scala:74)
    at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
    at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:160)
    at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:158)
    at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:108)
    at org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:58)
    at org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:170)
    at org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:94)
    at org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
    at org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:248)
    at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:151)
    at org.apache.flink.table.api.scala.internal.StreamTableEnvironmentImpl.toDataStream(StreamTableEnvironmentImpl.scala:210)
    at org.apache.flink.table.api.scala.internal.StreamTableEnvironmentImpl.toAppendStream(StreamTableEnvironmentImpl.scala:107)

我在 Flink 1.10.0.

【问题讨论】:

【参考方案1】:

这是一个错误并已修复 1.10.0+ https://issues.apache.org/jira/browse/FLINK-16160

【讨论】:

【参考方案2】:

不幸的是,这是 1.10 中的一个错误,正如@lijiayan 所说,应该在 1.11+ 中修复

作为 1.10 中的一种解决方法,您可以改用 DDL:

tEnv.sqlUpdate(
"CREATE TABLE InputTable (\n" +
"    k STRING,\n" +
"    ts TIMESTAMP(3),\n" +
"    WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" +
") WITH (\n" + 
" 'connector.type' = 'kafka',\n" + 
" 'connector.version' = 'universal',\n" +
" 'connector.topic' = 'test',\n" +
" 'connector.properties.zookeeper.connect' = 'localhost:2181',\n" +
" 'connector.properties.bootstrap.servers' = 'localhost:9092',\n" +
" 'format.type' = 'csv'\n" +
")"
);

【讨论】:

以上是关于简单的 TableAPI SQL 查询不适用于 Flink 1.10 和 Blink的主要内容,如果未能解决你的问题,请参考以下文章

Flink Table API和SQL的简单实例

Flink SQL Query 语法(一)

flink笔记11 Flink Table API和SQL的简单实例

18-flink-1.10.1-Table API & Flink SQL

18-flink-1.10.1-Table API & Flink SQL

18-flink-1.10.1-Table API & Flink SQL