Apache Calcite进行SQL解析

Posted 追求无悔

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Apache Calcite进行SQL解析相关的知识,希望对你有一定的参考价值。

背景

当一个项目分了很多模块,很多个服务的时候,一些公共的配置就需要统一管理了,于是就有了元数据驱动!

简介

什么是Calcite?
是一款开源SQL解析工具, 可以将各种SQL语句解析成抽象语法树AST(Abstract Syntax Tree), 之后通过操作AST就可以把SQL中所要表达的算法与关系体现在具体代码之中。
Calcite能做啥?

  1. SQL 解析
  2. SQL 校验
  3. 查询优化
  4. SQL 生成器
  5. 数据连接

实例

今天主要是贴出一个java代码实例,实现了:解析SQL语句中的表名
上代码:
SQL语句转化:

public static SqlNode parseStatement(String sql) 
        SqlParser parser = SqlParser.create(sql, config.getParserConfig());
        try 
            return parser.parseQuery();
         catch (Exception e) 
            e.printStackTrace();
            throw new UnsupportedOperationException("operation not allowed");
        
    

解析Select中的表名:

private static Set<String> extractSourceTableInSelectSql(SqlNode sqlNode, boolean fromOrJoin) 
        if (sqlNode == null) 
            return new HashSet<>();
        
        final SqlKind sqlKind = sqlNode.getKind();
        if (SqlKind.SELECT.equals(sqlKind)) 
            SqlSelect selectNode = (SqlSelect) sqlNode;
            Set<String> selectList = new HashSet<>(extractSourceTableInSelectSql(selectNode.getFrom(), true));
            selectNode.getSelectList().getList().stream().filter(node -> node instanceof SqlCall)
                .forEach(node -> selectList.addAll(extractSourceTableInSelectSql(node, false)));
            selectList.addAll(extractSourceTableInSelectSql(selectNode.getWhere(), false));
            selectList.addAll(extractSourceTableInSelectSql(selectNode.getHaving(), false));
            return selectList;
        
        if (SqlKind.JOIN.equals(sqlKind)) 
            SqlJoin sqlJoin = (SqlJoin) sqlNode;
            Set<String> joinList = new HashSet<>();
            joinList.addAll(extractSourceTableInSelectSql(sqlJoin.getLeft(), true));
            joinList.addAll(extractSourceTableInSelectSql(sqlJoin.getRight(), true));
            return joinList;
        

        if (SqlKind.AS.equals(sqlKind)) 
            SqlCall sqlCall = (SqlCall) sqlNode;
            return extractSourceTableInSelectSql(sqlCall.getOperandList().get(0), fromOrJoin);
        

        if (SqlKind.IDENTIFIER.equals(sqlKind)) 
            Set<String> identifierList = new HashSet<>();
            if (fromOrJoin) 
                SqlIdentifier sqlIdentifier = (SqlIdentifier) sqlNode;
                identifierList.add(sqlIdentifier.toString());
            
            return identifierList;
        

        Set<String> defaultList = new HashSet<>();
        if (sqlNode instanceof SqlCall) 
            SqlCall call = (SqlCall) sqlNode;
            call.getOperandList()
                .forEach(node -> defaultList.addAll(extractSourceTableInSelectSql(node, false)));
        
        return defaultList;
    

解析Insert语句中的表名:

private static Set<String> extractSourceTableInInsertSql(SqlNode sqlNode, boolean fromOrJoin) 
        SqlInsert sqlInsert = (SqlInsert) sqlNode;
        Set<String> insertList = new HashSet<>(extractSourceTableInSelectSql(sqlInsert.getSource(), false));
        final SqlNode targetTable = sqlInsert.getTargetTable();
        if (targetTable instanceof SqlIdentifier) 
            insertList.add(((SqlIdentifier) targetTable).toString());
        
        return insertList;
    

执行效果

private static final String sql0 = "SELECT MIN(relation_id) FROM tableA JOIN TableB  GROUP BY account_instance_id, follow_account_instance_id HAVING COUNT(*)>1";

private static final String sql1 = "SELECT * FROM blog_user_relation a WHERE (a.account_instance_id,a.follow_account_instance_id) IN (SELECT account_instance_id,follow_account_instance_id FROM Blogs_info GROUP BY account_instance_id, follow_account_instance_id HAVING COUNT(*) > 1)";
private static final String sql2 = "select name from (select * from student)";
private static final String sql3 = "SELECT * FROM Student LEFT JOIN Grade ON Student.sID = Grade.gID\\n" +
            "UNION\\n" +
            "SELECT * FROM Student RIGHT JOIN Grade ON Student.sID = Grade.gID";
private static final String sql4 = "SELECT *\\n" +
            "FROM teacher\\n" +
            "WHERE birth = (SELECT MIN(birth)\\n" +
            "               FROM employee)";
private static final String sql5 = "SELECT sName\\n" +
            "FROM Student\\n" +
            "WHERE 450 NOT IN (SELECT courseID\\n" +
            "                    FROM Course\\n" +
            "                    WHERE sID = Student.sID)";

final SqlNode sqlNode0 = parseStatement(sql0);
System.out.println("sqlNode0: " + extractSourceTableInSelectSql(sqlNode0, false));

结果为:

结语

第一次使用这个组件,感觉还是很强大的,当前还没有深入了解,后续有机会继续分享。。。

以上是关于Apache Calcite进行SQL解析的主要内容,如果未能解决你的问题,请参考以下文章

解析和校验Flink SQL语句

org.apache.calcite.sql.parser.impl.ParseException: Encountered "create"

Apache Calcite 功能简析及在 Flink 的应用

Apache Calcite 学习

Flink SQL 之 Calcite Volcano优化器(源码解析)

95-910-330-源码-FlinkSQL-Calcite-Flink结合Calcite