Apache Calcite进行SQL解析

Posted 2022-01-22 追求无悔

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Apache Calcite进行SQL解析相关的知识，希望对你有一定的参考价值。

背景

当一个项目分了很多模块，很多个服务的时候，一些公共的配置就需要统一管理了，于是就有了元数据驱动！

简介

什么是Calcite？
是一款开源SQL解析工具, 可以将各种SQL语句解析成抽象语法树AST(Abstract Syntax Tree), 之后通过操作AST就可以把SQL中所要表达的算法与关系体现在具体代码之中。
Calcite能做啥？

SQL 解析
SQL 校验
查询优化
SQL 生成器
数据连接

实例

今天主要是贴出一个java代码实例，实现了：解析SQL语句中的表名
上代码：
SQL语句转化：

public static SqlNode parseStatement(String sql) 
        SqlParser parser = SqlParser.create(sql, config.getParserConfig());
        try 
            return parser.parseQuery();
         catch (Exception e) 
            e.printStackTrace();
            throw new UnsupportedOperationException("operation not allowed");

解析Select中的表名：

private static Set<String> extractSourceTableInSelectSql(SqlNode sqlNode, boolean fromOrJoin) 
        if (sqlNode == null) 
            return new HashSet<>();
        
        final SqlKind sqlKind = sqlNode.getKind();
        if (SqlKind.SELECT.equals(sqlKind)) 
            SqlSelect selectNode = (SqlSelect) sqlNode;
            Set<String> selectList = new HashSet<>(extractSourceTableInSelectSql(selectNode.getFrom(), true));
            selectNode.getSelectList().getList().stream().filter(node -> node instanceof SqlCall)
                .forEach(node -> selectList.addAll(extractSourceTableInSelectSql(node, false)));
            selectList.addAll(extractSourceTableInSelectSql(selectNode.getWhere(), false));
            selectList.addAll(extractSourceTableInSelectSql(selectNode.getHaving(), false));
            return selectList;
        
        if (SqlKind.JOIN.equals(sqlKind)) 
            SqlJoin sqlJoin = (SqlJoin) sqlNode;
            Set<String> joinList = new HashSet<>();
            joinList.addAll(extractSourceTableInSelectSql(sqlJoin.getLeft(), true));
            joinList.addAll(extractSourceTableInSelectSql(sqlJoin.getRight(), true));
            return joinList;
        

        if (SqlKind.AS.equals(sqlKind)) 
            SqlCall sqlCall = (SqlCall) sqlNode;
            return extractSourceTableInSelectSql(sqlCall.getOperandList().get(0), fromOrJoin);
        

        if (SqlKind.IDENTIFIER.equals(sqlKind)) 
            Set<String> identifierList = new HashSet<>();
            if (fromOrJoin) 
                SqlIdentifier sqlIdentifier = (SqlIdentifier) sqlNode;
                identifierList.add(sqlIdentifier.toString());
            
            return identifierList;
        

        Set<String> defaultList = new HashSet<>();
        if (sqlNode instanceof SqlCall) 
            SqlCall call = (SqlCall) sqlNode;
            call.getOperandList()
                .forEach(node -> defaultList.addAll(extractSourceTableInSelectSql(node, false)));
        
        return defaultList;

解析Insert语句中的表名：

private static Set<String> extractSourceTableInInsertSql(SqlNode sqlNode, boolean fromOrJoin) 
        SqlInsert sqlInsert = (SqlInsert) sqlNode;
        Set<String> insertList = new HashSet<>(extractSourceTableInSelectSql(sqlInsert.getSource(), false));
        final SqlNode targetTable = sqlInsert.getTargetTable();
        if (targetTable instanceof SqlIdentifier) 
            insertList.add(((SqlIdentifier) targetTable).toString());
        
        return insertList;

执行效果

private static final String sql0 = "SELECT MIN(relation_id) FROM tableA JOIN TableB  GROUP BY account_instance_id, follow_account_instance_id HAVING COUNT(*)>1";

private static final String sql1 = "SELECT * FROM blog_user_relation a WHERE (a.account_instance_id,a.follow_account_instance_id) IN (SELECT account_instance_id,follow_account_instance_id FROM Blogs_info GROUP BY account_instance_id, follow_account_instance_id HAVING COUNT(*) > 1)";
private static final String sql2 = "select name from (select * from student)";
private static final String sql3 = "SELECT * FROM Student LEFT JOIN Grade ON Student.sID = Grade.gID\\n" +
            "UNION\\n" +
            "SELECT * FROM Student RIGHT JOIN Grade ON Student.sID = Grade.gID";
private static final String sql4 = "SELECT *\\n" +
            "FROM teacher\\n" +
            "WHERE birth = (SELECT MIN(birth)\\n" +
            "               FROM employee)";
private static final String sql5 = "SELECT sName\\n" +
            "FROM Student\\n" +
            "WHERE 450 NOT IN (SELECT courseID\\n" +
            "                    FROM Course\\n" +
            "                    WHERE sID = Student.sID)";

final SqlNode sqlNode0 = parseStatement(sql0);
System.out.println("sqlNode0: " + extractSourceTableInSelectSql(sqlNode0, false));

结果为：

结语

第一次使用这个组件，感觉还是很强大的，当前还没有深入了解，后续有机会继续分享。。。

以上是关于Apache Calcite进行SQL解析的主要内容，如果未能解决你的问题，请参考以下文章