美文网首页
Antlr 4 的 备忘

Antlr 4 的 备忘

作者: 木戎 | 来源:发表于2019-04-22 18:22 被阅读0次

    overview

    Antlr4 是一个强大的解析器的生成器,实现的词法/语法分析,可以用来读取、处理、执行或翻译结构化文本,ANTLR可以从语法上来生成一个可以构建和遍历解析树的解析器

    additional

    1. 原生SqlBase只兼容大写,现兼容字母大小写

    stage

    1. 引入sql,通过CharStreams.fromString(sql)将原生sql转为可识别的流:CharStreams
    2. 构造SqlBaseLexer词法分析器
    3. 构造Token流
    4. 生产最终SqlBaseParser对象
    SqlBaseLexer lexer = new SqlBaseLexer(CharStreams.fromString(sql));
    CommonTokenStream tokenStream = new CommonTokenStream(lexer);
    SqlBaseParser parser = new SqlBaseParser(tokenStream);
    ParseTreeWalker walker = new ParseTreeWalker();
    MySqlBaseBaseListener mySqlBaseBaseListener = new MySqlBaseBaseListener();
    walker.walk(mySqlBaseBaseListener, parser.statement());
    

    code examples

    create table

    create table table1 (
    gender string comment 'gender',
    name string comment 'name',
    age int comment 'age',
    income double comment 'income'
    ) comment 'user info'
    
    create table.png

    对于非insert、select,核心为:

    statement
        : query                                                            #statementDefault
        | USE db=identifier                                                #use
        | CREATE DATABASE (IF NOT EXISTS)? identifier
            (COMMENT comment=STRING)? locationSpec?
            (WITH DBPROPERTIES tablePropertyList)?                         #createDatabase
        | ALTER DATABASE identifier SET DBPROPERTIES tablePropertyList     #setDatabaseProperties
        | DROP DATABASE (IF EXISTS)? identifier (RESTRICT | CASCADE)?      #dropDatabase
        | createTableHeader ('(' colTypeList ')')? tableProvider
            ((OPTIONS options=tablePropertyList) |
            (PARTITIONED BY partitionColumnNames=identifierList) |
            bucketSpec |
            locationSpec |
            (COMMENT comment=STRING) |
            (TBLPROPERTIES tableProps=tablePropertyList))*
            (AS? query)?                                                   #createTable
        | createTableHeader ('(' columns=colTypeList ')')?
    
    ......
    

    sub select

    select 
    name,
    age,
    sum(income) 
    from 
    (
    select 
    gender,
    name,
    age,
    income 
    from 
    table1 
    where 
    name = 'allen'
    ) table2
    group by 
    name,age
    
    select tree.png

    对于查询来看,核心在于规则

    querySpecification
        : (((SELECT kind=TRANSFORM '(' namedExpressionSeq ')'
            | kind=MAP namedExpressionSeq
            | kind=REDUCE namedExpressionSeq))
           inRowFormat=rowFormat?
           (RECORDWRITER recordWriter=STRING)?
           USING script=STRING
           (AS (identifierSeq | colTypeList | ('(' (identifierSeq | colTypeList) ')')))?
           outRowFormat=rowFormat?
           (RECORDREADER recordReader=STRING)?
           fromClause?
           (WHERE where=booleanExpression)?)
        | ((kind=SELECT (hints+=hint)* setQuantifier? namedExpressionSeq fromClause?
           | fromClause (kind=SELECT setQuantifier? namedExpressionSeq)?)
           lateralView*
           (WHERE where=booleanExpression)?
           aggregation?
           (HAVING having=booleanExpression)?
           windows?)
        ;
    

    当然还有其他依赖,例如排序、聚合等等拓展Rule

    some important rule instructions

    匹配SqlBase.g4中sql的入口匹配规则,递归的遍历statement,以及其后的各个节点。在匹配过程中,碰到叶子节点,就将构造TreeNode

    singleTableIdentifier
     : tableIdentifier EOF
     ;
    

    匹配规则时(单表的标识符),则匹配TableIdentifier

    singleTableIdentifier
     : tableIdentifier EOF
     ;
    

    递归遍历对应的tableIdentifier,tableIdentifier的定义和遍历规则如下,当匹配到tableIdentifier,将直接生成TableIdentifier对象,而该对象是TreeNode的一种。

    tableIdentifier
        : (db=identifier '.')? table=identifier
        ;
    

    antlr additional rule example

    singleStatement
        : statement EOF
        ;
    

    如默认只解释一个sql语句,可以拓展为

    multiStatement
        : statement SQL_SPLIT? | (statement SQL_SPLIT)+ EOF
        ;
    
    SQL_SPLIT
        : ';'+ | ([\r\n]* ';'+ [\r\n]*)+
        ;
    

    other

    如何实现字段血缘关系?

    相关文章

      网友评论

          本文标题:Antlr 4 的 备忘

          本文链接:https://www.haomeiwen.com/subject/kbhugqtx.html