从零写一个编译器(十三):代码生成之遍历AST
Posted secoding
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了从零写一个编译器(十三):代码生成之遍历AST相关的知识,希望对你有一定的参考价值。
项目的完整代码在 C2j-Compiler
前言
在上一篇完成对JVM指令的生成,下面就可以真正进入代码生成部分了。通常现代编译器都是先把生成IR,再经过代码优化等等,最后才编译成目标平台代码。但是时间水平有限,我们没有IR也没有代码优化,就直接利用AST生成Java字节码
入口
进行代码生成的入口在CodeGen,和之前解释器一样:先获取main函数的头节点,从这个节点开始,先进入函数定义,再进入代码块
函数定义节点
在进入函数定义节点的时候,就要生成一个函数定义对应的Java字节码,即一个静态方法(因为我们对整个C语言文件生成为一个类,main方法为public static main,其它的则是对应的静态方法,结构体则是另外的类)
- 对于函数定义先从节点中拿到对应的函数命和参数
- emitArgs是用来处理参数的,根据参数生成相应的Java字节码
- 如果这个函数是main的话已经是交由前面处理了,逻辑差不多(具体在start.Start中)
case SyntaxProductionInit.NewName_LP_RP_TO_FunctDecl:
root.reverseChildren();
AstNode n = root.getChildren().get(0);
String name = (String) n.getAttribute(NodeKey.TEXT);
symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
generator.setCurrentFuncName(name);
if (name != null && !name.equals("main"))
String declaration = name + emitArgs(symbol);
generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration);
generator.setNameAndDeclaration(name, declaration);
copyChild(root, root.getChildren().get(0));
break;
case SyntaxProductionInit.NewName_LP_VarList_RP_TO_FunctDecl:
n = root.getChildren().get(0);
name = (String) n.getAttribute(NodeKey.TEXT);
symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
generator.setCurrentFuncName(name);
if (name != null && !name.equals("main"))
String declaration = name + emitArgs(symbol);
generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration);
generator.setNameAndDeclaration(name, declaration);
Symbol args = symbol.getArgList();
if (args == null || argsList == null || argsList.isEmpty())
System.err.println("generate function with arg list but arg list is null");
System.exit(1);
break;
创建结构体和数组
数组
创建结构体和数组的节点在DefGenerate里,可以看到在这里只处理了数组和普通变量,有关结构体的处理是在对结构体第一次使用的时候。顺便提一下代码生成对于赋初值操作是没有进行处理的。
- 如果是个数组,酒直接调用ProgramGenerator直接生成创建数组的指令
- 如果是个普通变量,就直接找到它并且赋值为0(这里变量在队列里的位置是根据符号表来计算的,具体可以看上一篇的getLocalVariableIndex方法)
public class DefGenerate extends BaseGenerate
@Override
public Object generate(AstNode root)
int production = (int) root.getAttribute(NodeKey.PRODUCTION);
ProgramGenerator generator = ProgramGenerator.getInstance();
Symbol symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
switch (production)
case SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def:
Declarator declarator = symbol.getDeclarator(Declarator.ARRAY);
if (declarator != null)
if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null)
generator.createArray(symbol);
else
int i = generator.getLocalVariableIndex(symbol);
generator.emit(Instruction.SIPUSH, "" + 0);
generator.emit(Instruction.ISTORE, "" + i);
break;
default:
break;
return root;
结构体
处理结构体定义的代码在UnaryNodeGenerate,也就是只有在使用到结构体定义时才会进行定义
- 先拿到当前UNARY的符号,如果instanceof ArrayValueSetter就说明是一个结构体数组,就进入getStructSymbolFromStructArray方法创建一个结构体数组,并返回当前下标的结构体对象
- 设置当前结构体的作用域范围
- 对结构体作为类进行定义
- 然后对读取结构体的域
- 其实可以忽略指针部分,因为代码生成并没有对指针进行模拟
case SyntaxProductionInit.Unary_StructOP_Name_TO_Unary:
child = root.getChildren().get(0);
String fieldName = (String) root.getAttribute(NodeKey.TEXT);
Object object = child.getAttribute(NodeKey.SYMBOL);
boolean isStructArray = false;
if (object instanceof ArrayValueSetter)
symbol = getStructSymbolFromStructArray(object);
symbol.addValueSetter(object);
isStructArray = true;
else
symbol = (Symbol) child.getAttribute(NodeKey.SYMBOL);
if (isStructArray)
ArrayValueSetter vs = (ArrayValueSetter) object;
Symbol structArray = vs.getSymbol();
structArray.addScope(ProgramGenerator.getInstance().getCurrentFuncName());
else
symbol.addScope(ProgramGenerator.getInstance().getCurrentFuncName());
ProgramGenerator.getInstance().putStructToClassDeclaration(symbol);
if (isSymbolStructPointer(symbol))
copyBetweenStructAndMem(symbol, false);
Symbol args = symbol.getArgList();
while (args != null)
if (args.getName().equals(fieldName))
args.setStructParent(symbol);
break;
args = args.getNextSymbol();
if (args == null)
System.err.println("access a filed not in struct object!");
System.exit(1);
if (args.getValue() != null)
ProgramGenerator.getInstance().readValueFromStructMember(symbol, args);
root.setAttribute(NodeKey.SYMBOL, args);
root.setAttribute(NodeKey.VALUE, args.getValue());
if (isSymbolStructPointer(symbol))
checkValidPointer(symbol);
structObjSymbol = symbol;
monitorSymbol = args;
GenerateBrocasterImpl.getInstance().registerReceiverForAfterExe(this);
else
structObjSymbol = null;
break;
一元操作节点
这个节点和在解释器的有很多相同,除了有对结构体的操作,其它的也是有非常重要的作用
- 像数字、字符串或者是变量和之前的操作都是把信息传递到父节点,交由父节点处理
case SyntaxProductionInit.Number_TO_Unary:
text = (String) root.getAttribute(NodeKey.TEXT);
boolean isFloat = text.indexOf('.') != -1;
if (isFloat)
value = Float.valueOf(text);
root.setAttribute(NodeKey.VALUE, value);
else
value = Integer.valueOf(text);
root.setAttribute(NodeKey.VALUE, value);
break;
case SyntaxProductionInit.Name_TO_Unary:
symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
if (symbol != null)
root.setAttribute(NodeKey.VALUE, symbol.getValue());
root.setAttribute(NodeKey.TEXT, symbol.getName());
break;
case SyntaxProductionInit.String_TO_Unary:
text = (String) root.getAttribute(NodeKey.TEXT);
root.setAttribute(NodeKey.VALUE, text);
break;
case SyntaxProductionInit.Unary_LB_Expr_RB_TO_Unary:
child = root.getChildren().get(0);
symbol = (Symbol) child.getAttribute(NodeKey.SYMBOL);
child = root.getChildren().get(1);
int index = 0;
if (child.getAttribute(NodeKey.VALUE) != null)
index = (Integer) child.getAttribute(NodeKey.VALUE);
Object idxObj = child.getAttribute(NodeKey.SYMBOL);
try
Declarator declarator = symbol.getDeclarator(Declarator.ARRAY);
if (declarator != null)
Object val = declarator.getElement((int) index);
root.setAttribute(NodeKey.VALUE, val);
ArrayValueSetter setter;
if (idxObj == null)
setter = new ArrayValueSetter(symbol, index);
else
setter = new ArrayValueSetter(symbol, idxObj);
root.setAttribute(NodeKey.SYMBOL, setter);
root.setAttribute(NodeKey.TEXT, symbol.getName());
Declarator pointer = symbol.getDeclarator(Declarator.POINTER);
if (pointer != null)
setPointerValue(root, symbol, index);
PointerValueSetter pv = new PointerValueSetter(symbol, index);
root.setAttribute(NodeKey.SYMBOL, pv);
root.setAttribute(NodeKey.TEXT, symbol.getName());
catch (Exception e)
e.printStackTrace();
System.exit(1);
break;
赋值操作
- 如果当前是一个数组,先拿到它的符号和下标
- 如果不是结构体数组,那么拿到下标直接用readArrayElement生成读取数组元素的指令
- 如果是一个符号则用getLocalVariableIndex读取这个符号的值
- 如果是一个常数,则直接生成IPUSH指令
- 最后进行赋值操作,如果不是对结构体的域进行赋值就直接用getLocalVariableIndex拿到队列位置然后生成ISTORE
- 如果是对结构体数组的元素的域的赋值,就调用assignValueToStructMemberFromArray生成代码,如果只是结构体就直接调用assignValueToStructMember生成代码
ProgramGenerator generator = ProgramGenerator.getInstance();
if (BaseGenerate.resultOnStack)
this.value = obj;
BaseGenerate.resultOnStack = false;
else if (obj instanceof ArrayValueSetter)
ArrayValueSetter setter = (ArrayValueSetter) obj;
Symbol symbol = setter.getSymbol();
Object index = setter.getIndex();
if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null)
if (index instanceof Symbol)
ProgramGenerator.getInstance().readArrayElement(symbol, index);
if (((Symbol) index).getValue() != null)
int i = (int) ((Symbol) index).getValue();
try
this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i);
catch (Exception e)
e.printStackTrace();
else
int i = (int) index;
try
this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i);
catch (Exception e)
e.printStackTrace();
ProgramGenerator.getInstance().readArrayElement(symbol, index);
else if (obj instanceof Symbol)
Symbol symbol = (Symbol) obj;
this.value = symbol.value;
int i = generator.getLocalVariableIndex(symbol);
generator.emit(Instruction.ILOAD, "" + i);
else if (obj instanceof Integer)
Integer val = (Integer) obj;
generator.emit(Instruction.SIPUSH, "" + val);
this.value = obj;
if (!this.isStructMember())
int idx = generator.getLocalVariableIndex(this);
if (!generator.isPassingArguments())
generator.emit(Instruction.ISTORE, "" + idx);
else
if (this.getStructSymbol().getValueSetter() != null)
generator.assignValueToStructMemberFromArray(this.getStructSymbol().getValueSetter(), this, this.value);
else
generator.assignValueToStructMember(this.getStructSymbol(), this, this.value);
最后
完成这部分后,对下面的代码
void quicksort(int A[10], int p, int r)
int x;
int i;
i = p - 1;
int j;
int t;
int v;
v = r - 1;
if (p < r)
x = A[r];
for (j = p; j <= v; j++)
if (A[j] <= x)
i++;
t = A[i];
A[i] = A[j];
A[j] = t;
v = i + 1;
t = A[v];
A[v] = A[r];
A[r] = t;
t = v - 1;
quicksort(A, p, t);
t = v + 1;
quicksort(A, t, r);
void main ()
int a[10];
int i;
int t;
printf("before quick sort:");
for(i = 0; i < 10; i++)
t = (10 - i);
a[i] = t;
printf("value of a[%d] is %d", i, a[i]);
quicksort(a, 0, 9);
printf("after quick sort:");
for (i = 0; i < 10; i++)
printf("value of a[%d] is %d", i, a[i]);
则会生成下面的Java字节码
.class public C2Bytecode
.super java/lang/Object
.method public static main([Ljava/lang/String;)V
sipush 10
newarray int
astore 0
sipush 0
istore 1
sipush 0
istore 2
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "before quick sort:"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
sipush 0
istore 1
loop0:
iload 1
sipush 10
if_icmpge branch0
sipush 10
iload 1
isub
istore 2
aload 0
iload 1
iload 2
iastore
aload 0
iload 1
iaload
istore 3
iload 1
istore 4
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "value of a["
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 4
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "] is "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 3
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
iload 1
sipush 1
iadd
istore 1
goto loop0
branch0:
aload 0
sipush 0
sipush 9
invokestatic C2Bytecode/quicksort([III)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "after quick sort:"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
sipush 0
istore 1
loop2:
iload 1
sipush 10
if_icmpge branch4
aload 0
iload 1
iaload
istore 3
iload 1
istore 4
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "value of a["
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 4
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "] is "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 3
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
iload 1
sipush 1
iadd
istore 1
goto loop2
branch4:
return
.end method
.method public static quicksort([III)V
sipush 2
newarray int
astore 6
sipush 0
istore 5
sipush 1
istore 5
aload 6
iload 5
sipush 1
iastore
aload 6
sipush 1
iaload
istore 10
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "before quick sort: "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 10
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
sipush 0
istore 9
sipush 0
istore 3
iload 1
sipush 1
isub
istore 3
sipush 0
istore 4
sipush 0
istore 7
sipush 0
istore 8
iload 2
sipush 1
isub
istore 8
iload 1
iload 2
if_icmpge branch1
aload 0
iload 2
iaload
istore 9
iload 1
istore 4
loop1:
iload 4
iload 8
if_icmpgt ibranch1
aload 0
iload 4
iaload
iload 9
if_icmpgt ibranch2
iload 3
sipush 1
iadd
istore 3
aload 0
iload 3
iaload
istore 7
aload 0
iload 3
aload 0
iload 4
iaload
iastore
aload 0
iload 4
iload 7
iastore
ibranch2:
iload 4
sipush 1
iadd
istore 4
goto loop1
ibranch1:
iload 3
sipush 1
iadd
istore 8
aload 0
iload 8
iaload
istore 7
aload 0
iload 8
aload 0
iload 2
iaload
iastore
aload 0
iload 2
iload 7
iastore
iload 8
sipush 1
isub
istore 7
aload 0
iload 1
iload 7
invokestatic C2Bytecode/quicksort([III)V
iload 8
sipush 1
iadd
istore 7
aload 0
iload 7
iload 2
invokestatic C2Bytecode/quicksort([III)V
branch1:
return
.end method
.end class
小结
这篇的代码生成和之前解释器的思路很相似,都是根据AST和对应的产生式来执行或者生成代码。
其实主要的思路是很清晰的,只是其中有太多细节容易让人太过纠结。这个系列算作是我自己的学习笔记,到这也有十三篇了,下一篇可能写写总结就正式结束了。
欢迎Star!
以上是关于从零写一个编译器(十三):代码生成之遍历AST的主要内容,如果未能解决你的问题,请参考以下文章