LLVM学习笔记(49)
Posted wuhui_gdnt
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了LLVM学习笔记(49)相关的知识,希望对你有一定的参考价值。
3.8. RegisterBank代码的自动生成(v7.0)
3.8.1. 数据结构
选项“-gen-register-bank”用于生成辅助目标机器RegisterBank类(对X86是X86RegisterBankInfo)处理的代码。目前,RegisterBank代码的自动生成程度还不高,相当部分的代码计划在未来将由TableGen生成。
以下内容来自https://2pi.dk/llvm/global-isel
许多指令集架构有多个寄存器银行。X86有3个:整数、向量以及X87浮点寄存器。(如果你把MMX寄存器算作独立的银行,是4个)。Blackfin与m68k有独立的指针及数据寄存器银行。在多个寄存器银行里有相同的操作也是常见的。例如,大多数带有向量寄存器银行的ISA在整数和向量寄存器银行上支持按位的and/or/xor操作。 全局指令选择器将显式地向寄存器银行分配虚拟寄存器。寄存器银行的集合通常是小的(2~3)且由目标机器定义。显式地塑造寄存器银行使得寄存器银行间的移动操作成为可能,以尽量减小通常代价很大的跨银行拷贝。SPARC甚至要求跨银行拷贝进入内存,就像x86在某些情形里那样。 寄存器银行选择遍计算最优的银行分派,在值需要跨银行时插入拷贝指令。有时,在两个寄存器银行里同时有相同的值是有利的,这也可以由跨银行拷贝指令来表示。银行选择还受到寄存器压力方面的影响。例如,在x86-64上许多i32值可以移到SSE寄存器,释放出整数寄存器。 |
一般来说,组内跨类别拷贝的代价预期比跨组拷贝要低。对寄存器合并器它们也可合并的,而跨组拷贝则不可以。
同样,使用不同的指令可以在不同组上执行等价操作。例如,X86可以视为有3个主要的组:通用寄存器、x87,以及向量寄存器(这可以进一步分为用于单精度及双精度指令的组)。
比如,现在类X86GenRegisterBankInfo尚未由TableGen生成定义(X86RegisterBankInfo.h):
26 class X86GenRegisterBankInfo : public RegisterBankInfo
27 protected:
28 #define GET_TARGET_REGBANK_CLASS
29 #include "X86GenRegisterBank.inc"
30 #define GET_TARGET_REGBANK_INFO_CLASS
31 #include "X86GenRegisterBankInfo.def"
32
33 static RegisterBankInfo::PartialMapping PartMappings[];
34 static RegisterBankInfo::ValueMapping ValMappings[];
35
36 static PartialMappingIdx getPartialMappingIdx(const LLT &Ty, bool isFP);
37 static const RegisterBankInfo::ValueMapping *
38 getValueMapping(PartialMappingIdx Idx, unsigned NumOperands);
39 ;
31行的文件X86GenRegisterBankInfo.def未来也是应该由TableGen生成的,目前它的内容是:
14 #ifdef GET_TARGET_REGBANK_INFO_IMPL
15 RegisterBankInfo::PartialMapping X86GenRegisterBankInfo::PartMappings[]
16 /* StartIdx, Length, RegBank */
17 // GPR value
18 0, 8, X86::GPRRegBank, // :0
19 0, 16, X86::GPRRegBank, // :1
20 0, 32, X86::GPRRegBank, // :2
21 0, 64, X86::GPRRegBank, // :3
22 // FR32/64 , xmm registers
23 0, 32, X86::VECRRegBank, // :4
24 0, 64, X86::VECRRegBank, // :5
25 // VR128/256/512
26 0, 128, X86::VECRRegBank, // :6
27 0, 256, X86::VECRRegBank, // :7
28 0, 512, X86::VECRRegBank, // :8
29 ;
30 #endif // GET_TARGET_REGBANK_INFO_IMPL
31
32 #ifdef GET_TARGET_REGBANK_INFO_CLASS
33 enum PartialMappingIdx
34 PMI_None = -1,
35 PMI_GPR8,
36 PMI_GPR16,
37 PMI_GPR32,
38 PMI_GPR64,
39 PMI_FP32,
40 PMI_FP64,
41 PMI_VEC128,
42 PMI_VEC256,
43 PMI_VEC512
44 ;
45 #endif // GET_TARGET_REGBANK_INFO_CLASS
46
47 #ifdef GET_TARGET_REGBANK_INFO_IMPL
48 #define INSTR_3OP(INFO) INFO, INFO, INFO,
49 #define BREAKDOWN(INDEX, NUM) \\
50 &X86GenRegisterBankInfo::PartMappings[INDEX], NUM
51 // ValueMappings.
52 RegisterBankInfo::ValueMapping X86GenRegisterBankInfo::ValMappings[]
53 /* BreakDown, NumBreakDowns */
54 // 3-operands instructions (all binary operations should end up with one of
55 // those mapping).
56 INSTR_3OP(BREAKDOWN(PMI_GPR8, 1)) // 0: GPR_8
57 INSTR_3OP(BREAKDOWN(PMI_GPR16, 1)) // 3: GPR_16
58 INSTR_3OP(BREAKDOWN(PMI_GPR32, 1)) // 6: GPR_32
59 INSTR_3OP(BREAKDOWN(PMI_GPR64, 1)) // 9: GPR_64
60 INSTR_3OP(BREAKDOWN(PMI_FP32, 1)) // 12: Fp32
61 INSTR_3OP(BREAKDOWN(PMI_FP64, 1)) // 15: Fp64
62 INSTR_3OP(BREAKDOWN(PMI_VEC128, 1)) // 18: Vec128
63 INSTR_3OP(BREAKDOWN(PMI_VEC256, 1)) // 21: Vec256
64 INSTR_3OP(BREAKDOWN(PMI_VEC512, 1)) // 24: Vec512
65 ;
66 #undef INSTR_3OP
67 #undef BREAKDOWN
68 #endif // GET_TARGET_REGBANK_INFO_IMPL
69
70 #ifdef GET_TARGET_REGBANK_INFO_CLASS
71 enum ValueMappingIdx
72 VMI_None = -1,
73 VMI_3OpsGpr8Idx = PMI_GPR8 * 3,
74 VMI_3OpsGpr16Idx = PMI_GPR16 * 3,
75 VMI_3OpsGpr32Idx = PMI_GPR32 * 3,
76 VMI_3OpsGpr64Idx = PMI_GPR64 * 3,
77 VMI_3OpsFp32Idx = PMI_FP32 * 3,
78 VMI_3OpsFp64Idx = PMI_FP64 * 3,
79 VMI_3OpsVec128Idx = PMI_VEC128 * 3,
80 VMI_3OpsVec256Idx = PMI_VEC256 * 3,
81 VMI_3OpsVec512Idx = PMI_VEC512 * 3,
82 ;
83 #undef GET_TARGET_REGBANK_INFO_CLASS
84 #endif // GET_TARGET_REGBANK_INFO_CLASS
85
86 #ifdef GET_TARGET_REGBANK_INFO_IMPL
87 #undef GET_TARGET_REGBANK_INFO_IMPL
88 const RegisterBankInfo::ValueMapping *
89 X86GenRegisterBankInfo::getValueMapping(PartialMappingIdx Idx,
90 unsigned NumOperands)
91
92 // We can use VMI_3Ops Mapping for all the cases.
93 if (NumOperands <= 3 && (Idx >= PMI_GPR8 && Idx <= PMI_VEC512))
94 return &ValMappings[(unsigned)Idx * 3];
95
96 llvm_unreachable("Unsupported PartialMappingIdx.");
97
98
99 #endif // GET_TARGET_REGBANK_INFO_IMPL
在基类RegisterBankInfo中定义了以下这些嵌套结构体(这里我们只关心它们的数据成员),它们在上面X86GenRegisterBankInfo.def中被初始化:
47 struct PartialMapping
48 /// Number of bits at which this partial mapping starts in the
49 /// original value. The bits are counted from less significant
50 /// bits to most significant bits.
51 unsigned StartIdx;
52
53 /// Length of this mapping in bits. This is how many bits this
54 /// partial mapping covers in the original value:
55 /// from StartIdx to StartIdx + Length -1.
56 unsigned Length;
57
58 /// Register bank where the partial value lives.
59 const RegisterBank *RegBank;
显然上面的枚举值PartialMappingIdx是数组PartMappings的索引,结合起来这个数组的含义就显而易见了。比如,PartMappings[0]对应GPR8,所以这个类别的寄存器将覆盖0~7的比特位。
将PartMappings组织起来的是另一个嵌套结构体ValueMapping,它表示一个值通过不同寄存器组映射的结果。比如注释里谈到的:假设我们有一个32位add以及一个<2 x 32位>的vadd。我们可以将这个<2 x 32位>向量加法展开为2 x 32位加法。当前,类似TableGen的文件看起来像这样:
PartialMapping[] =
/*32-bit add*/ 0, 32, GPR,
/*2x32-bit add*/ 0, 32, GPR, 0, 32, GPR, // <-- Same entry 3x
/*<2x32-bit> vadd*/ 0, 64, VPR
; // PartialMapping duplicated.
ValueMapping[]
/*plain 32-bit add*/ &PartialMapping[0], 1,
/*expanded vadd on 2xadd*/ &PartialMapping[1], 2,
/*plain <2x32-bit> vadd*/ &PartialMapping[3], 1
;
使用指针数组,我们将有:
PartialMapping[] =
/*32-bit add*/ 0, 32, GPR,
/*<2x32-bit> vadd*/ 0, 64, VPR
; // No more duplication.
BreakDowns[] =
/*AddBreakDown*/ &PartialMapping[0],
/*2xAddBreakDown*/ &PartialMapping[0], &PartialMapping[0],
/*VAddBreakDown*/ &PartialMapping[1]
; // Addresses of PartialMapping duplicated (smaller).
ValueMapping[]
/*plain 32-bit add*/ &BreakDowns[0], 1,
/*expanded vadd on 2xadd*/ &BreakDowns[1], 2,
/*plain <2x32-bit> vadd*/ &BreakDowns[3], 1
;
上面数组ValMappings每个寄存器长度给3个项是为了支持3参数指令。ValueMapping仅包含两个数据成员:
142 struct ValueMapping
143 /// How the value is broken down between the different register banks.
144 const PartialMapping *BreakDown;
145
146 /// Number of partial mapping to break down this value.
147 unsigned NumBreakDowns;
类RegisterBankInfo中还有另外两个嵌套类定义,比如InstructionMapping、OperandsMapper,但与这里的代码生成关系不大,我们暂时跳过。
3.8.2. TableGen定义的解析
X86只定义了以下的RegisterBank的派生定义:
13 /// General Purpose Registers: RAX, RCX,...
14 def GPRRegBank : RegisterBank<"GPR", [GR64]>;
15
16 /// Floating Point/Vector Registers
17 def VECRRegBank : RegisterBank<"VECR", [VR512]>;
其中GR64类别的寄存器有:RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, RBX, R14, R15, R12, R13, RBP, RSP, RIP。VR512类别的寄存器有:v16f32, v8f64, v64i8, v32i16, v16i32, v8i64, ZMM0 ~ ZMM31。
EmitRegisterBank()调用RegisterBankEmitter的构造函数,其中成员RegisterClassHierarchy类型为CodeGenRegBank,这个类的构造在前面已经看过(参考CodeGenRegBank一节)。
在完成RegisterBankEmitter的构造后,调用其中的run():
287 void RegisterBankEmitter::run(raw_ostream &OS)
288 std::vector<Record*> Targets = Records.getAllDerivedDefinitions("Target");
289 if (Targets.size() != 1)
290 PrintFatalError("ERROR: Too many or too few subclasses of Target defined!");
291 StringRef TargetName = Targets[0]->getName();
292
293 std::vector<RegisterBank> Banks;
294 for (const auto &V : Records.getAllDerivedDefinitions("RegisterBank"))
295 SmallPtrSet<const CodeGenRegisterClass *, 8> VisitedRCs;
296 RegisterBank Bank(*V);
297
298 for (const CodeGenRegisterClass *RC :
299 Bank.getExplictlySpecifiedRegisterClasses(RegisterClassHierarchy))
301 RegisterClassHierarchy, RC, "explicit",
302 [&Bank](const CodeGenRegisterClass *RC, StringRef Kind)
303 LLVM_DEBUG(dbgs()
304 << "Added " << RC->getName() << "(" << Kind << ")\\n");
305 Bank.addRegisterClass(RC);
306 ,
307 VisitedRCs);
308
309
310 Banks.push_back(Bank);
311
312
313 // Warn about ambiguous MIR caused by register bank/class name clashes.
314 for (const auto &Class : Records.getAllDerivedDefinitions("RegisterClass"))
315 for (const auto &Bank : Banks)
316 if (Bank.getName().lower() == Class->getName().lower())
317 PrintWarning(Bank.getDef().getLoc(), "Register bank names should be "
318 "distinct from register classes "
319 "to avoid ambiguous MIR");
320 PrintNote(Bank.getDef().getLoc(), "RegisterBank was declared here");
321 PrintNote(Class->getLoc(), "RegisterClass was declared here");
322
323
324
325
326 emitSourceFileHeader("Register Bank Source Fragments", OS);
327 OS << "#ifdef GET_REGBANK_DECLARATIONS\\n"
328 << "#undef GET_REGBANK_DECLARATIONS\\n";
329 emitHeader(OS, TargetName, Banks);
330 OS << "#endif // GET_REGBANK_DECLARATIONS\\n\\n"
331 << "#ifdef GET_TARGET_REGBANK_CLASS\\n"
332 << "#undef GET_TARGET_REGBANK_CLASS\\n";
333 emitBaseClassDefinition(OS, TargetName, Banks);
334 OS << "#endif // GET_TARGET_REGBANK_CLASS\\n\\n"
335 << "#ifdef GET_TARGET_REGBANK_IMPL\\n"
336 << "#undef GET_TARGET_REGBANK_IMPL\\n";
337 emitBaseClassImplementation(OS, TargetName, Banks);
338 OS << "#endif // GET_TARGET_REGBANK_IMPL\\n";
339
在上面296行,类RegisterBank具有以下的数据成员和构造函数:
29 class RegisterBank
30
31 /// A vector of register classes that are included in the register bank.
32 typedef std::vector<const CodeGenRegisterClass *> RegisterClassesTy;
33
34 private:
35 const Record &TheDef;
36
37 /// The register classes that are covered by the register bank.
38 RegisterClassesTy RCs;
39
40 /// The register class with the largest register size.
41 const CodeGenRegisterClass *RCWithLargestRegsSize;
42
43 public:
44 RegisterBank(const Record &TheDef)
45 : TheDef(TheDef), RCs(), RCWithLargestRegsSize(nullptr)
299行的getExplictlySpecifiedRegisterClasses()返回对应RegisterBank TD定义里RegisterClasses域中RegisterClass所对应的CodeGenRegisterClass实例,通过下面的方法把这些CodeGenRegisterClass实例归入相应的RegisterBank对象。
仅当下列条件之一成立,一个寄存器类才属于这个组:
- 被明确指明
- 是某个组成员的子类。
- 包含某个组成员寄存器的子寄存器。
下面173行的参数VisitFn是上面302~306行的lambda表达式。
170 static void visitRegisterBankClasses(
171 CodeGenRegBank &RegisterClassHierarchy, const CodeGenRegisterClass *RC,
172 const Twine Kind,
173 std::function<void(const CodeGenRegisterClass *, StringRef)> VisitFn,
174 SmallPtrSetImpl<const CodeGenRegisterClass *> &VisitedRCs)
175
176 // Make sure we only visit each class once to avoid infinite loops.
177 if (VisitedRCs.count(RC))
178 return;
179 VisitedRCs.insert(RC);
180
181 // Visit each explicitly named class.
182 VisitFn(RC, Kind.str());
183
184 for (const auto &PossibleSubclass : RegisterClassHierarchy.getRegClasses())
185 std::string TmpKind =
186 (Twine(Kind) + " (" + PossibleSubclass.getName() + ")").str();
187
188 // Visit each subclass of an explicitly named class.
189 if (RC != &PossibleSubclass && RC->hasSubClass(&PossibleSubclass))
190 visitRegisterBankClasses(RegisterClassHierarchy, &PossibleSubclass,
191 TmpKind + " " + RC->getName() + " subclass",
192 VisitFn, VisitedRCs);
193
194 // Visit each class that contains only subregisters of RC with a common
195 // subregister-index.
196 //
197 // More precisely, PossibleSubclass is a subreg-class iff Reg:SubIdx is in
198 // PossibleSubclass for all registers Reg from RC using any
199 // subregister-index SubReg
200 for (const auto &SubIdx : RegisterClassHierarchy.getSubRegIndices())
201 BitVector BV(RegisterClassHierarchy.getRegClasses().size());
202 PossibleSubclass.getSuperRegClasses(&SubIdx, BV);
203 if (BV.test(RC->EnumValue))
204 std::string TmpKind2 = (Twine(TmpKind) + " " + RC->getName() +
205 " class-with-subregs: " + RC->getName())
206 .str();
207 VisitFn(&PossibleSubclass, TmpKind2);
208
209
210
211
上面202行的getSuperRegClasses()获取PossibleSubclass包含的指定寄存器索引的上级寄存器类别。记得203行的EnumValue是CodeGenRegisterClass实例在CodeGenRegBank的RegClasses容器里的索引号(整数)。
182行VisitFn()的主体是RegisterBank的addRegisterClass(),它获取这些CodeGenRegisterClass对象中最大的溅出尺寸,并把它们记录在RCs容器里。
73 void addRegisterClass(const CodeGenRegisterClass *RC)
74 if (std::find_if(RCs.begin(), RCs.end(),
75 [&RC](const CodeGenRegisterClass *X)
76 return X == RC;
77 ) != RCs.end())
78 return;
79
80 // FIXME? We really want the register size rather than the spill size
81 // since the spill size may be bigger on some targets with
82 // limited load/store instructions. However, we don't store the
83 // register size anywhere (we could sum the sizes of the subregisters
84 // but there may be additional bits too) and we can't derive it from
85 // the VT's reliably due to Untyped.
86 if (RCWithLargestRegsSize == nullptr)
87 RCWithLargestRegsSize = RC;
88 else if (RCWithLargestRegsSize->RSI.get(DefaultMode).SpillSize <
89 RC->RSI.get(DefaultMode).SpillSize)
90 RCWithLargestRegsSize = RC;
91 assert(RCWithLargestRegsSize && "RC was nullptr?");
92
93 RCs.emplace_back(RC);
94
回到RegisterBankEmitter::run(),在314行的循环检查RegisterBank与RegisterClass是否存在同名的定义,同名会带来二义性。
3.8.3. 代码生成
RegisterBankEmitter::run()接下来开始输出X86GenRegisterBank.inc。
129 void RegisterBankEmitter::emitHeader(raw_ostream &OS,
130 const StringRef TargetName,
131 const std::vector<RegisterBank> &Banks)
132 // <Target>RegisterBankInfo.h
133 OS << "namespace llvm \\n"
134 << "namespace " << TargetName << " \\n"
135 << "enum \\n";
136 for (const auto &Bank : Banks)
137 OS << " " << Bank.getEnumeratorName() << ",\\n";
138 OS << " NumRegisterBanks,\\n"
139 << ";\\n"
140 << " // end namespace " << TargetName << "\\n"
141 << " // end namespace llvm\\n";
142
首先,通过emitHeader()输出一组枚举值:
namespace llvm
namespace X86
enum
GPRRegBankID,
VECRRegBankID,
NumRegisterBanks,
;
// end namespace X86
// end namespace llvm
注意,这部分定义是在X86RegisterBankInfo.h的全局域里展开的,定义了上述的枚举类型。
下面生成的部分则是展开在X86GenRegisterBankInfo定义里,是类X86GenRegisterBankInfo定义的补充。
145 void RegisterBankEmitter::emitBaseClassDefinition(
146 raw_ostream &OS, const StringRef TargetName,
147 const std::vector<RegisterBank> &Banks)
148 OS << "private:\\n"
149 << " static RegisterBank *RegBanks[];\\n\\n"
150 << "protected:\\n"
151 << " " << TargetName << "GenRegisterBankInfo();\\n"
152 << "\\n";
153
接着,由上面的方法输出这样的代码片段:
private:
static RegisterBank *RegBanks[];
protected:
X86GenRegisterBankInfo();
下面生成的代码展开在X86RegisterBankInfo.cpp中,给出两个全局的数组定义。
213 void RegisterBankEmitter::emitBaseClassImplementation(
214 raw_ostream &OS, StringRef TargetName,
215 std::vector<RegisterBank> &Banks)
216
217 OS << "namespace llvm \\n"
218 << "namespace " << TargetName << " \\n";
219 for (const auto &Bank : Banks)
220 std::vector<std::vector<const CodeGenRegisterClass *>> RCsGroupedByWord(
LLVM学习笔记(52)