v8字节码的编译过程
Posted Jtag特工
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了v8字节码的编译过程相关的知识,希望对你有一定的参考价值。
v8字节码的编译过程
前面的文章中我们学习了调用V8 API的方法。本文我们讲解一下v8编译成字节码的主要过程。
我们来看一张编译的全局地图:
API调用部分
我们知道,v8中编译代码的方法是v8::Script::Compile:
v8::Local<v8::Script> script =
v8::Script::Compile(context, source).ToLocalChecked();
这将会调用api.cc中的Compile:
MaybeLocal<Script> Script::Compile(Local<Context> context, Local<String> source,
ScriptOrigin* origin)
if (origin)
ScriptCompiler::Source script_source(source, *origin);
return ScriptCompiler::Compile(context, &script_source);
ScriptCompiler::Source script_source(source);
return ScriptCompiler::Compile(context, &script_source);
Script::Compile会调用同属于api.cc中的ScriptCompiler::Compile:
MaybeLocal<Script> ScriptCompiler::Compile(Local<Context> context,
Source* source,
CompileOptions options,
NoCacheReason no_cache_reason)
Utils::ApiCheck(
!source->GetResourceOptions().IsModule(), "v8::ScriptCompiler::Compile",
"v8::ScriptCompiler::CompileModule must be used to compile modules");
auto isolate = context->GetIsolate();
MaybeLocal<UnboundScript> maybe =
CompileUnboundInternal(isolate, source, options, no_cache_reason);
Local<UnboundScript> result;
if (!maybe.ToLocal(&result)) return MaybeLocal<Script>();
v8::Context::Scope scope(context);
return result->BindToCurrentContext();
然后会调用ScriptCompiler的CompileUnboundInternal函数,我们删节一下code cache部分:
MaybeLocal<UnboundScript> ScriptCompiler::CompileUnboundInternal(
Isolate* v8_isolate, Source* source, CompileOptions options,
NoCacheReason no_cache_reason)
auto isolate = reinterpret_cast<i::Isolate*>(v8_isolate);
TRACE_EVENT_CALL_STATS_SCOPED(isolate, "v8", "V8.ScriptCompiler");
ENTER_V8_NO_SCRIPT(isolate, v8_isolate->GetCurrentContext(), ScriptCompiler,
CompileUnbound, MaybeLocal<UnboundScript>(),
InternalEscapableScope);
i::Handle<i::String> str = Utils::OpenHandle(*(source->source_string));
i::Handle<i::SharedFunctionInfo> result;
TRACE_EVENT0(TRACE_DISABLED_BY_DEFAULT("v8.compile"), "V8.CompileScript");
i::ScriptDetails script_details = GetScriptDetails(
isolate, source->resource_name, source->resource_line_offset,
source->resource_column_offset, source->source_map_url,
source->host_defined_options, source->resource_options);
i::MaybeHandle<i::SharedFunctionInfo> maybe_function_info;
if (options == kConsumeCodeCache)
...
else
// Compile without any cache.
maybe_function_info = i::Compiler::GetSharedFunctionInfoForScript(
isolate, str, script_details, options, no_cache_reason,
i::NOT_NATIVES_CODE);
has_pending_exception = !maybe_function_info.ToHandle(&result);
RETURN_ON_FAILED_EXECUTION(UnboundScript);
RETURN_ESCAPED(ToApiHandle<UnboundScript>(result));
Compiler部分
此时调用internal::Compiler::GetSharedFunctionInfoForScript:
MaybeHandle<SharedFunctionInfo> Compiler::GetSharedFunctionInfoForScript(
Isolate* isolate, Handle<String> source,
const ScriptDetails& script_details,
ScriptCompiler::CompileOptions compile_options,
ScriptCompiler::NoCacheReason no_cache_reason, NativesFlag natives)
return GetSharedFunctionInfoForScriptImpl(
isolate, source, script_details, nullptr, nullptr, nullptr,
compile_options, no_cache_reason, natives);
这是一层皮,最终调用到GetSharedFunctionInfoForScriptImpl:
MaybeHandle<SharedFunctionInfo> GetSharedFunctionInfoForScriptImpl(
Isolate* isolate, Handle<String> source,
const ScriptDetails& script_details, v8::Extension* extension,
AlignedCachedData* cached_data, BackgroundDeserializeTask* deserialize_task,
ScriptCompiler::CompileOptions compile_options,
ScriptCompiler::NoCacheReason no_cache_reason, NativesFlag natives)
...
maybe_result =
CompileScriptOnMainThread(flags, source, script_details, natives,
extension, isolate, &is_compiled_scope);
...
对于大多数情况,我们是在主线程中编译的,调用CompileScriptOnMainThread:
MaybeHandle<SharedFunctionInfo> CompileScriptOnMainThread(
const UnoptimizedCompileFlags flags, Handle<String> source,
const ScriptDetails& script_details, NativesFlag natives,
v8::Extension* extension, Isolate* isolate,
IsCompiledScope* is_compiled_scope)
UnoptimizedCompileState compile_state(isolate);
ParseInfo parse_info(isolate, flags, &compile_state);
parse_info.set_extension(extension);
Handle<Script> script =
NewScript(isolate, &parse_info, source, script_details, natives);
DCHECK_IMPLIES(parse_info.flags().collect_type_profile(),
script->IsUserjavascript());
DCHECK_EQ(parse_info.flags().is_repl_mode(), script->is_repl_mode());
return Compiler::CompileToplevel(&parse_info, script, isolate,
is_compiled_scope);
然后从全局函数回到Compiler类的CompileToplevel函数:
MaybeHandle<SharedFunctionInfo> Compiler::CompileToplevel(
ParseInfo* parse_info, Handle<Script> script, Isolate* isolate,
IsCompiledScope* is_compiled_scope)
return v8::internal::CompileToplevel(parse_info, script, kNullMaybeHandle,
isolate, is_compiled_scope);
结果类中的又是个皮,实现的CompileToplevel会调用parsing::ParseProgram去解析代码为AST,然后调用IterativelyExecuteAndFinalizeUnoptimizedCompilationJobs去进行编译操作:
MaybeHandle<SharedFunctionInfo> CompileToplevel(
ParseInfo* parse_info, Handle<Script> script,
MaybeHandle<ScopeInfo> maybe_outer_scope_info, Isolate* isolate,
IsCompiledScope* is_compiled_scope)
...
if (parse_info->literal() == nullptr &&
!parsing::ParseProgram(parse_info, script, maybe_outer_scope_info,
isolate, parsing::ReportStatisticsMode::kYes))
FailWithPendingException(isolate, script, parse_info,
Compiler::ClearExceptionFlag::KEEP_EXCEPTION);
return MaybeHandle<SharedFunctionInfo>();
...
if (!IterativelyExecuteAndFinalizeUnoptimizedCompilationJobs(
isolate, shared_info, script, parse_info, isolate->allocator(),
is_compiled_scope, &finalize_unoptimized_compilation_data_list,
nullptr))
FailWithPendingException(isolate, script, parse_info,
Compiler::ClearExceptionFlag::KEEP_EXCEPTION);
return MaybeHandle<SharedFunctionInfo>();
...
最终会落到 IterativelyExecuteAndFinalizeUnoptimizedCompilationJobs 中。
bool IterativelyExecuteAndFinalizeUnoptimizedCompilationJobs(
IsolateT* isolate, Handle<SharedFunctionInfo> outer_shared_info,
Handle<Script> script, ParseInfo* parse_info,
AccountingAllocator* allocator, IsCompiledScope* is_compiled_scope,
FinalizeUnoptimizedCompilationDataList*
finalize_unoptimized_compilation_data_list,
DeferredFinalizationJobDataList*
jobs_to_retry_finalization_on_main_thread)
DeclarationScope::AllocateScopeInfos(parse_info, isolate);
std::vector<FunctionLiteral*> functions_to_compile;
functions_to_compile.push_back(parse_info->literal());
while (!functions_to_compile.empty())
FunctionLiteral* literal = functions_to_compile.back();
functions_to_compile.pop_back();
Handle<SharedFunctionInfo> shared_info =
Compiler::GetSharedFunctionInfo(literal, script, isolate);
if (shared_info->is_compiled()) continue;
std::unique_ptr<UnoptimizedCompilationJob> job =
ExecuteSingleUnoptimizedCompilationJob(parse_info, literal, allocator,
&functions_to_compile,
isolate->AsLocalIsolate());
if (!job) return false;
...
针对于每一个要编译的函数,将调用ExecuteSingleUnoptimizedCompilationJob去进行编译:
std::unique_ptr<UnoptimizedCompilationJob>
ExecuteSingleUnoptimizedCompilationJob(
ParseInfo* parse_info, FunctionLiteral* literal,
AccountingAllocator* allocator,
std::vector<FunctionLiteral*>* eager_inner_literals,
LocalIsolate* local_isolate)
...
std::unique_ptr<UnoptimizedCompilationJob> job(
interpreter::Interpreter::NewCompilationJob(
parse_info, literal, allocator, eager_inner_literals, local_isolate));
if (job->ExecuteJob() != CompilationJob::SUCCEEDED)
// Compilation failed, return null.
return std::unique_ptr<UnoptimizedCompilationJob>();
return job;
创建了Job之后,再调用Job的ExecuteJob 函数去具体执行。
CompilationJob::Status UnoptimizedCompilationJob::ExecuteJob()
// Delegate to the underlying implementation.
DCHECK_EQ(state(), State::kReadyToExecute);
ScopedTimer t(&time_taken_to_execute_);
return UpdateState(ExecuteJobImpl(), State::kReadyToFinalize);
解释器部分
ExecuteJobImpl是一个虚函数。目前的实现有两种,一种是asm.js的实现,另一种是解释器的实现。
正常我们用的都是解释器的实现方法:
InterpreterCompilationJob::Status InterpreterCompilationJob::ExecuteJobImpl()
RCS_SCOPE(parse_info()->runtime_call_stats(),
RuntimeCallCounterId::kCompileIgnition,
RuntimeCallStats::kThreadSpecific);
...
base::Optional<ParkedScope> parked_scope;
if (local_isolate_) parked_scope.emplace(local_isolate_);
generator()->GenerateBytecode(stack_limit());
if (generator()->HasStackOverflow())
return FAILED;
return SUCCEEDED;
最终将调用BytecodeGenerator的GenerateBytecode方法去生成字节码。
void BytecodeGenerator::GenerateBytecode(uintptr_t stack_limit)
...
if (NeedsContextInitialization(closure_scope()))
// Push a new inner context scope for the function.
BuildNewLocalActivationContext();
ContextScope local_function_context(this, closure_scope());
BuildLocalActivationContextInitialization();
GenerateBytecodeBody();
else
GenerateBytecodeBody();
// Check that we are not falling off the end.
DCHECK(builder()->RemainderOfBlockIsDead());
字节码生成的核心部分在GenerateBytecodeBody中,其主要的工作是遍历AST,所以主要的操作全在Visit*:
void BytecodeGenerator::GenerateBytecodeBody()
// Build the arguments object if it is used.
VisitArgumentsObject(closure_scope()->arguments());
// Build rest arguments array if it is used.
Variable* rest_parameter = closure_scope()->rest_parameter();
VisitRestArgumentsArray(rest_parameter);
// Build assignment to the function name or .this_function
// variables if used.
VisitThisFunctionVariable(closure_scope()->function_var());
VisitThisFunctionVariable(closure_scope()->this_function_var());
// Build assignment to new.target variable if it is used.
VisitNewTargetVariable(closure_scope()->new_target_var());
...
// Visit declarations within the function scope.
if (closure_scope()->is_script_scope())
VisitGlobalDeclarations(closure_scope()->declarations());
else if (closure_scope()->is_module_scope())
VisitModuleDeclarations(closure_scope()->declarations());
else
VisitDeclarations(closure_scope()->declarations());
// Emit initializing assignments for module namespace imports (if any).
VisitModuleNamespaceImports();
...
// Visit statements in the function body.
VisitStatements(literal->body());
...
字节码生成器
在Visit*函数中,除了遍历AST,然后就是调用字节码生成器去生成字节码。
我们来看个简单的例子,void运算符的实现:
void BytecodeGenerator::VisitVoid(UnaryOperation* expr)
VisitForEffect(expr->expression());
builder()->LoadUndefined();
这个builder()获取到的是BytecodeArrayBuilder类的对象:
BytecodeArrayBuilder& BytecodeArrayBuilder::LoadUndefined()
OutputLdaUndefined();
return *this;
不过当我们试图寻找OutputLdaUndefined函数的时候,会发现在代码中搜不到。
这是因为,这些Output函数的名字,是用宏拼出来的:
#define DEFINE_BYTECODE_OUTPUT(name, ...) \\
template <typename... Operands> \\
BytecodeNode BytecodeArrayBuilder::Create##name##Node( \\
Operands... operands) \\
return BytecodeNodeBuilder<Bytecode::k##name, __VA_ARGS__>::Make( \\
this, operands...); \\
\\
\\
template <typename... Operands> \\
void BytecodeArrayBuilder::Output##name(Operands... operands) \\
BytecodeNode node(Create##name##Node(operands...)); \\
Write(&node); \\
\\
\\
template <typename... Operands> \\
void BytecodeArrayBuilder::Output##name(BytecodeLabel* label, \\
Operands... operands) \\
DCHECK(Bytecodes::IsForwardJump(Bytecode::k##name)); \\
BytecodeNode node(Create##name##Node(operands...)); \\
WriteJump(&node, label)v8字节码的编译过程