paint-brush
Let’s Understand Chrome V8 — Chapter 8: V8 Interpreter Ignitionby@huidou
665 reads
665 reads

Let’s Understand Chrome V8 — Chapter 8: V8 Interpreter Ignition

by 灰豆September 2nd, 2022
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this paper, we’ll talk about the load and execution of bytecode, as well as the kernel code, important data structures, and workflow of the ignition.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coins Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Let’s Understand Chrome V8 — Chapter 8: V8 Interpreter Ignition
灰豆 HackerNoon profile picture


Welcome to other chapters of Let’s Understand Chrome V8


In this paper, we’ll talk about the load and execution of bytecode, as well as the kernel code, important data structures, and workflow of the ignition.

1. Builtin

We must know builtin before stepping into ignition, because most of the functions of ignition are implemented by builtins. Builtins are built-in functions, builtins can be seen as chunks of code that are executable by the V8 at runtime. The builtins can be implemented using four different methods. Below is an example of TF_BUILTIN.

#define TF_BUILTIN(Name, AssemblerBase)                                     \
  class Name##Assembler : public AssemblerBase {                            \
   public:                                                                  \
    using Descriptor = Builtin_##Name##_InterfaceDescriptor;                \
                                                                            \
    explicit Name##Assembler(compiler::CodeAssemblerState* state)           \
        : AssemblerBase(state) {}                                           \
    void Generate##Name##Impl();                                            \
                                                                            \
    template <class T>                                                      \
    TNode<T> Parameter(                                                     \
        Descriptor::ParameterIndices index,                                 \
        cppgc::SourceLocation loc = cppgc::SourceLocation::Current()) {     \
      return CodeAssembler::Parameter<T>(static_cast<int>(index), loc);     \
    }                                                                       \
                                                                            \
    template <class T>                                                      \
    TNode<T> UncheckedParameter(Descriptor::ParameterIndices index) {       \
      return CodeAssembler::UncheckedParameter<T>(static_cast<int>(index)); \
    }                                                                       \
  };                                                                        \
  void Builtins::Generate_##Name(compiler::CodeAssemblerState* state) {     \
    Name##Assembler assembler(state);                                       \
    state->SetInitialDebugInformation(#Name, __FILE__, __LINE__);           \
    if (Builtins::KindOf(Builtin::k##Name) == Builtins::TFJ) {              \
      assembler.PerformStackCheck(assembler.GetJSContextParameter());       \
    }                                                                       \
    assembler.Generate##Name##Impl();                                       \
  }                                                                         \
  void Name##Assembler::Generate##Name##Impl()

In the above code, AssemblerBase is the parent class of this Builtin. The parent of different Builtins are also different. Below is an example.

TF_BUILTIN(CloneFastJSArrayFillingHoles, ArrayBuiltinsAssembler) {
  auto context = Parameter<Context>(Descriptor::kContext);
  auto array = Parameter<JSArray>(Descriptor::kSource);

  CSA_ASSERT(this,
             Word32Or(Word32BinaryNot(IsHoleyFastElementsKindForRead(
                          LoadElementsKind(array))),
                      Word32BinaryNot(IsNoElementsProtectorCellInvalid())));

  Return(CloneFastJSArray(context, array, base::nullopt,
                          HoleConversionMode::kConvertToUndefined));
}

In the above builtin, the CloneFastJSArrayFillingHoles is the class name, the ArrayBuiltinsAssembler is the parent. The parent is different for each builtin, but they all inherit from the same base class CodeStubAssembler.

class V8_EXPORT_PRIVATE CodeStubAssembler
    : public compiler::CodeAssembler,
      public TorqueGeneratedExportedMacrosAssembler {
 public:
  using ScopedExceptionHandler = compiler::ScopedExceptionHandler;

  template <typename T>
  using LazyNode = std::function<TNode<T>()>;

  explicit CodeStubAssembler(compiler::CodeAssemblerState* state);

  enum AllocationFlag : uint8_t {
    kNone = 0,
    kDoubleAlignment = 1,
    kPretenured = 1 << 1,
    kAllowLargeObjectAllocation = 1 << 2,
  };

  enum SlackTrackingMode { kWithSlackTracking, kNoSlackTracking };

  using AllocationFlags = base::Flags<AllocationFlag>;

  TNode<IntPtrT> ParameterToIntPtr(TNode<Smi> value) { return SmiUntag(value); }
  TNode<IntPtrT> ParameterToIntPtr(TNode<IntPtrT> value) { return value; }
  TNode<IntPtrT> ParameterToIntPtr(TNode<UintPtrT> value) {
    return Signed(value);
  }

  enum InitializationMode {
    kUninitialized,
    kInitializeToZero,
    kInitializeToNull
  };
//........................
//omit
//........................

Below is the builtin list, it contains all the builtins.

#define BUILTIN_LIST(CPP, TFJ, TFC, TFS, TFH, BCH, ASM)  \
  BUILTIN_LIST_BASE(CPP, TFJ, TFC, TFS, TFH, ASM)        \
  BUILTIN_LIST_FROM_TORQUE(CPP, TFJ, TFC, TFS, TFH, ASM) \
  BUILTIN_LIST_INTL(CPP, TFJ, TFS)                       \
  BUILTIN_LIST_BYTECODE_HANDLERS(BCH)

This paper does not cover the grammar of builtin, please reach out to me if you want to know more.


Unfortunately, we can only see the assembly source code when debugging builtin, because the implementation of builtin is separated from V8 and stored in the snapshot_blob.bin file, which does not include the builtin symbol table.


InterpreterEntryTrampoline is the starting point for executing bytecode, let’s talk about how to debug bytecode.

enum class Builtin : int32_t {
  kNoBuiltinId = -1,
#define DEF_ENUM(Name, ...) k##Name,
  BUILTIN_LIST(DEF_ENUM, DEF_ENUM, DEF_ENUM, DEF_ENUM, DEF_ENUM, DEF_ENUM,
               DEF_ENUM)
#undef DEF_ENUM
#define EXTRACT_NAME(Name, ...) k##Name,
  // Define kFirstBytecodeHandler,
  kFirstBytecodeHandler =
      FirstFromVarArgs(BUILTIN_LIST_BYTECODE_HANDLERS(EXTRACT_NAME) 0)
#undef EXTRACT_NAME
};

In the above enumeration class, each builtin has an index. According to the order of the BUILTIN_LIST macro template, we can calculate the InterpreterEntryTrampoline’s index. Find the corresponding element in the isolate->isolate_data_.builtins_ array, as shown in Figure 1.

This element holds the memory address of the InterpreterEntryTrampoline. With this address, we can use memory breakpoints for debugging.


Another way is to debug from the i::Excetuion::Call(), which also goes into InterpreterEntryTrampoline eventually.

2. Ignition

Ignition is the V8 interpreter, responsible for executing bytecode, its input is a bytecode array, and the output is the result of the program. Here are some important concepts:

  • bytecode handler: bytecode handlers, one handler for each bytecode.
  • bytecode array: An array of bytecodes is generated from a piece of JS (usually a JS function) compiled.
  • dispatch: bytecode dispatcher, same as EIP register.
  • dispatch table: This table holds the address of the bytecode.

Below is the source code of InterpreterEntryTrampoline.

1.  void Builtins::Generate_InterpreterEntryTrampoline(MacroAssembler* masm) {
2.    Register closure = rdi;
3.    Register feedback_vector = rbx;
4.    // Get the bytecode array from the function object and load it into
5.    // kInterpreterBytecodeArrayRegister.
6.    __ LoadTaggedPointerField(
7.        kScratchRegister,
8.        FieldOperand(closure, JSFunction::kSharedFunctionInfoOffset));
9.    __ LoadTaggedPointerField(
10.        kInterpreterBytecodeArrayRegister,
11.        FieldOperand(kScratchRegister, SharedFunctionInfo::kFunctionDataOffset));
12.    Label is_baseline;
13.    GetSharedFunctionInfoBytecodeOrBaseline(
14.        masm, kInterpreterBytecodeArrayRegister, kScratchRegister, &is_baseline);
15.    // The bytecode array could have been flushed from the shared function info,
16.    // if so, call into CompileLazy.
17.    Label compile_lazy;
18.    __ CmpObjectType(kInterpreterBytecodeArrayRegister, BYTECODE_ARRAY_TYPE,
19.                     kScratchRegister);
20.    __ j(not_equal, &compile_lazy);
21.    // Load the feedback vector from the closure.
22.    __ LoadTaggedPointerField(
23.        feedback_vector, FieldOperand(closure, JSFunction::kFeedbackCellOffset));
24.    __ LoadTaggedPointerField(feedback_vector,
25.                              FieldOperand(feedback_vector, Cell::kValueOffset));
26.  //omit
27.  }

The role of InterpreterEntryTrampoline includes building stacks, allocating local variables, etc.


Figure 2 is a detailed description.


The 13th line, the GetSharedFunctionInfoBytecodeOrBaseline takes out the bytecode array, and then calls the first bytecode of the bytecode array.


There are also some important functions in the builtins class, as below.

1.  class Builtins {
2.  //.......omit.................
3.    static void Generate_CallOrConstructVarargs(MacroAssembler* masm,
4.                                                Handle<Code> code);
5.    static void Generate_CallOrConstructForwardVarargs(MacroAssembler* masm,
6.                                                       CallOrConstructMode mode,
7.                                                       Handle<Code> code);
8.    static void Generate_InterpreterPushArgsThenCallImpl(
9.        MacroAssembler* masm, ConvertReceiverMode receiver_mode,
10.        InterpreterPushArgsMode mode);
11.    static void Generate_InterpreterPushArgsThenConstructImpl(
12.        MacroAssembler* masm, InterpreterPushArgsMode mode);
13.    template <class Descriptor>
14.    static void Generate_DynamicCheckMapsTrampoline(MacroAssembler* masm,
15.                                                    Handle<Code> builtin_target);
16.  #define DECLARE_ASM(Name, ...) \
17.    static void Generate_##Name(MacroAssembler* masm);
18.  #define DECLARE_TF(Name, ...) \
19.    static void Generate_##Name(compiler::CodeAssemblerState* state);
20.    BUILTIN_LIST(IGNORE_BUILTIN, DECLARE_TF, DECLARE_TF, DECLARE_TF, DECLARE_TF,
21.                 IGNORE_BUILTIN, DECLARE_ASM)
22.  //......omit............

Lines 16 and 17, DECLARE_TF and DECLARE_ASM are the generation templates for builtins.


Note: bytecode is a subclass of builtin, and V8 has many kinds of builtins. bytecode is builtin, but builtin is not all bytecode.


The bytecode handler generation template is below.

#define IGNITION_HANDLER(Name, BaseAssembler)                         \
  class Name##Assembler : public BaseAssembler {                      \
   public:                                                            \
    explicit Name##Assembler(compiler::CodeAssemblerState* state,     \
                             Bytecode bytecode, OperandScale scale)   \
        : BaseAssembler(state, bytecode, scale) {}                    \
    Name##Assembler(const Name##Assembler&) = delete;                 \
    Name##Assembler& operator=(const Name##Assembler&) = delete;      \
    static void Generate(compiler::CodeAssemblerState* state,         \
                         OperandScale scale);                         \
                                                                      \
   private:                                                           \
    void GenerateImpl();                                              \
  };                                                                  \
  void Name##Assembler::Generate(compiler::CodeAssemblerState* state, \
                                 OperandScale scale) {                \
    Name##Assembler assembler(state, Bytecode::k##Name, scale);       \
    state->SetInitialDebugInformation(#Name, __FILE__, __LINE__);     \
    assembler.GenerateImpl();                                         \
  }                                                                   \
  void Name##Assembler::GenerateImpl()

//====================separation=================================
// LdaZero
// Load literal '0' into the accumulator.
IGNITION_HANDLER(LdaZero, InterpreterAssembler) {
  TNode<Number> zero_value = NumberConstant(0.0);
  SetAccumulator(zero_value);
  Dispatch();
}

IGNITION_HANDLER is the macro template, Name is the bytecode name, and BaseAssembler is the parent class of the bytecode. IGNITION_HANDLER(LdaZero, InterpreterAssembler) generates LdaZero’s handler.


The source code of Dispatch() is below.

1.  void InterpreterAssembler::Dispatch() {
2.    Comment("========= Dispatch");
3.    DCHECK_IMPLIES(Bytecodes::MakesCallAlongCriticalPath(bytecode_), made_call_);
4.    TNode<IntPtrT> target_offset = Advance();
5.    TNode<WordT> target_bytecode = LoadBytecode(target_offset);
6.    DispatchToBytecodeWithOptionalStarLookahead(target_bytecode);
7.  }
8.  void InterpreterAssembler::DispatchToBytecodeWithOptionalStarLookahead(
9.      TNode<WordT> target_bytecode) {
10.    if (Bytecodes::IsStarLookahead(bytecode_, operand_scale_)) {
11.      StarDispatchLookahead(target_bytecode);
12.    }
13.    DispatchToBytecode(target_bytecode, BytecodeOffset());
14.  }

Line 5, LoadBytecode takes out the next bytecode. Line 8, DispatchToBytecodeWithOptionalStarLookahead jumps to the next bytecode.


The entry for generating bytecode handler is below.

Handle<Code> GenerateBytecodeHandler(Isolate* isolate, const char* debug_name,
                                     Bytecode bytecode,
                                     OperandScale operand_scale,
                                     Builtin builtin,
                                     const AssemblerOptions& options) {
  Zone zone(isolate->allocator(), ZONE_NAME, kCompressGraphZone);
  compiler::CodeAssemblerState state(
      isolate, &zone, InterpreterDispatchDescriptor{},
      CodeKind::BYTECODE_HANDLER, debug_name,
      builtin);

  switch (bytecode) {
#define CALL_GENERATOR(Name, ...)                     \
  case Bytecode::k##Name:                             \
    Name##Assembler::Generate(&state, operand_scale); \
    break;
    BYTECODE_LIST_WITH_UNIQUE_HANDLERS(CALL_GENERATOR);
#undef CALL_GENERATOR
    case Bytecode::kIllegal:
      IllegalAssembler::Generate(&state, operand_scale);
      break;
    case Bytecode::kStar0:
      Star0Assembler::Generate(&state, operand_scale);
      break;
    default:
      UNREACHABLE();
  }
//............omit
}

GenerateBytecodeHandler is the entry, which is responsible for calling the IGNITION_HANDLER(XXX, YYY) macro template to generate full bytecode handlers.


Okay, that wraps it up for this share. I’ll see you guys next time, take care!


Please reach out to me if you have any issues.


WeChat: qq9123013 Email: [email protected]


Also published here.