因优化而导致的溢出与CVE-2020-16040

Chrome < 87.0.4280.88
issue：1150649
本系列文章的项目仓库：

限于笔者个人水平，如果您发现了本文的描述中存在任何错误，都请与我联系，我会尽快修正。

正文

优化阶段

通过 turbolizer ，我们可以很容易的分析出不同阶段得到的结点海。而不同的阶段有不同的优化规则，主要包括如下几个阶段：

BytecodeGraphBuilder：将字节码转换为图
InliningPhase：内联函数展开
TyperPhase：确定结点类型与范围
TypedLoweringPhase：将 JavaScript Node转换成中间的Simplified Node或者是Common Node
LoadEliminationPhase：消除多余的 Load 内存读取操作
EscapeAnalysisPhase：逃逸分析，标记结点是否逃逸并进行修饰
SimplifiedLoweringPhase：将 Simplified Node 节点降级到更具体的 Machine Node
GenericLoweringPhase：将 JS Node 的运算符降低到Runtime，Builtins，IC调用级别

在新版本中还有更多的阶段，并且每个阶段名前添加了 TF 前缀，不过功能本身没有太大改变，并不影响我们研究其功能原理。上述的几个阶段就是比较主要的部分，而我们主要研究 TypedLoweringPhase 和 SimplifiedLoweringPhase 这两个会对结点进行降级的阶段。同时，我们的重点也会放在与 CVE-2020-16040 有关的 SimplifiedLoweringPhase 上。

不过由于本文主要涉及到了源代码层面的分析，笔者相信您在读完本文以后也具有对其他阶段源代码进行解读的能力了。

TypedLoweringPhase

通过调用 TypedLoweringPhase::run 来实现优化，我们先将其函数实现摘出。

您无需立刻开始阅读它，摘出代码的目的是用以对照文章的叙述。

         
       void Run(PipelineData
       * 
       data, Zone
       * 
       temp_zone) {
      
       GraphReducer graph_reducer(
      
       temp_zone, data
       -
       >graph(), &data
       -
       >info()
       -
       >tick_counter(), data
       -
       >broker(),
      
       data
       -
       >jsgraph()
       -
       >Dead(), data
       -
       >observe_node_manager());
      
       DeadCodeElimination dead_code_elimination(&graph_reducer, data
       -
       >graph(),
      
       data
       -
       >common(), temp_zone);
      
       JSCreateLowering create_lowering(&graph_reducer, data
       -
       >dependencies(),
      
       data
       -
       >jsgraph(), data
       -
       >broker(),
      
       temp_zone);
      
       JSTypedLowering typed_lowering(&graph_reducer, data
       -
       >jsgraph(),
      
       data
       -
       >broker(), temp_zone);
      
       ConstantFoldingReducer constant_folding_reducer(
      
       &graph_reducer, data
       -
       >jsgraph(), data
       -
       >broker());
      
       TypedOptimization typed_optimization(&graph_reducer, data
       -
       >dependencies(),
      
       data
       -
       >jsgraph(), data
       -
       >broker());
      
       SimplifiedOperatorReducer simple_reducer(
      
       &graph_reducer, data
       -
       >jsgraph(), data
       -
       >broker(), BranchSemantics::kJS);
      
       CheckpointElimination checkpoint_elimination(&graph_reducer);
      
       CommonOperatorReducer common_reducer(
      
       &graph_reducer, data
       -
       >graph(), data
       -
       >broker(), data
       -
       >common(),
      
       data
       -
       >machine(), temp_zone, BranchSemantics::kJS);
      
       AddReducer(data, &graph_reducer, &dead_code_elimination);
      
       AddReducer(data, &graph_reducer, &create_lowering);
      
       AddReducer(data, &graph_reducer, &constant_folding_reducer);
      
       AddReducer(data, &graph_reducer, &typed_lowering);
      
       AddReducer(data, &graph_reducer, &typed_optimization);
      
       AddReducer(data, &graph_reducer, &simple_reducer);
      
       AddReducer(data, &graph_reducer, &checkpoint_elimination);
      
       AddReducer(data, &graph_reducer, &common_reducer);
      
       /
       / 
       ConstantFoldingReducer, JSCreateLowering, JSTypedLowering, 
       and
      
       /
       / 
       TypedOptimization access the heap.
      
       UnparkedScopeIfNeeded scope(data
       -
       >broker());
      
       graph_reducer.ReduceGraph();
      
       }
      
       };

通过最后的 AddReducer 能够总结出该阶段总共经历了如下几个 Reducer：

DeadCodeElimination：消除死结点
JSCreateLowering：JSCreate类的Node节点展开成更低级的Node
ConstantFoldingReducer：常量折叠
JSTypedLowering：JavaScript Node降低为更低级的Node
TypedOptimization：TyperPhase阶段计算出来的类型对 Speculative 等类型的Node优化
SimplifiedOperatorReducer：SIMPLIFIED 优化成 Machine Node
CheckpointElimination：消除 CheckPoint 结点
CommonOperatorReducer：CONTROL_OP_LIST Node的节点优化

我们以 TypedOptimization 为例来分析一下该阶段的工作：

       Reduction TypedOptimization::
       Reduce
       (Node
       * 
       node) {
      
       switch (node
       -
       >opcode()) {
      
       case IrOpcode::kConvertReceiver:
      
       return 
       ReduceConvertReceiver(node);
      
       case IrOpcode::kMaybeGrowFastElements:
      
       return 
       ReduceMaybeGrowFastElements(node);
      
       case IrOpcode::kCheckHeapObject:
      
       return 
       ReduceCheckHeapObject(node);
      
       case IrOpcode::kCheckBounds:
      
       return 
       ReduceCheckBounds(node);
      
       case IrOpcode::kCheckNotTaggedHole:
      
       return 
       ReduceCheckNotTaggedHole(node);
      
       case IrOpcode::kCheckMaps:
      
       return 
       ReduceCheckMaps(node);
      
       case IrOpcode::kCheckNumber:
      
       return 
       ReduceCheckNumber(node);
      
       case IrOpcode::kCheckString:
      
       ······
      
       case IrOpcode::kSpeculativeNumberAdd:
      
       return 
       ReduceSpeculativeNumberAdd(node);
      
       ······
      
       }
      
       return 
       NoChange();
      
       }

我们以之前提到过的 kSpeculativeNumberAdd 为例跟踪一下其优化过程。

       Reduction TypedOptimization::ReduceSpeculativeNumberAdd(Node
       * 
       node) {
      
       Node
       * 
       const lhs 
       = 
       NodeProperties::GetValueInput(node, 
       0
       );
      
       Node
       * 
       const rhs 
       = 
       NodeProperties::GetValueInput(node, 
       1
       );
      
       Type 
       const lhs_type 
       = 
       NodeProperties::GetType(lhs);
      
       Type 
       const rhs_type 
       = 
       NodeProperties::GetType(rhs);
      
       NumberOperationHint hint 
       = 
       NumberOperationHintOf(node
       -
       >op());
      
       if 
       ((hint 
       =
       = 
       NumberOperationHint::kNumber ||
      
       hint 
       =
       = 
       NumberOperationHint::kNumberOrOddball) &&
      
       BothAre(lhs_type, rhs_type, 
       Type
       ::PlainPrimitive()) &&
      
       NeitherCanBe(lhs_type, rhs_type, 
       Type
       ::StringOrReceiver())) {
      
       /
       / 
       SpeculativeNumberAdd(x:
       -
       string, y:
       -
       string) 
       =
       >
      
       /
       /     
       NumberAdd(ToNumber(x), ToNumber(y))
      
       Node
       * 
       const toNum_lhs 
       = 
       ConvertPlainPrimitiveToNumber(lhs);
      
       Node
       * 
       const toNum_rhs 
       = 
       ConvertPlainPrimitiveToNumber(rhs);
      
       Node
       * 
       const value 
       =
      
       graph()
       -
       >NewNode(simplified()
       -
       >NumberAdd(), toNum_lhs, toNum_rhs);
      
       ReplaceWithValue(node, value);
      
       return 
       Replace(value);
      
       }
      
       return 
       NoChange();
      
       }

注释部分很好的说明了该结点所做的可能的优化，在符号的左右值都符合条件时，它会将 SpeculativeNumberAdd 以 NumberAdd 替换。

基本上每一个 Reducer 都在做相似的工作。它们获取结点的类型，然后通过 switch 跳转到对应的分支尝试优化；如果当前结点并不应该在这个 Reducer 进行优化，那么就会返回 NoChange() ，然后继续向下由其他 Reducer 来尝试优化。

SimplifiedLoweringPhase

本文的重点放在了 SimplifiedLoweringPhase 阶段，它的执行函数如下：

       void Run(SimplifiedLowering
       * 
       lowering) {
      
       GenerateTraversal();
      
       RunPropagatePhase();
      
       RunRetypePhase();
      
       RunLowerPhase(lowering);
      
       if 
       (verification_enabled()) {
      
       RunVerifyPhase(lowering
       -
       >info_);
      
       }
      
       }

GenerateTraversal：先序遍历结点树，收集结点
RunPropagatePhase：传播 truncation，并设置 restriction_type
RunRetypePhase：根据 restriction_type 为每个 Node 更新 feedback_type类型，最终去判断设置 Node 的 machine representation
RunLowerPhase：降级 Node 节点到 Machine Node

笔者必须在这里提及一下用于标记每个 Node 信息的 NodeInfo ：

       class 
       NodeInfo final {
      
       public:
      
       /
       / 
       Adds new use to the node. Returns true 
       if 
       something has changed
      
       /
       / 
       and 
       the 
      
       node has to be requeued.
      
       ······
      
       /
       / 
       Helpers 
       for 
       feedback typing.
      
       void set_feedback_type(
       Type 
       type
       ) { feedback_type_ 
       = 
       type
       ; }
      
       Type 
       feedback_type() const { 
       return 
       feedback_type_; }
      
       void set_weakened() { weakened_ 
       = 
       true; }
      
       bool 
       weakened() const { 
       return 
       weakened_; }
      
       void set_restriction_type(
       Type 
       type
       ) { restriction_type_ 
       = 
       type
       ; }
      
       Type 
       restriction_type() const { 
       return 
       restriction_type_; }
      
       private:
      
       /
       / 
       Fields are ordered to avoid mixing byte 
       and 
       word size fields to minimize
      
       /
       / 
       padding.
      
       enum State : uint8_t { kUnvisited, kPushed, kVisited, kQueued };
      
       State state_ 
       = 
       kUnvisited;
      
       MachineRepresentation representation_ 
       =
      
       MachineRepresentation::kNone;             
       /
       / 
       Output representation.
      
       Truncation truncation_ 
       = 
       Truncation::
       None
       ();  
       /
       / 
       Information about uses.
      
       bool 
       weakened_ 
       = 
       false;
      
       Type 
       restriction_type_ 
       = 
       Type
       ::
       Any
       ();
      
       Type 
       feedback_type_;
      
       };

笔者简略掉了大部分成员函数，我们主要关注那些成员变量。

truncation_：意为截断信息，一个简单的例子是：在返回值为 32bit 时，它需要将 64bit 的计算结果截断后返回给用户
restrictiontype ：用于在重新获取类型(RunRetypePhase)时设置 feedback_type
feedbacktype：用于计算每个 Node 最终的类型表示
representation_：每个 Node 最终的类型表示

由于 GenerateTraversal 并不是我们需要关心的内容，因此本文不对该函数的算法实现做分析，笔者打算从 RunPropagatePhase 开始。

RunPropagatePhase

       void RunPropagatePhase() {
      
       TRACE(
       "--{Propagate phase}--\n"
       );
      
       ResetNodeInfoState();
      
       DCHECK(revisit_queue_.empty());
      
       /
       / 
       Process nodes 
       in 
       reverse post order, with End as the root.
      
       for 
       (auto it 
       = 
       traversal_nodes_.crbegin(); it !
       = 
       traversal_nodes_.crend();
      
       +
       +
       it) {
      
       PropagateTruncation(
       *
       it);
      
       while 
       (!revisit_queue_.empty()) {
      
       Node
       * 
       node 
       = 
       revisit_queue_.front();
      
       revisit_queue_.pop();
      
       PropagateTruncation(node);
      
       }
      
       }
      
       }

函数体不是很大，它首先通过 ResetNodeInfoState 重置了每个结点的 NodeInfo 中的 state_ 标志位。

接下来，函数遍历每个节点，并对其调用 PropagateTruncation ，即传播 Truncation：

       void PropagateTruncation(Node
       * 
       node) {
      

         
       NodeInfo
       * 
       info 
       = 
       GetInfo(node);
      

         
       info
       -
       >set_visited();
      

         
       TRACE(
       " visit #%d: %s (trunc: %s)\n"
       , node
       -
       >
       id
       (), node
       -
       >op()
       -
       >mnemonic(),
      

               
       info
       -
       >truncation().description());
      

         
       VisitNode<PROPAGATE>(node, info
       -
       >truncation(), nullptr);
      

       }
      

接下来它会对当前结点调用 VisitNode<PROPAGATE> 模板函数，从而调用一系列的 Visit 函数：

       void VisitNode(Node
       * 
       node, Truncation truncation,
      

                      
       SimplifiedLowering
       * 
       lowering) {
      

         
       tick_counter_
       -
       >TickAndMaybeEnterSafepoint();
      

         
       ······
      

         
       switch (node
       -
       >opcode()) {
      

           
       /
       /
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
      

           
       /
       / 
       Common operators.
      

           
       /
       /
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
       -
      

           
       case IrOpcode::kStart:
      

             
       /
       / 
       We use Start as a terminator 
       for 
       the frame state chain, so even
      

             
       /
       / 
       tho Start doesn't really produce a value, we have to say Tagged
      

             
       /
       / 
       here, otherwise the 
       input 
       conversion will fail.
      

             
       return 
       VisitLeaf<T>(node, MachineRepresentation::kTagged);
      

           
       case IrOpcode::kParameter:
      

             
       return 
       VisitUnop<T>(node, UseInfo::
       None
       (),
      

                                 
       linkage()
      

                                     
       -
       >GetParameterType(ParameterIndexOf(node
       -
       >op()))
      

                                     
       .representation());
      

           
       case IrOpcode::kInt32Constant:
      

             
       return 
       VisitLeaf<T>(node, MachineRepresentation::kWord32);
      

           
       case IrOpcode::kInt64Constant:
      

             
       return 
       VisitLeaf<T>(node, MachineRepresentation::kWord64);
      

           
       case IrOpcode::kExternalConstant:
      

             
       return 
       VisitLeaf<T>(node, MachineType::PointerRepresentation());
      

           
       case IrOpcode::kNumberConstant: {
      

             
       double const value 
       = 
       OpParameter<double>(node
       -
       >op());
      

             
       int 
       value_as_int;
      

             
       if 
       (DoubleToSmiInteger(value, &value_as_int)) {
      

               
       VisitLeaf<T>(node, MachineRepresentation::kTaggedSigned);
      

               
       if 
       (lower<T>()) {
      

                 
       intptr_t smi 
       = 
       base::bit_cast<intptr_t>(Smi::FromInt(value_as_int));
      

                 
       Node
       * 
       constant 
       = 
       InsertTypeOverrideForVerifier(
      

                     
       NodeProperties::GetType(node),
      

                     
       lowering
       -
       >jsgraph()
       -
       >IntPtrConstant(smi));
      

                 
       DeferReplacement(node, constant);
      

               
       }
      

               
       return
       ;
      

             
       }
      

             
       VisitLeaf<T>(node, MachineRepresentation::kTagged);
      

             
       return
       ;
      

           
       }
      

           
       case IrOpcode::kHeapConstant:
      

           
       case IrOpcode::kDelayedStringConstant:
      

             
       return 
       VisitLeaf<T>(node, MachineRepresentation::kTaggedPointer);
      

         
       ······
      

以 kNumberConstant 为例，首先获取其对应的值，如果其为整数，则调用 VisitLeaf 将 restriction_type_ 置为 kTaggedSigned ，如果 T==LOWER ，则将值转为 intptr_t 类型指针，将原结点替换为 IntPtrConstant；否则置为 kTagged ：

具体实现通过调用 SetOutput 完成

       void RepresentationSelector::SetOutput<PROPAGATE>(
      
       Node
       * 
       node, MachineRepresentation representation, 
       Type 
       restriction_type) {
      
       NodeInfo
       * 
       const info 
       = 
       GetInfo(node);
      
       info
       -
       >set_restriction_type(restriction_type);
      
       }

常数类型会显得较为简单，笔者这里又选了一个稍微复杂一点的类型 kNumberEqual ：

       case IrOpcode::kNumberEqual: {
      
       Type 
       const lhs_type 
       = 
       TypeOf(node
       -
       >InputAt(
       0
       ));
      
       Type 
       const rhs_type 
       = 
       TypeOf(node
       -
       >InputAt(
       1
       ));
      
       if 
       ((lhs_type.Is(
       Type
       ::Unsigned32OrMinusZero()) &&
      
       rhs_type.Is(
       Type
       ::Unsigned32OrMinusZero())) ||
      
       (lhs_type.Is(
       Type
       ::Unsigned32OrMinusZeroOrNaN()) &&
      
       rhs_type.Is(
       Type
       ::Unsigned32OrMinusZeroOrNaN()) &&
      
       OneInputCannotBe(node, type_cache_
       -
       >kZeroish))) {
      
       /
       / 
       =
       > unsigned Int32Cmp
      
       VisitBinop<T>(node, UseInfo::TruncatingWord32(),
      
       MachineRepresentation::kBit);
      
       if 
       (lower<T>()) ChangeOp(node, Uint32Op(node));
      
       return
       ;
      
       }
      
       if 
       ((lhs_type.Is(
       Type
       ::Signed32OrMinusZero()) &&
      
       rhs_type.Is(
       Type
       ::Signed32OrMinusZero())) ||
      
       (lhs_type.Is(
       Type
       ::Signed32OrMinusZeroOrNaN()) &&
      
       rhs_type.Is(
       Type
       ::Signed32OrMinusZeroOrNaN()) &&
      
       OneInputCannotBe(node, type_cache_
       -
       >kZeroish))) {
      
       /
       / 
       =
       > signed Int32Cmp
      
       VisitBinop<T>(node, UseInfo::TruncatingWord32(),
      
       MachineRepresentation::kBit);
      
       if 
       (lower<T>()) ChangeOp(node, Int32Op(node));
      
       return
       ;
      
       }
      
       /
       / 
       =
       > Float64Cmp
      
       VisitBinop<T>(node, UseInfo::TruncatingFloat64(kIdentifyZeros),
      
       MachineRepresentation::kBit);
      
       if 
       (lower<T>()) ChangeOp(node, Float64Op(node));
      
       return
       ;
      
       }

它首先判断二值的比较类型，从而决定 UseInfo 参数的截断类型，然后调用 VisitBinop ：

       void VisitBinop(Node
       * 
       node, UseInfo left_use, UseInfo right_use,
      
       MachineRepresentation output,
      
       Type 
       restriction_type 
       = 
       Type
       ::
       Any
       ()) {
      
       DCHECK_EQ(
       2
       , node
       -
       >op()
       -
       >ValueInputCount());
      
       ProcessInput<T>(node, 
       0
       , left_use);
      
       ProcessInput<T>(node, 
       1
       , right_use);
      
       for 
       (
       int 
       i 
       = 
       2
       ; i < node
       -
       >InputCount(); i
       +
       +
       ) {
      
       EnqueueInput<T>(node, i);
      
       }
      
       SetOutput<T>(node, output, restriction_type);
      
       }

ProcessInput 是对其两个比较数结点进行处理，其最终会向下调用到 EnqueueInput 函数：

       template <>
      
       void RepresentationSelector::EnqueueInput<PROPAGATE>(Node
       * 
       use_node, 
       int 
       index,
      
       UseInfo use_info) {
      
       Node
       * 
       node 
       = 
       use_node
       -
       >InputAt(index);
      
       NodeInfo
       * 
       info 
       = 
       GetInfo(node);
      
       if 
       (info
       -
       >unvisited()) {
      
       info
       -
       >AddUse(use_info);
      
       TRACE(
       "  initial #%i: %s\n"
       , node
       -
       >
       id
       (), info
       -
       >truncation().description());
      
       return
       ;
      
       }
      
       TRACE(
       "   queue #%i?: %s\n"
       , node
       -
       >
       id
       (), info
       -
       >truncation().description());
      
       if 
       (info
       -
       >AddUse(use_info)) {
      
       /
       / 
       New usage information 
       for 
       the node 
       is 
       available.
      
       if 
       (!info
       -
       >queued()) {
      
       DCHECK(info
       -
       >visited());
      
       revisit_queue_.push(node);
      
       info
       -
       >set_queued();
      
       TRACE(
       "   added: %s\n"
       , info
       -
       >truncation().description());
      
       } 
       else 
       {
      
       TRACE(
       " inqueue: %s\n"
       , info
       -
       >truncation().description());
      
       }
      
       }
      
       }

通过调用成员函数 AddUse 修改其 truncation_ 为对应的类型。

在执行代码时，通过添加参数 --trace-representation 可以打印出该阶段的截断信息。

以下述示范代码为例：

       function foo(x)
      

       {
      

           
       var y
       =
       (x
       =
       =
       0
       )?
       5
       :
       10
       ;
      

           
       var z
       =
       y
       +
       1
       ;
      

           
       return 
       z;
      

       }
      

       %
       PrepareFunctionForOptimization(foo);
      

       foo(
       0
       );
      

       %
       OptimizeFunctionOnNextCall(foo);
      

       foo(
       1
       );
      

       $ .
       /
       d8 temp.js 
       -
       -
       allow
       -
       natives
       -
       syntax 
       -
       -
       trace
       -
       turbo 
       -
       -
       trace
       -
       representation
      
       -
       -
       {Propagate phase}
       -
       -
      
       visit 
       #32: Return (trunc: no-value-use)
      
       initial 
       #14: truncate-to-word32
      
       initial 
       #31: no-truncation (but distinguish zeros)
      
       initial 
       #31: no-truncation (but distinguish zeros)
      
       initial 
       #25: no-value-use
      
       visit 
       #31: SpeculativeSafeIntegerAdd (trunc: no-truncation (but distinguish zeros))
      
       initial 
       #27: truncate-to-word32
      
       initial 
       #30: truncate-to-word32
      
       initial 
       #17: no-value-use
      
       initial 
       #25: no-value-use
      
       visit 
       #30: NumberConstant (trunc: truncate-to-word32)
      
       ······

在开始阶段，通过 Log 我们可以看到，27 和 30 号结点它们的相关值被置为 truncate-to-word32 。
当遍历到 31 号结点时，优化器发现此时在进行两个 32bit 有符号数对比，因此调用

1	`VisitBinop<T>(node, UseInfo::TruncatingWord32(),MachineRepresentation::kBit);`

于是在 ProcessInput 函数中将 30 和 27 号结点标记为 truncate-to-word32 。其他几个也以此类推。

在完成优化以后的节点海如下：

可以发现，它们已经被优化为了 Int32Constant 类型。在上一篇中曾提到过，NumberConstant 存储的值为 64bit double 类型，而现在发生截断，使其被优化为 32bit int 类型。

对于较小的数来说，它会保存为 SMI 类型，这并不会用到 double
具体来说，SMI 类型只用于表示 31bit 的整数
不过这还只是从 RunPropagatePhase 得出的结论，似乎还并不能解决所有的问题。

RunRetypePhase

该阶段与上一个阶段相似，重置 state 后对每个节点调用 RetypeNode ：

       bool 
       RetypeNode(Node
       * 
       node) {
      

         
       NodeInfo
       * 
       info 
       = 
       GetInfo(node);
      

         
       info
       -
       >set_visited();
      

         
       bool 
       updated 
       = 
       UpdateFeedbackType(node);
      

         
       TRACE(
       " visit #%d: %s\n"
       , node
       -
       >
       id
       (), node
       -
       >op()
       -
       >mnemonic());
      

         
       VisitNode<RETYPE>(node, info
       -
       >truncation(), nullptr);
      

         
       TRACE(
       "  ==> output %s\n"
       , MachineReprToString(info
       -
       >representation()));
      

         
       return 
       updated;
      

       }
      

UpdateFeedbackType 函数会根据之前设定的 restriction_type_ 标记来更新 feedback_type_ ，代码实现仍为一个 switch，因此这里不展开细讲。

最后通过 SetOutput<RETYPE> 完成设置，类似这样：

       SetOutput<T>(
      
       node, MachineRepresentation::kWord64,
      
       is_asuintn ? 
       Type
       ::UnsignedBigInt64() : 
       Type
       ::SignedBigInt64());

       template <>
      
       void RepresentationSelector::SetOutput<RETYPE>(
      
       Node
       * 
       node, MachineRepresentation representation, 
       Type 
       restriction_type) {
      
       NodeInfo
       * 
       const info 
       = 
       GetInfo(node);
      
       DCHECK(restriction_type.Is(info
       -
       >restriction_type()));
      
       info
       -
       >set_output(representation);
      
       }

--trace-representation 参数也会打印 Retype phase 阶段得到的结果：

       -
       -
       {Retype phase}
       -
       -
      
       visit 
       #25: Merge
      
       =
       =
       > output kRepTagged
      
       #27:Phi[kRepTagged](#24:NumberConstant, #26:NumberConstant, #25:Merge)  [Static type: Range(5, 10)]
      
       visit 
       #27: Phi
      
       =
       =
       > output kRepWord32
      
       visit 
       #30: NumberConstant
      
       =
       =
       > output kRepTaggedSigned
      
       #31:SpeculativeSafeIntegerAdd[SignedSmall](#27:Phi, #30:NumberConstant, #17:SpeculativeNumberEqual, #25:Merge)  [Static type: Range(6, 11)]
      
       visit 
       #31: SpeculativeSafeIntegerAdd
      
       =
       =
       > output kRepWord32
      
       visit 
       #32: Return
      
       =
       =
       > output kRepTagged

RunLowerPhase

根据之前设定好的 feedback_type_ 决定每个结点应该如何优化。例如在 ReType phase 阶段，我们能够得到几行特殊的 log：

       #27:Phi[kRepTagged](#24:NumberConstant, #26:NumberConstant, #25:Merge)  [Static type: Range(5, 10)]
      
       visit 
       #30: NumberConstant
      
       =
       =
       > output kRepTaggedSigned
      
       #31:SpeculativeSafeIntegerAdd[SignedSmall](#27:Phi, #30:NumberConstant, #17:SpeculativeNumberEqual, #25:Merge)  [Static type: Range(6, 11)]

我们对照 Lower phase 打印出来的 Log，跟踪这几个被标记出来的结点：

       -
       -
       {Lower phase}
       -
       -
      
       visit 
       #27: Phi
      
       change: 
       #27:Phi(@0 #24:NumberConstant) from kRepTaggedSigned to kRepWord32:truncate-to-word32
      
       change: 
       #27:Phi(@1 #26:NumberConstant) from kRepTaggedSigned to kRepWord32:truncate-to-word32
      
       visit 
       #30: NumberConstant
      
       defer replacement 
       #30:NumberConstant with #49:Int64Constant
      
       visit 
       #31: SpeculativeSafeIntegerAdd
      
       change: 
       #31:SpeculativeSafeIntegerAdd(@1 #30:NumberConstant) from kRepTaggedSigned to kRepWord32:truncate-to-word32

27 号 Phi 结点标记出了 24 和 26 号结点，将 truncation 传播到这两个结点，使得其被优化为 Int32Constant ，而 kRepTaggedSigned 则让 30 号被替换为 Int64Constant ，31 号结点也是如此。

实例 CVE-2020-16040

大致过了一遍 TypedLoweringPhase 和 SimplifiedLoweringPhase 以后，接下来来看看具体是什么样的情况下有可能让这个优化过程出现漏洞。

首先给出 POC：

       function foo(a) {  
      
       var y 
       = 
       0x7fffffff
       ;  
      
       if 
       (a 
       =
       = 
       NaN) y 
       = 
       NaN;
      
       if 
       (a) y 
       = 
       -
       1
       ;
      
       let z 
       = 
       (y 
       + 
       1
       ) | 
       0
       ;  
      
       console.log(z.toString(
       16
       ));
      
       return 
       z < 
       0
       ;  
      
       }  
      
       %
       PrepareFunctionForOptimization(foo);  
      
       foo(true);  
      
       %
       OptimizeFunctionOnNextCall(foo);  
      
       print
       (foo(false));

只执行 true 分支的目的是防止其通过执行信息获取到 false 分支的实际情况，这样在预测 Range 时从有可能出现错误

如果直接执行，会打印出如下内容：

第二次打印出 -80000000 ，但最终却返回了一个 false ，这就是该漏洞的核心问题。

习惯了 C/C++ 的读者可能注意到 (y + 1) | 0 看起来像是一个无意义的操作，它实际上与 JavaScript 储存数据的方式有关。
先前在 “简单理解 V8 TurboFan” 中提到过，JavaScript 使用 double 类型来储存一般的 Constant，而令其与 0 进行或运算，就能够主动触发类型转换，使得浮点数表示被转为整数表示。
如果原操作数真的是浮点数，那么这个操作就会舍弃小数点以后的数，然后转为整数。
因此会有这样一个事实： 0x80000000 | 0 == -0x80000000
因为它被截断为 32bit int 后就是对应了 -0x80000000 的二进制表示。

接下来我们导出该 poc 的结点海分阶段进行分析：

#45 Word32Or 导出的结果应为 -0x80000000 ，而接下来 Uint32LessThan 是无符号数比较，因此它返回了错误的结果。

我们从 EscapeAnalysisPhase 开始，跟踪一下其优化的过程。此时的结点海如下：

通过 --trace-representation 把遍历过程打印出来，这里笔者筛出了几个代表性的结点：

       -
       -
       {Propagate phase}
       -
       -
      
       visit 
       #55: NumberLessThan (trunc: no-truncation (but distinguish zeros))
      
       initial 
       #45: truncate-to-word32
      
       initial 
       #44: truncate-to-word32
      
       visit 
       #45: SpeculativeNumberBitwiseOr (trunc: truncate-to-word32)
      
       initial 
       #43: truncate-to-word32
      
       initial 
       #44: truncate-to-word32
      
       initial 
       #43: truncate-to-word32
      
       initial 
       #36: no-value-use
      
       visit 
       #43: SpeculativeSafeIntegerAdd (trunc: truncate-to-word32)
      
       initial 
       #39: no-truncation (but identify zeros)
      
       initial 
       #42: no-truncation (but identify zeros)
      
       initial 
       #22: no-value-use
      
       initial 
       #36: no-value-use
      
       -
       -
       {Retype phase}
       -
       -
      
       #43:SpeculativeSafeIntegerAdd[SignedSmall](#39:Phi, #42:NumberConstant, #22:SpeculativeNumberEqual, #36:Merge)  [Static type: Range(0, 2147483648), Feedback type: Range(0, 2147483647)]
      
       visit 
       #43: SpeculativeSafeIntegerAdd
      
       =
       =
       > output kRepWord32
      
       #45:SpeculativeNumberBitwiseOr[SignedSmall](#43:SpeculativeSafeIntegerAdd, #44:NumberConstant, #43:SpeculativeSafeIntegerAdd, #36:Merge)  [Static type: Range(-2147483648, 2147483647), Feedback type: Range(0, 2147483647)]
      
       visit 
       #45: SpeculativeNumberBitwiseOr
      
       =
       =
       > output kRepWord32
      
       visit 
       #55: NumberLessThan
      
       =
       =
       > output kRepBit
      
       -
       -
       {Lower phase}
       -
       -
      
       visit 
       #43: SpeculativeSafeIntegerAdd
      
       change: 
       #43:SpeculativeSafeIntegerAdd(@0 #39:Phi) from kRepFloat64 to kRepWord32:no-truncation (but identify zeros)
      
       change: 
       #43:SpeculativeSafeIntegerAdd(@1 #42:NumberConstant) from kRepTaggedSigned to kRepWord32:no-truncation (but identify zeros)
      
       visit 
       #45: SpeculativeNumberBitwiseOr
      
       change: 
       #45:SpeculativeNumberBitwiseOr(@1 #44:NumberConstant) from kRepTaggedSigned to kRepWord32:truncate-to-word32
      
       visit 
       #55: NumberLessThan
      
       change: 
       #55:NumberLessThan(@1 #44:NumberConstant) from kRepTaggedSigned to kRepWord32:truncate-to-word32

结点海只会显示每个结点的 static type 而不会表示 feedback type

Retype phase

Propagate phase 只做了简单的遍历判断工作，这里并没有太多疑问，我们关注的主要是接下来的 Retype phase

前问提到过，该阶段通过调用 UpdateFeedbackType 来设定每个结点的 feedback_type_ ，具体来说，在它尝试优化时会调用如下内容：

           
       Type 
       input0_type;
      

           
       if 
       (node
       -
       >InputCount() > 
       0
       ) input0_type 
       = 
       FeedbackTypeOf(node
       -
       >InputAt(
       0
       ));
      

           
       Type 
       input1_type;
      

           
       if 
       (node
       -
       >InputCount() > 
       1
       ) input1_type 
       = 
       FeedbackTypeOf(node
       -
       >InputAt(
       1
       ));
      

       ······
      

       new_type 
       = 
       Type
       ::Intersect(op_typer_.Name(input0_type, input1_type), info
       -
       >restriction_type(), graph_zone()); 
      

       ······
      

       GetInfo(node)
       -
       >set_feedback_type(new_type);
      

       }
      

op_typer_.Name(input0_type, input1_type) 会重新计算该结点的两个输入结点得到的 Range，此时就会有我们看到的 static type，即 Range(0,0x80000000)

接下来它调用 info->restriction_type() 获得 restriction_type ，由于在 Propagate phase 时将其设定为 Type::Signed32() ：

1 2	`VisitBinop<T>(node, left_use, right_use, MachineRepresentation::kWord32,` `Type` `::Signed32());`

因此将会取 Range(0,0x80000000) 和 Range(-0x80000000,0x7fffffff) 的交集，于是得到最终的集合是 Range(0,0x7fffffff) 作为它的 feedback_type_ ，而该值会向下传播到 SpeculativeNumberBitwiseOr ，于是就有了上面打印出的 Log 的结果

Lower phase

最终在 RunLowerPhase -> VisitNode<LOWER> 处进行降级的时候：

       case IrOpcode::kNumberLessThan:
      
       case IrOpcode::kNumberLessThanOrEqual: {
      
       Type 
       const lhs_type 
       = 
       TypeOf(node
       -
       >InputAt(
       0
       ));
      
       Type 
       const rhs_type 
       = 
       TypeOf(node
       -
       >InputAt(
       1
       ));
      
       /
       / 
       Regular number comparisons 
       in 
       JavaScript generally identify zeros,
      
       /
       / 
       so we always 
       pass 
       kIdentifyZeros 
       for 
       the inputs, 
       and 
       in 
       addition
      
       /
       / 
       we can truncate 
       -
       0 
       to 
       0 
       for 
       otherwise Unsigned32 
       or 
       Signed32 inputs.
      
       if 
       (lhs_type.Is(
       Type
       ::Unsigned32OrMinusZero()) &&
      
       rhs_type.Is(
       Type
       ::Unsigned32OrMinusZero())) {
      
       /
       / 
       =
       > unsigned Int32Cmp
      
       VisitBinop<T>(node, UseInfo::TruncatingWord32(),
      
       MachineRepresentation::kBit);
      
       if 
       (lower<T>()) ChangeOp(node, Uint32Op(node));
      
       } 
       else 
       if 
       (lhs_type.Is(
       Type
       ::Signed32OrMinusZero()) &&
      
       rhs_type.Is(
       Type
       ::Signed32OrMinusZero())) {
      
       /
       / 
       =
       > signed Int32Cmp
      
       VisitBinop<T>(node, UseInfo::TruncatingWord32(),
      
       MachineRepresentation::kBit);
      
       if 
       (lower<T>()) ChangeOp(node, Int32Op(node));
      
       } 
       else 
       {
      
       /
       / 
       =
       > Float64Cmp
      
       VisitBinop<T>(node, UseInfo::TruncatingFloat64(kIdentifyZeros),
      
       MachineRepresentation::kBit);
      
       if 
       (lower<T>()) ChangeOp(node, Float64Op(node));
      
       }
      
       return
       ;
      
       }

由于 Range(0,0x7fffffff) 符合了 Type::Unsigned32OrMinusZero 的条件，因此调用如下代码将其降级为 Uint32 类型：

1	`if` `(lower<T>()) ChangeOp(node, Uint32Op(node));`

于是就造成了上述的问题，尽管计算值为 -0x80000000 ，但却经过的是 Uint32LessThan，于是返回了 False

总结

事后给出的补丁是这样的：

       @@ 
       -
       1453
       ,
       6 
       +
       1452
       ,
       13 
       @@
      

        
      

            
       Type 
       left_feedback_type 
       = 
       TypeOf(node
       -
       >InputAt(
       0
       ));
      

            
       Type 
       right_feedback_type 
       = 
       TypeOf(node
       -
       >InputAt(
       1
       ));
      

       +
      

       +    
       /
       / 
       Using Signed32 as restriction 
       type 
       amounts to promising there won't be
      

       +    
       /
       / 
       signed overflow. This 
       is 
       incompatible with relying on a Word32
      

       +    
       /
       / 
       truncation 
       in 
       order to skip the overflow check.
      

       +    
       Type 
       const restriction 
       =
      

       +        
       truncation.IsUsedAsWord32() ? 
       Type
       ::
       Any
       () : 
       Type
       ::Signed32();
      

       +
      

            
       /
       / 
       Handle the case when no int32 checks on inputs are necessary (but
      

            
       /
       / 
       an overflow check 
       is 
       needed on the output). Note that we do 
       not
      

            
       /
       / 
       have to do 
       any 
       check 
       if 
       at most one side can be minus zero. For
      

       @@ 
       -
       1466
       ,
       7 
       +
       1472
       ,
       7 
       @@
      

                
       right_upper.Is(
       Type
       ::Signed32OrMinusZero()) &&
      

                
       (left_upper.Is(
       Type
       ::Signed32()) || right_upper.Is(
       Type
       ::Signed32()))) {
      

              
       VisitBinop<T>(node, UseInfo::TruncatingWord32(),
      

       -                    
       MachineRepresentation::kWord32, 
       Type
       ::Signed32());
      

       +                    
       MachineRepresentation::kWord32, restriction);
      

            
       } 
       else 
       {
      

              
       /
       / 
       If the output's truncation 
       is 
       identify
       -
       zeros, we can 
       pass 
       it
      

              
       /
       / 
       along. Moreover, 
       if 
       the operation 
       is 
       addition 
       and 
       we know the
      

       @@ 
       -
       1486
       ,
       8 
       +
       1492
       ,
       9 
       @@
      

              
       UseInfo right_use 
       = 
       CheckedUseInfoAsWord32FromHint(hint, FeedbackSource(),
      

                                                                 
       kIdentifyZeros);
      

              
       VisitBinop<T>(node, left_use, right_use, MachineRepresentation::kWord32,
      

       -                    
       Type
       ::Signed32());
      

       +                    
       restriction);
      

            
       }
      

可以注意到，默认情况下的 restriction 不再是 Type::Signed32()

如果在之前判断截断为 truncate-to-word32 ，则会令参数为 Type::Any() ，其表示任何，和它取交集不会改变原本的范围，因此就不会因为上界超出 Signed32 的预设值导致这个漏洞的出现了。

因此，导致这个漏洞出现的核心原因总结如下：

首先定义整数并通过做简单运算使得其触发截断，将类型缩减为 32bit int ，但保证运算后下界大于 0

由于在 Retype phase 中会对两个范围取交集，因此上界必然小于等于 Signed32 的上界，得到较小的 feedback_type

由于下界为 0 的缘故，接下来其范围必然符合 Type::Unsigned32OrMinusZero 的条件，因此会以无符号数运算进行处理

如果此时运算的上界有可能超出 Signed32，则会因为交集的缘故而不被识别为其他类型，导致最终被认为成无符号运算

其实还有一个小问题没有解释：为什么在默认 restriction_tpye 为 Signed32 的情况下还要先判断 Unsigned32OrMinusZero？
如果原本的下界存在负数，那么 Unsigned32OrMinusZero 将不再适用；而如果原本的下界仅为正数，那么使用 Unsigned32OrMinusZero 就能得到更大的运算范围。
由于 JavaScript 是自动调整数据类型的，这么做能减少数据转换时的开销。

另外，在新的补丁里能找到这样的注释：“Using Signed32 as restriction type amounts to promising there won't be signed overflow.”
显然，此前似乎在这里承诺不会发生溢出，因此默认了 Signed32

关于漏洞利用

在 zer0con2021 上提到的该漏洞利用是通过 array.shift 实现的，但是时至今日，这个利用已经被补丁修复了，因此本文就不再详细赘述具体的利用方式，我们仅尝试分析一下漏洞利用原理。

梳理一下问题：

未经优化的执行不会导致 0x800000 被认为是有符号数
优化以后，在进行运算时有可能导致 0x800000 被识别为无符号数

通过调用 Math.sign() 就能让这个问题出现两个不同的情况：在未进行优化时，该函数返回 1；而经过优化以后，该函数返回 -1

假设一个变量为：

1	`var vul` `=` `0` `-` `Math.sign(x);`

在结点海的推断中，它会推断出变量 vul 的值为 Range(-1,0) ，但其真实值为 Range(0,1) ，因此在构建数组的时候，这个差异绕过了对数组长度的检查，造成了越界读写。

更多【因优化而导致的溢出与CVE-2020-16040】相关视频教程：www.yxfzedu.com

二进制漏洞-因优化而导致的溢出与CVE-2020-16040

因优化而导致的溢出与CVE-2020-16040

正文

优化阶段

TypedLoweringPhase

SimplifiedLoweringPhase

RunPropagatePhase

RunRetypePhase

RunLowerPhase

实例 CVE-2020-16040

Retype phase

Lower phase

总结

关于漏洞利用

相关文章推荐

Markdown Editor

友情链接

课程目录

技术交流QQ群