Java 方法直接调用 vs 单元素循环调用

Posted

技术标签:

【中文标题】Java 方法直接调用 vs 单元素循环调用【英文标题】:Java method direct invocation vs single element loop invocation 【发布时间】:2019-10-20 21:53:57 【问题描述】:

最近我想看看直接在对象上调用方法与在同一对象上调用相同方法的性能差异是什么,如果该对象被添加到单个元素 ArrayList 中并且我们尝试循环该元素 List .

老实说,我的假设是会有小的差异,因为我希望展开单个元素循环,因此调用模拟直接调用。

为了测试我创建了以下简单的 JMH 示例:https://gist.github.com/nikkatsa/69a58c63c1b79b245d700f6465bce7f7

令我惊讶的是,直接在元素上调用该方法似乎快了约 2.5 倍。这是一个很大的数字,因为我的期望完全不同:

Benchmark                                                     Mode  Cnt          Score         Error  Units
SingleElementLoopBenchmark.directInvocation                  thrpt   10  332576972.176 ± 2740616.755  ops/s
SingleElementLoopBenchmark.singleElementLoopInvocation       thrpt   10  130179329.978 ± 1303776.248  ops/s

我在推特上发了一点信息,希望得到一些指示,以了解为什么会发生这种情况。有人告诉我,运行perf 会给我一个线索。所以我确实使用-perf dtraceasm 运行,我得到了以下两个输出,第一个用于直接调用,第二个用于循环调用:

....[Hottest Region 1]..............................................................................
c2, com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub, version 115 (69 bytes)

            0x000000010b73c92c: mov    %rbp,%r9
            0x000000010b73c92f: movzbl 0x94(%r9),%r10d                ;*getfield isDone reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@27 (line 123)
                                                                      ; implicit exception: dispatches to 0x000000010b73ca5e
            0x000000010b73c937: test   %r10d,%r10d
            0x000000010b73c93a: jne    0x000000010b73c9da             ;*ifeq reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@30 (line 123)
            0x000000010b73c940: mov    $0x1,%ebp
            0x000000010b73c945: data16 data16 nopw 0x0(%rax,%rax,1)   ;*aload_1 reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@33 (line 124)
  9.56%  ↗  0x000000010b73c950: mov    0x40(%rsp),%r10
  1.00%  │  0x000000010b73c955: mov    0xc(%r10),%r10d                ;*getfield dispatcher reexecute=0 rethrow=0 return_oop=0
         │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::directInvocation@1 (line 23)
         │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@17 (line 121)
  0.17%  │  0x000000010b73c959: mov    0xc(%r12,%r10,8),%r11d         ; implicit exception: dispatches to 0x000000010b73ca12
 11.18%  │  0x000000010b73c95e: test   %r11d,%r11d
  0.00%  │  0x000000010b73c961: je     0x000000010b73c9c9             ;*invokevirtual performAction reexecute=0 rethrow=0 return_oop=0
         │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invoke@5 (line 40)
         │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::directInvocation@5 (line 23)
         │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@17 (line 121)
 10.69%  │  0x000000010b73c963: mov    %r9,(%rsp)
  0.65%  │  0x000000010b73c967: mov    0x38(%rsp),%rsi
  0.00%  │  0x000000010b73c96c: mov    $0x1,%edx
  0.14%  │  0x000000010b73c971: xchg   %ax,%ax
 10.08%  │  0x000000010b73c973: callq  0x000000010b6c2900             ; ImmutableOopMap[48]=Oop [56]=Oop [64]=Oop [0]=Oop 
         │                                                            ;*invokevirtual consume reexecute=0 rethrow=0 return_oop=0
         │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Listener::performAction@2 (line 53)
         │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invoke@5 (line 40)
         │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::directInvocation@5 (line 23)
         │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@17 (line 121)
         │                                                            ;   optimized virtual_call
  1.44%  │  0x000000010b73c978: mov    (%rsp),%r9
  0.19%  │  0x000000010b73c97c: movzbl 0x94(%r9),%r8d                 ;*ifeq reexecute=0 rethrow=0 return_oop=0
         │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@30 (line 123)
  9.77%  │  0x000000010b73c984: mov    0x108(%r15),%r10
  0.99%  │  0x000000010b73c98b: add    $0x1,%rbp                      ; ImmutableOopMapr9=Oop [48]=Oop [56]=Oop [64]=Oop 
         │                                                            ;*ifeq reexecute=1 rethrow=0 return_oop=0
         │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@30 (line 123)
  0.02%  │  0x000000010b73c98f: test   %eax,(%r10)                    ;   poll
  0.28%  │  0x000000010b73c992: test   %r8d,%r8d
  0.00%  ╰  0x000000010b73c995: je     0x000000010b73c950             ;*aload_1 reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@33 (line 124)
            0x000000010b73c997: movabs $0x10abd7d62,%r10
            0x000000010b73c9a1: callq  *%r10                          ;*invokestatic nanoTime reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@34 (line 124)
            0x000000010b73c9a4: mov    0x30(%rsp),%r10
            0x000000010b73c9a9: mov    %rbp,0x18(%r10)                ;*putfield measuredOps reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@49 (line 126)
            0x000000010b73c9ad: mov    %rax,0x30(%r10)                ;*putfield stopTime reexecute=0 rethrow=0 return_oop=0
                                                                      ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_directInvocation_jmhTest::directInvocation_thrpt_jmhStub@37 (line 124)
            0x000000010b73c9b1: movq   $0x0,0x20(%r10)                ;*invokevirtual directInvocation reexecute=0 rethrow=0 return_oop=0
....................................................................................................
....[Hottest Region 1]..............................................................................
c2, com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub, version 115 (279 bytes)

                                                                        ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@27 (line 123)
                                                                        ; implicit exception: dispatches to 0x000000011153fffe
              0x000000011153fa81: test   %r10d,%r10d
              0x000000011153fa84: jne    0x000000011153fbe4             ;*ifeq reexecute=0 rethrow=0 return_oop=0
                                                                        ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@30 (line 123)
              0x000000011153fa8a: mov    0x58(%rsp),%rdx
              0x000000011153fa8f: test   %rdx,%rdx
              0x000000011153fa92: je     0x000000011153fd5e
              0x000000011153fa98: mov    $0x1,%ebx
         ╭    0x000000011153fa9d: jmp    0x000000011153fad6
  0.19%  │ ↗  0x000000011153fa9f: mov    0x58(%rsp),%r13
  3.55%  │ │  0x000000011153faa4: mov    (%rsp),%rcx
  0.09%  │ │  0x000000011153faa8: mov    0x60(%rsp),%rdx
  0.22%  │ │  0x000000011153faad: mov    0x50(%rsp),%r11
  0.17%  │ │  0x000000011153fab2: mov    0x8(%rsp),%rbx                 ;*if_icmpge reexecute=0 rethrow=0 return_oop=0
         │ │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@12 (line 44)
         │ │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
         │ │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  3.55%  │↗│  0x000000011153fab7: movzbl 0x94(%r11),%r8d                ;*goto reexecute=0 rethrow=0 return_oop=0
         │││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@35 (line 44)
         │││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
         │││                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.16%  │││  0x000000011153fabf: mov    0x108(%r15),%r10
  0.28%  │││  0x000000011153fac6: add    $0x1,%rbx                      ; ImmutableOopMapr11=Oop rcx=Oop rdx=Oop r13=Oop 
         │││                                                            ;*ifeq reexecute=1 rethrow=0 return_oop=0
         │││                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@30 (line 123)
  0.19%  │││  0x000000011153faca: test   %eax,(%r10)                    ;   poll
  4.00%  │││  0x000000011153facd: test   %r8d,%r8d
         │││  0x000000011153fad0: jne    0x000000011153fbe9             ;*aload_1 reexecute=0 rethrow=0 return_oop=0
         │││                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@33 (line 124)
  0.07%  ↘││  0x000000011153fad6: mov    0xc(%rcx),%r8d                 ;*getfield dispatcher reexecute=0 rethrow=0 return_oop=0
          ││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@1 (line 28)
          ││                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.22%   ││  0x000000011153fada: mov    0x10(%r12,%r8,8),%r10d         ;*getfield singleListenerList reexecute=0 rethrow=0 return_oop=0
          ││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@4 (line 44)
          ││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
          ││                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
          ││                                                            ; implicit exception: dispatches to 0x000000011153ff2a
  0.21%   ││  0x000000011153fadf: mov    0x8(%r12,%r10,8),%edi          ; implicit exception: dispatches to 0x000000011153ff3e
  4.39%   ││  0x000000011153fae4: cmp    $0x237565,%edi                 ;   metadata('com/nikoskatsanos/benchmarks/loops/SingleElementLoopBenchmark$Dispatcher$1')
          ││  0x000000011153faea: jne    0x000000011153fc92
  0.33%   ││  0x000000011153faf0: lea    (%r12,%r10,8),%r9              ;*invokeinterface size reexecute=0 rethrow=0 return_oop=0
          ││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@7 (line 44)
          ││                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
          ││                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.09%   ││  0x000000011153faf4: mov    0x10(%r9),%r9d
  0.14%   ││  0x000000011153faf8: test   %r9d,%r9d
          ╰│  0x000000011153fafb: jle    0x000000011153fab7             ;*if_icmpge reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@12 (line 44)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  3.98%    │  0x000000011153fafd: lea    (%r12,%r8,8),%rdi              ;*getfield dispatcher reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@1 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.06%    │  0x000000011153fb01: xor    %r9d,%r9d                      ;*aload_0 reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@15 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.09%    │  0x000000011153fb04: mov    0x8(%r12,%r10,8),%esi          ; implicit exception: dispatches to 0x000000011153ff4e
  0.06%    │  0x000000011153fb09: cmp    $0x237565,%esi                 ;   metadata('com/nikoskatsanos/benchmarks/loops/SingleElementLoopBenchmark$Dispatcher$1')
  0.00%    │  0x000000011153fb0f: jne    0x000000011153fcc2
  3.93%    │  0x000000011153fb15: lea    (%r12,%r10,8),%rax             ;*invokeinterface get reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@20 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.06%    │  0x000000011153fb19: mov    0x10(%rax),%r10d               ;*getfield size reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - java.util.ArrayList::get@2 (line 458)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@20 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.11%    │  0x000000011153fb1d: test   %r10d,%r10d
           │  0x000000011153fb20: jl     0x000000011153fcf6             ;*invokestatic checkIndex reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - java.util.Objects::checkIndex@3 (line 372)
           │                                                            ; - java.util.ArrayList::get@5 (line 458)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@20 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.28%    │  0x000000011153fb26: cmp    %r10d,%r9d
  0.00%    │  0x000000011153fb29: jae    0x000000011153fc1c
  3.97%    │  0x000000011153fb2f: mov    0x14(%rax),%r10d               ;*getfield elementData reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - java.util.ArrayList::elementData@1 (line 442)
           │                                                            ; - java.util.ArrayList::get@11 (line 459)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@20 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.05%    │  0x000000011153fb33: mov    %r9d,%ebp                      ;*invokestatic checkIndex reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - java.util.Objects::checkIndex@3 (line 372)
           │                                                            ; - java.util.ArrayList::get@5 (line 458)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@20 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.08%    │  0x000000011153fb36: mov    0xc(%r12,%r10,8),%esi          ; implicit exception: dispatches to 0x000000011153ff62
  1.27%    │  0x000000011153fb3b: cmp    %esi,%ebp
           │  0x000000011153fb3d: jae    0x000000011153fc5a
  3.94%    │  0x000000011153fb43: shl    $0x3,%r10
  0.05%    │  0x000000011153fb47: mov    0x10(%r10,%rbp,4),%r9d         ;*aaload reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - java.util.ArrayList::elementData@5 (line 442)
           │                                                            ; - java.util.ArrayList::get@11 (line 459)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@20 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  1.71%    │  0x000000011153fb4c: mov    0x8(%r12,%r9,8),%r10d          ; implicit exception: dispatches to 0x000000011153ff72
 17.85%    │  0x000000011153fb51: cmp    $0x237522,%r10d                ;   metadata('com/nikoskatsanos/benchmarks/loops/SingleElementLoopBenchmark$Listener')
  0.00%    │  0x000000011153fb58: jne    0x000000011153fef6             ;*checkcast reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@25 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  3.79%    │  0x000000011153fb5e: mov    %rdi,0x18(%rsp)
  0.02%    │  0x000000011153fb63: mov    %r8d,0x10(%rsp)
  0.02%    │  0x000000011153fb68: mov    %rbx,0x8(%rsp)
  0.19%    │  0x000000011153fb6d: mov    %r11,0x50(%rsp)
  3.95%    │  0x000000011153fb72: mov    %rdx,0x60(%rsp)
  0.02%    │  0x000000011153fb77: mov    %rcx,(%rsp)
  0.03%    │  0x000000011153fb7b: mov    %r13,0x58(%rsp)
  0.36%    │  0x000000011153fb80: mov    %rdx,%rsi
  3.78%    │  0x000000011153fb83: mov    $0x1,%edx
  0.01%    │  0x000000011153fb88: vzeroupper
  4.05%    │  0x000000011153fb8b: callq  0x00000001114c2900             ; ImmutableOopMap[80]=Oop [88]=Oop [96]=Oop [0]=Oop [16]=NarrowOop [24]=Oop 
           │                                                            ;*invokevirtual consume reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Listener::performAction@2 (line 53)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@29 (line 45)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
           │                                                            ;   optimized virtual_call
  0.98%    │  0x000000011153fb90: mov    0x10(%rsp),%r8d
  3.61%    │  0x000000011153fb95: mov    0x10(%r12,%r8,8),%r10d         ;*getfield singleListenerList reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@4 (line 44)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.24%    │  0x000000011153fb9a: mov    0x8(%r12,%r10,8),%r9d          ; implicit exception: dispatches to 0x000000011153ff9e
  0.74%    │  0x000000011153fb9f: inc    %ebp                           ;*iinc reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@32 (line 44)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.04%    │  0x000000011153fba1: cmp    $0x237565,%r9d                 ;   metadata('com/nikoskatsanos/benchmarks/loops/SingleElementLoopBenchmark$Dispatcher$1')
  0.00%    │  0x000000011153fba8: jne    0x000000011153fd36
  3.60%    │  0x000000011153fbae: lea    (%r12,%r10,8),%r11             ;*invokeinterface size reexecute=0 rethrow=0 return_oop=0
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@7 (line 44)
           │                                                            ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
           │                                                            ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
  0.11%    │  0x000000011153fbb2: mov    0x10(%r11),%r9d
  0.35%    │  0x000000011153fbb6: cmp    %r9d,%ebp
           ╰  0x000000011153fbb9: jge    0x000000011153fa9f             ;*if_icmpge reexecute=0 rethrow=0 return_oop=0
                                                                        ; - com.benchmarks.loops.SingleElementLoopBenchmark$Dispatcher::invokeInLoop@12 (line 44)
                                                                        ; - com.benchmarks.loops.SingleElementLoopBenchmark::singleElementLoopInvocation@5 (line 28)
                                                                        ; - com.benchmarks.loops.generated.SingleElementLoopBenchmark_singleElementLoopInvocation_jmhTest::singleElementLoopInvocation_thrpt_jmhStub@17 (line 121)
              0x000000011153fbbf: mov    %ebp,%r9d
              0x000000011153fbc2: mov    0x58(%rsp),%r13
              0x000000011153fbc7: mov    (%rsp),%rcx
              0x000000011153fbcb: mov    0x60(%rsp),%rdx
              0x000000011153fbd0: mov    0x50(%rsp),%r11
              0x000000011153fbd5: mov    0x8(%rsp),%rbx
....................................................................................................

如果可能的话,我想要一些帮助来理解上述两个输出。从我很少了解的情况来看,似乎在循环时,由于列表泛型,程序要付出相当大的代价来检查数组列表的大小以执行循环,以及将对象转换为适当类型的成本.我无法真正理解循环是否展开,但我相信它会展开?

如果有人理解上述输出并能指出正在发生的事情,将不胜感激。

【问题讨论】:

vzeroupper 在这里疯了;循环内没有使用 YMM AVX 寄存器。 (它在 KNL 以外的 CPU 上并不慢,但这更多地表明 JIT 没有看穿很多东西)。显然这里有大量的优化缺失。 【参考方案1】:

您衡量一个几乎什么都不做的操作与一个涉及至少四个相关内存负载的操作:

ArrayList.size ArrayList.elementData elementData.length 元素数据[0]

加上至少一个边界检查。我说“至少”,因为这可能是最好的优化编译器可以做的。然而,HotSpot 执行两个绑定检查:一个在 ArrayList.get() 中,另一个用于 elementData 数组访问。

此外,ArrayList 包含一个对象数组,因此为了从列表中获取Listener 的实例,需要将对象强制转换为目标类型。这是额外的内存负载加上类型检查。

从另一个角度看基准分数:单个元素列表不是慢 2.5 倍,而是多 3 纳秒:

Benchmark                                               Mode  Cnt  Score   Error  Units
SingleElementLoopBenchmark.directInvocation             avgt    5  2,340 ± 0,095  ns/op
SingleElementLoopBenchmark.singleElementLoopInvocation  avgt    5  5,283 ± 0,074  ns/op

3 纳秒! 这只是 2 GHz 内核的 6 个 CPU 周期。光可以在 3 纳秒内传播不到 1 米。你真的期望 ArrayList 的开销比这少吗?开销再小,如果和完全没有开销相比,这个比例可能无限大。

【讨论】:

安德烈,首先感谢您的时间和详细的解释。我的主要目标是真正了解循环方法的开销是什么,您对此进行了彻底的解释。你说我可能从错误的角度看待结果是有道理的。我只是想权衡可能导致短期内调用过多的特定用例的利弊。再次感谢您。

以上是关于Java 方法直接调用 vs 单元素循环调用的主要内容,如果未能解决你的问题,请参考以下文章

Java多线程为什么使用while循环来调用wait方法

Java多线程为什么使用while循环来调用wait方法

Java多线程为什么使用while循环来调用wait方法

VS2008调用webservice 没有生成Reference.cs文件 所以不能调用方法

java递归

vs调用note转存文件