(转)Java应用性能分析工具：async-profiler

原文地址：https://www.jianshu.com/p/9364028cca4e

环境准备

首先，你需要从github将代码下载下来：

git clone https://github.com/jvm-profiling-tools/async-profiler

然后，进入到下载好的项目中，然后进行编译：

cd async-profiler
make

等待编译完成，可以在看到项目中多了一个build文件夹，这就是我们需要的东西，值得注意的是，async-profiler是少有的我在编译的时候没有遇到任何问题的工具，这也说明这个工具的易用性。当然，下面这些内容是必须的：

JAVA_HOME
GCC

关于async-profiler到底能做些什么事情，可以参考下面的描述：

async-profiler can trace the following kinds of events:

CPU cycles
Hardware and Software performance counters like cache misses, branch misses, page faults, context switches etc.
Allocations in Java Heap
Contented lock attempts of Java monitors

我主要关心的是CPU profiling这一个功能点，所以本文的重点也在CPU profiling这一个功能点上，其他的功能点可以自行去探索。关于async-profiler实现CPU profiling的原理以及为什么这么做，直接参考github上的readme就可以了，就不再这里赘述了，下面来看一下到底如何使用这个工具进行java应用的性能分析。

可以发现在async-profiler项目中有一个脚本叫做“profile.sh”，运行这个脚本，会输出如下提示内容：

Usage: ./profiler.sh [action] [options] <pid>
Actions:
  start             start profiling and return immediately
  stop              stop profiling
  status            print profiling status
  list              list profiling events supported by the target JVM
  collect           collect profile for the specified period of time
                    and then stop (default action)
Options:
  -e event          profiling event: cpu|alloc|lock|cache-misses etc.
  -d duration       run profiling for <duration> seconds
  -f filename       dump output to <filename>
  -i interval       sampling interval in nanoseconds
  -b bufsize        frame buffer size
  -t                profile different threads separately
  -o fmt[,fmt...]   output format: summary|traces|flat|collapsed

<pid> is a numeric process ID of the target JVM
      or 'jps' keyword to find running JVM automatically using jps tool

Example: ./profiler.sh -d 30 -f profile.fg -o collapsed 3456
         ./profiler.sh start -i 999000 jps
         ./profiler.sh stop -o summary,flat jps

其中几个重要的命令解释如下：

start ：开始进行应用的profile数据采集，如果没有设定采集时间的话会一直运行下去直到遇到stop命令
stop：和start配合使用，用来停止应用的profile数据采集
status：检测工具的运行状态，比如可以看到是否已经不可用，或者已经运行多少时间了等信息
list：将可以采集的profile数据类型打印出来
-d N：设定采集应用profile数据的时间，单位为秒
-e event：指定采集数据类型，比如cpu

其他的命令可以参考说明，并且可以结合自己实际操作来查看效果。下面来开始使用async-profiler工具来采集cpu profile数据，并且配合火焰图生成工具工具FlameGraph来生成cpu火焰图，并且从火焰图中找到热点代码。FlameGraph工具可以直接下载下来就可以使用：

git clone https://github.com/brendangregg/FlameGraph

首先将java应用运行起来，你可以试着运行下面的代码来进行测试：

import java.io.File; class Target { private static volatile int value; private static void method1() { for (int i = 0; i < 1000000; ++i)
            ++value;
    } private static void method2() { for (int i = 0; i < 1000000; ++i)
            ++value;
    } private static void method3() throws Exception { for (int i = 0; i < 1000; ++i) { for (String s : new File("/tmp").list()) {
                value += s.hashCode();
            }
        }
    } public static void main(String[] args) throws Exception { while (true) {
            method1();
            method2();
            method3();
        }
    }
}

运行起来之后，可以使用jps命令来查看运行起来的java应用的pid，然后使用下面的命令开始使用工具进行cpu profile数据采集：

./profiler.sh start $pid

一段时间之后，比如30秒后，就可以使用下面的命令来停止数据采集了：

./profiler.sh stop $pid

然后，会打印处下面的信息：

可以很直观的看出，占用cpu时间最多的是method3，占用了93.06%的cpu时间，然后是method2和method1，分别占用2.93%和2.77%的cpu时间，所以很明显method3就是性能瓶颈，也就是所谓的热点代码，需要着手进行优化。当然，上面是有的命令式是比较简单的，下面来介绍一个比较厉害的命令，可以设定采集数据的时间，并且可以将采集到的数据dump起来，然后使用FlameGraph工具来生成火焰图进行直观的分析。当然，首先需要运行起来代码，并且使用jps找到应用的pid，然后可以使用下面的命令来进行数据采集任务：

./profiler.sh -d 10 -o collapsed -f /tmp/collapsed.txt pid

这个命令的意思是说，采集数据的时间为10秒，并且将数据按照collapsed规范进行dump，并且dump到/tmp/collapsed.txt这个文件，过了10秒之后，工具会自动停止，并且将cpu的profile数据dump到指定的路径（按照指定的规范），可以到/tmp/collapsed.txt查看具体的文件内容，但是很大程度上是看不懂的，所以需要使用FlameGraph工具来进行加工一下，可以使用下面的命令来生成火焰图：

~/github/FlameGraph/flamegraph.pl --colors=java /tmp/collapsed.txt > flamegraph.svg

当然，你需要指定你自己的FlameGraph的路径，上面命令中的是我的路径，很快，你就可以在当前目录下发现多了一个flamegraph.svg文件，使用chorm打开，就可以看到下面的图片内容（可以点击放大的）：

可以看到，method3是最宽的，也就代表method3占用的cpu时间是最多的，这样看起来就直观很多了。

下面来看一下alloc类型的数据式怎么生成的，可以从这些数据中看出什么，运行下面的代码：

import java.util.concurrent.ThreadLocalRandom; public class AllocatingTarget implements Runnable { public static volatile Object sink; public static void main(String[] args) { new Thread(new AllocatingTarget(), "AllocThread-1").start(); new Thread(new AllocatingTarget(), "AllocThread-2").start();
    } @Override public void run() { while (true) {
            allocate();
        }
    } private static void allocate() { if (ThreadLocalRandom.current().nextBoolean()) {
            sink = new int[128 * 1000];
        } else {
            sink = new Integer[128 * 1000];
        }
    }
}

然后使用jps命令取到该应用的pid，然后执行下面的命令：

./profiler.sh start  -e alloc pid

一段时间之后，可以使用下面的命令来停止数据采集：

./profiler.sh stop  -e alloc pid

然后就会看到下面的输出：

可以看出各种类型的对象生成量，并且可以看到是从什么路径生成的（所谓路径就是类->方法->方法->...），当然，这只是该工具的一种玩法，其他复杂而有趣的玩法需要不断挖掘，并且结合实际应用来发现。