Web lists-archives.com

Re: I made a flame graph renderer for git's trace2 output




On Fri, May 10 2019, Jeff King wrote:

> On Fri, May 10, 2019 at 05:09:58PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> As noted in TODOs in the script there's various stuff I'd like to do
>> better, and this also shows how we need a lot more trace regions to get
>> granular data.
>
> Hmm. My gut reaction was: doesn't "perf record -g make test" already
> give us that granular data? I know "perf" isn't available everywhere,
> but the idea of the FlameGraph repo is that it takes input from a lot of
> sources (though I don't know if it supports any Windows-specific formats
> yet, which is presumably a point of interesting to trace-2 authors).
>
> But having generated such a flamegraph, it's not all that helpful. It
> mainly tells us that we spend a lot of time on fork/exec. Which is no
> surprise, since the test suite is geared not towards heavy workloads,
> but lots of tiny functionality tests.
>
> TBH, I'm not sure that flame-graphing the test suite is going to be all
> that useful in the long run. It's going to be heavily weighted by the
> types of things the test suite does. Flamegraphs are good for
> understanding where your time is going for a particular workload, but
> the workload of the test suite is not that interesting.
>
> And once you do have a particular workload of interest that you can
> replay, then I think the granular "perf" results really can be helpful.
>
> I think the trace2 flamegraph would be most useful if you were
> collecting across a broad spectrum of workloads done by a user. You
> _can_ do that with perf or similar tools, but it can be a bit awkward.
> I do wonder how painful it would be to alias "git" to "perf record git"
> for a day or something.

Yeah I should have mentioned that I'm mainly linking to the test suite
rendering as a demo.

My actual use-case for this is to see what production nodes are spending
their time on, similar to what Microsoft is doing with their use of this
facility.

The test suite serves as a really good test-case for the output, and to
stress-test my aggregation script, since we're pretty much guaranteed to
run all our commands, and cover a lot of unusual cases.

It also shows that we've got a long way to go in improving the trace2
facility, i.e. adding region enter/leave for some of the things we spend
the most time on.