[PATCH v9 1/2] perf, uncore: Adding documentation for ThunderX2 pmu uncore driver
- Date: Wed, 5 Dec 2018 10:59:28 +0000
- From: "Kulkarni, Ganapatrao" <Ganapatrao.Kulkarni@xxxxxxxxxx>
- Subject: [PATCH v9 1/2] perf, uncore: Adding documentation for ThunderX2 pmu uncore driver
The SoC has PMU support in its L3 cache controller (L3C) and in the
DDR4 Memory Controller (DMC).
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@xxxxxxxxxx>
Documentation/perf/thunderx2-pmu.txt | 93 ++++++++++++++++++++++++++++
1 file changed, 93 insertions(+)
create mode 100644 Documentation/perf/thunderx2-pmu.txt
diff --git a/Documentation/perf/thunderx2-pmu.txt b/Documentation/perf/thunderx2-pmu.txt
new file mode 100644
@@ -0,0 +1,93 @@
+Cavium ThunderX2 SoC Performance Monitoring Unit (PMU UNCORE)
+ThunderX2 SoC PMU consists of independent system wide per Socket PMUs, such
+as Level 3 Cache(L3C) and DDR4 Memory Controller(DMC).
+The DMC has 8 interleaved channels and the L3C has 16 interleaved tiles. Events
+are counted for default channel(i.e channel 0) and prorated to total number of
+DMC and L3C supports up to 4 counters. Counters are independently programmable
+and can be started and stopped individually. Each counter can be set to
+different event. Counters are 32 bit and do not support overflow interrupt;
+they are read every 2 seconds.
+PMU UNCORE (perf) driver:
+The thunderx2_pmu driver registers per socket perf PMUs for DMC and L3C devices.
+Each PMU can be used to count up to 4 events simultaneously. PMUs provide
+description of its available events and configuration options
+in sysfs, see /sys/devices/uncore_<l3c_S/dmc_S/>; S is the socket id.
+The driver does not support sampling, therefore "perf record" will
+not work. Per-task perf sessions are not supported.
+perf stat -a -e uncore_dmc_0/cnt_cycles/ sleep 1
+perf stat -a -e \
+uncore_dmc_0/write_txns/ sleep 1
+perf stat -a -e \
+uncore_l3c_0/inv_hit/ sleep 1
+ Number of Read requests received by the L3 Cache.
+ This include Read as well as Read Exclusives.
+ Number of Read requests received by the L3 cache that were hit
+ in the L3 (Data provided form the L3)
+ Number of Write Backs received by the L3 Cache. These are basically
+ the L2 Evicts and writes from the PCIe Write Cache.
+ This is the Number of Invalidate and Write received by the L3 Cache.
+ Also Writes from IO that did not go through the PCIe Write Cache.
+ This is the Number of Invalidate and Write received by the L3 Cache
+ That were a hit in the L3 Cache.
+ Number of Invalidate request received by the L3 Cache.
+ Number of Invalidate request received by the L3 Cache that were a
+ hit in L3.
+ Number of Evicts that the L3 generated.
+1. Granularity of all these events counter value is cache line length(64 Bytes).
+2. L3C cache Hit Ratio = (read_hit + inv_nwrite_hit + inv_hit) / (read_request + inv_nwrite_request + inv_request)
+ Count cycles (Clocks at the DMC clock rate)
+ Number of 64 Bytes write transactions received by the DMC(s)
+ Number of 64 Bytes Read transactions received by the DMC(s)
+ Number of 64 Bytes data transferred to or from DRAM.