Web lists-archives.com

RE: [PATCH V2] perf/x86/intel/uncore: Querying number of CHAs from CAPID6 register




I've tested this patch on the same set of hubless (single-segment) and scalable (segment-per-socket) configurations as for Kan's version 1.

As far as we can tell this will also work for Cascade Lake, but will need revisiting for Ice Lake.

Thanks.
Gary

> -----Original Message-----
> From: kan.liang@xxxxxxxxxxxxxxx [mailto:kan.liang@xxxxxxxxxxxxxxx]
> Sent: Tuesday, March 13, 2018 1:52 PM
> To: mingo@xxxxxxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx;
> peterz@xxxxxxxxxxxxx; andy.shevchenko@xxxxxxxxx
> Cc: Kroening, Gary; Travis, Mike; Banman, Andrew; Sivanich, Dimitri;
> Anderson, Russ; x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Kan Liang
> Subject: [PATCH V2] perf/x86/intel/uncore: Querying number of CHAs from
> CAPID6 register
> 
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> 
> The number of CHAs is miscalculated on multi PCI domain systems on
> Skylake server.
> 
> (From Kroening, Gary:
> 
> For systems with a single PCI segment, it is sufficient to look for the
> bus number to change in order to determine that all of the CHa's have
> been counted for a single socket.
> However, for multi PCI segment systems, each socket is given a new
> segment and the bus number does NOT change.  So looking only for the
> bus number to change ends up counting all of the CHa's on all sockets
> in the system.  This leads to writing CPU MSRs beyond a valid range and
> causes an error in ivbep_uncore_msr_init_box().)
> 
> To determine the number of CHAs, it should read bits 27:0 in the CAPID6
> register located at Device 30, Function 3, Offset 0x9C. These 28 bits
> form a bit vector of available LLC slices and the CHAs that manage those
> slices.
> 
> Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
> support")
> Reported-by: Kroening, Gary <gary.kroening@xxxxxxx>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> ---
> 
> Changes since V1:
>  - add missed pci_dev_put()
>  - Drop ugly casting by using hweight32()
>  - Add comments for macros.
> 
>  arch/x86/events/intel/uncore_snbep.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/events/intel/uncore_snbep.c
> b/arch/x86/events/intel/uncore_snbep.c
> index 6d8044a..8970f71 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3562,24 +3562,27 @@ static struct intel_uncore_type *skx_msr_uncores[]
> = {
>  	NULL,
>  };
> 
> +/*
> + * To determine the number of CHAs, it should read bits 27:0 in the
> CAPID6
> + * register which located at Device 30, Function 3, Offset 0x9C. PCI ID
> 0x2083.
> + */
> +#define SKX_CAPID6		0x9c
> +#define SKX_CHA_BIT_MASK	GENMASK(27, 0)
> +
>  static int skx_count_chabox(void)
>  {
> -	struct pci_dev *chabox_dev = NULL;
> -	int bus, count = 0;
> +	struct pci_dev *dev = NULL;
> +	u32 val = 0;
> 
> -	while (1) {
> -		chabox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x208d,
> chabox_dev);
> -		if (!chabox_dev)
> -			break;
> -		if (count == 0)
> -			bus = chabox_dev->bus->number;
> -		if (bus != chabox_dev->bus->number)
> -			break;
> -		count++;
> -	}
> +	dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x2083, dev);
> +	if (!dev)
> +		goto out;
> 
> -	pci_dev_put(chabox_dev);
> -	return count;
> +	pci_read_config_dword(dev, SKX_CAPID6, &val);
> +	val &= SKX_CHA_BIT_MASK;
> +out:
> +	pci_dev_put(dev);
> +	return hweight32(val);
>  }
> 
>  void skx_uncore_cpu_init(void)
> --
> 2.7.4