Web lists-archives.com

Re: Yet another KPTI regression with 4.14.x series in a VM




On Fri, 12 Jan 2018, Laura Abbott wrote:

Cc+ Andy

I'm almost crashed out by now. Andy might have an idea. I'll look again
tomorrow with brain awake.

> On 01/12/2018 10:51 AM, Thomas Gleixner wrote:
> > On Fri, 12 Jan 2018, Laura Abbott wrote:
> > > Fedora got a bug report on 4.14.11 of a panic when booting a
> > > Fedora guest in a CentOS 6 VM, not reproducible with nopti.
> > > The issue is still present as of 4.14.13 as well. The only
> > > report is a panic screenshot
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1532458
> > > 
> > > I've lost track of all the fixes that have been flying around,
> > > is this a new issue or has a fix not yet made it to stable?
> > 
> > Hmm. Looks kinda familiar, but that has been fixed I think even before
> > 4.4.11. Could you please ask the reported to provide a full console log via
> > the VM "serial console" ?
> > 
> > Thanks,
> > 
> > 	tglx
> > 
> 
> [    0.000000] Linux version 4.14.13-300.fc27.x86_64
> (mockbuild@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) (gcc version 7.2.1 20170915 (Red
> Hat 7.2.1-2) (GCC)) #1 SMP Thu Jan 11 04:00:01 UTC 2018
> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.14.13-300.fc27.x86_64
> root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap
> rhgb LANG=en_US.UTF-8 console=tty0 console=ttyS0
> [    0.000000] Disabled fast string operations
> [    0.000000] x86/fpu: x87 FPU will use FXSAVE
> [    0.000000] e820: BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dbff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000009dc00-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007fffcfff] usable
> [    0.000000] BIOS-e820: [mem 0x000000007fffd000-0x000000007fffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fffbc000-0x00000000ffffffff] reserved
> [    0.000000] NX (Execute Disable) protection: active
> [    0.000000] random: fast init done
> [    0.000000] SMBIOS 2.4 present.
> [    0.000000] DMI: Red Hat KVM, BIOS 0.5.1 01/01/2007
> [    0.000000] Hypervisor detected: KVM
> [    0.000000] tsc: Using PIT calibration value
> [    0.000000] e820: last_pfn = 0x7fffd max_arch_pfn = 0x400000000
> [    0.000000] x86/PAT: PAT MSR is 0, disabled.
> [    0.000000] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC
> [    0.000000] found SMP MP-table at [mem 0x000fda30-0x000fda3f] mapped at
> [ffffffffff200a30]
> [    0.000000] RAMDISK: [mem 0x356e3000-0x36b69fff]
> [    0.000000] ACPI: Early table checksum verification disabled
> [    0.000000] ACPI: RSDP 0x00000000000FD9E0 000014 (v00 BOCHS )
> [    0.000000] ACPI: RSDT 0x000000007FFFD5D0 000034 (v01 BOCHS	BXPCRSDT
> 00000001 BXPC 00000001)
> [    0.000000] ACPI: FACP 0x000000007FFFFE20 000074 (v01 BOCHS	BXPCFACP
> 00000001 BXPC 00000001)
> [    0.000000] ACPI: DSDT 0x000000007FFFD910 0024A2 (v01 BXPC	BXDSDT	
> 00000001 INTL 20090123)
> [    0.000000] ACPI: FACS 0x000000007FFFFDC0 000040
> [    0.000000] ACPI: SSDT 0x000000007FFFD810 0000FF (v01 BOCHS	BXPCSSDT
> 00000001 BXPC 00000001)
> [    0.000000] ACPI: APIC 0x000000007FFFD720 000080 (v01 BOCHS	BXPCAPIC
> 00000001 BXPC 00000001)
> [    0.000000] ACPI: SSDT 0x000000007FFFD610 00010F (v01 BXPC	BXSSDTPC
> 00000001 INTL 20090123)
> [    0.000000] No NUMA configuration found
> [    0.000000] Faking a node at [mem 0x0000000000000000-0x000000007fffcfff]
> [    0.000000] NODE_DATA(0) allocated [mem 0x7ffd2000-0x7fffcfff]
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [    0.000000] kvm-clock: cpu 0, msr 0:7ffc2001, primary cpu clock
> [    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles:
> 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [    0.000000] Zone ranges:
> [    0.000000]	 DMA	  [mem 0x0000000000001000-0x0000000000ffffff]
> [    0.000000]	 DMA32	  [mem 0x0000000001000000-0x000000007fffcfff]
> [    0.000000]	 Normal   empty
> [    0.000000]	 Device   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]	 node	0: [mem 0x0000000000001000-0x000000000009cfff]
> [    0.000000]	 node	0: [mem 0x0000000000100000-0x000000007fffcfff]
> [    0.000000] Initmem setup node 0 [mem
> 0x0000000000001000-0x000000007fffcfff]
> [    0.000000] ACPI: PM-Timer IO Port: 0xb008
> [    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
> [    0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
> [    0.000000] Using ACPI (MADT) for SMP configuration information
> [    0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
> [    0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff]
> [    0.000000] PM: Registered nosave memory: [mem 0x0009d000-0x0009dfff]
> [    0.000000] PM: Registered nosave memory: [mem 0x0009e000-0x0009ffff]
> [    0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
> [    0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
> [    0.000000] e820: [mem 0x80000000-0xfffbbfff] available for PCI devices
> [    0.000000] Booting paravirtualized kernel on KVM
> [    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 1910969940391419 ns
> [    0.000000] setup_percpu: NR_CPUS:1024 nr_cpumask_bits:2 nr_cpu_ids:2
> nr_node_ids:1
> [    0.000000] percpu: Embedded 44 pages/cpu @ffff891f7fc00000 s139672 r8192
> d32360 u1048576
> [    0.000000] kvm-stealtime: cpu 0, msr 7fc16240
> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 515972
> [    0.000000] Policy zone: DMA32
> [    0.000000] Kernel command line:
> BOOT_IMAGE=/vmlinuz-4.14.13-300.fc27.x86_64
> root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap
> rhgb LANG=en_US.UTF-8 console=tty0 console=ttyS0
> [    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
> [    0.000000] Memory: 2018548K/2096740K available (12300K kernel code, 1546K
> rwdata, 3728K rodata, 2108K init, 1364K bss, 78192K reserved, 0K cma-reserved)
> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> [    0.000000] Kernel/User page tables isolation: enabled
> [    0.000000] ftrace: allocating 35499 entries in 139 pages
> [    0.001000] Hierarchical RCU implementation.
> [    0.001000]	   RCU restricting CPUs from NR_CPUS=1024 to nr_cpu_ids=2.
> [    0.001000]	   Tasks RCU enabled.
> [    0.001000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
> [    0.001000] NR_IRQS: 65792, nr_irqs: 440, preallocated irqs: 16
> [    0.001000]	   Offload RCU callbacks from CPUs: .
> [    0.001000] Console: colour VGA+ 80x25
> [    0.001000] console [tty0] enabled
> [    0.001000] console [ttyS0] enabled
> [    0.001029] tsc: Detected 3192.954 MHz processor
> [    0.003140] Calibrating delay loop (skipped) preset value.. 6385.90
> BogoMIPS
> (lpj=3192954)
> [    0.005015] pid_max: default: 32768 minimum: 301
> [    0.007065] ACPI: Core revision 20170728
> [    0.012257] ACPI: 3 ACPI AML tables successfully acquired and loaded
> [    0.014182] Security Framework initialized
> [    0.016033] Yama: becoming mindful.
> [    0.018023] SELinux:  Initializing.
> [    0.027871] Dentry cache hash table entries: 262144 (order: 9, 2097152
> bytes)
> [    0.033417] Inode-cache hash table entries: 131072 (order: 8, 1048576
> bytes)
> [    0.035181] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
> [    0.037117] Mountpoint-cache hash table entries: 4096 (order: 3, 32768
> bytes)
> [    0.040551] Disabled fast string operations
> [    0.042121] mce: CPU supports 10 MCE banks
> [    0.044238] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
> [    0.046011] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
> [    0.048013] Spectre V2 mitigation: Vulnerable: Minimal generic ASM
> retpoline
> [    0.051181] Freeing SMP alternatives memory: 36K
> [    0.056600] smpboot: Max logical packages: 2
> [    0.062819] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> [    0.063000] smpboot: CPU0: Intel Common KVM processor (family: 0xf, model:
> 0x6, stepping: 0x1)
> [    0.065264] Performance Events: unsupported Netburst CPU model 6 no PMU
> driver, software events only.
> [    0.067283] Hierarchical SRCU implementation.
> [    0.071996] NMI watchdog: Perf event create on CPU 0 failed with -2
> [    0.073011] NMI watchdog: Perf NMI watchdog permanently disabled
> [    0.075272] smp: Bringing up secondary CPUs ...
> [    0.078795] x86: Booting SMP configuration:
> [    0.079021] .... node  #0, CPUs:	 #1
> [    0.001000] kvm-clock: cpu 1, msr 0:7ffc2041, secondary cpu clock
> [    0.001000] Disabled fast string operations
> [    0.111083] kvm-stealtime: cpu 1, msr 7fd16240
> [    0.117019] smp: Brought up 1 node, 2 CPUs
> [    0.118023] smpboot: Total of 2 processors activated (12771.81 BogoMIPS)
> [    0.125008] devtmpfs: initialized
> [    0.127172] x86/mm: Memory block size: 128MB
> [    0.130505] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
> max_idle_ns: 1911260446275000 ns
> [    0.132028] futex hash table entries: 512 (order: 3, 32768 bytes)
> [    0.134409] pinctrl core: initialized pinctrl subsystem
> [    0.137427] RTC time: 21:17:11, date: 01/12/18
> [    0.139903] NET: Registered protocol family 16
> [    0.143042] cpuidle: using governor menu
> [    0.147760] ACPI: bus type PCI registered
> [    0.149005] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
> [    0.152346] PCI: Using configuration type 1 for base access
> [    0.159292] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
> [    0.163359] ACPI: Added _OSI(Module Device)
> [    0.165008] ACPI: Added _OSI(Processor Device)
> [    0.166017] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    0.168006] ACPI: Added _OSI(Processor Aggregator Device)
> [    0.175536] ACPI: Interpreter enabled
> [    0.177039] ACPI: (supports S0 S5)
> [    0.179008] ACPI: Using IOAPIC for interrupt routing
> [    0.180064] PCI: Using host bridge windows from ACPI; if necessary, use
> "pci=nocrs" and report a bug
> [    0.183000] ACPI: Enabled 16 GPEs in block 00 to 0F
> [    0.193781] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
> [    0.194025] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
> [    0.195023] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
> [    0.196033] acpi PNP0A03:00: fail to add MMCONFIG information, can't access
> extended PCI configuration space under this bridge.
> [    0.199739] acpiphp: Slot [1] registered
> [    0.200107] acpiphp: Slot [2] registered
> [    0.201078] acpiphp: Slot [3] registered
> [    0.202073] acpiphp: Slot [4] registered
> [    0.204013] acpiphp: Slot [5] registered
> [    0.205076] acpiphp: Slot [6] registered
> [    0.206100] acpiphp: Slot [7] registered
> [    0.207071] acpiphp: Slot [8] registered
> [    0.208091] acpiphp: Slot [9] registered
> [    0.210017] acpiphp: Slot [10] registered
> [    0.212048] acpiphp: Slot [11] registered
> [    0.213108] acpiphp: Slot [12] registered
> [    0.214000] acpiphp: Slot [13] registered
> [    0.214088] acpiphp: Slot [14] registered
> [    0.215092] acpiphp: Slot [15] registered
> [    0.216079] acpiphp: Slot [16] registered
> [    0.218037] acpiphp: Slot [17] registered
> [    0.219080] acpiphp: Slot [18] registered
> [    0.221066] acpiphp: Slot [19] registered
> [    0.222077] acpiphp: Slot [20] registered
> [    0.223073] acpiphp: Slot [21] registered
> [    0.224076] acpiphp: Slot [22] registered
> [    0.226014] acpiphp: Slot [23] registered
> [    0.227086] acpiphp: Slot [24] registered
> [    0.229032] acpiphp: Slot [25] registered
> [    0.231030] acpiphp: Slot [26] registered
> [    0.232128] acpiphp: Slot [27] registered
> [    0.233076] acpiphp: Slot [28] registered
> [    0.235139] acpiphp: Slot [29] registered
> [    0.236089] acpiphp: Slot [30] registered
> [    0.237071] acpiphp: Slot [31] registered
> [    0.239000] PCI host bridge to bus 0000:00
> [    0.239013] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
> [    0.240028] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
> [    0.242032] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff
> window]
> [    0.244010] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xfebfffff
> window]
> [    0.245011] pci_bus 0000:00: root bus resource [bus 00-ff]
> [    0.253711] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io
> 0x01f0-0x01f7]
> [    0.254015] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
> [    0.255019] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io
> 0x0170-0x0177]
> [    0.256014] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
> [    0.262301] pci 0000:00:01.3: quirk: [io  0xb000-0xb03f] claimed by PIIX4
> ACPI
> [    0.263059] pci 0000:00:01.3: quirk: [io  0xb100-0xb10f] claimed by PIIX4
> SMB
> [    0.326959] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
> [    0.328155] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
> [    0.330160] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
> [    0.332163] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
> [    0.334046] ACPI: PCI Interrupt Link [LNKS] (IRQs *9)
> [    0.339070] pci 0000:00:02.0: vgaarb: setting as boot VGA device
> [    0.340000] pci 0000:00:02.0: vgaarb: VGA device added:
> decodes=io+mem,owns=io+mem,locks=none
> [    0.340020] pci 0000:00:02.0: vgaarb: bridge control possible
> [    0.341006] vgaarb: loaded
> [    0.345077] SCSI subsystem initialized
> [    0.348291] ACPI: bus type USB registered
> [    0.350196] usbcore: registered new interface driver usbfs
> [    0.352048] usbcore: registered new interface driver hub
> [    0.355022] usbcore: registered new device driver usb
> [    0.358186] EDAC MC: Ver: 3.0.0
> [    0.362305] PCI: Using ACPI for IRQ routing
> [    0.367116] NetLabel: Initializing
> [    0.368008] NetLabel:  domain hash size = 128
> [    0.369010] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
> [    0.371093] NetLabel:  unlabeled traffic allowed by default
> [    0.374249] clocksource: Switched to clocksource kvm-clock
> [    0.441047] VFS: Disk quotas dquot_6.6.0
> [    0.446519] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> [    0.454161] pnp: PnP ACPI init
> [    0.460204] pnp: PnP ACPI: found 5 devices
> [    0.487183] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff,
> max_idle_ns: 2085701024 ns
> [    0.500887] NET: Registered protocol family 2
> [    0.512619] TCP established hash table entries: 16384 (order: 5, 131072
> bytes)
> [    0.524218] TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
> [    0.532472] TCP: Hash tables configured (established 16384 bind 16384)
> [    0.540296] UDP hash table entries: 1024 (order: 3, 32768 bytes)
> [    0.552314] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
> [    0.561106] NET: Registered protocol family 1
> [    0.567117] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
> [    0.574096] pci 0000:00:01.0: PIIX3: Enabling Passive Release
> [    0.586198] pci 0000:00:01.0: Activating ISA DMA hang workarounds
> [    0.593935] pci 0000:00:02.0: Video device with shadowed ROM at [mem
> 0x000c0000-0x000dffff]
> [    0.606572] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
> [    0.622820] Unpacking initramfs...
> [    2.335287] Freeing initrd memory: 21020K
> [    2.344531] audit: initializing netlink subsys (disabled)
> [    2.351390] audit: type=2000 audit(1515791833.710:1): state=initialized
> audit_enabled=0 res=1
> [    2.352623] Initialise system trusted keyrings
> [    2.352714] Key type blacklist registered
> [    2.353238] workingset: timestamp_bits=36 max_order=19 bucket_order=0
> [    2.359953] zbud: loaded
> [    2.910025] PANIC: double fault, error_code: 0x0
> [    2.910025] CPU: 1 PID: 56 Comm: modprobe Not tainted
> 4.14.13-300.fc27.x86_64 #1
> [    2.910025] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> [    2.910025] task: ffff891f78dc3c00 task.stack: ffffa6eac0594000
> [    2.910025] RIP: 0010:vprintk_default+0x5/0x30
> [    2.910025] RSP: 0000:fffffe000002e000 EFLAGS: 00010046
> [    2.910025] RAX: 0000000000000000 RBX: fffffe000002e118 RCX:
> 0000000000000001
> [    2.910025] RDX: 0000000000000000 RSI: fffffe000002e018 RDI:
> ffffffffbe0715a0
> [    2.910025] RBP: fffffe000002e008 R08: ffffffffbe0bb565 R09:
> ffffffffbe07159b
> [    2.910025] R10: fffffe000002e080 R11: 0000000000000000 R12:
> ffffffffbe070fdd
> [    2.910025] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000000
> [    2.910025] FS:  0000000000000000(0000) GS:ffff891f7fd00000(0000)
> knlGS:0000000000000000
> [    2.910025] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    2.910025] CR2: fffffe000002dff8 CR3: 0000000078da6000 CR4:
> 00000000000006e0
> [    2.910025] Call Trace:
> [    2.910025]	<ENTRY_TRAMPOLINE>
> [    2.910025]	? vprintk_func+0x27/0x60
> [    2.910025]	printk+0x52/0x6e
> [    2.910025]	__die+0x6b/0xe0
> [    2.910025]	die+0x2f/0x50
> [    2.910025]	do_general_protection+0x149/0x160
> [    2.910025]	general_protection+0x2c/0x60
> [    2.910025] RIP: 0010:swapgs_restore_regs_and_return_to_usermode+0x6f/0x80
> [    2.910025] RSP: 0000:fffffe000002e1c8 EFLAGS: 00000006
> [    2.910025] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> [    2.910025] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000078da7800
> [    2.910025] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [    2.910025] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [    2.910025] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000000
> [    2.910025]	</ENTRY_TRAMPOLINE>
> [    2.910025] Code: eb 01 e8 ef 9c 7a 00 e8 1a 23 06 00 e8 b5 19 06 00 83 fb
> ff 75 e4 8b 5d c8 e9 d2 fc ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <55>
> 49
> 89 f8 49 89 f1 31 c9 31 d2 be ff ff ff ff 48 89 e5 31 ff
> [    2.910025] Kernel panic - not syncing: Machine halted.
> [    2.910025] Kernel Offset: 0x3c000000 from 0xffffffff81000000 (relocation
> range: 0xffffffff80000000-0xffffffffbfffffff)
> [    2.910025] ---[ end Kernel panic - not syncing: Machine halted.
> 
> Configs and other patches are at
> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/log/?h=f27
> 
> Note that we did bring in the retpoline patches for 4.14.13 but the
> report and panic was the same as with 4.14.11.
> 
> Thanks,
> Laura
>