Web lists-archives.com

[PATCH] KVM: nVMX: Fix L2 guest hang if shadow page tables on EPT

From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>

The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that 
L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly:

qemu-system-x86-2821  [003] d..2    45.848814: kvm_entry: vcpu 0
qemu-system-x86-2821  [003] ...1    45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e
qemu-system-x86-2821  [003] ...1    45.848827: kvm_page_fault: address fe05b error_code 14

Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT)
prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is 
uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE 
paging and EPT enabled. However, there is a progress to switch from Legacy 
mode's such-mode Protected mode to Long mode during system boot, the check 
in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in 
Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually 
the original commit should just intended to prevent to dereference L2's CR3 
if the L1 hypervisor emulates L2's real mode through vm8086.  

This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and 

Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
Cc: Ladi Prosek <lprosek@xxxxxxxxxx>
Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
 arch/x86/kvm/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c664365..2b2a05f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val)
 static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept,
 			       u32 *entry_failure_code)
-	if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
+	if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) {
 		if (!nested_cr3_valid(vcpu, cr3)) {
 			*entry_failure_code = ENTRY_FAIL_DEFAULT;
 			return 1;
@@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
 		 * must not be dereferenced.
 		if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) &&
-		    !nested_ept) {
+		    !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) {
 			if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) {
 				*entry_failure_code = ENTRY_FAIL_PDPTE;
 				return 1;