Wednesday, July 20, 2011
VMX and SMM – Dual monitor mode
Thursday, January 6, 2011
VMX and System Management Mode - Part 1
1.Normal Mode
2.Dual monitor mode
Normal Mode:
Under Normal mode, a SMI# assertion causes the processor to turn-off vmx and enter into SMM. Upon a RSM, the processor automatically enables VMX if it was either in VMX-ROOT or VMX-GUEST prior to the SMI#. Since the processor turns off VMX, it means that CR4.VMXE is treated as reserved bit and must be 0 during RSM.
Algorithmically,
if(smi){
if(vmx_root or vmx_guest){
save cr4.vmxe internally;
if(vmx_root) internal_state = vmx_root;
if(vmx_guest) internal_state = vmx_guest;
turn_off_vmx;
}
save cr4 to smm_ram;
}
during rsm:
if(rsm){
read cr4_val from smm_ram;
if(cr4_val.vmxe==1) jump_to_shutdown;
retrieve internal cr4.vmxe;
cr4 <- cr4_val | (cr4.vmxe<<13);
read internal_state;
if(internal_state==vmx_root) put_cpu_in_vmx_root;
if(internal_state==vmx_guest) put_cpu_in_vmx_guest;
}
Notice the jump_to_shutdown during RSM. Since the processor saves CR4.VMXE internally during SMM, the value saved in SMRAM for CR4.VMXE is always 0. During RSM, the CR4 value is first loaded from SMRAM and bit 13 is checked . It must be 0 – If not the cpu will jump to shutdown. The processor then retrieves the value of VMXE from an internal register and updates CR4 with this value. The state of the processor (whether it was in vmx-root or vmx-guest or normal ia32 operation) is also retrieved and the cpu is put in that state after the completion of RSM.
This process is the default treatment of SMIs with VMX.
Notes on System Management Mode [SMM]
SMM [System Management Mode] is an operating mode entered through the assertion of the SMI# pin. The processor upon detecting a SMI# saves the processor state in SMRAM [The base address of the SMRAM is obtained form an internal SMBASE register. The reset value of SMBASE register is 0x30000]. The processor saves several architectural values into the SMRAM (like the values of CR0, CR3, CR4 etc) when it enters SMM. To exit out of SMM , software executes a RSM(resume) instruction. During the RSM instruction, the processor reloads the architectural state from SMRAM and gets back to the state it was prior to the SMI#.
Here is a loosely defined algorithm for entering and exiting SMM:
1.Processor is executing a task (say T).
2.SMI# is detected by the processor.
3.Processor saves all information pertaining to task T in the SMRAM. It issues SMI_ENTER_ACK bus cycle and enters SMM.
4.Processor executes code from the SMM space[starting at address 0x38000]
5.When it executes the RSM instruction, the processor reloads the prior architectural state from SMRAM and then issues SMI_EXIT_ACK bus cycle and exits SMM.
6.Processor resumes executing the task T.
During Step 5, while the processor loads architectural state, it performs few checks on the state being loaded:
1.It checks the reserved bits of CR4.
2.It checks CR0 register for illegal combinations. For eg: CR0.PG=1 and CR0.PE=0 or CR0.CD=0 and and NW=1 .
If the checks above fail, then the processor enters shutdown.
[Note: there may be additional checks performed. CR0 and CR4 values in SMRAM should be left untouched by the SMM handler. These checks exist to make sure that the handler does not modify values to put the processor in an incompatible state after the execution of RSM].
Monday, October 4, 2010
Software injection into V86 guest with interrupt redirection - What must be the IDT VECTOR INFO?
a. GUEST_RFLAGS.VM = 1 (indicating the guest is in V86 mode).
b. CR4.VME=1 (enables interrupt redirection provided the redirection bitmap says so in TSS).
c. The exception_bitmap in the guest is configured to vmexit on a #PF.
At the end of vmlaunch, the software interrupt is injected. The guest is in V86 mode and has CR4.VME=1. The cpu consults the TSS to read the interrupt redirection bitmap. The TSS page is not present and the cpu takes a #PF. The guest is configured to vmexit on #PF. After the vmexit, use vmread to read the following vmcs fields:
a. Exit reason (reads 0)
b. Exit Interruption Info (0x80000B0E - indicates a #PF)
c. IDT Vector Info (reads 0)
d. Exit Qualification (0x
Something interesting in the above results is the value of idt-vector-info. The idt-vector-info must have read 0x800004vv(vv=vector), since the vmexit was encountered in the process of injecting an event. This behavior appears to violate what is stated in vol3b.
Monday, April 26, 2010
Injecting software interrupt into a V86 guest
1 . Bits 7:0 of this field represent the vector (0x0D - which is vector 13)
2. Bits 10:8 indicate the type(in this case type = 0x3 which is a hardware-exception).
3. Bit 11 is the error-code valid bit which is true in the example above.
4. Bit 31 is the valid bit for ENTRY_INTERRUPTION_INFO field.
To inject a software interrupt (say vector 0x8) hypervisor would program entry_interruption_info field as given under: 0x80000408 (type=0x4 and vector=0x8). If the guest is in V86 mode (GUEST_RFLAGS[VM]=1) , the processor behaves according to Table 15.2, Intel SDM, vol 3A .
Given below is a summary of the processor behavior during normal software-interrupt execution in V86 and during an event injection into a V86 guest:
1. EFLAGS.VM = 1 , CR4.VME=1, EFLAGS.IOPL=3
=> In this case the bit in the redirection bitmap of the TSS is consulted.
=> if bit in the redirection bitmap=0, the software interrupt is redirected to x86 style handler.
=> if bit in the redirection bitmap=1, the software interrupt is redirected to protected-mode handler.
2. EFLAGS.VM = 1 , CR4.VME=1, EFLAGS.IOPL<3
=> In this case the bit in the redirection bitmap of the TSS is consulted.
=> if bit in the redirection bitmap=0, the software interrupt is redirected to x86 style handler. Notice that this is the same behavior as with EFLAGS.IOPL=3. The difference is in the value of eflags pushed on the stack. Here the IOPL of the eflags image is forced to 3 and the value of VIF is copied to IF.
Normal behavior: if bit in the redirection bitmap=1, the interrupt is directed to a #GP handler.
During VMX event injection: if bit in the redirection bitmap = 1, the processor will *NOT* #GP due to IOPL < CPL.
3. EFLAGS.VM = 1 , CR4.VME=0, EFLAGS.IOPL=3
=> Normal behavior: Interrupt directed to a protected mode handler (No #GP).
=> During event injection: Same as above.
4. EFLAGS.VM = 1 , CR4.VME=0, EFLAGS.IOPL<3
=> Normal behavior: Interrupt directed to a #GP handler .
=> During Event Injection: No #GP can occur due to IOPL< CPL. The behavior will be the same as with IOPL=3.
Summary:
From the above discussion is there will be no #GP due to IOPL < CPL during the injection of a software interrupt into a V86 guest. If the hypervisor wants this #GP to occur, it needs to inject a #GP directly into the guest instead of a software-interrupt.This can be achieved by programming the entry_interruption_info field to 0x80000B0D.
Monday, October 19, 2009
VMEXIT on INVLPG
(a) The virtual-machine is configured to vmexit on INVLPG(bit 9 of the PROCESSOR_EXECUTION_CONTROLS is 1).
(b) The virtual-machine has GS BASE = 0xFFFF8000_00000000
(c) Virtual machine executes: invlpg [gs:0-1]
(d) Execution of invlpg causes vmexit.
(e) The address of invlpg is recorded in exit-qualification. Upon a vmread of EXIT_QUALIFICATION the value obtained is:
=> FFFF7FFF_FFFFFFFF
Notice that the value recorded is a non-canonical address ie; address[63:48] != address[47]. This is the only case i have encountered where a non-canonical address shows up on the exit-qualification.
The only explanation I can come up with for this behavior is that : INVLPG unlike other instructions does not fault in 64-bit mode with a non-canonical operand. According to the instruction spec, INVLPG morphs into a NOP for such cases.
When a vmexit handler for INVLPG is written, this case must be taken into consideration(ie; a non-canonical address might show up in the exit-qualification field).
Saturday, July 25, 2009
A full blown initialization of VMCS - Assembly code
Prior to looking at the assembly code, here is a step-by-step description of what is being done:
The reader must know that:
A)this code will run only in ring0.
B)that paging is already enabled in CR0(bit 31).
(1) First Enable VMXE (bit 13) in CR4. Make sure that processor supports VMX by executing CPUID(leaf 1, ecx[5]).
(2) Intialize revision-id(msr 0x480,31:0) in the vmxon region and in the guest-vmcs region.
(3) Execute VMXON with the pointer to vmxon region. In some cases, if BIOS has not enabled bits 0, 2 of FEATURE_CONTROL_MSR (msr 0x3a) this will fail.
(4) Execute VMCLEAR with the pointer to the guest-vmcs region.
(5) Execute VMPTRLD with the pointer to the guest-vmcs region.
(6) Now initialize the guest-vmcs:
(a) First initialize the vmx controls. These include the following controls:
1. PIN_BASED
2. PROC_BASED
3. ENTRY_CONTROLS
4. EXIT_CONTROLS
(b) Next initialize the host-state and guest-state.
(c) Now do vmlaunch. If VMLAUNCH is successful, then the processor will start executing code
from the GUEST_CS:GUEST_RIP value specified in the VMCS.
Here comes the code:
////////////////////////////////////////////////////
mov eax, cr4
bts eax, 13
mov cr4, eax
mov ecx, 0x480
rdmsr
mov edx, [vmxon-ptr]
mov [edx], eax
mov edx, [guest-ptr]
mov [edx], eax
VMXON [vmxon-ptr]
jbe fail
vmclear [guest-ptr]
jbe fail
vmptrld [guest-ptr]
jbe fail
call initialize_vmx_controls
call initialize_vmx_host_guest_state
call do_vmlaunch
;ideally a hypervisor would read the VMX-MSRS
; to determine what values to write.
initialize_vmx_controls:
mov ebx, ENTRY_CONTROLS ;0x4012
mov eax, 0x11ff
vmwrite ebx, eax
mov ebx, PIN_CONTROLS; 0x4000
mov eax, 0x1f
vmwrite ebx, eax
mov ebx, PROC_CONTROLS ; 0x4002
mov eax, 0x0401E9F2
vmwrite ebx, eax
mov ebx, EXIT_CONTROLS ; 0x400C
mov eax, 0x36dff
vmwrite ebx, eax
ret
initialize_vmx_host_guest_state:
mov eax, cr3
mov ebx, HOST_CR3 ;0x6C02
mov edx, GUEST_CR3 ;0x6802
VMWRITE EBX,EAX
mov eax, pdebase_guest
VMWRITE EDX,EAX
mov ebx, HOST_RSP ;0x6c14
mov eax, tos ;top-of-stack
vmwrite ebx, eax
mov ebx, HOST_CR0 ; 0x6C00
mov eax, cr0
vmwrite ebx, eax
mov ebx, GUEST_CR0 ;0x6800
vmwrite ebx, eax
mov ebx, HOST_CR4 ; 0x6C04
mov eax, cr4
vmwrite ebx, eax
mov ebx, GUEST_CR4; 0x6804
vmwrite ebx, eax
mov ebx, HOST_CS_SEL ; 0x0c02
mov eax, cs
vmwrite ebx, eax
mov ebx,HOST_DS_SEL ; 0x0c06
mov eax, ds
vmwrite ebx, eax
mov ebx, HOST_SS_SEL ; 0x00000c04
mov eax, 0x18
vmwrite ebx, eax
mov ebx, HOST_TR_SEL; 0x00000c0c
mov eax, 0x18
vmwrite ebx, eax
mov ebx, GUEST_TR_SEL ;0x0000080e
mov eax, 0x18
vmwrite ebx, eax
mov ebx, GUEST_TR_ATTR ;0x00004822
mov eax, 0x8b
vmwrite ebx, eax
mov ebx, GUEST_TR_LIMIT ;0x0000480e
mov eax, 0xff
vmwrite ebx, eax
mov ebx, GUEST_LDTR_ATTR ;0x00004820
mov eax, 0x00010000
vmwrite ebx, eax
mov ebx, GUEST_SS_ATTR ;0x00004818
mov eax, 0xc093
vmwrite ebx, eax
mov ebx, GUEST_DS_ATTR ;0x0000481a
mov eax, 0xc093
vmwrite ebx, eax
mov ebx, GUEST_ES_ATTR ;0x00004814
mov eax, 0xc093
vmwrite ebx, eax
mov ebx, GUEST_FS_ATTR ;0x0000481c
mov eax, 0xc093
vmwrite ebx, eax
mov ebx, GUEST_GS_ATTR ;0x0000481e
mov eax, 0xc093
vmwrite ebx, eax
mov ebx, GUEST_SS_LIMIT ;0x00004804
mov eax, 0xffffffff
vmwrite ebx, eax
mov ebx, GUEST_DS_LIMIT ;0x00004806
vmwrite ebx, eax
mov ebx, GUEST_ES_LIMIT ;0x00004800
vmwrite ebx, eax
mov ebx, GUEST_FS_LIMIT ;0x00004808
vmwrite ebx, eax
mov ebx, GUEST_GS_LIMIT ;0x0000480a
vmwrite ebx, eax
mov ebx, LINK_PTR_FULL ;0x00002800
vmwrite ebx, eax
mov ebx, VMS_LINK_PTR_HIGH ;0x00002801
vmwrite ebx, eax
mov ebx, GUEST_GDTR_BASE ;0x00006816
mov eax, gdt32t
vmwrite ebx, eax
mov ebx, HOST_GDTR_BASE ;0x00006c0c
vmwrite ebx, eax
ov ebx, GUEST_CS_LIMIT ;0x00004802
mov eax, 0xffffffff
vmwrite ebx, eax
mov ebx, GUEST_CS_ATTR ;0x00004816
mov eax, 0xc09b
vmwrite ebx, eax
mov ebx, GUEST_RSP ;0x0000681c
mov eax, tos
vmwrite ebx, eax
mov ebx, GUEST_IDTR_BASE ;0x00006818
mov eax, idt32t
vmwrite ebx, eax
mov ebx, HOST_IDTR_BASE ;0x00006c0e
vmwrite ebx, eax
mov ebx, GUEST_CS_SEL ;0x00000802
mov eax, guest_sel
vmwrite ebx, eax
mov ebx, GUEST_CS_BASE ;0x00006808
mov eax, guest_base
vmwrite ebx, eax
mov ebx, GUEST_RIP ;0x0000681e
mov eax, 0
vmwrite ebx, eax
mov ebx, HOST_RIP ;0x00006c16
mov eax, after_vmexit
vmwrite ebx, eax
mov ebx, GUEST_RFLAGS ;0x00006820
mov eax, 2
vmwrite ebx, eax
mov ebx, EXCEPTION_BITMAP ;0x4004
mov eax,0xdeadfeef
vmwrite ebx, eax
ret
do_vmlaunch:
VMLAUNCH
after_vmexit:
;read EXIT_REASON and figure out what caused the vmexit.
///////////////////////////////////////////////////////////////
The HOST_RIP is where control is transferred after a vmexit. The hypervisor can determine the appropriate course of action by reading the vmexit fields from the vmcs.
