Wednesday, May 6, 2009

VMX EXECUTION_CONTROLS

Two types of execution controls are defined:
a) PIN_BASED execution controls
b) PROC_BASED execution controls



PIN_BASED Controls:
The vmcs encoding for this field is 0x4000. There are 2 bits in this 32-bit field that are interesting:
Bit 0 – External Interrupt Exiting
Bit 3 – NMI Exiting
After launching a vmx-guest, when an external interrupt is received in the guest and Bit0 is 1 then there is a vmexit due to external interrupt.
Bit3 setting controls the behavior of the processor in response to a NMI while running as vmx-guest. If bit3==1 and a nmi is received in the guest a vmexit occurs.
The other bits are reserved. The settings of the reserved bits(0 or 1) are obtained by reading msr 0x481. To initialize this field in the vmcs:


xor eax,eax
xor edx, edx
mov ecx, 0x481
rdmsr
or eax, edx ; it has the valid vector to be written into the vmcs.
bts eax, 0 ; set bit0 to vmexit due to interrupts
bts eax, 3; nmi exiting bit = 1
mov ebx, 0x4000 ; encoding for entry controls
vmwrite ebx, eax



PROC_BASED Controls:
The vmcs encoding for this field is 0x4002. It is a 32 bit field that determines the behavior of the processor when certain instructions are executed in the vmx-guest.
For eg:
Bit7 of this vector controls the processor behavior upon execution of the HLT instruction in vmx-guest. If 1 , execution of HLT will cause a vmexit. If 0, the instruction will be executed normally without any vmexit.


Similarly bit9 controls the behavior of the processor on INVLPG, bit19 controls the behavior on mov-to-cr8 and bit20 controls the behavior on mov-from-cr8 etc.

The bit positions are described below:

INTRWINDOW 2
TSCOFFSET 3
HLT 7
INVLPG 9
MWAIT 10
RDPMC 11
RDTSC 12
CR8LOAD 19
CR8STORE 20
TPRSHADOW 21
MOVDR 23
IOUNCOND 24
IOBITMAP 25
MSRBITMAP 28
MONITOR 29
PAUSE 30



MSR 0x482 indicates the allowed-0 and allowed-1 settings of these controls.

Note: Newer processors have additional bits defined for these controls. For more details see PRM Vol 3b.

Monday, May 4, 2009

VMX Exit Control fields

EXIT_CONTROLS:
This field is used by the processor during a vmexit. This is a 32-bit field (just like the entry controls) but only 2 bits are defined:
Bit 9 – Host address space – This value is loaded to EFER.LME and CS.L on a vmexit.
Bit 15 – Acknowledge Interrupt on Exit – If there is a vmexit due to interrupt this bit determines whether the interrupt is acknowledged or not. The interrupt vector is recorded in the vmcs.
All other bits are reserved. They are either 0s or 1s as determined by the EXIT_CTLS_MSR (msr 0x483).


EXIT_CONTROL FOR MSR:
This is exactly similar to ENTRY_CONTROL FOR MSR. The only difference is in the vmcs encodings . They are tabulated below:

EXIT_MSR_STORE_ADDR EQU 0x2006
EXIT_MSR_STORE_COUNT EQU 0x400E
The Guest MSRS are saved in the MSR store area during a vmexit. On a subsequent VMEntry, these MSRS will be loaded from the same area.


EXIT_MSR_LOAD_ADDR EQU 0x2008
EXIT_MSR_LOAD_COUNT EQU 0x4010
The Host MSRs are loaded from the physical address specified in EXIT_MSR_LOAD_ADDR.
The format of the msr-load/msr-store areas is exactly similar to the msr-load area that is used for vmentry.

VMX Entry Control fields

VMX Control fields

Control fields are of 3 types:
a) Entry Control fields
b) Exit Control fields
c) Execution Control fields.

Entry Control fields:
Used during VMEntry (Vmentry is the process by which CPU transitions from HOST state to the Guest state).


VMENTRY_CONTROLS:
This is a 32-bit field that sets up some critical information that is used by the processor during vmentry. Most of the fields in this 32-bit field is reserved.
Among the bits that are defined, the following 3 are interesting:
bit 9 - Guest is in long mode
bit 10 - Guest is in SMM
bit 11 - Deactivate Dual monitor treatment
For normal vmentries, bit 10 and bit 11 are always 0. Bit 9 can be 0 or 1 depending on whether the guest is in long-mode or protected mode.

Note:

(A) If a guest will be in compatibility-mode , bit 9 must be set to 1. When the processor loads state during Vmentry, if GUEST_CS.L bit is 0 and bit 9 of entry_control is 1 , then the guest will be in compatibility-mode after vmentry.

(B) During Vmentry the value of bit 9 is copied into EFER.LME. Since CR0.PG is fixed to 1, the value also propagates to EFER.LMA.

Sample code to set up entry controls:

To set up this field, software should consult msr 0x484 and extract the allowed-0 and allowed-1 settings of this field.
xor eax,eax
xor edx, edx
mov ecx, 0x484
rdmsr
or eax, edx ; it has the valid vector to be written into the vmcs.
btr eax, 10 ; clear the SMM bit
btr eax, 11 ; clear the deactivate dual monitor bit
mov rbx, 0x4012 ; encoding for entry controls
vmwrite rbx, rax



VMENTRY_CONTROL_MSR:
This field is used when msrs are to be loaded as part of vmentry. This is sometimes required for the hypervisor to present the guest with a msr value different than the host-value.


Sample code:
%define MSR_LOAD_ADDR EQU 0x200a
%define MSR_LOAD_COUNT EQU 0x4014
mov rax,
mov rbx, MSR_LOAD_ADDR
vmwrite rbx, rax
mov rax, 1
mov rbx, MSR_LOAD_COUNT
vmwrite rbx, rax
my_msr_address:
dd
dd 0
dd msr_data_lo
dd msr_data_hi
Note:
my_msr_address is the Physical Address of the msr-load area in memory.
The layout of my_msr_address must match the layout described above. my_msr_address must be 16B aligned.



VMENTRY_CONTROL_EVENT_INJECTION:
This field is used when delivering an event/exception to the guest during vmentry. For eg: If the hypervisor wants the control to be transferred to the guest_GP handler, it would do the following:


mov rax, 0x4016; vmcs encoding
mov rbx, 0x80000B0D ; bits 10:8 = 3 -> HW exception, bits 7:0 = 0x0d (vector 13)
vmwrite rax, rbx



Vol 3b has more details on this vmcs field. The hypervisor might use this technique to handle a vmexit from the guest due to an exception.

Initializing the VMCS

Software initializes the vmcs by using the vmwrite instruction. It can read the value from the vmcs using the vmread instruction. The VMCS is divided into four areas:

(a) Host Area
(b) Guest Area
(c) VMX Control fields
(d) VMX Exit Information fields


Each VMCS field is identified by an encoding which is used by the processor to write into the appropriate place in the vmcs.

Host Area:

Host selector fields:
------------------------
Host ES selector 0xC00
Host CS selector 0xC02
Host SS selector 0xC04
Host DS selector 0xC06
Host FS selector 0xC08
Host GS selector 0xC0A
Host TR selector 0xC0C
As an example, say the hypervisor wants to initialize the Task register selector with a value of 0x18:
mov rax, 0x0C0C
mov rbx, 0x18
vmwrite rbx, rax


To read a value from the vmcs, vmread is used:
mov rax, 0x0C0C
vmread rcx, rax ; Read from Host TR selector


Other Host state fields:
Host CR0 0x6C00
Host CR3 0x6C02
Host CR4 0x6C04
Host FS base 0x6C06
Host GS base 0x6C08
Host TR base 0x6C0A
Host GDTR base 0x6C0C
Host IDTR base 0x6C0E
Host IA32_SYSENTER_ESP 0x6C10
Host IA32_SYSENTER_EIP 0x6C12
Host RSP 0x6C14
Host RIP 0x6C16


As an example to write to host_cr0 in the vmcs, the following code snippet may be used:
mov rbx, cr0
mov rax, 0x6c00 ; encoding for host CR0
vmwrite rax,rbx



Similarly the other host state fields are to be intialized. For a complete list of the vmcs fields see Intel PRM Vol 3b .


Guest Area
The technique to intialize guest state area is the same as the host-state area. Hypervisors use vmwrite instruction to initialize the guest-state area. The encodings used as operands to the vmwrite instruction reflect the guest-state encodings. Here are few examples:


Guest CR0 0x6800
Guest CR3 0x6802
Guest CR4 0x6804
Guest ES base 0x6806
Guest CS base 0x6808


Follow the same approach as before to write to these vmcs fields. For eg: to write a guest CR4 value that has PAE=1, PGE=1, OSFXSR=1,OSXMMEXCPT=1 do the following:

mov rbx, 0x6A0 ; required value in cr4
mov rax, 0x6804 ; GUEST_CR4 encoding
vmwrite rax, rbx

A similar approach is adopted for intializing other GUEST_STATE fields.