Thursday, April 23, 2009

VMPTRLD - Load VMCS pointer

vmptrld will load the vmcs pointer for the virtual-machine to be launched. The vmcs stands for Virtual Machine Control Structure. The vmcs is a region in memory which holds all the data for the virtual-machine to be launched. The instruction usage is similar to vmxon:

vmptrld [vmcs_ptr]
vmcs_ptr dq vmcs_region

vmcs_region:
rev_id dd 0

As with vmxon, the revision id of the vmcs_region should be updated with the revision-id supported by the processor (contained in msr 0x480) prior to executing vmptrld. As with vmxon, the vmcs_region must be located on a 4K boundary.


The only other thing worth mentioning is if you try to load the vmxon_ptr as an operand to vmptrld, then execution of vmptrld will fail. Meaning, a code sequence like the one shown below is guaranteed to fail vmptrld:
vmxon [vmxon_ptr]
jbe vmxon_failed
vmptrld [vmxon_ptr]
jbe vmptrld_failed

When the processor executes vmptrld, it realizes that vmptrld's pointer points to the same region as vmxon. This will cuase vmptrld to fail.

It may also be a good practice to execute vmclear before executing vmptrld to load the vmcs-pointer. So the hypervisor may want to do this:
vmclear [vmcs_ptr]
jbe vmclear_failed
vmptrld [vmcs_ptr]
jbe vmptrld_failed

At this point we have executed vmxon, entered VMX_ROOT mode, initialized the virtual-machine-vmcs with vmclear and loaded the virtual-machine-vmcs pointer into the processor by executing vmptrld. The next step is to initialize the vmcs with the virtual-machine's (hence forth referred to as guest) data and then launch the guest.

Wednesday, April 22, 2009

More on VMXON

vmxon takes as its operand a pointer to the vmxon region.

The code may look like this:

vmxon [vmxon_ptr]


vmxon_ptr dq vmxon_region_begin
vmxon_region_begin: vmxon_rev_id dd 0

Key things to note in the above snippet:

a. The operand to vmxon is vmxon_ptr. vmxon_ptr is a pointer to the vmxon_region. Note: vmxon_region is in physical memory.

b. The vmxon_region contains a 4-byte field called 'rev_id'. The hypervisor is expected to set up the revision-id in the vmxon-region provided by the processor.

How does the hypervisor determine the revision-id?

On Intel processors, the revision-id is contained in VMX_BASIC_MSR (0x480). Bits 31:0 of this MSR contains the revision-id of the processor.

So, for the example above, the hypervisor may want to do:

///////////////////////////////////////////////////

xor eax, eax

xor edx, edx

mov ecx, 0x480

rdmsr ; after rdmsr eax has the revision id

mov dword [vmxon_rev_id], eax ; write the rev-id into vmxon region

vmxon [vmxon_ptr]

////////////////////////////////////////////////

Note: Intel PRM also specifies that the vmxon_region must be aligned on a 4K boundary. If it is not 4K aligned , VMXON is guaranteed to fail.

It is worth repeating that the operand to vmxon is a pointer to the vmxon_region which is in physical memory. Hence vmxon regions should reside in unpaged memory.

Successful completion of vmxon will cause the processor to enter the VMX_ROOT operation.

Hypervisors must also check to see if the execution of vmxon was successful. That can be done by checking the state of eflags.ZF and eflags.CF. If eflags.ZF =0 and eflags.CF=0 then vmxon was successful.

Continuing our previous code:

///////////////////////////////////////////////////
xor eax, eax
xor edx, edx
mov ecx, 0x480
rdmsr
mov dword [vmxon_rev_id], eax
vmxon [vmxon_ptr]

jbe vmxon_failed

vmxon_pass:

if i am here then eflags.ZF=0 and eflags.CF=0. So vmxon was successful.

vmxon_failed:

either eflags.ZF=1 or eflags.CF=1

handle failed code here
//////////////////////////////////////////////////

First look at VMXON

Hypervisors should first begin with the execution of the vmxon instruction. VMXON enables vmx operation. Execution of VMXON puts the processor in VMX_ROOT mode. There are a few things that the hypervisor must ascertain before executing vmxon:

1. Hypervisor must turn on the CR4.VMXE bit. The VMX enable (VMXE) bit is bit 13 of CR4. A typical code sequence would be:

mov eax, cr4
or eax, 0x2000
mov cr4, eax

Executing VMXON without CR4.VMXE=1 will cause the processor to generate a #UD(undefined opcode) exception.

2. Hypervisor must set the fixed bits of CR0. CR0.NE,PG and PE are all fixed bits in vmx operation and they should always be 1 as long as the processor is in VMX_ROOT operation.

Any attempt to clear the fixed bits of CR0 after executing vmxon will cause the processor to generate a #GP exception.

3.A20M: A20M# must be off prior to the execution of vmxon. (violating this will result in a #GP).

4. The hypervisor must ensure that prior to execution of vmxon , the processor is not in V86 mode(eflags.vm must be 0) or in compatibility mode(efer.lma && !cs.l must be false).

The above 4 conditions must be satisfied for vmxon to work. (For it to be successful few other things need to be done).

Note that:
Assertions of INIT# will not be recoginzed by the cpu after the execution of vmxon.I think INIT# just stays pending until it gets unblocked.

VMX instructions in x86

Note: Intel PRM Vol 3b has a lot of details on VMX. If you want a quick snapshot read this blog and then Vol 3b will seem tractable.


To enable virtual machine architecture, Intel provides new instructions as part of their Virtual Machine Extensions(abbreviated VMX) instruction set. This instruction set is different from the one that AMD provides for SVM.

Here is a quick look at the instructions:

(a) VMXON - enter vmx operation

(b) VMXOFF - leave vmx operation

(c) VMREAD - read from the vmcs (vmcs will be discussed later)

(d) VMWRITE - write to the vmcs

(e) VMPTRLD - load vmcs pointer

(f) VMPTRST - store vmcs pointer

(g) VMLAUNCH/VMRESUME - launch or resume virtual machine

(h) VMCALL - call to the hypervisor

Processor/Firmware settings for VMX:

1. To make sure your processor supports VMX, execute CPUID with eax=1 (leaf 1) and check for bit 5 of ecx. If the bit is set the CPU supports VMX else it is not supported.

2. In addition to the above the BIOS must enable VMX by a write to the FEATURE_CONTROL_MSR (address 0x3a). If the msr value is initialized to 0x5 (bit0=1 and bit2=1), then vmx is enabled.
Bit 0 of the msr is the lock bit. If set, the msr is protected. This means the processor will throw a #GP exception when a wrmsr is attempted with the lock bit = 1. Bit 2 is the VMXON_ENABLE bit. Executing VMXON without bit2 set will cause the processor to generate a #GP exception.