Thursday, November 20, 2008

VMware does a performance study on AMD's RVI

Nice read, this doc.

In a native system the operating system maintains a mapping of logical page numbers (LPNs) to physical page numbers (PPNs) in page table structures. When a logical address is accessed, the hardware walks these page tables to determine the corresponding physical address. For faster memory access the x86 hardware caches the most recently used LPN->PPN mappings in its translation lookaside buffer (TLB).

In a virtualized system the guest operating system maintains page tables just like in a native system, but the VMM maintains an additional mapping of PPNs to machine page numbers (MPNs). In shadow paging the VMM maintains PPN->MPN mappings in its internal data structures and stores LPN->MPN mappings in shadow page tables that are exposed to the hardware. The most recently used LPN->MPN translations are cached in the hardware TLB. The VMM keeps these shadow page tables synchronized to the guest page tables. This synchronization introduces virtualization overhead when the guest updates its page tables.
Using RVI, the guest operating system continues to maintain LPN->PPN mappings in the guest page tables, but the VMM maintains PPN->MPN mappings in a second level of page tables, called nested page tables. When a logical address is accessed, the hardware walks the guest page tables as in the case of native execution, but for every PPN accessed during the guest page table walk, the hardware also walks the nested page tables to determine the corresponding MPN. This composite translation eliminates the need to maintain shadow page tables and synchronize them with the guest page tables. However the extra operation also increases the cost of a page walk, thereby impacting the performance of applications that stress the TLB. This cost can be
reduced by using large pages, thus reducing the stress on the TLB for applications with good spatial locality.

For optimal performance the ESX VMM and VMkernel aggressively try to use large pages for their own memory when RVI is used.

Get the Whitepaper here

No comments yet