VMware
- Fusion-io Scsi & Raid Devices Driver Download 64-bit
- Fusion-io Scsi & Raid Devices Driver Downloads
- Scsi Disk Device
Scsi-iomemory-vsl 4.2.1.1137-1OEM.550.0.0.1198610 Fusion-io 2. Run the following steps to load the module: # esxcfg-module -d iomemry-vsl (this is the name of the iomemory from the vib list). Fusion-io - has moved into the hybrid arrays iSCSI market Pure Storage - says it has a mission to undercut the price of hard drive arrays Skyera - offers one of the lowest price points ($ per terabyte) in the industry (but not the lowest entry level floor price) Stec - their first systems level product is an iSCSI rack.
Today, more and more workloads are running in virtual machines (VMs), including workloads that require significantly more IO in the guest operating system. In a VM on VMware vSphere, all virtual disks (VMDKs) are attached to the LSI Logical SAS SCSI Adapter in the default configuration. This adapter is recognized by all operating systems without installing additional drivers, but does not always provide the best performance, especially when an SSD RAID or NVMe Storage is used. In this article we have compared the virtual storage controllers LSI Logical SAS, VMware Paravirtual and the NVMe Controller.
- 3Performance Comparison
Controller models
The standard controller in almost every VM is the LSI Logical SAS SCSI controller. This controller is recognized and supported by every guest operating system without additional drivers. It is suitable for almost any workload that does not have large I/O requirements. It is also necessary for the configuration of Microsoft Server Cluster Service (MSCS).
Starting with ESXi 4.0 and virtual hardware version 7, the VMware Paravirtual controller is available. This controller was developed for high performance storage systems, because it can handle much higher I/O and reduces the CPU load. In order for the controller to be used by the guest operating system, the VMware Tools must be installed.
Starting with ESXi 6.5 and virtual hardware version 13, an NVMe controller can also be added to the VM. This controller further optimizes the performance of SSD RAIDs, NVMe and PMEM storage. This Controller is the default Controller for Windows VMs in vSphere 7.0.
The choice of the right controller depends on the applications within the VM. For example, if it is an office VM, relatively little performance is required and the standard LSI Logical SAS SCSI controller can be used. If more storage performance is required within the VM and the storage system behind it also offers more performance, the VMware Paravirtual Controller is usually more suitable. For absolute high end performance when using an SSD RAID, NVMe or PMEM storage and very high performance requirements within the VM, the NVMe controller is the best choice.
Performance test
We have conducted various performance tests for different scenarios. The test scenarios are only examples, the individual values should be adjusted individually to the own workload to achieve realistic results. Details of the test system used:
Hardware / Software:
- Supermicro Mainboard X11DPi-NT
- 2x Intel Xeon Gold 5222 (3.80GHz, 4-Core, 16.5MB)
- 256GB ECC Registered (RDIMM) DDR4 2666 RAM 4 Rank
- 3.2 TB Samsung SSD NVMe PCI-E 3.0 (PM1725b)
- ESXi 6.7.0 Update 2 (Build 13981272)
Test VM
- Windows 10 Pro (18362)
- 2 CPU sockets
- 8 vCPUs
- 8GB RAM
- VMware Paravirtual
- LSI Logical SAS
- NVMe Controller
- Thick-Provisioned eager-zeroed VMDK
LSI Logical SAS
VMware Paravirtual
NVMe Controller
Performance Comparison
Database Server
Database Server (8K Random; 70% Read; 8 Threads; 16 Outstanding IO) | |||||||
IOPS | MByte/s | Latency (ms) | CPU (%) | ||||
---|---|---|---|---|---|---|---|
LSI Logical SAS | 78210.16 | 611.02 | 1.633 | 24.81 | |||
VMware Paravirtual | 153723.45 | 1200.96 | 0,832 | 31.27 | |||
NVMe Controller | 203612.54 | 1590.72 | 0,628 | 48.03 |
E-Mail-Server
E-Mail-Server (4K Random; 60% Read; 8 Threads; 16 Outstanding IO) | |||||||
IOPS | MByte/s | Latency (ms) | CPU (%) | ||||
---|---|---|---|---|---|---|---|
LSI Logical SAS | 83403.47 | 325,79 | 1.506 | 23.52 | |||
VMware Paravirtual | 157624.97 | 615.72 | 0,811 | 31.46 | |||
NVMe Controller | 236622.59 | 924.31 | 0,540 | 52.11 |
File-Server
File-Server (64K Sequential; 90% Read; 8 Threads; 16 Outstanding IO) | |||||||
IOPS | MByte/s | Latency (ms) | CPU (%) | ||||
---|---|---|---|---|---|---|---|
LSI Logical SAS | 44739.43 | 2796.21 | 2.860 | 12.29 | |||
VMware Paravirtual | 53717.26 | 3357.33 | 2.382 | 16.87 | |||
NVMe Controller | 48929.05 | 3058.07 | 2.615 | 14.14 |
Streaming-Server
Streaming Server (5120K Random; 80% Read; 8 Threads; 16 Outstanding IO) | |||||||
IOPS | MByte/s | Latency (ms) | CPU (%) | ||||
---|---|---|---|---|---|---|---|
LSI Logical SAS | 458.16 | 2290.81 | 279.607 | 2.18 | |||
VMware Paravirtual | 504.22 | 2521.10 | 253.949 | 12.26 | |||
NVMe Controller | 505.14 | 2525.68 | 253.659 | 1.56 |
VDI-Workload
VDI-Workload (4K Random; 20% Read; 8 Threads; 8 Outstanding IO) | |||||||
IOPS | MByte/s | Latency (ms) | CPU (%) | ||||
---|---|---|---|---|---|---|---|
LSI Logical SAS | 140155.89 | 547.48 | 0,456 | 35.69 | |||
VMware Paravirtual | 163073.26 | 637.00 | 0,392 | 37,98 | |||
NVMe Controller | 203464.89 | 794.78 | 0.314 | 49.55 |
Author: Sebastian Köbke
The Linux SCSI Target Wiki
vStorage APIs for Array Integration | |
Original author(s) | Nicholas Bellinger |
---|---|
Developer(s) | Datera, Inc. |
Development status | Production |
Written in | C |
Operating system | Linux |
Type | T10 SCSI feature |
License | GNU General Public License |
Website | datera.io |
- See Target for a complete overview over all fabric modules.
The VMwarevStorage APIs for Array Integration (VAAI) enable seamless offload of locking and block operations onto the storage array.
|
Overview
VMware introduced the vStorage APIs for Array Integration (VAAI) in vSphere 4.1 with a plugin, and provided native VAAI support with vSphere 5. VAAI significantly enhances the integration of storage and servers by enabling seamless offload of locking and block operations onto the storage array. The LinuxIO provides native VAAI support for vSphere 5.
Features
LIO and LIO support the following VAAI functions:
Name | Primitive | Description | Block | NFS | LIO |
---|---|---|---|---|---|
Atomic Test & Set (ATS) | Hardware Assisted LockingCOMPARE_AND_WRITE | Enables granular locking of block storage devices, accelerating performance. | Yes | N/A | Yes |
Zero | Block ZeroingWRITE_SAME | Communication mechanism for thin provisioning arrays. Used when creating VMDKs. | Yes | N/A | Yes |
Clone | Full Copy, XCopyEXTENDED_COPY | Commands the array to duplicate data in a LUN. Used for Clone and VMotion operations. | Yes | N/A | Yes |
Delete | Space ReclamationUNMAP | Allow thin provisioned arrays to clear unused VMFS space. | Yes | Yes | Yes Disabled |
The presence of VAAI and its features can be verified from the VMware ESX 5 CLI as follows:
Primitives
ATS
ATS is arguably one of the most valuable storage technologies to come out of VMware. It enables locking of block storage devices at much finer granularity than with the preceding T10 Persistent Reservations, which can only operate on full LUNs. Hence, ATS allows more concurrency and thus significantly higher performance for shared LUNs.
For instance, Hewlett-Packard reported that it can support six times more VMs per LUN with VAAI than without it.
ATS uses the T10 COMPARE_AND_WRITE
command to allow comparing and writing SCSI blocks in one atomic operation.
NFS doesn’t need ATS, as locking is a non-issue and VM files aren’t shared the same way LUNs are.
Feature presence can be verified from the VMware ESX 5 CLI:
VMware actually uses ATS depending on the underlying filesystem type and history:
On VAAI Hardware | New VMFS-5 | Upgraded VMFS-5 | VMFS-3 |
---|---|---|---|
Single-extent datastore reservations | ATS only[ATS 1] | ATS but fall back to SCSI-2 reservations | ATS but fall back to SCSI-2 reservations |
Multi-extent datastore when locks on non-head | Only allow spanning on ATS hardware[ATS 2] | ATS except when locks on non-head | ATS except when locks on non-head |
- ↑ If a new VMFS-5 is created on a non-ATS storage device, SCSI-2 reservations will be used.
- ↑ When creating a multi-extent datastore where ATS is used, the vCenter Server will filter out non-ATS devices, so that only devices that support the ATS primitive can be used.
Zero
Thin provisioning is difficult to get right because storage arrays don't know what’s going on in the hosts. VAAI includes a generic interface for communicating free space, thus allowing large ranges of blocks to be zeroed out at once.
Zero uses the T10 WRITE_SAME
command, and defaults to a 1 MB block size. Zeroing only works for capacity inside a VMDK. vSphere 5 can use WRITE_SAME
in conjunction with the T10 UNMAP
command.
Feature presence can be verified from the VMware ESX 5 CLI:
To disable Zero from the ESX 5 CLI:
This change takes immediate effect, without requiring a 'Rescan All' from VMware.
Clone
This is the signature VAAI command. Instead of reading each block of data from the array and then writing it back, the ESX hypervisor can command the array to duplicate a range of data on its behalf. If Clone is supported and enabled, VMware operations like VM cloning and VM vMotion can become very fast. Speed-ups of a factor of ten or more are achievable, particularly on fast flash-based backstores over slow network links, such as 1 GbE.
Clone uses the T10 EXTENDED_COPY
command, and defaults to a 4 MB block size.
Fusion-io Scsi & Raid Devices Driver Download 64-bit
Feature presence can be verified from the VMware ESX 5 CLI:
To disable Clone from the ESX 5 CLI:
This change takes immediate effect, without requiring a 'Rescan All' from VMware.
Delete
VMFS operations like cloning and vMotion didn’t include any hints to the storage array to clear unused VMFS space. Hence, some of the biggest storage operations couldn't be accelerated or 'thinned out'.
Delete uses the T10 UNMAP
command to allow thin-capable arrays to offload clearing unused VMFS space.
However, vCenter 5 doesn't correctly handle waiting for the storage array to return the UNMAP
command status, so the use of Delete is disabled per default in vSphere 5.
Feature presence can be verified from the VMware ESX 5 CLI (the default value is '0'):
To enable Delete from the ESX 5 CLI:[1]
Many SATA SSDs have issues handling UNMAP
properly, so it is disabled per default in LIO.
To enable UNMAP
, start the targetcli shell, enter the context of the respective backstore device, and set the emulate_tpu attribute:
Reboot the ESX host or re-login into the backstore in order to get Delete as recognized, then verify its presence from the VMware ESX 5 CLI:
Fusion-io Scsi & Raid Devices Driver Downloads
Performance
Performance improvements offered by VAAI can be grouped into three categories:
- Reduced time to complete VM cloning and Block Zeroing operations.
- Reduced use of server compute and storage network resources.
- Improved scalability of VMFS datastores in terms of the number of VMs per datastore and the number of ESX servers attached to a datastore.
The actual improvement seen in any given environment depends on a number of factors, discussed in the following section. In some environments, improvement may be small.
Cloning, migrating and zeroing VMs
The biggest factor for Full Copy and Block Zeroing operations is whether the limiting factor is on the front end or the back end of the storage controller. If the throughput of the storage network is slower than the backstore can handle, offloading the bulk work of reading and writing virtual disks for cloning and migration and writings zeroes for virtual disk initialization can help immensely.
One example where substantial improvement is likely is when the ESX servers use 1 GbE iSCSI to connect to an LIO storage system with flash memory. Geovision others driver download. The front end at 1 Gbps doesn't support enough throughput to saturate the back end. When cloning or zeroing is offloaded, however, only small commands with small payload go across the front, while the actual I/O is completed by the storage controller itself directly to disk.
VMFS datastore scalability
Documentation from various sources, including VMware professional services best practices, has traditionally recommended 20 to 30 VMs per VMFS datastore, and sometimes even fewer. Documents for VMware Lab Manager suggest limiting the number of ESX servers in a cluster to eight. These recommended limits are due in part to the effect of SCSI reservations on performance and reliability. Extensive use of some features, such as VMware snapshots and linked clones, can trigger large numbers of VMFS metadata updates, which require locking. Before vSphere 4.1, reliable locks on smaller objects were obtained by briefly locking the entire LUN with a SCSI Persistent Reservations. Any other server trying to access the LUN during the reservation would fail and would wait and retry up to 80 times by default. This wait and retry added to perceived latency and reduced throughput in VMs. In extreme cases, if the other server exceeded the number of retries, errors would be logged in the VMkernel logs and I/Os could return as failures to the VM.
When all ESX servers sharing a datastore support VAAI, ATS can eliminate SCSI Persistent Reservations, at least reservations due to obtaining smaller locks. The result is that datastores can be scaled to more VMs and attached servers than previously.
Datera has tested up to 128 VMs in a single VMFS datastore on LIO. The number of VMs was limited in testing to 128 because the maximum addressable LUN size in ESX is 2 TB, which means that each VM can occupy a maximum of 16 GB, including virtual disk, virtual swap, and any other files. Virtual disks much smaller than this generally do not allow enough space to be practical for an OS and any application.
Load was generated and measured on the VMs by using iometer. For some tests, all VMs had load. In others, such as when sets of VMs were started, stopped, or suspended, load was placed only on VMs that stayed running.
Tests such as starting, stopping, and suspending numbers of VMs were run with iometer workloads running on other VMs that weren't being started, stopped, or suspended. Additional tests were run with all VMs running iometer, and VMware snapshots were created and deleted as quickly as possible on all or some large subset of the VMs.
The results of these tests demonstrated that performance impact measured before or without VAAI was either eliminated or substantially reduced when using VAAI, to the point that datastores could reliably be scaled to 128 VMs in a single LUN.
Statistics
The VMware esxtop command in ESX 5 has two new sets of counters for VAAI operations available under the disk device view. Both sets of counters include the three VAAI key primitives. To view VAAI statistics using esxtop, follow these steps from the ESX 5 CLI:
- Press 'u' to change to the disk device stats view.
- Press 'f' to select fields, or 'o' to change their order. Note: This selects sets of counters, not individual counters.
- Press 'o' to select VAAI Stats and/or 'p' to select VAAI Latency Stats.
- Optionally, deselect Queue Stats, I/O Stats, and Overall Latency Stats by pressing 'f', 'g', and 'i' respectively in order to simplify the display.
- To see the whole LUN field, widen it by pressing 'L' (capital) then entering a number ('36' is wide enough to see a full NAA ID of a LUN).
The output of esxtop looks similar to the following:
The VAAI counters in esxtop are:
Counter Name | Description |
---|---|
DEVICE | Devices that support VAAI (LUNs on a supported storage system) are listed by their NAA ID. You can get the NAA ID for a datastore from the datastore properties in vCenter, the Storage Details—SAN view in Virtual Storage Console, or using the vmkfstools -P /vmfs/volumes/<datastore> command. LIO LUNs start with naa.6001405. Note: Devices or datastores other than LUNs on an external storage system such as CD-ROM, internal disks (which may be physical disks or LUNs on internal RAID controllers), and NFS datastores are listed but have all zeroes for VAAI counters. |
CLONE_RD | Number of Full Copy reads from this LUN. |
CLONE_WR | Number of Full Copy writes to this LUN. |
CLONE_F | Number of failed Full Copy commands on this LUN. |
MBC_RD/s | Effective throughput of Full Copy command reads from this LUN in megabytes per second. |
MBC_WR/s | Effective throughput of Full Copy command writes to this LUN in megabytes per second. |
ATS | Number of successful lock commands on this LUN. |
ATSF | Number of failed lock commands on this LUN. |
ZERO | Number of successful Block Zeroing commands on this LUN. |
ZERO_F | Number of failed Block Zeroing commands on this LUN. |
MBZERO/s | Effective throughput of Block Zeroing commands on this LUN in megabytes per second. |
Counters that count operations do not return to zero unless the server is rebooted. Throughput counters are zero when no commands of the corresponding primitive are in progress.
Scsi Disk Device
Clones between VMFS datastores and Storage VMotion operations that use VAAI increment clone read for one LUN and clone write for another LUN. In any case, the total for clone read and clone write columns should be equal.
See also
- LinuxIO, targetcli
- FCoE, Fibre Channel, iSCSI, iSER, SRP, vHost
- Persistent Reservations, ALUA
References
- ↑VMware (2011-11-10). 'Disable Space Reclamation'. ESXi and vCenter Server 5 Documentation. Palo Alto: vmware.com.
External links
- RTSlib Reference Guide [HTML][PDF]
- RTS OS VAAI video. YouTube.
- Stephen Foskett (2011-11-10). 'A Complete List of VMware VAAI Primitives'. blog.fosketts.net.
- Jason Langer (2011-12-06). 'VAAI, Is This Thing On??'. www.virtuallanger.com.
- Peter Learmonth (November 2010). Understanding and Using vStorage APIs for Array Integration and NetApp Storage. TR-3886. Sunnyvale: NetApp.
- Archie Hendryx (2011-09-04). 'vSphere 5, VAAI and the Death of the Traditional Storage Array'. Sys-Con.
- 'vStorage APIs for Array Integration FAQ'. Palo Alto: VMware. 2012-06-18.