Monday, 28 December 2020

Virtual HBA in Oracle VM Server for SPARC

_--------------- ldm list-hba -l primary 505 fcinfo hba-port | grep online 506 fcinfo hba-port 507 cfgadm -alv 508 format 509 ldm list 510 ldm list -l ldm-test-db 511 ldm list-hba 512 ldm list-io 513 ldm list 514 ldm list -l | more 515 ldm list-hba -l primary 516 fcinfo hba-port 517 ldm add-vsan /SYS/MB/PCIE1/HBA0/PORT0,0 pci1_port0 primary 518 ldm list-hba -l primary 519 ldm add-vhba p0_pci1_port0 pci1_port0 stdisk 520 ldm list 521 ldm add-vhba id=0 timeout=120 p0_pci1_port0 pci1_port0 ldm-test-db 522 ldm list-hba -d primary 523 ldm list 524 telnet loghost 5000 525 ldm add-vhba id=1 timeout=120 p1_pci1_port0 pci1_port0 ldm-test-db-2 526 ldm list-domain -l 527 ldm list 528 ldm list -l ldm-test-db 529 ldm add-vhba id=1 timeout=120 p1_pci1_port0 pci1_port0 ldm-test-db-2 530 ldm add-vsan /SYS/MB/PCIE1/HBA0/PORT0,0 pci1_port1 primary 531 ldm add-vhba p0_pci1_port1 pci1_port1 stdisk1 532 ldm add-vhba id=1 timeout=120 p1_pci1_port1 pci1_port1 ldm-test-db-2 533 ldm list 534 telnet loghost 5001 535 ldm list 536 ldm list 537 ldm list -l ldm-test-db -------------------------- Oracle VM Server for SPARC 3.3 was released on October 26 during Oracle OpenWorld. This release added an important new feature, virtual HBA (vHBA), which adds flexibility and relieves prior limitations of virtual I/O without sacrificing performance. Important note: A lot of functionality has been added to vHBA support in Solaris 11.3 updates. It's very important to be on a recent Solaris 11.3 SRU to get the best results. Background Oracle VM Server for SPARC supports both virtual I/O and physical I/O Physical I/O, described in the Administration Guide chapter I/O domains, gives domains direct ownership and access to I/O devices via PCIe busses, a PCIe endpoint device, or a PCIe SR-IOV virtual function. This is ideal when native performance or full access to device features is needed, but comes with restrictions. In particular, it is limited to the number of physical devices that can be assigned to domains, and is incompatible with live migration. Virtual I/O is more commonly used, as it provides more operational flexibility for virtual networks and virtual disks. Virtual I/O supports live migration, and can provide near native performance when correctly configured. However, virtual I/O also has restrictions. It supports network and disk devices, but not tape or other device types. Virtual disks have good performance, but are limited to active-passive virtual disk multipathing using mpgroups, which cannot be used in conjunction with SCSI reservation. Ideally, a guest domain would be a "full participant" in the SAN infrastructure used by enterprise customers. Introducing vHBA Virtual HBA (vHBA) is a new capability of Oracle VM Server for SPARC that lets guest domains have virtual SCSI HBAs. In the vHBA model, a physical host bus adapter is mapped onto a virtual SAN (vsan) in the service domain, and a virtual HBA in the guest domain is associated with the vsan. SCSA protocol is used to communicate between the physical HBA, the virtual SAN, and the virtual HBA. The physical LUNs addressed by the physical HBA are visible within the guest as virtual LUNs, and are managed within the guest domain using the regular Solaris device drivers, just as in a non-virtual environment. vHBA provides multiple benefits: Device functionality: - Solaris in the domain uses native device drivers, and can use full functionality for SCSI reserveration and MPxIO for path management Device generality: - Any device Solaris supports on a physical HBA is supported in the virtual HBA. For example, the guest domain will be able to use tape devices for backup. Command scalability - instead of requiring two commands per virtual disk (an ldm add-vdsdev and an ldm add-vdisk, all of the devices on the physical HBA are presented to the guest in the same commands. This reduces the effort needed by the domain administrator. Logical Domain Channel scalability - Normally, a virtual device consumes an LDC in the guest and in the service domain, and since there are limits on the number of LDCs per system and per domain, this can be a constraint on the number of devices that domains can have. With vHBA, the entire HBA is represented by a single LDC in the guest and service domain, regardless of the number of LUNs. Implementing vHBA Let's show an example of vHBA. It requires Oracle VM Server for SPARC 3.3, which is delivered with Oracle Solaris 11.3 in the control domain. The guest domains using vHBA must run Solaris 11, as Solaris 10 does not have the SCSA interface described above. There are no special hardware requirements other than having a physical HBA with LUNs. It runs on any supported SPARC server - in this example on a SPARC T2. First, let's display the physical HBA that's available on the server: primary# ldm ls-hba primary # alias is ldm list-hba NAME VSAN ---- ---- MB/SASHBA/HBA0 MB/RISER2/PCIE2/HBA0/PORT0,0 MB/RISER2/PCIE2/HBA0,1/PORT0,0 primary# ldm ls-hba -p primary HBA |alias=MB/SASHBA/HBA0|dev=/pci@0/pci@0/pci@2/scsi@0|disks=2 |alias=MB/RISER2/PCIE2/HBA0/PORT0,0|dev=/pci@0/pci@0/pci@9/SUNW,emlxs@0/fp@0,0|disks=4 |alias=MB/RISER2/PCIE2/HBA0,1/PORT0,0|dev=/pci@0/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0|disks=4 primary# ldm ls-hba -t primary NAME VSAN ---- ---- MB/SASHBA/HBA0 init-port 50800200005ab000 Transport Protocol SAS MB/RISER2/PCIE2/HBA0/PORT0,0 init-port 10000000c9b09b3c Transport Protocol FABRIC MB/RISER2/PCIE2/HBA0,1/PORT0,0 init-port 10000000c9b09b3d Transport Protocol FABRIC We have a physical adapter, and can even see the LUNs under it. Let's show some output formats for details. # ldm ls-hba -d primary NAME VSAN ---- ---- MB/SASHBA/HBA0 c3t0d0s0 c3t1d0s0 MB/RISER2/PCIE2/HBA0/PORT0,0 c4t21000024FF2D4C83d2s0 c4t21000024FF2D4C83d0s0 c4t21000024FF2D4C82d2s0 c4t21000024FF2D4C82d0s0 MB/RISER2/PCIE2/HBA0,1/PORT0,0 c5t21000024FF2D4C83d2s0 c5t21000024FF2D4C83d0s0 c5t21000024FF2D4C82d2s0 c5t21000024FF2D4C82d0s0 # ldm ls-hba -l primary NAME VSAN ---- ---- MB/SASHBA/HBA0 [/pci@0/pci@0/pci@2/scsi@0] MB/RISER2/PCIE2/HBA0/PORT0,0 [/pci@0/pci@0/pci@9/SUNW,emlxs@0/fp@0,0] MB/RISER2/PCIE2/HBA0,1/PORT0,0 [/pci@0/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0] Now that I've identified the physical resource, I'll create the vsan against it, and the vHBA for my guest domain: primary# ldm add-vsan MB/RISER2/PCIE2/HBA0,1/PORT0,0 jeff-vsan primary MB/RISER2/PCIE2/HBA0,1/PORT0,0 resolved to device: /pci@0/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0 primary# ldm add-vhba jeff-vhba jeff-vsan ldom3 That was really all there was to it. The guest now has the vHBA and sees its LUNs. I'll show that in a bit.First, I'll show the virtual services created in the control domain: primary# ldm list-services ... snip ... VSAN NAME LDOM TYPE DEVICE IPORT jeff-vsan primary VSAN vsan@0 [/pci@0/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0] VHBA NAME VSAN DEVICE TOUT SERVER jeff-vhba jeff-vsan vhba@0 0 primary We now have a vsan and a vHBA I've creatively named for myself. I can inspect the configuration using the commands I used in the beginning: primary# ldm ls-hba -d primary NAME VSAN ---- ---- MB/SASHBA/HBA0 c3t0d0s0 c3t1d0s0 MB/RISER2/PCIE2/HBA0/PORT0,0 c4t21000024FF2D4C83d2s0 c4t21000024FF2D4C83d0s0 c4t21000024FF2D4C82d2s0 c4t21000024FF2D4C82d0s0 MB/RISER2/PCIE2/HBA0,1/PORT0,0 jeff-vsan c5t21000024FF2D4C83d2s0 jeff-vsan c5t21000024FF2D4C83d0s0 jeff-vsan c5t21000024FF2D4C82d2s0 jeff-vsan c5t21000024FF2D4C82d0s0 jeff-vsan primary# ldm ls-hba -l primary NAME VSAN ---- ---- MB/SASHBA/HBA0 [/pci@0/pci@0/pci@2/scsi@0] MB/RISER2/PCIE2/HBA0/PORT0,0 [/pci@0/pci@0/pci@9/SUNW,emlxs@0/fp@0,0] MB/RISER2/PCIE2/HBA0,1/PORT0,0 jeff-vsan [/pci@0/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0] primary# ldm ls-hba -t primary NAME VSAN ---- ---- MB/SASHBA/HBA0 init-port 50800200005ab000 Transport Protocol SAS MB/RISER2/PCIE2/HBA0/PORT0,0 init-port 10000000c9b09b3c Transport Protocol FABRIC MB/RISER2/PCIE2/HBA0,1/PORT0,0 jeff-vsan init-port 10000000c9b09b3d Transport Protocol FABRIC primary# ldm ls -o san,hba NAME primary VSAN NAME TYPE DEVICE IPORT jeff-vsan VSAN vsan@0 [/pci@0/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0] ... snip ... NAME ldom3 VHBA NAME VSAN DEVICE TOUT SERVER jeff-vhba jeff-vsan vhba@0 0 primary Not counting the commands to list the environment, it only took two commands in the control domain (ldm add-vsan, ldm add-vhba) to do the actual work. vHBA devices viewed from the guest domain The guest domain was running while the above commands were issues (showing that this works with guest domain dynamic reconfiguration). I thought it would be interesting to see what dmesg reported for the dynamic reconfiguration events, so I tailed it and saw the following interesting events: root@ldom3:/# dmesg|tail Jul 20 16:40:54 ldom3 scsi: [ID 583861 kern.info] sd10 at scsi_vhci0: unit-address g600144f0ede50676000055a815160019: f_tpgs Jul 20 16:40:54 ldom3 genunix: [ID 936769 kern.info] sd10 is /scsi_vhci/disk@g600144f0ede50676000055a815160019 Jul 20 16:40:54 ldom3 cmlb: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g600144f0ede50676000055a815160019 (sd10): Jul 20 16:40:54 ldom3 Corrupt label; wrong magic number Jul 20 16:40:54 ldom3 genunix: [ID 408114 kern.info] /scsi_vhci/disk@g600144f0ede50676000055a815160019 (sd10) online Jul 20 16:40:54 ldom3 genunix: [ID 483743 kern.info] /scsi_vhci/disk@g600144f0ede50676000055a815160019 (sd10) multipath status: degraded: path 1 vhba1/disk@w21000024ff2d4c83,0 is online Jul 20 16:40:54 ldom3 genunix: [ID 530209 kern.info] /scsi_vhci/disk@g600144f0ede50676000055a815160019 (sd10) multipath status: optimal: path 2 vhba1/disk@w21000024ff2d4c82,0 is online: Load balancing: round-robin Jul 20 16:40:54 ldom3 genunix: [ID 408114 kern.info] /virtual-devices@100/channel-devices@200/scsi@0/iport@0/probe@w21000024ff2d4c83,2 (nulldriver1) online Jul 20 16:40:55 ldom3 scsi: [ID 583861 kern.info] sd11 at scsi_vhci0: unit-address g600144f0ede50676000055a81bb2001a: f_tpgs Jul 20 16:40:55 ldom3 genunix: [ID 936769 kern.info] sd11 is /scsi_vhci/disk@g600144f0ede50676000055a81bb2001a Jul 20 16:40:55 ldom3 genunix: [ID 408114 kern.info] /scsi_vhci/disk@g600144f0ede50676000055a81bb2001a (sd11) online Jul 20 16:40:55 ldom3 genunix: [ID 483743 kern.info] /scsi_vhci/disk@g600144f0ede50676000055a81bb2001a (sd11) multipath status: degraded: path 3 vhba1/disk@w21000024ff2d4c83,2 is online Jul 20 16:40:55 ldom3 genunix: [ID 530209 kern.info] /scsi_vhci/disk@g600144f0ede50676000055a81bb2001a (sd11) multipath status: optimal: path 4 vhba1/disk@w21000024ff2d4c82,2 is online: Load balancing: round-robin Next, I used format to show the disk devices: root@ldom3:/# format Searching for disks...done c0t600144F0EDE50676000055A815160019d0: configured with capacity of 23.93GB AVAILABLE DISK SELECTIONS: 0. c0t600144F0EDE50676000055A81BB2001Ad0 /scsi_vhci/disk@g600144f0ede50676000055a81bb2001a 1. c0t600144F0EDE50676000055A815160019d0 /scsi_vhci/disk@g600144f0ede50676000055a815160019 2. c1d0 /virtual-devices@100/channel-devices@200/disk@0 Specify disk (enter its number): ^C Note the long device names for the LUNs coming from a ZFS storage appliance - those are the ones I've just picked up. You can see that it's using the native device driver, instead of the 'virtual-devices' driver used with a standard vdisk. I even created a ZFS pool on one of the LUNs on another host accessing the physical SAN, so I can now import it: root@ldom3:/# zpool import pool: aux36 id: 10749927192920141180 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: aux36 ONLINE c0t600144F0EDE50676000055A81BB2001Ad0 ONLINE root@ldom3:/# zpool import aux36 root@ldom3:/# zpool status -v pool: aux36 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM aux36 ONLINE 0 0 0 c0t600144F0EDE50676000055A81BB2001Ad0 ONLINE 0 0 0 errors: No known data errors At this point, if I had more devices I could use them too, and I could use all the device features Solaris supports on bare-metal, like SCSI reservation or MPxIO. Now, before you ask: I did some trivial performance tests with I/O workload tools, and it seemed to perform as well as regular vdisks on LUNs.

No comments:

Post a Comment