VCDX
VCDX, VMworld2009, RoR … and a Fractured Sternum
Jun 25th
Given my recent inactivity here and on Twitter, I feel the need to post some updates. So, where have I been and what have been doing for the past few weeks? I’ll start with the most recent drama, which should give you a good laugh.
Fractured Sternum
A few months ago, my wife and I decided to buy our four year old son an outdoor playset. Our local Costco had the “Rainbow All-American Double Decker Playset” (pictured left) on sale, so we decided to buy it. That was back in March. Nearing the end of June, do you know where the playset is? Still boxed up in our garage. Some Dad I am.
Anyway, on Monday I decided I was sick and tired of all the clutter in our garage. Plus I was determined to get that playset built before the end of June. So I decided to rearrange the garage and get the playset boxes ready to be moved to the back yard.
Before I continue, let’s take a quick look at the weight and dimension of these boxes …
Shipping Box Dimensions:
Box-1: 14 1/2” L x 11 1/4” W x 9” H: Approximately 40-lbs
Box-2: 22 1/2” L x 11 1/4” W x 9” H: Approximately 40-lbs
Box-3: 106” L x 24” W x 7” H: Approximately 210-lbs
Box-4: 106” L x 24” W x 7” H: Approximately 240-lbs
Box-5: 106” L x 24” W x 7” H: Approximately 195-lbs
Box-6: 106” L x 24” W x 7” H: Approximately 200-lbs
Slide: 115 1/2” L x 24 3/4” W x 16 3/4” H: Approximately 40-lbs
Well, as I was trying to be the big, bad, super dad and move Box #4 on my own … I had a little accident. That’s right, I have a fractured sternum because a playset box fell on my chest! How embarrassing. My friends affectionately now call me “crash” and my wife will no longer allow me to go into the garage without first showing her my helmet is securely fastened.
The good doctor from the ER gave me some Vicodin for the pain, which has been very helpful. But the side effect is that Vicodin makes me loopy, making it difficult to write.
VCDX
If you’re still reading then you’re probably questioning my level of intelligence (and I wouldn’t entirely blame you
So I figured I would try to redeem myself with an update on my VCDX progress. Even though I have not yet posted all of my VCDX study notes, I actually took the VCDX admin exam a few weeks ago. And I just found out last week that I passed! Woooohoooo! Now I’ve got to start preparing for the next test, the VCDX Design Exam.
VMworld2009
Looks like I’ll be speaking at VMworld2009. So if you’re planning to attend this year’s event, be sure to say “hello.” Just look for my shaved noggin’ wondering the halls (or the guy with the shiny helmet, hehehe). Even better, as I’ll be the speaker of session DV3567, you can certainly find me at the following breakout session …
Session ID:
DV3567Title:
Don’t throw that PC away! How to convert old PCs to Thin Clients using a thin Linux OS and VMware View Open Client.Abstract:
More and more, companies are looking for additional ways to cut costs though virtualization. And it isn’t long before IT teams start exploring the possibility of a Virtual Desktop Infrastructure. But with desktops out numbering servers by a factor of 10:1 (or more), converting users to a virtual desktop can be technically challenging and a significant upfront expense. A potential solution to this problem is to convert existing PCs into Thin Clients, extending the life of the hardware and easing the transition into a VDI. This session will show IT professionals various ways to convert older PCs into Thin Clients, capable of connecting to a VMware VM hosted on ESX via the VMware View Manager.
RoR (Ruby on Rails) and other next generation frameworks
I like to think of myself as an amateur developer (though, even amateur developers might have a thing or two to say about that!!
I began programming in Perl about 10 years ago and since then I’ve dabbled in a number of different languages, like C++, Java and Ruby.
About two years ago I was introduced to Ruby on Rails and since then, most of my development work has been with RoR. Thus far, however, I haven’t posted anything on this blog about RoR. Why? Two reasons. The apps I’ve written to date have absolutely nothing to do with VMware. And second, like I said, I’m an amateur. Anyone looking for RoR help and advice can probably find better info on actual RoR blogs.
But I’ve decided that this is about to change. Most recently I’ve been working on a little RoR front end that will “drive” vSphere via SOAP. So I certainly find that work relevant here. Plus, if you think about it, Rails provides a level of abstraction and therefore, by definition, can be called a type of virtualization.
So if you’re an RoR developer (or any other kind of next generation framework, for that matter), please let me know. I’m interested in reading your blog, checking out your applications, sharing code, chatting about issues / concerns / challenges, etc. Just post a comment or email me at asweemer [at] gmail [dot] com.
Scripted ESX Installation: Reconfiguring COS Networking with Kickstart
May 27th
Frequently customers have specific NICs (like onboard NICs) that they’d like assigned to the COS, leaving the other NICs for VM traffic. This is difficult, however, when using our automated kickstart deployment scripts as there is no way to explicitly define the vmnic assigned to the COS. And to make matters worse, the VMkernel is not yet available to us during the %post section of the kickstart script, which makes COS networking configuration difficult! Recently I had a customer who was getting frustrated because …
- They would “rack and stack” a physical server and wire up their NICs accordingly (i.e. onboard NICs on the management VLAN, remaining NICs on production VLANs)
- PXE boot the server
- When kickstart completed, they’d lose connection to the COS.
This happens because during installation, ESX just assigns vmnic0 to the lowest PCI number, and then assigns vmnic0 to the COS. And this is often not the NIC the admin wants used for their COS. Of course, they could go back after the fact and reconfigure the COS networking, but this kind of defeats the purpose of a completely hands-free, automated deployment.
Here is one possible solution to the problem. Below is a script I wrote to append to the %post section of a kickstart file. Obviously, you’ll need to make modifications for your environment.
|
## This script should be appended to the %post section of an ESX kickstart file. %post
cat > /tmp/esx_post_install.sh << EOF ## If your kickstart file has vmportgroup=1, you *might* want to uncomment the /usr/sbin/esxcfg-vswitch -A “VMkernel” vSwitch0 ## You’ll need to find which physical NICs you want assigned to your COS. From
/usr/sbin/esxcfg-nics -l | awk ‘\$0 ~ /search term/ {print \$1}’ | xargs –n 1 /usr/sbin/esxcfg-vswitch vSwitch0 –L ## Note: if you want to test the line above from the command-line, you’ll need ## Replace the x.x.x.x after -i with the IP address and after -n with the ## Replace the x.x.x.x after -i with the IP address and after -n with the subnet ## Replace x.x.x.x with the default gateway for the COS in both of the next two lines. mv /etc/rc.d/rc.local.save /etc/rc.d/rc.local
chmod +x /tmp/esx_post_install.sh cat >> /etc/rc.d/rc.local << EOF |
As an example, in my environment I have server with 4 NICs and by default, ESX assigns vmnic0, which is mapped to PCI 02:00.00, to the service console. However, what is actually physically wired to my management network is vmnic3, which is mapped to PCI 02:03.00. In the script above, I simply searched for the number 3 (i.e. replaced search term with 3) and now my scripted ESX installation works properly.
Below is the configuration of my server before I redeployed with kickstart. The line in red is the NIC I want assigned to the COS. The lines in black are what ESX assigns the COS by default.
|
BEFORE (without %post section)
PortGroup Name VLAN ID Used Ports Uplinks |
Now, here is the same output after I redeployed the server with my modifications to the %post section of the kickstart file. The scripted deployment of ESX now properly assigns vmnic3 to my service console.
|
AFTER (with %post section)
[root@vesx7 root]# esxcfg-nics -l
[root@vesx7 root]# esxcfg-vswitch -l
PortGroup Name VLAN ID Used Ports Uplinks |
I hope this was helpful. Let me know if you have any questions.
Well, I’d better sign off and start packing because I leave for Omaha, NE in a few hours.
VCDX Admin Exam Notes – Section 1.4
May 19th
I’m trying to get these out faster. But I’m finding it’s a pain to compile, tweak and reformat my notes so they make sense and look right on the blog. Here’s the next in the series.
Objective 1.4 – Implement and manage Storage VMotion.
Knowledge
Describe Storage VMotion operation
I like to think of Storage VMotion as the “inverse” of VMotion. Instead of moving (live) the front end (i.e. CPU, memory, and network) from one physical device to another, Storage VMotion moves (live) the back end (i.e. disk) from one physical device to another.
The following was taken directly from the VMware.com website and acurately describes how a Storage VMotion works.
- Before moving disk files, Storage VMotion creates a new virtual machine home directory for the virtual machine in the destination datastore.
- Next, a new instance of the virtual machine is created. Its configuration is kept in the new datastore.
- Storage VMotion then creates a child disk for each virtual machine disk that is being moved to capture a copy of write activity, while the parent disk is in read only mode.
- The original parent disk is copied to the new storage location.
- The child disk is re-parented to the newly copied parent disk in the new location.
- When the transfer to the new copy of the virtual machine is completed, and the original instance is shut down. Then, the original virtual machine home is deleted from VMware vStorage VMFS at the source location.
Explain implementation process for Storage VMotion
For ESX 3.5, Storage VMotion is a command-line only implementation. There are third party GUI plugins to the VI client that I have used and work really well, but they are of course not supported by VMware. And for the purposes of this post, I would imagine the VCDX exam will stick to VMware only supported implementations.
To execute a Storage VMotion, you’ll need the RCLI (Remote Command Line Interface) from VMware.com. There is an RCLI for both Windows and Linux, and there’s an RCLI virtual appliance too. Take your pick, and then be sure to review the Remote Command-Line Interface Installation and Reference Guide for more info on RCLI. But specifically for Storage VMotion, the RCLI command comes in two flavors (from the guide):
To use the command in interactive mode, type svmotion –interactive. You are
prompted for all the information necessary to complete the storage migration.
When you invoke the command in interactive mode, all other parameters are
ignored.In noninteractive mode, the svmotion command uses the following syntax:
svmotion [standard Remote CLI options] –datacenter=<datacenter name>
–vm <VM config datastore path>:<new datastore>
[--disks <virtual disk datastore path>:<new datastore>,
<virtual disk datastore path>:<new datastore>]
- Identify Storage VMotion use cases
There are a number of reasons that I can think of where you’d want to use Storage VMotion. Some of the more obvious reasons (at least to me, anyway) would be:
- Array maintenance
- Migrating to newer or different (i.e. FC, iSCSI, NAS) hardware
- Adding new storage
- To achieve optimal distribution of storage consumption across LUNs
- To resolve storage bottlenecks and other performance issues
I’m sure if I thought about it some more, I could come up with a few more. But I think this is a decent list.
Understand performance implications for Storage VMotion
There are a couple things that occur during a Storage VMotion that could affect performance, and therefore need to be considered when planning to move storage.
- During the Storage VMotion, the VM actually does a “Self VMotion,” meaning the VM is VMotioned to the same host it’s already running on (review step #2 in the graphic above). And therefore during this time, there is temporarily twice the amount of memory consumed.
- During the move, extra disk space is required on the source volume while all disk writes are redirected to the snapshot disk.
- All disk I/0 for the copy (i.e. read from the source volume, write to the snapshot disk and write to the destination volume) is going through the VMkernel of the host.
Skills and Abilities
Use Remote CLI to perform Storage VMotion operations
- Interactive mode
This is pretty easy. Just enter svmotion –interactive at the command line and then follow the prompts. - Non-interactive mode
In my environment, the command to move all the virtual disks associated with aaron-corp-xp (the name of an actual VM I use) from vol1 to vol2, would look like this …
svmotion –url=https://10.10.8.60/sdk –username=aaron –password=<yeah right> –datacenter=cincylab –vm=’[vol1] aaron-corp-xp/aaron-corp-xp.vmx: vol2’
VCDX Admin Exam Notes – Section 1.3
May 18th
Ugh. My brain hurts. I’ve spent the past few hours reviewing scripted ESX installations and working on a PowerShell script for a customer that will reorder vmnics after a scripted installation is complete (because I can’t find any other way to force their order during the install). It’s been a few months since I’ve done a scripted installation, so I definitely needed a refresher. Plus, according to the VMware Enterprise Administration Exam Blueprint v3.5, section 8.1 is all about automating ESX deployments. The good news is that section 8.1 is the last section of the blueprint, so I believe I’m almost done preparing for the VCDX Admin Exam, which I’m scheduled to take in a few days.
Anyway, going back to the beginning of the Blueprint, and continuing from where I left off, here is the next section of my study notes.
Objective 1.3 – Troubleshoot Virtual Infrastructure storage components.
Knowledge
Identify storage related events and log entries. Analyze storage events to determine related issues.
All storage related events will be recorded in the /var/log/vmkernel log file. Most of the messages in this log file are fairly cryptic and can be difficult to interpret. Furthermore, this log file contains all messages from the vmkernel, not just storage related messages, so you’ll have to filter through it. An easy way to do this is simply to search for SCSI. For example, the command cat /var/log/vmkernel | grep SCSI on one of my servers produces the following output (only showing the last 10 lines) …
[root@cincylab-esx3 root]# cat /var/log/vmkernel | grep SCSI | tail -10
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1069)iSCSI: login phase for session 0×8603f90 (rx 1071, tx 1070) timed out at 23051576, timeout was set for 23051576
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1071)iSCSI: session 0×8603f90 connect timed out at 23051576
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1071)<5>iSCSI: session 0×8603f90 iSCSI: session 0×8603f90 retrying all the portals again, since the portal list got exhausted
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1071)iSCSI: session 0×8603f90 to iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1 waiting 1 seconds before next login attempt
May 18 12:57:48 cincylab-esx3 vmkernel: 2:16:01:56.748 cpu2:1071)iSCSI: bus 0 target 0 trying to establish session 0×8603f90 to portal 0, address 10.10.8.200 port 3260 group 1
May 18 12:58:00 cincylab-esx3 vmkernel: 2:16:02:09.355 cpu2:1071)iSCSI: bus 0 target 0 established session 0×8603f90 #3, portal 0, address 10.10.8.200 port 3260 group 1
May 18 12:58:01 cincylab-esx3 vmkernel: VMWARE SCSI Id: Supported VPD pages for vmhba35:C0:T0:L0 : 0×0 0×80 0×83
May 18 12:58:01 cincylab-esx3 vmkernel: VMWARE SCSI Id: Device id info for vmhba35:C0:T0:L0: 0×1 0×1 0×0 0×18 0×42 0×55 0×46 0×46 0×41 0×4c 0×4f 0×0 0×0 0×0 0×0 0×0 0×1 0×0 0×0 0×0 0×0 0×0 0×0 0×0 0×2 0×0 0×0 0×0
May 18 12:58:01 cincylab-esx3 vmkernel: VMWARE SCSI Id: Id for vmhba35:C0:T0:L0 0×20 0×20 0×20 0×20 0×56 0×49 0×52 0×54 0×55 0×41
[root@cincylab-esx3 root]#
If you look closely, I clearly had some issues with my iSCSI appliance a few hours ago. I decided make some configuration changes to the switch and then, all of a sudden, the ESX server lost connectivity to its storage. Weird!
Anyway, what does all this mean? There’s an really good VMworld Europe 2008 presentation (which you can get from www.vmworld.com) titled VI3 Advanced Log Analysis, which goes into detail about how to interpret VMware log files. From that presentation, I found this diagram which describes the components of a message in the vmkernel log file.
Skills and Abilities
Verify storage configuration and troubleshoot storage connection issues using CLI , VI Client and logs
- Rescan events
A rescan event can be initiated with the esxcfg-rescan at the command line. The output should look like the following …
[root@cincylab-esx3 root]# esxcfg-rescan vmhba32
Rescanning vmhba32 …
On scsi1, removing: 0:0.
On scsi1, adding: 0:0.
Done.
[root@cincylab-esx3 root]# cat /var/log/vmkernel | grep SCSI | tail -3
May 18 19:13:09 cincylab-esx3 vmkernel: VMWARE SCSI Id: Supported VPD pages for vmhba32:C0:T0:L0 : 0×0 0×80 0×83
May 18 19:13:10 cincylab-esx3 vmkernel: VMWARE SCSI Id: Device id info for vmhba32:C0:T0:L0: 0×2 0×0 0×0 0×18 0×4c 0×69 0×6e 0×75 0×78 0×20 0×41 0×54 0×41 0×2d 0×53 0×43 0×53 0×49 0×20 0×73 0×69 0×6d 0×75 0x
May 18 19:13:10 cincylab-esx3 vmkernel: VMWARE SCSI Id: Id for vmhba32:C0:T0:L0 0×36 0×52 0×58 0×36 0×4a 0×39 0×39 0×58 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×47 0×42 0×30 0×31 0×36 0x
[root@cincylab-esx3 root]#
- Failover events
I don’t have redundant paths in my lab to simulate this. So, again from the VMworld presentation, VI3 Advanced Log Analysis, here is a screen shot from the slide that covers this topic.
There will obviously be a lot of different types of error and event messages in /var/log/vmkernel. And I’m certainly not going to try and list every possible combination here. I highly suggest you download the VMworld preso because it does a great job of explaining how to further decipher the log files (like defining SCSI error codes).
Well, that’s about it for this section. Back to PowerShell scripting for another hour or so.
VCDX Admin Exam Notes – Section 1.2
May 11th
Last week I was in San Francisco with most (maybe all) of the VMware field technical folks for a three day technical summit. One of the evenings we had an awards ceremony and dinner. And guess what? The first eight VCDX certifications ever to be awarded were announced.
Now, VMware is a pretty big company, so I didn’t recognize seven of the eight names. But I definitely recognized one of them. Well, that is, I should say I recognized his name. Having never officially met him fact to face, I couldn’t pick him out of a crowd of two. You might know him as the rock star blogger from Yellow Bricks. Congratulations Duncan Epping! I believe he said he is VCDX number seven and the first VCDX in Europe. Very cool. And well deserved, for sure.
I’m a little behind Duncan. I’m scheduled to take the Admin Exam later this month, which is the first of two exams. Then I’ll have to present and defend a successful design and deployment before a jury … er, I mean, panel of my peers.
Anyway, here are my notes for section 1.2 of the VMware Enterprise Administration Exam Blueprint v3.5. (Section 1.1 can be found here). Everything in Blue is a direct cut and paste from the exam blueprint.
Objective 1.2 – Implement and manage complex data security and replication
configurations.
Knowledge
Describe methods to secure access to virtual disks and related storage devices
- Distributed Lock Handling

In the graphic below, notice how each ESX server sees and has access to the same LUN? This is achieved via VMFS, a clustered file system which leverages distributed file locking to allow multiple hosts to access the same storage. When a Virtual Machine is powered on, VMFS places a lock on its files, ensuring no other ESX server can access them.
Identify tools and steps necessary to manage replicated VMFS volumes
- Resignaturing
First, there’s a really good article on VMFS resignaturing by Duncan (go figure). Also, Chad Sakac over at Virtual Geek has a great article too. I’m not going to reinvent the wheel, so make sure you read their posts. You’ll need to understand this. For the exam, you’ll certainly need to know the following …The following is from the Fibre Channel SAN Configuration Guide:
EnableResignature=0, DisallowSnapshotLUN=1 (default)
In this state:
- You cannot bring snapshots or replicas of VMFS volumes by the array into the ESX Server host regardless of whether or not the ESX Server has access to the original LUN.
- LUNs formatted with VMFS must have the same ID for each ESX Server host.
EnableResignature=1, (DisallowSnapshotLUN is not relevant)
In this state, you can safely bring snapshots or replicas of VMFS volumes into the same servers as the original and they are automatically resignatured.EnableResignature=0, DisallowSnapshotLUN=0
This is similar to ESX 2.x behavior. In this state, the ESX Server assumes that it sees only one replica or snapshot of a given LUN and never tries to resignature. This is ideal in a DR scenario where you are bringing a replica of a LUN to a new cluster of ESX Servers, possibly on another site that does not have access to the source LUN. In such a case, the ESX Server uses the replica as if it is the original.Do not use this setting if you are bringing snapshots or replicas of a LUN into a server
with access to the original LUN. This can have destructive results including:
- If you create snapshots of a VMFS volume one or more times and dynamically
bring one or more of those snapshots into an ESX Server, only the first copy is
usable. The usable copy is most likely the primary copy. After reboot, it is
impossible to determine which volume (the source or one of the snapshots) is
usable. This nondeterministic behavior is dangerous.- If you create a snapshot of a spanned VMFS volume, an ESX Server host might
reassemble the volume from fragments that belong to different snapshots. This can
corrupt your file system.
Skills and Abilities
Configure storage network segmentation
- FC Zoning
Zoning delivers access control in the SAN, restricting visibility to devices in the zone solely to other members of that zone. It is a common technique used to do things like group ESX servers into production/test/dev, increase security and decrease traffic, among other things.
- iSCSI/NFS VLAN
Storage segmentation for IP storage can be accomplished in one of two ways: VLANs or physical segmentation (i.e. separate layer 2 switches for storage).
Configure LUN masking
The Disk.MaskLUNs parameter should be used when you’re trying to mask specific LUNs to your ESX host. This is a useful option when you don’t want your ESX server to access a particular LUN, but are unwilling (or unable) to configure your FC switch.
To configure LUN masking in the VI Client go to Configuration –> Advanced Settings for the host you want to configure. You’ll find the Disk.MaskLUNs parameter under the section Disk. It looks like this in my VI Client.
Enter a value in the following format … <adapter>:<target>:<comma separated LUN range list>. Be sure to rescan when your done and verify the Mask has been properly applied.
Use esxcfg-advcfg
This one’s easy. Just use the man page (type “man esxcfg-advcfg” at the command prompt). It’ll tell you everything you need to know
Set Resignaturing and Snapshot LUN options
So, following along with the man page above, here is a cut and paste from my server …
[asweemer@cincylab-esx3 config]$ su -
Password:
[root@cincylab-esx3 root]# esxcfg-advcfg -s 0 /LVM/EnableResignature
Value of EnableResignature is 0
[root@cincylab-esx3 root]# esxcfg-advcfg -s 1 /LVM/EnableResignature
Value of EnableResignature is 1
[root@cincylab-esx3 root]#
[root@cincylab-esx3 root]# esxcfg-advcfg -s 0 /LVM/DisallowSnapshotLun
Value of DisallowSnapshotLun is 0
[root@cincylab-esx3 root]# esxcfg-advcfg -s 1 /LVM/DisallowSnapshotLun
Value of DisallowSnapshotLun is 1
[root@cincylab-esx3 root]#
Manage RDMs in a replicated environment
RDMs can be created via the CLI with the following command …
vmkfstools -r /vmfs/devices/disks/vmhbaX:Y:Z:0 my-vm.vmdk
By default, the RDM will be created in Virtual Compatibility Mode. But should you need and/or prefer Physical Compatibility Mode, you can change this by editing the VMDK file and changing the createType value to vmfsPassthroughRawDeviceMap.
Use proc nodes to identify driver configuration and options
The proc filesystem is a pseudo filesystem, it’s not “real.” It consumes no storage space and is used to access process information from the kernel. You’ll find quite a bit of valuable data and configuration options in the many subdirectories of /proc/vmware/config. Here’s a quick example from my ESX server …
[asweemer@cincylab-esx3 LVM]$ pwd
/proc/vmware/config/LVM
[asweemer@cincylab-esx3 LVM]$ ls
DisallowSnapshotLun EnableResignature
[asweemer@cincylab-esx3 LVM]$ cat EnableResignature
EnableResignature (Enable Volume Resignaturing) [0-1: default = 0]: 0
[asweemer@cincylab-esx3 LVM]$
Use esxcfg-module
Just like esxcfg-adv, use the man page.
VCDX Admin Exam Notes — Section 1.1
Apr 27th
I finally got a chance to sit down and reformat some of my notes for the VCDX Admin Exam. Below are my notes for Section 1.1 of the VMware Enterprise Administration Exam Blueprint v3.5. Everything in Blue is a direct cut and past from the exam blueprint.
Oh, and thanks to the Disqus comment from VirtualizationTeam (Blog), letting me know that Peter van den Bosch has a more recent version of his VMware Enterprise Administration Exam Study Guide 3.5.
Section 1 – Storage
Objective 1.1 – Create and Administer VMFS datastores using advanced techniques.
Knowledge
Describe how to identify iSCSI, Fibre channel, SATA and NFS configurations using CLI commands and log entries
Here are a few command line examples that I believe would work well …
1) esxcfg-mpath –l
This command produces the following output on my server:
[root@cincylab-esx3 root]# esxcfg-mpath -l
Disk vmhba0:0:0 /dev/sdb (152627MB) has 1 paths and policy of Fixed
Local 0:31.2 vmhba0:0:0 On active preferred
Disk vmhba32:0:0 /dev/sda (152627MB) has 1 paths and policy of Fixed
Local 0:31.2 vmhba32:0:0 On active preferred
Disk vmhba35:0:0 /dev/sdc (923172MB) has 1 paths and policy of Fixed
iScsi sw iqn.1998-01.com.vmware:cincylab-esx3-1d029e5f<->iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1 vmhba35:0:0 On active preferred
2) esxcfg-info –s
The –s flag will narrow the scope of the output to just storage and disk related info. But even with the narrowed scope, this command produces way too much output to be displayed here. You’ll likely want to pipe the output into grep, or at a minimum to a more/less to get what you’re looking for.
3) cat /var/log/vmkernel | grep vmhba | tail –10
This will search the vmkernel log file and display the last 10 lines containing the text vmhba. If you want more (or fewer lines) change the –10 to whatever suits your needs.
If found this one particularly useful when you’ve enabled the software iSCSI initiator at the command line, but don’t know yet number has been assigned to the vmhba (e.g. vmhba35).
4) esxcfg-vmhbadevs –m and ls –lah /vmfs/volumes
The command esxcfg-vmhbadevs –m will show the mapping between vmhba numbers, device files and their UUIDs. If you’d like a quick and easy way to see what UUIDs are mapped to their human readable name, you can follow that up with a ls –lah /vmfs/volumes. The two commands back to back produce the following output on my server:
[root@cincylab-esx3 root]# esxcfg-vmhbadevs -m
vmhba35:0:0:1 /dev/sdc1 4986310d-6525e5e6-ebbd-00237d0681e7
vmhba0:0:0:3 /dev/sdb3 49e115fb-3e22358c-c10a-00237d0681e7
vmhba32:0:0:1 /dev/sda1 4985c53e-e7b1904f-5042-00237d0681e7
[root@cincylab-esx3 root]# ls -lah /vmfs/volumes/
total 10M
drwxr-xr-x 1 root root 512 Apr 20 23:07 .
drwxrwxrwt 1 root root 512 Apr 11 18:12 ..
drwxr-xr-t 1 root root 1.2K Feb 1 21:34 4985c53e-e7b1904f-5042-00237d0681e7
drwxr-xr-t 1 root root 3.7K Apr 14 14:49 4986310d-6525e5e6-ebbd-00237d0681e7
drwxr-xr-t 1 root root 980 Apr 11 18:13 49e115fb-3e22358c-c10a-00237d0681e7
lrwxr-xr-x 1 root root 35 Apr 20 23:07 cincylab-esx3:storage1 -> 4985c53e-e7b1904f-5042-00237d0681e7
lrwxr-xr-x 1 root root 35 Apr 20 23:07 cincylab-esx3:storage2 -> 49e115fb-3e22358c-c10a-00237d0681e7
lrwxr-xr-x 1 root root 35 Apr 20 23:07 vol1 -> 4986310d-6525e5e6-ebbd-00237d0681e7
5) vmkiscsi-ls
This one only applies to iSCSI storage, of course, and produces the following output on my server:
[root@cincylab-esx3 root]# vmkiscsi-ls
*************************************************************
SFNet iSCSI Driver Version … 3.6.3 (27-Jun-2005 )
*************************************************************
TARGET NAME : iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1
TARGET ALIAS :
HOST NO : 4
BUS NO : 0
TARGET ID : 0
TARGET ADDRESS : 10.10.8.200:3260
SESSION STATUS : ESTABLISHED AT Sun Apr 12 11:35:09 2009
NO. OF PORTALS : 1
PORTAL ADDRESS 1 : 10.10.8.200:3260,1
SESSION ID : ISID 00023d000001 TSIH 1400
*************************************************************
Describe the VMFS file system
There are many subsections here and before digging into each one, check out the following three links …
- Advanced VMFS Configuration and Troubleshooting.
- Really advanced, but really good: Understanding VMFS Volumes
- An oldie but goodie: VMware Virtual Machine File System: Technical Overview and Best Practices
Metadata
The simple definition of Metadata is “data about data.” All file systems handle metadata differently. VMFS uses metadata, stored in a special area of each volume, to manage all the files, directories (in VMFS-3 only), and attributes about the volume. VMFS is a clustered file system, meaning more than one ESX server can access the same file system at the same time. Therefore an update to the metadata requires locking of the LUN using a SCSI reservation.
Multi-access and locking
The following was taking from Advanced VMFS Configuration and Troubleshooting.
Distributed Lock handling by VMFS3
- Done in-band
- Hosts mount a VMFS3 volume
- Hosts’ ids posted to heartbeat region
- Heartbeat records are updated at regular intervals by hosts
- Host X locks a file, the lock is associated with its ID
- If host X dies or loses access to volume the file lock is stale
- Host Z attempts to lock the same file which is locked
- Host Z check the heartbeat record of Host X (~5 times)
- If host X heartbeat record is not updated, Host Z will age the lock
- All other hosts yield to host Z and not attempt to lock the file
- Lock is broken and Host Z acquires the lock
- Journal is replayed by Host Z
Extents
Extents are logical extensions of a file system. They are typically used to grow a volume beyond the VMFS size limitations. Essentially, an extent is the “joining” of two or more volumes into a single, logical VMFS volume.
Tree structure and files
The vmfs partition is mounted to the directory with the corresponding UUID found in /vmfs/volumes. The human readable name of the volume is merely a symbolic link to that directory. By default, all VMs are given a directory at the root of the partition. So, for example, a VM with the name of AaronSweemer would have the directory /vmfs/volume/UUID/AaronSweemer. In this directory you will find all files specific and relevant to that VM. This is the default behavior as some (not all) of these files can be configured to reside elsewhere.
Here is a table of common files found on the VMFS file system.
| Extension | Usage |
| .dsk | VM disk file |
| .vmdk | VM disk file |
| .hlog | VMotion log file |
| .vswp | Virtual swap file |
| .vmss | VM suspend file |
| .vmtd | VM template disk file |
| .vmtx | VM Template configuration file |
| .REDO | Files used when VM is in REDO mode |
| .vmx | VM configuration file |
| .log | VM log file |
| .nvram | Nonvolatile RAM |
Journaling
From Wikipedia …
A journaling file system is a file system that logs changes to a journal (usually a circular log in a dedicated area) before committing them to the main file system. Such file systems are less likely to become corrupted in the event of power failure or system crash.
Explain the process used to align VMFS partitions
The following procedure was found in VMware Enterprise Administration Exam study guide 3.5 (page 5) and Advanced VMFS Configuration and Troubleshooting (slide 36).
Aligned partitions start at 128. If the Start value is 63 (the default), the partition is
not aligned. If you choose not to use the VI Client and create partitions with
vmkfstools, or if you want to align the default installation partition before use, take
the following steps to use fdisk to align a partition manually from the ESX Server
service console:
1. Enter fdisk /dev/sd<x> where <x> is the device suffix.
2. Determine if any VMware VMFS partitions already exist. VMware VMFS
partitions are identified by a partition system ID of fb. Type d to delete to
delete these partitions.
Note: This destroys all data currently residing on the VMware VMFS partitions you
delete.
3. Ensure you back up this data first if you need it.
4. Type n to create a new partition.
5. Type p to create a primary partition.
6. Type 1 to create partition No. 1.
Select the defaults to use the complete disk.
7. Type t to set the partition’s system ID.
8. Type fb to set the partition system ID to fb (VMware VMFS volume).
9. Type x to go into expert mode.
10. Type b to adjust the starting block number.
11. Type 1 to choose partition 1.
12. Type 128 to set it to 128 (the array’s stripe element size).
13. Type w to write label and partition information to disk.
Explain the use cases for round-robin load balancing
Multipathing is typically used for failover. Meaning, if one storage path becomes available the host can failover to an alternate path. However, multipathing can also be used in a round-robin fashion to achieve load balancing to achieve better utilization of the HBAs. There are a couple different configurable options that specify when an ESX server switches paths. From the Round-Robin Load Balancing technical note …
When to switch – Specify that the ESX Server host should attempt a path switch after a specified number of I/O blocks have been issued on a path or after a specified number of read or write commands have been issued on a path. If another path exists that meets the specified path policy for the target, the active path to the target is switched to the new path. The –custom-max-commands and –custom-max-blocks options specify when to switch.
Which target to use – Specify that the next path should be on the preferred target, the most recently used target, or any target. The –custom-target-policy option specifies which target to use.
Which HBA to use – Specify that the next path should be on the preferred HBA, the most recently used HBA, the HBA with the minimum outstanding I/O requests, or any HBA. The –custom-HBA-policy option specifies which HBA to use.
Skills and Abilities
Perform advanced multi-pathing configuration
- Configure multi-pathing policy
- Configure round-robin behavior using command-line tools
- Manage active and inactive paths
- Again, from the Round-Robin Load Balancing technical note …
Setting the Path Switching Policy
You can set the path?switching policy for failover and for load balancing by using the esxcfg-mpath command.You can set the path switching policy on a per?LUN basis by using the esxcfg-mpath command’s –policy custom option. If you specify –policy custom, you must also specify one of the custom policy options. Because the path switching policy is set on a per?LUN basis, you must always specify the LUN using the –lun option.
…
Notes
If you set the custom-max-blocks and custom-max-commands, options, the system attempts to switch paths as soon as one of the limits is reached.
If you set the target or the HBA policy to preferred, the system chooses the target or the HBA of the preferred path when possible. If a preferred policy is set on an active/passive SAN array, and the preferred target is not on the active SP (Storage Processor), the system does not select the preferred target but a target on the active SP.
Path switching is not performed if an outstanding SCSI reservation is on the target, or if a path failover is underway. Path switching is delayed until an I/O request is performed when no reservations or path failovers are pending.
Configure and use NPIV HBAs
<<I don’t have NPIV in my lab. Need to revisit this section>>
Manage VMFS file systems using command-line tools
The command line tool you’ll use for managing VMFS file systems in vmkfstools. It’s a very powerful tool and there are many options available, so I suggest you read the man page. The following examples (taken from the online documentation) are certainly not inclusive, just a quick sample of what the tool can do.
Example for Creating a VMFS File System
vmkfstools -C vmfs3 -b 1m -S my_vmfs /vmfs/devices/disks/vmhba1:3:0:1Example for Extending a VMFS-3 Volume
vmkfstools -Z /vmfs/devices/disks/vmhba0:1:2:1 /vmfs/devices/disks/vmhba1:3:0:1Upgrading a VMFS-2 to VMFS-3
-T –tovmfs3 -x –upgradetype [zeroedthick|eagerzeroedthick|thin]Example for Creating a Virtual Disk
vmkfstools -c 2048m /vmfs/volumes/myVMFS/rh6.2.vmdkExample for Cloning a Virtual Disk
vmkfstools -i /vmfs/volumes/templates/gold-master.vmdk /vmfs/volumes/myVMFS/myOS.vmdk
Configure NFS datastores using command-line tools
Assuming your NAS is configured properly, this is pretty easy. The following command will mount an NFS datastore on an ESX host …
esxcfg-nas –a –o 10.10.8.25 –s /nfs/share NAS
In this example, the –a adds a host with the IP address followed by the –o flag using the share configured after the –s flag. Upon successfully adding the datastore, the NFS mount will be found at /vmfs/volumes/NAS
The following command will remove the datastore
esxcfg-nas –d –o 10.10.8.25 NAS
Configure iSCSI hardware and software initiators using command-line tools
I don’t know if I’ve seen an official, formal example of how to do this (though I’m sure it exists somewhere). So, here’s how I do it …
Step 1: Add the portgroup to vSwitch0
esxcfg-vswitch –add-pg=VMkernel vSwitch0
Step 2: Add the IP to the VMkernel portgroup
esxcfg-vmknic -a -i 10.10.8.202 -n 255.255.255.0 VMkernel
Step 4: Enable iSCSI
esxcfg-swiscsi –e
Step 5: Add the target
vmkiscsi-tool -D -a 10.10.8.200 vmhba34
Step 6: Rescan the HBA
esxcfg-rescan vmhba34
That’s it for section 1.1 … time to go reformat my notes for section 1.2!
Are you studying for the VCDX Enterprise Admin Exam?
Apr 13th
Me too. Actually, I’ve been studying for a few weeks now. A while back, a friend of mine and fellow VMware SE (and SRM super freak), Michael White, turned me on to Evernote, an awesome tool for capturing my web research and lab notes (among other other things). So, as I’ve been studying for the exam, I’ve been saving everything in Evernote, which I have installed on my iPhone and my XP desktop. (Now, if only they made a version for Linux, I’d be all set. *sigh* Yet another company with a great app that chooses to ignore the Linux community).
Anyway, now I just need to go back and do some simple formatting and VOILA!, I’ve got quite a few ready-made blog posts. Which is handy, given my recent commitment to a post a day.
So if you too are studying for the exam (or planning to) and looking for study material, feel free to check back here from time to time over the next few weeks. In addition, you might want to check out the following list, which contains everything I’m using to prepare.
- Hands down, the best “cheat sheet” for VI3 is the vmreference vi3 card, by Forbes Guthrie
- Duncan Epping over at Yellow Bricks has put together a great list of documents in his blog post VCDX Design Exam, how to prepare?
- Looks like Jon Owings is also studying for the exam and has a number of good posts on his blog, 2 VCP’s and a Truck
- The best, and most comprehensive study guide I’ve found is the VMware Enterprise Administration Exam study guide 3.5 by Peter van den Bosch.
- Oh, and don’t forget the official VMware Enterprise Administration Exam Blueprint v3.5
Also, send me an email if you’d like to be part of a weekly online VCDX study group I’m trying to put together.