ESX
UPDATE: Connections and Ports in ESX and ESXi
Sep 22nd
Mr. Dudley Smith has updated his PDF diagram with some minor corrections and additions. Get the latest, most up-to-date version here (click the graphic) …
He also updated “the brain” which can be found at it’s new home http://webbrain.com/brainpage/brain/89EFA582-2C35-F6A2-9ED1-7AD4810266C2/. Make sure you update your bookmark accordingly.
Notes from VMware (aka, Mr. Michael White’s Newsletter)
Aug 26th
I wish I could take credit for the following work, but everything below is brought to you by Michael White. Michael is a co-worker of mine, an SE out of Canada who we often refer to as the “SRM King.” He continually impresses me with his ability to crank out a weekly news letter loaded full of great content. Well last night, he happened to mention I could republish his work on my blog. Shoot, you don’t have ask me twice!
Keep in mind as you’re reading, everything is a direct cut and paste. So anything written in the first person (e.g. “I have found …” or “I have decided”) would be referring to him, not me. I certainly don’t want to take credit for all his hard work!
If you have any questions or comments for Michael, feel free to leave a message for him.
Notes from VMware:
Cluster BP, FT and Issue, HA Issue, vDS Cheat Sheet, vDR Issue, YAPOTAV, vSphere Reference Card, View Design BP, SRM FAQ, and really a LOT more!
vSphere Cluster – ESX or ESXi or Mixed – suggestion / recommended best practice
We say that one day that ESX will not exist, and that ESX and ESXi are the same. Or almost the same. However, I have found in Host Profiles and FT there is very good reason to not mix ESX and ESXi in the same cluster. As soon as VMworld is over, I am redoing my mixed cluster to all ESXi (instead of mixed). First, we all know of the problem I reported some time ago that the 8/6/09 patches for vSphere would break FT in a mixed ESX / ESXi cluster. There is no short term solution for that. The workaround is to have a cluster that is all ESX or all ESXi. Second, host profiles have a problem dealing with service console / management network ports. In theory you can manage that by using a reference server that is ESX and it will translate as necessary for ESXi. It doesn’t do so well at that. So using Host Profiles to do a push of a distributed virtual switch (only) ends up causing issues in ESXi consoles. I ended up doing the ESXi hosts manually. The real solution to the FT and HP type issues is to have a cluster all ESX or ESXi. And I am voting for ESXi in my lab. Make no mistake, if you don’t listen to this you will have some issues that are not pleasant.
Using ESXi and ESX and FT in same cluster? And FT broke with the 8/6/09 patches?
The only solution to this at this time is to separate your ESXi and ESX servers into their own cluster, or upgrade one or the other to be the same as the other – meaning all ESXi or ESX and your problem should go away. If you have not installed the 8/6/09 patches yet, and you are using FT, and you have ESXi and ESX in your cluster than either change your cluster to be all ESXi or ESX and than install the patches. Not installing the patches until we fix this is NOT an option. I have decided, and as mentioned somewhere else in here, to redo my cluster as all ESXi. It won’t take much time. Some background on this issue can be found at http://communities.vmware.com/message/1335428#1335428.
Update on odd issue with HA not working if the vSphere ESX console was using certain IP addresses
I hope everyone has already heard that the vSphere bug talked about in http://kb.vmware.com/kb/1013013 and something I mentioned, I think in my last newsletter now has a patch. This is the bug that when a very specific IP address scheme is in use on management ports / service console with no other IP schemes in use and a host crashed, the VM’s that should have been started by HA would in fact not be started at all. I have not tested the fix, as I am wrestling with SRM and trying to get ready for VMworld. To avoid this bug, only one of the addresses on your service console or management ports need to be using something outside of the ‘special’ scheme.
vDS Implementation Cheat Sheet
I worked with the distributed switches in the past in a lab sense, but recently. For my future SRM testing, I got it going for real in my lab. And it was hard, confusing, and not intuitive at all. So I wrote a cheat sheet so you would not have to suffer. It is attached. I have used it a few times and am happy with it so hopefully it will make things quicker and easier. Let me know if you need improvements or changes in it. http://www.virtualinsanity.com/wp-content/uploads/vDS-Implementation-Cheat-Sheet-b.pdf
Data Recovery Issue – which stops backups from happening
If you ever have an issue with writing to your destination when doing backups, you may see the restore point in red with a (Damaged) beside it. This can cause your backup to not work again. The events part of the Reports will show file access errors – 3902. The solution to this is not in the documentation for vDR but it is here. Expand the display of restore points to be bigger than the default 5. I used 25 when I had this issue. Now click all of the restore points that show as damaged. Then select the Mark for Deletion button in the top right of the screen. Now change to the Configuration \ Destinations screen and select the destination that is associated with your backup, and use the Integrity Check option near the top right of the screen. It will take a while. Once it is complete with no errors – check the Events view of Reports – you need to restart the appliance. Now your backups should work!
YAPOTAV – Yet another post on why to attend VMworld
Find this at http://blogs.vmware.com/vmtn/2009/08/yapowtav-yet-another-post-on-why-to-attend-vmworld.html.
New vSphere document reference card
Forbes Guthrie has done a wonderful job on a reference card for vSphere documentation stuff. It pulls stuff out of the documentation and highlights it as a result. Very handy and well done. Find it at http://www.vreference.com/public/vsphere4-notes1.0.pdf
View Design Best Practices training
Would you like to learn more about designing a View infrastructure? The more people you have that depend on it the more important training and experience becomes. Get some ideas on design at http://mylearn.vmware.com/descriptions/EDU_DATASHEET_ViewDesignBestPractices_V3.pdf
SRM FAQ online now thanks to Duncan at Yellow-Bricks
This is from information I have shared with Duncan but it is great information and I appreciate him sharing with everyone. Find it at http://www.yellow-bricks.com/srm-faq/. Duncan’s web site is one of the few you should read frequently. He is a PSO guy in Europe and is very smart, and knows what to communicate – does it real well and I appreciate it.
vSphere and VM snapshots and block size
This is something else that Duncan has done. There is a behavior difference between 3.5x and 4.0 that could catch someone. Find out more from Duncan at http://www.yellow-bricks.com/2009/08/24/vsphere-vm-snapshots-and-block-size/.
VMware View Cheat Sheet
I have had some help to update my VMware View Cheat Sheet and it has gone very well. Our next update of this will have a lot more but this is a good document to get you going with View. www.virtualinsanity.com/wp-content/uploads/VMware-View-Cheat-Sheet-a.pdf
Important patch for Celerra when using NFS with VMware
You can find more information about this at Virtual Geek, but it is important to understand that you need to upgrade your Celerra DART OS before you enable NFS datastores with VMware. Find out more at http://virtualgeek.typepad.com/virtual_geek/2009/08/important-patch-for-celerranfsvmware.html
Lab Manager 4 Upgrade issue
The installer during an upgrade of LM4 assumes all the default roles are present and unmodified. If the customer removes or changes any the upgrade installer will fail.
FT – Architecture and Performance
Do you know how to determine how many FT enabled VM’s your vSphere server can support? Do you know how to design your FT environment for the best performance? In fact, do you know what the performance overhead for FT is? All of this and more is answered in http://www.vmware.com/resources/techresources/10058.
How can I determine the exact build number for my ESX 4.0.x hosts?
You can find out the way to determine the build numbers for components of ESX 4.0 hosts at http://kb.vmware.com/kb/1012514
VMware Data Recovery Evaluator’s guide
This is a very nice document for someone who needs some guidance for testing VDR. It is a quite way to get started. http://www.vmware.com/resources/techresources/10055. My preso on VDR at VMworld is a combination of install / config / best practices and it will be very useful. Look for the session, or the preso after VMworld. It will fit with this eval guide nicely and is known as BC2142.
AppSpeed and Maintenance Mode
Currently AppSpeed has no when to listen to the ESX host it is working on, so when the host tries to enter Maintenance mode it will not be able to since the AppSpeed sensor VM will not listen to it and it will not VMotion off the host. This is a very high priority for us to fix. You will need to manually turn off this sensor before trying to do maintenance mode.
Need some help searching the VMware KB? Find it at http://xtravirt.com/xd10112 – some interesting info.
NFS Storage Configuration Help
Do you need some help configuring NFS support for your ESX servers. There is some help at
http://communities.vmware.com/docs/DOC-7900. This link has only a little info but it does include some troubleshooting info.
VUM and Cisco – conflict message
I got a conflict message from VUM when I tried to patch recently. It was a conflict with the Cisco Nexus stuff which I do not have installed. It turns out that I could just ignore it but it was a little bothersome. We are going to change that message in the near future to be more informative. That way if you know you don’t have Cisco (or whatever) installed you can just install with no issues. The issue is we download all the meta data or patches for ESX without any granularity. So the Cisco patches come done too. More info can be found at http://kb.vmware.com/kb/1013068.
Suggested VMware Employee Sessions at VMworld
This is a list that one of my co-workers put together. It might give you some ideas of what to look for.
- Michael White – BC2142 – Data Recovery intro and best practices
- Tiffany To – DV1790 – View TCO-ROI expert
- Mahesh Ramachandran – VM1724 – Capacity IQ Tech Preview
- Chris Rimer – EA2342 – Oracle sessions (especially around questions of Support and Licensing)
- Richard McDougall – TA3438 – vSphere Performance Guru
- Jacob Jensen – TA2103 – Virtual Networking guru (especially around the Cisco v1000)
- Andy Banta – TA3264 – iSCSI Best Practices (THE iSCSI Engineer/Expert at VMware!)
- Kaushik Banerjee – TA2942 – Performance Best Practices (This guys is a genius in performance and on the Perf. core team!)
- Paul Manning – VM3566 – Storage Best Practices (Many of you have been on calls with Paul for storage related topics!)
- Brian CS, Charu Charubal, and Rob Randell – VM2847, TA2544, DV2626, – Security Team extraordinaire
- Mostafa Khalil – TA2509 – Storage Best Practices (Mostafa is one of the first VCDX members!)
- Amir Sharif – TA3195, V13226 – ESXi PM – ESXi sessions
- Monica Sharma – VM2408 – ConfigControl Tech Preview
- Bill Call – VM2657 – LifeCycle Manager Uber-Guru!
- Dean Flaming and Travis Sales – DV2478 – ThinApp (These are some of the best sessions I have ever seen historically from these guys!)
- Gaetan Castelein – EA3605, EA 3606 – Virtualizing Tier 1 applications –
- Srinivas Krishnamurti – VM2280 – Managing VI from your mobile phone!
- Duncan Epping – TA2259 – Expert VI Design (Duncan runs the #1 Virtualization blog “Yellow-Bricks”)
- Dean Yao – BC3369 – FT Real World design
- Howie Xu – TA3521 – vNetwork Troubleshooting (Howie invented the vSwitch! – and wrote one of our TCP/IP stacks)
- Banjot Chanana – BC3425 – High Availability Futures
- Nicholas Jacques – PA4694 – AppSpeed PM
- Eric Horschmann – TA3880 – vSphere vs Hyper-V/XenServer
- Warren Ponder – DV2697 – View /VDI PM
- Mike DiPetrillo – TA3326 – Cloud (Mike is another uber-rock star and talks all things Cloud!)
- Rahul Ravulur- -VM4380 – vCenter PM covering future of vCenter
- Naeem Malik – VM3609 – Capacity Planner expert
- Aaron Sweemer – DV3567 – How to convert old PCs to Thin Clients using a thin Linux OS and VMware View Open Client.
**** Reminders ******
- vSphere Upgrade Center – http://www.vmware.com/products/vsphere/upgrade-center/
- vSphere Support Center – http://www.vmware.com/support/vsphere.html
- vSphere-land documentation links – http://vsphere-land.com/vsphere-links/documentation-links-2.html
- Connections and Ports in ESX / ESXi – http://www.virtualinsanity.com/esx-connections-and-ports/
- vSphere-land download links – http://vsphere-land.com/vsphere-links/download-links-2.html
- vSphere and other VMware products compatibility matrix – http://www.vmware.com/resources/compatibility/docs/vSphere_Comp_Matrix.pdf
- VMware TCO / ROI Calculator (for VI 3 consolidation, virtual lab automation, and VDI savings) – http://www.vmware.com/products/vi/calculator.html
- VMware Cost per application calculator – http://www.vmware.com/technology/calculator/costperapp.html
- Why choose VMware? – http://www.vmware.com/technology/whyvmware/
- Server Virtualization Validation Program – you can check the status of servers or virtualization technology approved by Windows in the SVVP program – http://windowsservercatalog.com/svvp.aspx?svvppage=svvp.htm
- Support policy – http://support.microsoft.com/kb/957006
- VMware Visio shapes at http://www.veeam.com/vmware-esx-stencils.html and http://engineering.xtravirt.com/products/vi3-visio-action-pack.html
- VMware graphics for PowerPoint – http://viops.vmware.com/home/docs/DOC-1338
- License issues – call 877-486-9273 and select option 1 – or vi-hotline@vmware.com
- Interesting SRM links – http://tendam.wordpress.com/srm-links/
- Practical DR (before SRM) – http://www.vmware.com/files/pdf/practical_guide_bcdr_vmb.pdf
- VMware Security Center – blogs, RSS feeds, and more – http://www.vmware.com/security
- Enhanced VMotion Compatibility (EVC) processor support – http://kb.vmware.com/kb/1003212
- Patch Notice (via email) is available at http://www.vmware.com/support
- Reporting bugs to VMware – http://www.vmware.com/support/policies/defect.html
- Requesting a specific feature – http://www.vmware.com/support/policies/feature.html
- License File checker – http://www.vmware.com/checklicense/ – doesn’t work with SRM license codes yet!
- Dell and VMware alliance – includes white papers and VMotion compatibility info – http://www.dell.com/content/topics/global.aspx/alliances/en/vmware_resources?c=us&cs=555&l=en&s=biz&~section=004
- HP Certification matrix for VMware on Proliant – including VMotion compatibility info – http://h18004.www1.hp.com/products/servers/software/vmware/hpvmotion-compatibility-matrix.html
- Emma Explains Microsoft Licensing in Depth – A well done UK site that actually seems to make sense when it comes to MS license info – http://ladylicensing.spaces.live.com/default.aspx
Connections and Ports in ESX & ESXi
Aug 25th
I got an email from Dudley Smith (a VMware TAM and the author of Troubleshooting ESX and Connections & Ports in VI3.5) informing me that he had recently updated one of his documents. Wow, he sure did. Check this puppy out (click the graphic to download) …
Pretty slick, eh? Well it gets even better. He also created a version using The Brain in HTML … http://www.virtualinsanity.com/esx-connections-and-ports/. Nice! This is definitely a bookmark I’ll be keeping handy and I’d recommend you do the same.
Good work Dudley! Thanks for making it available for everyone. If you agree, be sure to leave a “Thank You” comment for Dudley Smith.
Scripted ESX Installation: Reconfiguring COS Networking with Kickstart
May 27th
Frequently customers have specific NICs (like onboard NICs) that they’d like assigned to the COS, leaving the other NICs for VM traffic. This is difficult, however, when using our automated kickstart deployment scripts as there is no way to explicitly define the vmnic assigned to the COS. And to make matters worse, the VMkernel is not yet available to us during the %post section of the kickstart script, which makes COS networking configuration difficult! Recently I had a customer who was getting frustrated because …
- They would “rack and stack” a physical server and wire up their NICs accordingly (i.e. onboard NICs on the management VLAN, remaining NICs on production VLANs)
- PXE boot the server
- When kickstart completed, they’d lose connection to the COS.
This happens because during installation, ESX just assigns vmnic0 to the lowest PCI number, and then assigns vmnic0 to the COS. And this is often not the NIC the admin wants used for their COS. Of course, they could go back after the fact and reconfigure the COS networking, but this kind of defeats the purpose of a completely hands-free, automated deployment.
Here is one possible solution to the problem. Below is a script I wrote to append to the %post section of a kickstart file. Obviously, you’ll need to make modifications for your environment.
|
## This script should be appended to the %post section of an ESX kickstart file. %post
cat > /tmp/esx_post_install.sh << EOF ## If your kickstart file has vmportgroup=1, you *might* want to uncomment the /usr/sbin/esxcfg-vswitch -A “VMkernel” vSwitch0 ## You’ll need to find which physical NICs you want assigned to your COS. From
/usr/sbin/esxcfg-nics -l | awk ‘\$0 ~ /search term/ {print \$1}’ | xargs –n 1 /usr/sbin/esxcfg-vswitch vSwitch0 –L ## Note: if you want to test the line above from the command-line, you’ll need ## Replace the x.x.x.x after -i with the IP address and after -n with the ## Replace the x.x.x.x after -i with the IP address and after -n with the subnet ## Replace x.x.x.x with the default gateway for the COS in both of the next two lines. mv /etc/rc.d/rc.local.save /etc/rc.d/rc.local
chmod +x /tmp/esx_post_install.sh cat >> /etc/rc.d/rc.local << EOF |
As an example, in my environment I have server with 4 NICs and by default, ESX assigns vmnic0, which is mapped to PCI 02:00.00, to the service console. However, what is actually physically wired to my management network is vmnic3, which is mapped to PCI 02:03.00. In the script above, I simply searched for the number 3 (i.e. replaced search term with 3) and now my scripted ESX installation works properly.
Below is the configuration of my server before I redeployed with kickstart. The line in red is the NIC I want assigned to the COS. The lines in black are what ESX assigns the COS by default.
|
BEFORE (without %post section)
PortGroup Name VLAN ID Used Ports Uplinks |
Now, here is the same output after I redeployed the server with my modifications to the %post section of the kickstart file. The scripted deployment of ESX now properly assigns vmnic3 to my service console.
|
AFTER (with %post section)
[root@vesx7 root]# esxcfg-nics -l
[root@vesx7 root]# esxcfg-vswitch -l
PortGroup Name VLAN ID Used Ports Uplinks |
I hope this was helpful. Let me know if you have any questions.
Well, I’d better sign off and start packing because I leave for Omaha, NE in a few hours.
VCDX Admin Exam Notes — Section 1.1
Apr 27th
I finally got a chance to sit down and reformat some of my notes for the VCDX Admin Exam. Below are my notes for Section 1.1 of the VMware Enterprise Administration Exam Blueprint v3.5. Everything in Blue is a direct cut and past from the exam blueprint.
Oh, and thanks to the Disqus comment from VirtualizationTeam (Blog), letting me know that Peter van den Bosch has a more recent version of his VMware Enterprise Administration Exam Study Guide 3.5.
Section 1 – Storage
Objective 1.1 – Create and Administer VMFS datastores using advanced techniques.
Knowledge
Describe how to identify iSCSI, Fibre channel, SATA and NFS configurations using CLI commands and log entries
Here are a few command line examples that I believe would work well …
1) esxcfg-mpath –l
This command produces the following output on my server:
[root@cincylab-esx3 root]# esxcfg-mpath -l
Disk vmhba0:0:0 /dev/sdb (152627MB) has 1 paths and policy of Fixed
Local 0:31.2 vmhba0:0:0 On active preferred
Disk vmhba32:0:0 /dev/sda (152627MB) has 1 paths and policy of Fixed
Local 0:31.2 vmhba32:0:0 On active preferred
Disk vmhba35:0:0 /dev/sdc (923172MB) has 1 paths and policy of Fixed
iScsi sw iqn.1998-01.com.vmware:cincylab-esx3-1d029e5f<->iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1 vmhba35:0:0 On active preferred
2) esxcfg-info –s
The –s flag will narrow the scope of the output to just storage and disk related info. But even with the narrowed scope, this command produces way too much output to be displayed here. You’ll likely want to pipe the output into grep, or at a minimum to a more/less to get what you’re looking for.
3) cat /var/log/vmkernel | grep vmhba | tail –10
This will search the vmkernel log file and display the last 10 lines containing the text vmhba. If you want more (or fewer lines) change the –10 to whatever suits your needs.
If found this one particularly useful when you’ve enabled the software iSCSI initiator at the command line, but don’t know yet number has been assigned to the vmhba (e.g. vmhba35).
4) esxcfg-vmhbadevs –m and ls –lah /vmfs/volumes
The command esxcfg-vmhbadevs –m will show the mapping between vmhba numbers, device files and their UUIDs. If you’d like a quick and easy way to see what UUIDs are mapped to their human readable name, you can follow that up with a ls –lah /vmfs/volumes. The two commands back to back produce the following output on my server:
[root@cincylab-esx3 root]# esxcfg-vmhbadevs -m
vmhba35:0:0:1 /dev/sdc1 4986310d-6525e5e6-ebbd-00237d0681e7
vmhba0:0:0:3 /dev/sdb3 49e115fb-3e22358c-c10a-00237d0681e7
vmhba32:0:0:1 /dev/sda1 4985c53e-e7b1904f-5042-00237d0681e7
[root@cincylab-esx3 root]# ls -lah /vmfs/volumes/
total 10M
drwxr-xr-x 1 root root 512 Apr 20 23:07 .
drwxrwxrwt 1 root root 512 Apr 11 18:12 ..
drwxr-xr-t 1 root root 1.2K Feb 1 21:34 4985c53e-e7b1904f-5042-00237d0681e7
drwxr-xr-t 1 root root 3.7K Apr 14 14:49 4986310d-6525e5e6-ebbd-00237d0681e7
drwxr-xr-t 1 root root 980 Apr 11 18:13 49e115fb-3e22358c-c10a-00237d0681e7
lrwxr-xr-x 1 root root 35 Apr 20 23:07 cincylab-esx3:storage1 -> 4985c53e-e7b1904f-5042-00237d0681e7
lrwxr-xr-x 1 root root 35 Apr 20 23:07 cincylab-esx3:storage2 -> 49e115fb-3e22358c-c10a-00237d0681e7
lrwxr-xr-x 1 root root 35 Apr 20 23:07 vol1 -> 4986310d-6525e5e6-ebbd-00237d0681e7
5) vmkiscsi-ls
This one only applies to iSCSI storage, of course, and produces the following output on my server:
[root@cincylab-esx3 root]# vmkiscsi-ls
*************************************************************
SFNet iSCSI Driver Version … 3.6.3 (27-Jun-2005 )
*************************************************************
TARGET NAME : iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1
TARGET ALIAS :
HOST NO : 4
BUS NO : 0
TARGET ID : 0
TARGET ADDRESS : 10.10.8.200:3260
SESSION STATUS : ESTABLISHED AT Sun Apr 12 11:35:09 2009
NO. OF PORTALS : 1
PORTAL ADDRESS 1 : 10.10.8.200:3260,1
SESSION ID : ISID 00023d000001 TSIH 1400
*************************************************************
Describe the VMFS file system
There are many subsections here and before digging into each one, check out the following three links …
- Advanced VMFS Configuration and Troubleshooting.
- Really advanced, but really good: Understanding VMFS Volumes
- An oldie but goodie: VMware Virtual Machine File System: Technical Overview and Best Practices
Metadata
The simple definition of Metadata is “data about data.” All file systems handle metadata differently. VMFS uses metadata, stored in a special area of each volume, to manage all the files, directories (in VMFS-3 only), and attributes about the volume. VMFS is a clustered file system, meaning more than one ESX server can access the same file system at the same time. Therefore an update to the metadata requires locking of the LUN using a SCSI reservation.
Multi-access and locking
The following was taking from Advanced VMFS Configuration and Troubleshooting.
Distributed Lock handling by VMFS3
- Done in-band
- Hosts mount a VMFS3 volume
- Hosts’ ids posted to heartbeat region
- Heartbeat records are updated at regular intervals by hosts
- Host X locks a file, the lock is associated with its ID
- If host X dies or loses access to volume the file lock is stale
- Host Z attempts to lock the same file which is locked
- Host Z check the heartbeat record of Host X (~5 times)
- If host X heartbeat record is not updated, Host Z will age the lock
- All other hosts yield to host Z and not attempt to lock the file
- Lock is broken and Host Z acquires the lock
- Journal is replayed by Host Z
Extents
Extents are logical extensions of a file system. They are typically used to grow a volume beyond the VMFS size limitations. Essentially, an extent is the “joining” of two or more volumes into a single, logical VMFS volume.
Tree structure and files
The vmfs partition is mounted to the directory with the corresponding UUID found in /vmfs/volumes. The human readable name of the volume is merely a symbolic link to that directory. By default, all VMs are given a directory at the root of the partition. So, for example, a VM with the name of AaronSweemer would have the directory /vmfs/volume/UUID/AaronSweemer. In this directory you will find all files specific and relevant to that VM. This is the default behavior as some (not all) of these files can be configured to reside elsewhere.
Here is a table of common files found on the VMFS file system.
| Extension | Usage |
| .dsk | VM disk file |
| .vmdk | VM disk file |
| .hlog | VMotion log file |
| .vswp | Virtual swap file |
| .vmss | VM suspend file |
| .vmtd | VM template disk file |
| .vmtx | VM Template configuration file |
| .REDO | Files used when VM is in REDO mode |
| .vmx | VM configuration file |
| .log | VM log file |
| .nvram | Nonvolatile RAM |
Journaling
From Wikipedia …
A journaling file system is a file system that logs changes to a journal (usually a circular log in a dedicated area) before committing them to the main file system. Such file systems are less likely to become corrupted in the event of power failure or system crash.
Explain the process used to align VMFS partitions
The following procedure was found in VMware Enterprise Administration Exam study guide 3.5 (page 5) and Advanced VMFS Configuration and Troubleshooting (slide 36).
Aligned partitions start at 128. If the Start value is 63 (the default), the partition is
not aligned. If you choose not to use the VI Client and create partitions with
vmkfstools, or if you want to align the default installation partition before use, take
the following steps to use fdisk to align a partition manually from the ESX Server
service console:
1. Enter fdisk /dev/sd<x> where <x> is the device suffix.
2. Determine if any VMware VMFS partitions already exist. VMware VMFS
partitions are identified by a partition system ID of fb. Type d to delete to
delete these partitions.
Note: This destroys all data currently residing on the VMware VMFS partitions you
delete.
3. Ensure you back up this data first if you need it.
4. Type n to create a new partition.
5. Type p to create a primary partition.
6. Type 1 to create partition No. 1.
Select the defaults to use the complete disk.
7. Type t to set the partition’s system ID.
8. Type fb to set the partition system ID to fb (VMware VMFS volume).
9. Type x to go into expert mode.
10. Type b to adjust the starting block number.
11. Type 1 to choose partition 1.
12. Type 128 to set it to 128 (the array’s stripe element size).
13. Type w to write label and partition information to disk.
Explain the use cases for round-robin load balancing
Multipathing is typically used for failover. Meaning, if one storage path becomes available the host can failover to an alternate path. However, multipathing can also be used in a round-robin fashion to achieve load balancing to achieve better utilization of the HBAs. There are a couple different configurable options that specify when an ESX server switches paths. From the Round-Robin Load Balancing technical note …
When to switch – Specify that the ESX Server host should attempt a path switch after a specified number of I/O blocks have been issued on a path or after a specified number of read or write commands have been issued on a path. If another path exists that meets the specified path policy for the target, the active path to the target is switched to the new path. The –custom-max-commands and –custom-max-blocks options specify when to switch.
Which target to use – Specify that the next path should be on the preferred target, the most recently used target, or any target. The –custom-target-policy option specifies which target to use.
Which HBA to use – Specify that the next path should be on the preferred HBA, the most recently used HBA, the HBA with the minimum outstanding I/O requests, or any HBA. The –custom-HBA-policy option specifies which HBA to use.
Skills and Abilities
Perform advanced multi-pathing configuration
- Configure multi-pathing policy
- Configure round-robin behavior using command-line tools
- Manage active and inactive paths
- Again, from the Round-Robin Load Balancing technical note …
Setting the Path Switching Policy
You can set the path?switching policy for failover and for load balancing by using the esxcfg-mpath command.You can set the path switching policy on a per?LUN basis by using the esxcfg-mpath command’s –policy custom option. If you specify –policy custom, you must also specify one of the custom policy options. Because the path switching policy is set on a per?LUN basis, you must always specify the LUN using the –lun option.
…
Notes
If you set the custom-max-blocks and custom-max-commands, options, the system attempts to switch paths as soon as one of the limits is reached.
If you set the target or the HBA policy to preferred, the system chooses the target or the HBA of the preferred path when possible. If a preferred policy is set on an active/passive SAN array, and the preferred target is not on the active SP (Storage Processor), the system does not select the preferred target but a target on the active SP.
Path switching is not performed if an outstanding SCSI reservation is on the target, or if a path failover is underway. Path switching is delayed until an I/O request is performed when no reservations or path failovers are pending.
Configure and use NPIV HBAs
<<I don’t have NPIV in my lab. Need to revisit this section>>
Manage VMFS file systems using command-line tools
The command line tool you’ll use for managing VMFS file systems in vmkfstools. It’s a very powerful tool and there are many options available, so I suggest you read the man page. The following examples (taken from the online documentation) are certainly not inclusive, just a quick sample of what the tool can do.
Example for Creating a VMFS File System
vmkfstools -C vmfs3 -b 1m -S my_vmfs /vmfs/devices/disks/vmhba1:3:0:1Example for Extending a VMFS-3 Volume
vmkfstools -Z /vmfs/devices/disks/vmhba0:1:2:1 /vmfs/devices/disks/vmhba1:3:0:1Upgrading a VMFS-2 to VMFS-3
-T –tovmfs3 -x –upgradetype [zeroedthick|eagerzeroedthick|thin]Example for Creating a Virtual Disk
vmkfstools -c 2048m /vmfs/volumes/myVMFS/rh6.2.vmdkExample for Cloning a Virtual Disk
vmkfstools -i /vmfs/volumes/templates/gold-master.vmdk /vmfs/volumes/myVMFS/myOS.vmdk
Configure NFS datastores using command-line tools
Assuming your NAS is configured properly, this is pretty easy. The following command will mount an NFS datastore on an ESX host …
esxcfg-nas –a –o 10.10.8.25 –s /nfs/share NAS
In this example, the –a adds a host with the IP address followed by the –o flag using the share configured after the –s flag. Upon successfully adding the datastore, the NFS mount will be found at /vmfs/volumes/NAS
The following command will remove the datastore
esxcfg-nas –d –o 10.10.8.25 NAS
Configure iSCSI hardware and software initiators using command-line tools
I don’t know if I’ve seen an official, formal example of how to do this (though I’m sure it exists somewhere). So, here’s how I do it …
Step 1: Add the portgroup to vSwitch0
esxcfg-vswitch –add-pg=VMkernel vSwitch0
Step 2: Add the IP to the VMkernel portgroup
esxcfg-vmknic -a -i 10.10.8.202 -n 255.255.255.0 VMkernel
Step 4: Enable iSCSI
esxcfg-swiscsi –e
Step 5: Add the target
vmkiscsi-tool -D -a 10.10.8.200 vmhba34
Step 6: Rescan the HBA
esxcfg-rescan vmhba34
That’s it for section 1.1 … time to go reformat my notes for section 1.2!
Troubleshooting ESX
Apr 21st
I was at the Louisville VMUG on Friday talking about Troubleshooting ESX. In my preparation for the event, I was looking for a good PowerPoint presentation I could reuse and I stumbled across a sweet little gem of a document. Dudley Smith, a VMware Technical Account Manager (TAM) out of Virginia, created a cool one page Mind Map for Troubleshooting ESX. Does it address every potential issue you’ll come across? No, of course not. But it’s a heck of a good place to start. One look at his Mind Map and I thought to myself, “that would be a great thing to have printed out and hanging over every VMware admin’s desk.”
Well, long story short, I snagged it and threw it up on the big screen behind me as I was presenting. During the presentation (and many times since the presentation) I had many requests to post the PDF for download.
But since I couldn’t just start passing out someone else’s work as my own, I sent Dudley a quick email asking for permission to distribute. He responded by saying, “Sure, publish away! You might enjoy this too… ” Attached was another one page document that visually shows the TCP/UDP ports leveraged in VI3.5. Nice! Again, another great document to have printed out and hanging over your desk, IMHO.
So, courtesy of the author, Dudley Smith, here are two documents that I would recommend you add to your tool belt. (click the images to download the PDFs)
If you like them, leave a comment for Dudley.
Bring on the 10Gig Ethernet!
Nov 17th
VMware recently updated its networking performance tests to see if the ESX hypervisor could efficiently leverage the ever-expanding bandwidth available at the Ethernet level. In short, it sure can! A single VM can effectively saturate a 10Gbps link when jumbo frames are enabled. But that’s not to say it can’t perform well with multiple virtual machines. Things scaled nicely and equitably for all VM’s. This type of scalable performance is reassuring as customers continue to raise consolidation ratios within their datacenters and virtualize the largest of workloads.
To save you some reading, here is the summary from the whitepaper, which can be found at: http://www.vmware.com/pdf/10GigE_performance.pdf
Conclusion:The results presented in the previous sections show that virtual machines running on ESX 3.5 Update 1 can efficiently share and saturate 10Gbps Ethernet links. A single uniprocessor virtual machine can push as much as 8Gbps of traffic with frames that use the standard MTU size and can saturate a 10Gbps link when using jumbo frames. Jumbo frames can also boost receive throughput by up to 40 percent, allowing a single virtual machine to receive traffic at rates up to 5.7Gbps.
Our detailed scaling tests show that ESX scales very well with increasing load on the system and fairly allocates bandwidth to all the booted virtual machines. Two virtual machines can easily saturate a 10Gbps link (the practical limit is 9.3Gbps for packets that use the standard MTU size because of protocol overheads), and the throughput remains constant as we add more virtual machines. Scaling on the receive path is similar, with throughput increasing linearly until we achieve line rate and then gracefully decreasing as system load and resource contention increase.
Thus, ESX 3.5 Update 1 supports the latest generation of 10Gbps NICs with minimal overheads and allows high virtual machine consolidation ratios while being fair to all virtual machines sharing the NICs and maintaining 10Gbps line rates.



