Archive for the ‘Aaron Sweemer’ Category

I’ve been here at VMworld2009 in San Francisco since Sunday.  Monday was Partner Day and marked the unofficial first day of the event.  Yesterday, however, was the actual first day, open to all attendees.  There is much coverage of the event by numerous bloggers, so I won’t reinvent the wheel and bore you with duplicate content.  Instead, here are a few of my favorite things, so far (we’ve to two more days).  Oh, and this is by no means a complete list.  there are a LOT of cool things happening here and I don’t have the time and/or energy to write about all of them.

John Troyer Streaming Live from the Solutions Exchange

First, I often find myself watching John Troyer’s live coverage from the Solutions Exchange.  Which is weird because I could literally walk there in about 2 minutes.  But when I’m in my room in between meetings, it’s nice to have it on in the background so I can listen in on all the stuff I’m missing.  And John has been interviewing some very cool people.  

Check it out here … http://www.ustream.tv/channel/vmworld

 

vCloud Express

The vCloud Express is …

The VMware vCloud™ Express service delivers the ability to provision infrastructure on-demand, via credit card, and pay for use by the hour. As a VMware Virtualized ™ service, it ensures compatibility with other VMware environments both internally and with external services.

(Taken from http://www.boche.net/blog/index.php/2009/09/01/vmware-announces-vcloud-express/)

VMware actually demoed vCloud Express with Terramark, one of the service providers in the program.  It was pretty slick to see them simply add some user and credit card information and then spin up a VM quickly and easily on stage. 

Now that I’m having serious power problems in my house because of my home lab (hence the reason this blog keeps going up and down), I really think I’ll be using vCloud Express very soon.

 

SpringSource

VMware’s acquisition of SpringSource was actually announced weeks ago, but this was the first time there was really any lengthy discussion about it.  Frankly, the SpringSource acquisition is probably the thing that I am most excited about.  And I personally believe it will play a significant role in VMware’s future.  There is actually a lot of things I’d like to say about this, but will save it for a later post.

 

Running VMware View / RDP sessions on your iPhone with the Wyse Pocket Cloud client

Given the fact that I “eat the dog food” and actually run my VMware corporate desktop as a VMware View image, and I am also an iPhone user, I think this is super slick and something I know I will use …

 

A Shameless Plug

The folks over at Virtual Strategy Magazine have asked me to do a video blog of the event.  Our first recording was last night and I would guess they’ll have it posted sometime today.  When it’s up on their site, you can find it here … http://www.virtual-strategy.com/VMworld-2009.html

Also, if you’re here at VMworld and undecided about your Thursday schedule, why not come to my session?  :)   I’ll be presenting at 10AM in room 135.  The topic?  How to convert old PCs to thin clients using a Linux OS and the VMware View Open Client.  Hope to see you there!

The editor of Virtual Strategy Magazine recently asked me if I would be interested in committing to a regular monthly column for them.  After thinking about it for a few days, I agreed and my first article in the series was published on Monday.  The title of the column is going to be “Confessions of a Virtualization-aholic” where I’ll talk about real world experiences, with plenty of exaggeration and embellishment for effect. :)

You can follow my new column by visiting Virtual Strategy Magazine and this month’s article is here http://www.virtual-strategy.com/Features/Sweemer-20090825.html.  I’ve also republished the article below.

 

image

The data center lights are brighter than the sun. The air is particularly stale and warm today. Beads of sweat are forming on my forehead. One bead grows too large and trickles down my face into my right eye. The salty liquid forces me to squint and temporarily blurs my vision. I hear Jamiroquai’s "Virtual Insanity" playing in the background. Sweet.

My vision returns and I notice a lone leaf of paper, like tumbleweed, dancing across the raised tile floor. It’s caught in the airstream of the temporary fans brought in to combat the blistering heat pouring off the mountains of servers. Off in the distance, I see an Oompa-Loompa doing a perfect pirouette. That’s weird. He’s kinda freakin’ me out. But nevermind. Back to the heat. It’s bad and it’s getting worse.

There are just too many servers. Some renegade, old-school wahoo added more hardware when the existing hardware is way underutilized. Brainwashed products of an era dominated by inefficient operating systems and incompatible applications, you can’t blame them anymore than you can blame a dog for being a dog. But placing blame aside, they have an incredible knack for making a horrible mess of things. I need to put an end to this.

I can feel the temperature rising. The data center technicians have all stripped down to their knickers. Long hair and pot bellies abound. It ain’t pretty. Except for Megan Fox. She’s hot. Evidently, in between filming scenes for her next Transformers movie, she moonlights as a server admin. Who knew? She turns to me and says "Aaron, you have to help us!" There’s an air of desperation in her voice.

"Don’t worry, miss…everything’s going to be alright."

In super slow motion, she flicks back her hair, gives me a sexy smile and a wink. I can’t disappoint Megan! Now I’m on a mission, the heat must come down. Servers must be eliminated. Everything must be virtualized!

It’s show time. Without hesitation I start to P2V everything in sight. The growing crowd starts to cheer and I feed off their excitement. Web, database, middleware, you name it … nothing is safe, nothing is sacred. I’m a P2V monkey, but instead of flinging poo, I’m flinging servers. One by one, each server meets its timely end. The heat is retreating. Trees are being saved! Energy execs are sobbing as their profits diminish with every dead server!

The pile of lifeless hardware is growing quite large. Each server bears an official death certificate which reads "Virtualized by VMware." Most of us cheer. But a few of the old timers hover over the steaming pile of scrap metal screaming "NOOOOOOOOOOOOO!" One of the poor pathetic souls shakes his fist at me and calls me a murderer. I can’t help but snicker.

Just then, Megan breaks through the crowd and comes running toward me. Oddly enough, her knickers are gone and she’s covered by nothing more than a recent issue of Virtual Strategy Magazine. Carryl Roy, Editor in Chief of VSM, yells "We’re only digital, not print!" To which I exclaim, "I really think you’re missing the point!" And at that very moment, Megan leaps toward me and lands with her lips in perfect alignment with mine.

"My hero!" says Megan.

"Sweemer’s the name, virtualization’s my game."

She moves in for a long, wet kiss. But before our lips touch, she pauses, gives me a funny look and says, "BEEP." Huh? That was weird. Well, it’s Megan Fox, she can say whatever she wants ‘cause she’s so hot! But let’s try that again.

"BEEP"

Okay, Megan sweetheart, you really need to stop that. It’s distracting.

"BEEP BEEP BEEEEEEEEEEEEEEEEEEEEEEEEEEP"

My eyelids open and I’m staring at the rear end of my 10-year-old boxer, Lucy, who apparently managed to crawl into bed in the middle of the night. I reach over and grab my alarm clock and mid BEEP, I throw it against the wall. I’m pissed. I lost Megan and woke up to old stinky dog butt. But then I roll over and see my amazing, beautiful wife. Megan’s got nothing on her! And life is good again.

My wife opens her eyes and says "You were dreaming about Megan Fox again, weren’t you?"

"Why do you say that?"

"Because you woke me up 15 times last night screaming, ‘P2V me, Megan! P2V me, baby!’"

I try to conceal my embarrassment as my wife just giggles. But she does a great job of comforting me when she says "Don’t worry, baby. I dream about Megan Fox, too." Sweeeeeet. I love my wife.

Well, time to get up and get ready, I’ve got a long day, which begins with my Virtualization-aholic’s Anonymous meeting. It’s been rumored that Simon Crosby and Steve Ballmer will be there. Care to join us? We meet right here, once a month, at Virtual Strategy Magazine. You’ll laugh, you’ll cry, you may even hurl, as you hear weird, wacky and sometimes seriously disturbing Confessions of a Virtualization-aholic.

I wish I could take credit for the following work, but everything below is brought to you by Michael White.  Michael is a co-worker of mine, an SE out of Canada who we often refer to as the “SRM King.”  He continually impresses me with his ability to crank out a weekly news letter loaded full of great content.  Well last night, he happened to mention I could republish his work on my blog.  Shoot, you don’t have ask me twice!

Keep in mind as you’re reading, everything is a direct cut and paste.  So anything written in the first person (e.g. “I have found …” or “I have decided”) would be referring to him, not me.  I certainly don’t want to take credit for all his hard work! :)  

If you have any questions or comments for Michael, feel free to leave a message for him.

 

Notes from VMware:

Cluster BP, FT and Issue, HA Issue, vDS Cheat Sheet, vDR Issue, YAPOTAV, vSphere Reference Card, View Design BP, SRM FAQ, and really a LOT more!

 

vSphere Cluster – ESX or ESXi or Mixed – suggestion / recommended best practice

We say that one day that ESX will not exist, and that ESX and ESXi are the same.  Or almost the same.  However, I have found in Host Profiles and FT there is very good reason to not mix ESX and ESXi in the same cluster.  As soon as VMworld is over, I am redoing my mixed cluster to all ESXi (instead of mixed).  First, we all know of the problem I reported some time ago that the 8/6/09 patches for vSphere would break FT in a mixed ESX / ESXi cluster.  There is no short term solution for that. The workaround is to have a cluster that is all ESX or all ESXi.  Second, host profiles have a problem dealing with service console / management network ports.  In theory you can manage that by using a reference server that is ESX and it will translate as necessary for ESXi.  It doesn’t do so well at that.  So using Host Profiles to do a push of a distributed virtual switch (only) ends up causing issues in ESXi consoles.  I ended up doing the ESXi hosts manually.  The real solution to the FT and HP type issues is to have a cluster all ESX or ESXi.  And I am voting for ESXi in my lab.  Make no mistake, if you don’t listen to this you will have some issues that are not pleasant.

 

Using ESXi and ESX and FT in same cluster?  And FT broke with the 8/6/09 patches?

The only solution to this at this time is to separate your ESXi and ESX servers into their own cluster, or upgrade one or the other to be the same as the other – meaning all ESXi or ESX and your problem should go away.  If you have not installed the 8/6/09 patches yet, and you are using FT, and you have ESXi and ESX in your cluster than either change your cluster to be all ESXi or ESX and than install the patches.  Not installing the patches until we fix this is NOT an option.  I have decided, and as mentioned somewhere else in here, to redo my cluster as all ESXi.  It won’t take much time.  Some background on this issue can be found at http://communities.vmware.com/message/1335428#1335428.

Update on odd issue with HA not working if the vSphere ESX console was using certain IP addresses

I hope everyone has already heard that the vSphere bug talked about in http://kb.vmware.com/kb/1013013 and something I mentioned, I think in my last newsletter now has a patch. This is the bug that when a very specific IP address scheme is in use on management ports / service console with no other IP schemes in use and a host crashed, the VM’s that should have been started by HA would in fact not be started at all.  I have not tested the fix, as I am wrestling with SRM and trying to get ready for VMworld.  To avoid this bug, only one of the addresses on your service console or management ports need to be using something outside of the ‘special’ scheme.

vDS Implementation Cheat Sheet

I worked with the distributed switches in the past in a lab sense, but recently. For my future SRM testing, I got it going for real in my lab.  And it was hard, confusing, and not intuitive at all.  So I wrote a cheat sheet so you would not have to suffer.  It is attached.  I have used it a few times and am happy with it so hopefully it will make things quicker and easier.  Let me know if you need improvements or changes in it.  http://www.virtualinsanity.com/wp-content/uploads/vDS-Implementation-Cheat-Sheet-b.pdf

Data Recovery Issue – which stops backups from happening

If you ever have an issue with writing to your destination when doing backups, you may see the restore point in red with a (Damaged) beside it.  This can cause your backup to not work again.   The events part of the Reports will show file access errors – 3902.  The solution to this is not in the documentation for vDR but it is here. Expand the display of restore points to be bigger than the default 5.  I used 25 when I had this issue.  Now click all of the restore points that show as damaged.  Then select the Mark for Deletion button in the top right of the screen.  Now change to the Configuration \ Destinations screen and select the destination that is associated with your backup, and use the Integrity Check option near the top right of the screen.  It will take a while.  Once it is complete with no errors – check the Events view of Reports – you need to restart the appliance.  Now your backups should work!

YAPOTAV – Yet another post on why to attend VMworld

Find this at http://blogs.vmware.com/vmtn/2009/08/yapowtav-yet-another-post-on-why-to-attend-vmworld.html.

New vSphere document reference card

Forbes Guthrie has done a wonderful job on a reference card for vSphere documentation stuff.  It pulls stuff out of the documentation and highlights it as a result.  Very handy and well done.  Find it at http://www.vreference.com/public/vsphere4-notes1.0.pdf

View Design Best Practices training

Would you like to learn more about designing a View infrastructure?  The more people you have that depend on it the more important training and experience becomes.  Get some ideas on design at http://mylearn.vmware.com/descriptions/EDU_DATASHEET_ViewDesignBestPractices_V3.pdf

SRM FAQ online now thanks to Duncan at Yellow-Bricks

This is from information I have shared with Duncan but it is great information and I appreciate him sharing with everyone.  Find it at http://www.yellow-bricks.com/srm-faq/.  Duncan’s web site is one of the few you should read frequently. He is a PSO guy in Europe and is very smart, and knows what to communicate – does it real well and I appreciate it.

 

vSphere and VM snapshots and block size

This is something else that Duncan has done.  There is a behavior difference between 3.5x and 4.0 that could catch someone.  Find out more from Duncan at http://www.yellow-bricks.com/2009/08/24/vsphere-vm-snapshots-and-block-size/.

VMware View Cheat Sheet

I have had some help to update my VMware View Cheat Sheet and it has gone very well.  Our next update of this will have a lot more but this is a good document to get you going with View.  www.virtualinsanity.com/wp-content/uploads/VMware-View-Cheat-Sheet-a.pdf

 

Important patch for Celerra when using NFS with VMware

You can find more information about this at Virtual Geek, but it is important to understand that you need to upgrade your Celerra DART OS before you enable NFS datastores with VMware.  Find out more at http://virtualgeek.typepad.com/virtual_geek/2009/08/important-patch-for-celerranfsvmware.html

Lab Manager 4 Upgrade issue

The installer during an upgrade of LM4 assumes all the default roles are present and unmodified.  If the customer removes or changes any the upgrade installer will fail.

FT – Architecture and Performance

Do you know how to determine how many FT enabled VM’s your vSphere server can support?  Do you know how to design your FT environment for the best performance?  In fact, do you know what the performance overhead for FT is?  All of this and more is answered in http://www.vmware.com/resources/techresources/10058.

How can I determine the exact build number for my ESX 4.0.x hosts?

You can find out the way to determine the build numbers for components of ESX 4.0 hosts at http://kb.vmware.com/kb/1012514

VMware Data Recovery Evaluator’s guide

This is a very nice document for someone who needs some guidance for testing VDR.  It is a quite way to get started.  http://www.vmware.com/resources/techresources/10055.  My preso on VDR at VMworld is a combination of install / config / best practices and it will be very useful.  Look for the session, or the preso after VMworld.  It will fit with this eval guide nicely and is known as BC2142.

 

AppSpeed and Maintenance Mode

Currently AppSpeed has no when to listen to the ESX host it is working on, so when the host tries to enter Maintenance mode it will not be able to since the AppSpeed sensor VM will not listen to it and it will not VMotion off the host.  This is a very high priority for us to fix. You will need to manually turn off this sensor before trying to do maintenance mode.
Need some help searching the VMware KB?  Find it at http://xtravirt.com/xd10112 – some interesting info.

NFS Storage Configuration Help

Do you need some help configuring NFS support for your ESX servers.  There is some help at
http://communities.vmware.com/docs/DOC-7900.  This link has only a little info but it does include some troubleshooting info.

VUM and Cisco – conflict message

I got a conflict message from VUM when I tried to patch recently.  It was a conflict with the Cisco Nexus stuff which I do not have installed.  It turns out that I could just ignore it but it was a little bothersome.  We are going to change that message in the near future to be more informative.  That way if you know you don’t have Cisco (or whatever) installed you can just install with no issues.  The issue is we download all the meta data or patches for ESX without any granularity. So the Cisco patches come done too.  More info can be found at http://kb.vmware.com/kb/1013068.

Suggested VMware Employee Sessions at VMworld

This is a list that one of my co-workers put together. It might give you some ideas of what to look for. 

  • Michael White – BC2142 – Data Recovery intro and best practices
  • Tiffany To – DV1790 – View TCO-ROI expert
  • Mahesh Ramachandran – VM1724 – Capacity IQ Tech Preview
  • Chris Rimer – EA2342 – Oracle sessions (especially around questions of Support and Licensing)
  • Richard McDougall – TA3438 – vSphere Performance Guru
  • Jacob Jensen – TA2103 – Virtual Networking guru (especially around the Cisco v1000)
  • Andy Banta – TA3264 – iSCSI Best Practices (THE iSCSI Engineer/Expert at VMware!)
  • Kaushik Banerjee – TA2942 – Performance Best Practices (This guys is a genius in performance and on the Perf. core team!)
  • Paul Manning – VM3566 – Storage Best Practices (Many of you have been on calls with Paul for storage related topics!)
  • Brian CS, Charu Charubal, and Rob Randell – VM2847, TA2544, DV2626, – Security Team extraordinaire
  • Mostafa Khalil – TA2509 – Storage Best Practices (Mostafa is one of the first VCDX members!)
  • Amir Sharif – TA3195, V13226 – ESXi PM – ESXi sessions
  • Monica Sharma – VM2408 – ConfigControl Tech Preview
  • Bill Call – VM2657 – LifeCycle Manager Uber-Guru!
  • Dean Flaming and Travis Sales – DV2478 – ThinApp (These are some of the best sessions I have ever seen historically from these guys!)
  • Gaetan Castelein – EA3605, EA 3606 – Virtualizing Tier 1 applications –
  • Srinivas Krishnamurti – VM2280 – Managing VI from your mobile phone! :)
  • Duncan Epping – TA2259 – Expert VI Design (Duncan runs the #1 Virtualization blog “Yellow-Bricks”)
  • Dean Yao – BC3369 – FT Real World design
  • Howie Xu – TA3521 – vNetwork Troubleshooting (Howie invented the vSwitch! – and wrote one of our TCP/IP stacks)
  • Banjot Chanana – BC3425 – High Availability Futures
  • Nicholas Jacques – PA4694 – AppSpeed PM
  • Eric Horschmann – TA3880 – vSphere vs Hyper-V/XenServer
  • Warren Ponder – DV2697 – View /VDI PM
  • Mike DiPetrillo – TA3326 – Cloud (Mike is another uber-rock star and talks all things Cloud!)
  • Rahul Ravulur- -VM4380 – vCenter PM covering future of vCenter
  • Naeem Malik – VM3609 – Capacity Planner expert
  • Aaron Sweemer – DV3567 – How to convert old PCs to Thin Clients using a thin Linux OS and VMware View Open Client.

**** Reminders ******

I got an email from Dudley Smith (a VMware TAM and the author of Troubleshooting ESX and Connections & Ports in VI3.5) informing me that he had recently updated one of his documents.  Wow, he sure did.  Check this puppy out (click the graphic to download) … 

 

image

Pretty slick, eh?  Well it gets even better.  He also created a version using The Brain in HTML … http://www.virtualinsanity.com/esx-connections-and-ports/.  Nice!  This is definitely a bookmark I’ll be keeping handy and I’d recommend you do the same.

Good work Dudley!  Thanks for making it available for everyone.  If you agree, be sure to leave a “Thank You” comment for Dudley Smith.

image

I’m not sure if anyone noticed (or cared), but the blog was down over the weekend.  The reason is that I run my blog server out of my house and on Saturday we moved to a new neighborhood in Cincinnati.  And while virtualization can do wonders for server availability, it can’t do much when your servers are sitting in the back of a moving truck! :)  

Man, what an ordeal.  Moving is such a pain in the butt!!  Now, I’ve moved plenty of times in my life, but it’s been as a bachelor and I didn’t have a lot of stuff.  Now that I have a wife, son and two dogs … holy cow!!  I think I said the phrase “how in the world did we get so much crap?!” about 400 times over the past three for four days.  But now we’re all settled in to our new home in Hyde Park, I got my Time Warner Internet set up and I got my servers unpacked and powered on in the basement.

My wife and I are excited about this new area.  Hyde Park is a very cool place to be.  The picture above is of Hyde Park square and was taken in 1901.  Here’s a quick blurb from a Hyde Park website …

Hyde Park was named after Hyde Park in New York and is one of the 52 neighborhoods that make up Cincinnati located on the east side of the city accessible via I-71. It was once a suburb before being annexed by Cincinnati and is the wealthiest neighborhood in the city. It is home to stately, well-maintained homes with manicured lawns and tree-lined streets. The business district is called Hyde Park Square and offers an array of upscale restaurants, boutiques and galleries.

This morning I walked down to the Hyde Park square, which is about three blocks away from our new house, and got some coffee for my wife and I.  Last night, we walked to Beluga, my wife’s favorite sushi restaurant.  This was NOT possible in our old house because we were way out in the country.  Of course, this kind of convenience could be dangerous for the waistline, and the wallet! :)

Anyway, the blog is back online.  BUT, if you’ve been trying to email me at my sweemer.com or virtualinsanity.com email address, then I have not been getting your messages.  See, I *thought* I had ordered business grade Internet service from Time Warner which would allow me to run a mail server.  But evidently I got the consumer grade service, which blocks port 25.  I corrected the problem yesterday, but it will take a few days for the service to get upgraded.  So for the next few days, email still won’t work.  Until then, I can be reached at asweemer [at] gmail [dot] com.

 

Given my recent inactivity here and on Twitter, I feel the need to post some updates.  So, where have I been and what have been doing for the past few weeks?  I’ll start with the most recent drama, which should give you a good laugh.

Fractured Sternum

playland A few months ago, my wife and I decided to buy our four year old son an outdoor playset.  Our local Costco had the “Rainbow All-American Double Decker Playset” (pictured left) on sale, so we decided to buy it.  That was back in March.  Nearing the end of June, do you know where the playset is?  Still boxed up in our garage.  Some Dad I am. 

Anyway, on Monday I decided I was sick and tired of all the clutter in our garage.  Plus I was determined to get that playset built before the end of June.  So I decided to rearrange the garage and get the playset boxes ready to be moved to the back yard.

Before I continue, let’s take a quick look at the weight and dimension of these boxes …

Shipping Box Dimensions:

  • Box-1: 14 1/2” L x 11 1/4” W x 9” H: Approximately 40-lbs

  • Box-2: 22 1/2” L x 11 1/4” W x 9” H: Approximately 40-lbs

  • Box-3: 106” L x 24” W x 7” H: Approximately 210-lbs

  • Box-4: 106” L x 24” W x 7” H: Approximately 240-lbs

  • Box-5: 106” L x 24” W x 7” H: Approximately 195-lbs

  • Box-6: 106” L x 24” W x 7” H: Approximately 200-lbs

  • Slide: 115 1/2” L x 24 3/4” W x 16 3/4” H: Approximately 40-lbs

 

Well, as I was trying to be the big, bad, super dad and move Box #4 on my own … I had a little accident.  That’s right, I have a fractured sternum because a playset box fell on my chest!  How embarrassing.  My friends affectionately now call me “crash” and my wife will no longer allow me to go into the garage without first showing her my helmet is securely fastened.

The good doctor from the ER gave me some Vicodin for the pain, which has been very helpful.  But the side effect is that Vicodin makes me loopy, making it difficult to write.

VCDX

If you’re still reading then you’re probably questioning my level of intelligence (and I wouldn’t entirely blame you :)   So I figured I would try to redeem myself with an update on my VCDX progress.  Even though I have not yet posted all of my VCDX study notes, I actually took the VCDX admin exam a few weeks ago.  And I just found out last week that I passed!  Woooohoooo!  Now I’ve got to start preparing for the next test, the VCDX Design Exam.

 

VMworld2009

Looks like I’ll be speaking at VMworld2009.  So if you’re planning to attend this year’s event, be sure to say “hello.”  Just look for my shaved noggin’ wondering the halls (or the guy with the shiny helmet, hehehe).  Even better, as I’ll be the speaker of session DV3567, you can certainly find me at the following breakout session …

Session ID:
DV3567

Title:
Don’t throw that PC away! How to convert old PCs to Thin Clients using a thin Linux OS and VMware View Open Client.

Abstract:
More and more, companies are looking for additional ways to cut costs though virtualization. And it isn’t long before IT teams start exploring the possibility of a Virtual Desktop Infrastructure. But with desktops out numbering servers by a factor of 10:1 (or more), converting users to a virtual desktop can be technically challenging and a significant upfront expense. A potential solution to this problem is to convert existing PCs into Thin Clients, extending the life of the hardware and easing the transition into a VDI. This session will show IT professionals various ways to convert older PCs into Thin Clients, capable of connecting to a VMware VM hosted on ESX via the VMware View Manager.

RoR (Ruby on Rails) and other next generation frameworks

I like to think of myself as an amateur developer (though, even amateur developers might have a thing or two to say about that!! :)   I began programming in Perl about 10 years ago and since then I’ve dabbled in a number of different languages, like C++, Java and Ruby.  

About two years ago I was introduced to Ruby on Rails and since then, most of my development work has been with RoR.  Thus far, however, I haven’t posted anything on this blog about RoR.  Why?  Two reasons.  The apps I’ve written to date have absolutely nothing to do with VMware.  And second, like I said, I’m an amateur.  Anyone looking for RoR help and advice can probably find better info on actual RoR blogs.

But I’ve decided that this is about to change.  Most recently I’ve been working on a little RoR front end that will “drive” vSphere via SOAP.  So I certainly find that work relevant here.  Plus, if you think about it, Rails provides a level of abstraction and therefore, by definition, can be called a type of virtualization. 

So if you’re an RoR developer (or any other kind of next generation framework, for that matter), please let me know.  I’m interested in reading your blog, checking out your applications, sharing code, chatting about issues / concerns / challenges, etc.  Just post a comment or email me at asweemer [at] gmail [dot] com.

    OK, we are *almost* done getting our network set up properly for VDI, but we’ve got a few more things to do.  Specifically, we need to address:

  1. Handling our external, Internet facing, dynamic IP address.
  2. DHCP and DNS
  3. External VPN access

Dynamic External IP

Most ISP’s (not all) provide a single, dynamic IP address for consumer grade service (i.e. home use).  But when we’re trying to connect to our virtual desktops from somewhere out on the public Internet, how do we know which IP address to connect to?

Generally speaking, connecting to an IP address is bad practice because it’s inflexible.  Instead, we should connect to a Fully Qualified Domain Name (FQDN).  OK fine, so we’ll set up a DNS entry and use a FQDN to connect to our desktops.  But what happens when our dynamic IP address changes and the DNS entry is still mapped to the old IP address?

What we need is an external Dynamic DNS (aka DDNS) service which will allow us to programmatically update our IP address whenever it changes.  There are a number of both free and paid-for DNS providers out there that can deliver DDNS services.  Personally, I use EditDNS (www.editdns.net).  They have a ton of functionality and they’ve been rock solid for the past few years I’ve been using their services, so I’m quite happy with them.

Now, many home use routers these days have the capability to update a DDNS provider.  But in my experience, the functionality is somewhat limited.  What if, for example, I want aaron.sweemer.com and desktop.sweemer.com to be dynamic entries and www.sweemer.com to be a static entry, pointing to my blog server hosted somewhere else?  In reality, I’ve got about 20 FQDN’s that I need to be dynamically updated and about 100 that I want static.  So instead, I created a script that will:

  1. Query my external IP address (check this out, a free tool from Whatismyip.com)
  2. Compare the result of the query with the IP obtained from the previous query
  3. If the IP is the same, or contains something other than an IP (e.g. HTTP error), the scirpt exits.
  4. If the IP is different, the script updates my DDNS entries via the EditDNS API, then updates a log file documenting the change, and finally adds the new IP to the last line of a file called previous_ips.

If you’d like to use the script I wrote, you’ll first need to do the following:

  1. If don’t already have one, set up an account with EditDNS and make sure you have properly configured the domain name(s) you own.
  2. Verify your linux distro has lynx (a command line, text only, www client)
  3. Verify your linux distro has curl (a tool to transfer data using HTTP)
  4. Create a directory (anywhere you have rwx access is fine) for the script and its files to live
  5. In this directory, create a text file called editdns.sh.  Paste the content (below) into it.
  6. Replace XXXXXXXX with your EditDNS password.
  7. Make editdns.sh executable (chmod +x /path/to/editdns/editdns.sh)
  8. Create another text file called records and, one per line, enter the FQDN’s of the DDNS entries you wanted updated (e.g. mydesktop.mydomain.com)
  9. Add the editdns.sh script to your crontab to run at regular intervals (e.g. mine runs every five minutes and the entry in cron looks like this:    */5 * * * * cd /usr/local/editdns; ./editdns.sh)
    And here’s a copy of the actual script …

#!/bin/bash

EDITDNSPASS=”XXXXXXXX”
LYNX=`which lynx`
TIME=`date`
CIP=`curl -s http://www.whatismyip.com/automation/n09230945.asp | awk –re-interval ‘$1 ~ /^([0-9]{1,3}\.){3}[0-9]{1,3}$/ {print}’`
PIP=`tail -1 ./previous_ips`

if [ "$CIP" != "$PIP" && –n "$CIP" ]; then
cat ./records | while read FQDN; do
$LYNX -source “http://DynDNS.EditDNS.net/api/dynLinux.php?p=$EDITDNSPASS&r=$FQDN”
done
echo “IP Change!  New IP is $CIP.  Editdns.net was updated at $TIME.” >> ./editdns.log
echo $CIP >> ./previous_ips
else
exit 0
fi

If you have any issues or questions, feel free to email me.  Also, keep in mind, this a quick and dirty script that accomplishes what I want it to accomplish.  Feel free to make it more robust (e.g. error handling or better logging) to suit your needs.

DNS and DHCP

It’s quite possible you’ll want to skip this section and opt for setting up DHCP and DNS via Microsoft’s built in DHCP and DNS services that come out of the box with their server products.  To properly set up VMware View, we’ll need to set up Active Directory anyway, and quite frankly, it’s far easier to set up a Microsoft server with DHCP and DNS than it is to set up a Linux server.  So feel free to skip this section and leverage Microsoft for these services.  If, however, you’re a gluten for punishment then by all means, read on.

Let’s first start with DNS.  Here too we need Dynamic DNS because as we’re handing out IP addresses via DHCP, we want our DNS server to properly reflect current information as IP addresses change.  So, if you don’t already have bind9 (the DNS server), go ahead and install it (sudo apt-get install bind9 should work on Ubuntu / Debian distros).

The default configuration for bind9 is to act as a caching server, so the first thing we need to do is configure our DNS to forward all unknown DNS requests to another DNS server.  These should be provided to you from your ISP.  Edit the forwarders {} section of your named.conf.options file (usually located in /etc/bind/) to look like this …

asweemer@cincylab-rtr1:/etc/bind$ more named.conf.options
options {
directory “/var/cache/bind”;

forwarders {
1.2.3.4;
5.6.7.8;
};

auth-nxdomain no;    # conform to RFC1035
listen-on-v6 { any; };
};

Obviously, you’ll need to change 1.2.3.4 and 5.6.7.8 to the IP addresses given to you by your ISP.

Next, we need to modify our master named.conf to allow dynamic updates to DNS.  Add the following entry to the bottom of your named.conf file.

controls {
inet 127.0.0.1 allow {127.0.0.1; 192.168.9.25; 10.10.7.1; 10.10.7.2; } keys {“rndc-key”;};
};

This tells the DNS server to allow updates from the IP address located between the {}.  Notice the first three IP addresses are local IP addresses.  The fourth IP address is a slave DNS server, which I have yet to set up.  The rndc-key is the default key generated during installation of bind9 and it’s used to authorize the updating of DNS records.  If you’re using Ubuntu, then you’ll likely find the key in the file /etc/bind/rndc.key …

asweemer@cincylab-rtr1:/etc/bind$ sudo cat rndc.key
key “rndc-key” {
algorithm hmac-md5;
secret “QZ5jOmcr/OW3nzksR5q0Hw==”;
};
asweemer@cincylab-rtr1:/etc/bind$

Note the file is a text file named rndc.key, and the actual key is called rndc-key located within the text file.

OK, next we need to define our zones in the named.conf.local file.  For each domain you’re using (probably just one), you’ll need two entries:  one for the domain and one for the reverse lookup of the domain.  I have two domains I’ll be updating, so my named.conf.local file looks like this …

asweemer@cincylab-rtr1:/etc/bind$ cat named.conf.local
//
// Do any local configuration here
//

include “/etc/bind/rndc.key”;

zone “mydomain.com” {
type master;
file “/etc/bind/zones/mydomain.com.db”;
allow-update { key “rndc-key”; };
allow-transfer {10.10.7/24; };
};

zone “7.10.10.in-addr.arpa” {
type master;
file “/etc/bind/zones/rev.7.10.10.in-addr.arpa”;
allow-update { key “rndc-key”; };
allow-transfer {10.10.7/24; };
};

zone “dmz.mydomain.com” {
type master;
file “/etc/bind/zones/dmz.mydomain.com.db”;
allow-update { key “rndc-key”; };
allow-transfer {192.168.9/24; };
};

zone “9.168.192.in-addr.arpa” {
type master;
file “/etc/bind/zones/rev.9.168.192.in-addr.arpa”;
allow-update { key “rndc-key”; };
allow-transfer {192.168.9/24; };
};

A couple points to note here:

  • I created a subdirectory called “zones” under /etc/bind/ where I put all my zone files.  This isn’t the default location, and in addition, this isn’t necessary as the zone files can be located anywhere you’d like.  But be aware the configuration file above reflects the location of my files.
  • Notice the include “/etc/bind/rndc.key” on the first line and the all-update directive within each zone definition?  This should be self explanatory at this point.
  • The allow-transfer directive within each zone definition explicitly limits zone transfers (copy) to the IP(s) defined.  This is an important security feature since, by default, DNS allows transfers to anyone, and the info contained within a DNS zone file can really give hackers visibility into your network.

Now we need to create the zone files we just defined above, which will contain our actual DNS records.  Here is the zone file for our dmz.mydomain.com …

asweemer@cincylab-rtr1:/etc/bind/zones$ cat dmz.mydomain.com.db
$TTL 3600       ; 1 hour
dmz.mydomain.com             IN SOA  master.dmz.mydomain.com. root.master.dmz.mydomain.com. (
2009060514 ; serial
86400      ; refresh (1 day)
86400      ; retry (1 day)
2419200    ; expire (4 weeks)
3600       ; minimum (1 hour)
)
NS      master.dmz.mydomain.com.
NS      slave.dmz.mydomain.com.
A       192.168.9.25
MX      10 mail.dmz.mydomain.com.
MX      20 mail-spool.dmz.mydomain.com.
computer-1              A       192.168.9.247
TXT     “317bf41a2c5b70fd9ca4e283d364dcddd5″
computer-2              A       192.168.9.250
TXT     “00cf6242f693ebbf1d545159548e44ab81″
computer-3              A       192.168.9.243
TXT     “31a0cb7e096a96c63dc998d2db3be6e450″
mail                    A       192.168.9.25
mail-spool              A       192.168.9.26
master                  A       192.168.9.25
www                     CNAME   master

A couple important things to point out here:

  • The entries for computer-1, 2 and 3 are dynamic entries there were generated by the DHCP server.  The TXT record that follows these entries is a unique identifier which is also generated by the DHCP server and is used to ensure it won’t overwrite existing DNS records that were generated by another process/server.
  • You’ll obviously need to change the domain names and IP addresses to match your environment.
  • If you haven’t worked with bind9 before, this file probably looks pretty cryptic to you.  If so, I would recommend taking a look at http://www.zytrax.com/books/dns/ch8/soa.html, which gives a pretty good overview of the SOA (defined in the first part of the file).  The balance of the file (i.e. the record definitions) is pretty straight forward.

The reverse zone should look like this …

asweemer@cincylab-rtr1:/etc/bind/zones$ more rev.9.168.192.in-addr.arpa
$ORIGIN 9.168.192.IN-ADDR.ARPA.
$TTL    1h;
IN      SOA     master.dmz.mydomain.com. root.master.dmz.mydomain.com. (
2009060501      ;
1d              ;
1d              ;
4w              ;
1h              ;
)
IN      NS      master.dmz.mydomain.com.
IN      NS      slave.dmz.mydomain.com.
25      IN      PTR     master.dmz.mydomain.com.
26      IN      PTR     slave.dmz.mydomain.com.

I mixed it up just a bit in this file to point out a few different ways to configure a zone file.  In this file, notice the following differences:

  • The $ORIGIN directive sets the domain name to be appended to any unqualified records.  If the $ORIGIN directive doesn’t exist (as it doesn’t in the first config file), then it is implicitly defined by the zone name.
  • The time variables can be defined with d (day), w (week), h (hour), etc.

That’s about it for DNS.  Once you’ve got your bind9 server configured, restart your bind9 server (sudo /etc/init.d/bind9 restart).  And of course, be sure to test your configurations by using the standard DNS tools (e.g. dig, nslookup).  If you get errors, pay careful attention to your local syslog file (probably located at /var/log/syslog) as that’s where DNS and DHCP errors typically write their error messages.

OK, next up is configuring our DHCP server.  And once again, this post is starting to get way to long, so it looks like I’ll need a fourth and (hopefully) final post to this section.

I got up early to get some work done before driving down to KY to work with a customer on their VDI pilot.  As I was preparing for the meeting, I thought to myself, “wow with my studies for the VCDX admin exam, and the recently launch of vSphere, I haven’t done a whole lot of VDI recently.”  And then BAM!  It hit me like a ton of bricks.  I haven’t completed the E.T.D.F series I started almost 6 months ago!  If this blog has done anything for me, it has made me painfully aware of my numerous character flaws. <sniffle><tear><sniffle>

Anyway, not that anyone is following along anymore, and purely in the interest of self improvement, I’m determined to finish what I’ve started (both for this series and my VCDX study notes series).  So without further delay, here is the second part of the networking section (here is part one).  And for your convenience, here again is the Visio diagram of my lab.

Router / Firewall (cincylab-rtr1)

In my environment, the vast majority of all relevant network configurations are on cincylab-rtr1, which is really an old Gateway PC that I had lying around with a single 2.2GHz processor and 1Gig of RAM and a single 100Mbps NIC.  I installed Ubuntu server 8.04.1 (kernel 2.6.24-19-server) on it and made it the gateway between my lab and the DMZ (aka, my home network).

The first thing I needed to do was get the basic networking on the server set up.  I have three networks in my house …

  1. An external DMZ, VLAN 192 (aka, my home network)
  2. An internal “production” network, VLAN 10.  I put the word production in bunny ears because nothing is *really* production … it’s all just a lab.  But I try to protect this network a little more than the next.
  3. An internal “lab” network.  This is where I can really have fun!

Configuring the Interface(s)
The server only has one NIC and I was too lazy (and cheap) to go buy a new one.  But it’s a 1GigE card, which is plenty for my environment.  And it’s a snap to configure …

root@cincylab-rtr1:/etc/network# more interfaces

auto lo
iface lo inet loopback

auto vlan10
auto vlan192
auto vlan10:1

iface vlan192 inet static
address 192.168.9.25
netmask 255.255.255.0
gateway 192.168.9.1
mtu 1500
vlan_raw_device eth1

iface vlan10 inet static
address 10.10.7.1
netmask 255.255.255.0
broadcast 10.10.7.255
network 10.10.7.0
mtu 1500
vlan_raw_device eth1

iface vlan10:1 inet static
address 10.0.1.1
netmask 255.255.255.0
broadcast 10.0.1.255
network 10.0.1.0
mtu 1500
vlan_raw_device eth1
root@cincylab-rtr1:/etc/network#

Turn on Routing

Now that the interfaces are configured, we need to turn on routing.  In Linux, this can be accomplished a couple different ways.  The easies, IMHO, is to simply edit the /etc/sysctl.conf file and set net.ipv4.ip_forward=1.  You could also add echo 1 > /proc/sys/net/ipv4/ip_forward to your /etc/rc.local file.  Either way should turn on IPv4 routing on your server.

Configure NAT and PAT (Port Address Translation)

Once routing is turned on, we need to set up Network Address translation and Port Address Translation.  This needs to be done for two reasons.

  1. My lab networks need outside access to the Internet and they have private IP addresses.
  2. My server has a single IP address in the DMZ, which needs to serve as the gateway IP for multiple internal IP’s and TCP/UDP ports.  As an example, I want all traffic arriving on 192.168.9.25, TCP port 8080 to be forwarded to the internal IP 10.10.8.51 port 80.  And more specifically, here’s what I want available to the outside world …
    • 192.168.9.25:8080 –> 10.10.7.51:80
    • 192.168.9.25:8181 –> 10.10.7.51:443
    • 192.168.9.25:8282 –> 10.10.7.50:80
    • 192.168.9.25:8383 –> 10.10.7.50:443
  3. OK, so how do we do this?   We need configure iptables.  An iptables tutorial is out of scope for this post, but if you’d like to learn more about Linux IP tables, I personally like this one:  http://www.yolinux.com/TUTORIALS/LinuxTutorialIptablesNetworkGateway.html.

To set up iptables, I’ve created a file called fw_rules in /usr/local/bin and made it executable (chmod +x /usr/local/bin/fw_rules).  Here is what the file looks like.

root@cincylab-rtr1:/usr/local/bin# cat fw_rules
#!/bin/bash
iptables -t nat -F
iptables -t filter -F

iptables -t nat -A PREROUTING -p tcp -i vlan192 -d 192.168.9.25 –dport 8080 -j DNAT –to 10.10.7.51:80
iptables -t nat -A PREROUTING -p tcp -i vlan192 -d 192.168.9.25 –dport 8181 -j DNAT –to 10.10.7.51:443
iptables -t nat -A PREROUTING -p tcp -i vlan192 -d 192.168.9.25 –dport 8282 -j DNAT –to 10.10.7.50:80
iptables -t nat -A PREROUTING -p tcp -i vlan192 -d 192.168.9.25 –dport 8383 -j DNAT –to 10.10.7.50:443

iptables -t filter -P INPUT ACCEPT
iptables -t filter -P OUTPUT ACCEPT
iptables -t filter -P FORWARD ACCEPT

root@cincylab-rtr1:/usr/local/bin#

To make these changes persistent across reboots, you’ll need to add /usr/local/bin/fw_rules to your /etc/rc.local file.

Now, for all you linux experts out there looking at this file, you’re probably saying “uh, that’s a pretty insecure firewall you got there!.”  And you’d be right :)   Remember, this is merely an internal firewall/router which is protected by a much more secure, Internet facing, Cisco ASA (thanks again to the local Cisco team!!).  It’s also this Cisco that forwards outside connections on specific ports (not 8080, 8181, and 8282, for additional security) to IP 192.168.9.25.  And because of this, my goal for this server isn’t to protect, but to separate my lab networks from my home network, and proxy the connections between them.

What’s Next?
We’ve configured our networks, turned on routing and configured NAT / PAT on our server.  What next?  Three things:

  1. Because my external IP is dynamic, we need to set up a script that will periodically check to see if our external IP has changed and, if so, update our dynamic DNS service.
  2. Configure DHCP and DNS.
  3. Set up external VPN access.

Step three is actually optional because when we’re done, we’ll be able to tunnel via SSL to our desktop.  And from our desktop, we’ll have full access to the local LAN.  But sometimes full remote access via VPN is nice without being forced to first “hop” to another desktop.  So, I’ll include my VPN configuration as well.

But for now, it looks like I’m going to need have a part three of this “setting up the network and dedicated remote access” section, because I need to get on the road down to KY.  But if you’re interested, look for part three later today.  I’m almost done with it and will try to finish it during my lunch break.

I haven’t had much exposure to KVM yet, so over the weekend I decided to check it out on my laptop.  After playing around with it a bit, I needed to power on an instance of VMware Workstation, and the following error popped up on my screen …

The virtualization capability of your processor is already in use.  Disable any other running hypervisors before running VMware Workstation.

Well that makes sense.  So I uninstalled KVM and the issue was resolved … or so I thought.  This morning as I powered on another instance of VMware Workstation, I got the same error again.  Hmmmm.  That was a bit more confusing because, to my knowledge, KVM was completely removed from my system.  Again, so I thought.  But a quick look at the currently loaded kernel modules revealed both the kvm_intel and the kvm modules.

As it turns out, when you remove KVM via apt-get (meaning, this *could* be a debian / ubuntu issue, not sure if other package managers do the same thing), it doesn’t actually completely remove itself.  The kvm and kvm_intel modules not only remain, but they continue to get loaded upon startup.  When I removed the modules, my VMware Workstation powered on without issue.

So then, that voice inside of my head — the one I should NEVER listen to — said “I wonder what happens when you load the KVM modules after you’ve powered on your VM?”  I *knew* it could only lead to bad things, but I couldn’t help myself.  Guess what?  Not only did VMware Workstation completely freeze, but now I can’t power on my VM, no matter what I try.   Grrrr.  I swear, someday I’ll be a news headline that reads … An eyewitness confirms his last words were, “I wonder what this button does?”

Anyway, if you get this error, simply check for the kvm modules (lsmod | grep kvm should do the trick).  Simply removing the modules will fix the issue.

Frequently customers have specific NICs (like onboard NICs) that they’d like assigned to the COS, leaving the other NICs for VM traffic.  This is difficult, however, when using our automated kickstart deployment scripts as there is no way to explicitly define the vmnic assigned to the COS.  And to make matters worse, the VMkernel is not yet available to us during the %post section of the kickstart script, which makes COS networking configuration difficult! Recently I had a customer who was getting frustrated because …

  1. They would “rack and stack” a physical server and wire up their NICs accordingly (i.e. onboard NICs on the management VLAN, remaining NICs on production VLANs)
  2. PXE boot the server
  3. When kickstart completed, they’d lose connection to the COS.

This happens because during installation, ESX just assigns vmnic0 to the lowest PCI number, and then assigns vmnic0 to the COS. And this is often not the NIC the admin wants used for their COS. Of course, they could go back after the fact and reconfigure the COS networking, but this kind of defeats the purpose of a completely hands-free, automated deployment.

Here is one possible solution to the problem.  Below is a script I wrote to append to the %post section of a kickstart file.  Obviously, you’ll need to make modifications for your environment.

## This script should be appended to the %post section of an ESX kickstart file.
## For more info on kickstart and scripted ESX installations, see Appendix B of
## http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_installation_guide.pdf

##
##
Essentially, this is a “script that creates a script.” Because the VMkernel is
## not yet available to us during the %post section of the scripted install, we use
## %post to generate a script called /tmp/post_esx_install.sh that will launch via
## rc.local upon first boot (and only first boot).
##
## The post_esx_install.sh will first make a backup copy of esx.conf and then
## reconfigure the COS networking.  Please see the in-line comments below for
## tweaking post_esx_install.sh for your environment.
##
## If you have any questions, please email aaron [at] sweemer [dot] com.

%post

cat > /tmp/esx_post_install.sh << EOF
#!/bin/bash
cp /etc/vmware/esx.conf /etc/vmware/esx.conf.backup
/usr/sbin/esxcfg-vswitch -U vmnic0 vSwitch0
/usr/sbin/esxcfg-vswif -d vswif0

## If your kickstart file has vmportgroup=1, you *might* want to uncomment the
## next line

## /usr/sbin/esxcfg-vswitch -D “VM Network”

/usr/sbin/esxcfg-vswitch -A “VMkernel” vSwitch0

## You’ll need to find which physical NICs you want assigned to your COS.  From
## the command line of an already installed ESX server, execute
## “/usr/sbin/esxcfg-nics -l” as root and look for something unique about the
## NICs.  For example, this could be the word “Broadcom” or it could be the
## actual PCI number.  In the next line, replace “search term” with this
## text.

/usr/sbin/esxcfg-nics -l | awk ‘\$0 ~ /search term/ {print \$1}’ | xargs –n 1 /usr/sbin/esxcfg-vswitch vSwitch0 –L

## Note: if you want to test the line above from the command-line, you’ll need
## to remove the leading “\” in front of $0 and $1. The \’s need to be here so
## the esx_post_install.sh script gets properly written by kickstart. But when
## executing directly on a command line, the \’s need to be removed.

## Replace the x.x.x.x after -i with the IP address and after -n with the
## subnet mask for your COS.

/usr/sbin/esxcfg-vswif -a vswif0 -p “Service Console” -i x.x.x.x  -n x.x.x.x

## Replace the x.x.x.x after -i with the IP address and after -n with the subnet
## mask for your VMkernel port group.

/usr/sbin/esxcfg-vmknic -a -i x.x.x.x -n x.x.x.x VMkernel

## Replace x.x.x.x with the default gateway for the COS in both of the next two lines.
route add default gw x.x.x.x
echo “GATEWAY=x.x.x.x” >> /etc/sysconfig/network

mv /etc/rc.d/rc.local.save /etc/rc.d/rc.local
EOF

chmod +x /tmp/esx_post_install.sh
cp /etc/rc.d/rc.local /etc/rc.d/rc.local.save

cat >> /etc/rc.d/rc.local << EOF
cd /tmp/
/tmp/esx_post_install.sh
EOF

As an example, in my environment I have server with 4 NICs and by default, ESX assigns vmnic0, which is mapped to PCI 02:00.00, to the service console. However, what is actually physically wired to my management network is vmnic3, which is mapped to PCI 02:03.00.  In the script above, I simply searched for the number 3 (i.e. replaced search term with 3) and now my scripted ESX installation works properly.

Below is the configuration of my server before I redeployed with kickstart.  The line in red is the NIC I want assigned to the COS.  The lines in black are what ESX assigns the COS by default.

BEFORE (without %post section)


[root@vesx7 root]# esxcfg-nics -l
Name    PCI      Driver      Link Speed    Duplex MTU    Description
vmnic1  02:01.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM

vmnic2  02:02.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM
vmnic3  02:03.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM
vmnic0  02:00.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM


[root@vesx7 root]# esxcfg-vswitch -l
Switch Name    Num Ports   Used Ports  Configured Ports  MTU     Uplinks

vSwitch0       64          4           64                1500    vmnic0

PortGroup Name      VLAN ID  Used Ports  Uplinks
VM Network          0        0           vmnic0

Service Console     0        1           vmnic0

Now, here is the same output after I redeployed the server with my modifications to the %post section of the kickstart file. The scripted deployment of ESX now properly assigns vmnic3 to my service console.

AFTER (with %post section)

[root@vesx7 root]# esxcfg-nics -l
Name    PCI      Driver      Link Speed    Duplex MTU    Description
vmnic1  02:01.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM
vmnic2  02:02.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM
vmnic0  02:00.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM
vmnic3  02:03.00 e1000       Up   1000Mbps Full   1500   Intel Corporation 82545EM

[root@vesx7 root]# esxcfg-vswitch -l
Switch Name    Num Ports   Used Ports  Configured Ports  MTU     Uplinks

vSwitch0       64          5           64                1500    vmnic3

PortGroup Name      VLAN ID  Used Ports  Uplinks
Production          0        0           vmnic3

Service Console     0        1           vmnic3

I hope this was helpful.  Let me know if you have any questions.

Well, I’d better sign off and start packing because I leave for Omaha, NE in a few hours.



I’m trying to get these out faster.  But I’m finding it’s a pain to compile, tweak and reformat my notes so they make sense and look right on the blog.  Here’s the next in the series.

Objective 1.4 – Implement and manage Storage VMotion.

Knowledge

Describe Storage VMotion operation

I like to think of Storage VMotion as the “inverse” of VMotion.  Instead of moving (live) the front end (i.e. CPU, memory, and network) from one physical device to another, Storage VMotion moves (live) the back end (i.e. disk) from one physical device to another.

The following was taken directly from the VMware.com website and acurately describes how a Storage VMotion works.

 

image

  1. Before moving disk files, Storage VMotion creates a new virtual machine home directory for the virtual machine in the destination datastore.

  2. Next, a new instance of the virtual machine is created. Its configuration is kept in the new datastore.
  3. Storage VMotion then creates a child disk for each virtual machine disk that is being moved to capture a copy of write activity, while the parent disk is in read only mode.
  4. The original parent disk is copied to the new storage location.
  5. The child disk is re-parented to the newly copied parent disk in the new location.
  6. When the transfer to the new copy of the virtual machine is completed, and the original instance is shut down. Then, the original virtual machine home is deleted from VMware vStorage VMFS at the source location.

Explain implementation process for Storage VMotion

For ESX 3.5, Storage VMotion is a command-line only implementation.  There are third party GUI plugins to the VI client that I have used and work really well, but they are of course not supported by VMware.  And for the purposes of this post, I would imagine the VCDX exam will stick to VMware only supported implementations.

To execute a Storage VMotion, you’ll need the RCLI (Remote Command Line Interface) from VMware.com.  There is an RCLI for both Windows and Linux, and there’s an RCLI virtual appliance too.  Take your pick, and then be sure to review the Remote Command-Line Interface Installation and Reference Guide for more info on RCLI.  But specifically for Storage VMotion, the RCLI command comes in two flavors (from the guide):

 

To use the command in interactive mode, type svmotion –interactive. You are
prompted for all the information necessary to complete the storage migration.
When you invoke the command in interactive mode, all other parameters are
ignored.

In noninteractive mode, the svmotion command uses the following syntax:

svmotion [standard Remote CLI options] –datacenter=<datacenter name>
–vm <VM config datastore path>:<new datastore>
[--disks <virtual disk datastore path>:<new datastore>,
<virtual disk datastore path>:<new datastore>]

 

    Identify Storage VMotion use cases

There are a number of reasons that I can think of where  you’d want to use Storage VMotion.  Some of the more obvious reasons (at least to me, anyway) would be:

  1. Array maintenance
  2. Migrating to newer or different (i.e. FC, iSCSI, NAS) hardware
  3. Adding new storage
  4. To achieve optimal distribution of storage consumption across LUNs
  5. To resolve storage bottlenecks and other performance issues

I’m sure if I thought about it some more, I could come up with a few more.  But I think this is a decent list.

Understand performance implications for Storage VMotion

There are a couple things that occur during a Storage VMotion that could affect performance, and therefore need to be considered when planning to move storage. 

  1. During the Storage VMotion, the VM actually does a “Self VMotion,” meaning the VM is VMotioned to the same host it’s already running on (review step #2 in the graphic above).  And therefore during this time, there is temporarily twice the amount of memory consumed.
  2. During the move, extra disk space is required on the source volume while all disk writes are redirected to the snapshot disk.
  3. All disk I/0 for the copy (i.e. read from the source volume, write to the snapshot disk and write to the destination volume) is going through the VMkernel of the host.

 

Skills and Abilities

Use Remote CLI to perform Storage VMotion operations

  • Interactive mode
    This is pretty easy.  Just enter svmotion –interactive at the command line and then follow the prompts.
  • Non-interactive mode
    In my environment, the command
    to move all the virtual disks associated with aaron-corp-xp (the name of an actual VM I use) from vol1 to vol2, would look like this

svmotion –url=https://10.10.8.60/sdk –username=aaron –password=<yeah right> –datacenter=cincylab –vm=’[vol1] aaron-corp-xp/aaron-corp-xp.vmx: vol2’

Ugh.  My brain hurts.  I’ve spent the past few hours reviewing scripted ESX installations and working on a PowerShell script for a customer that will reorder vmnics after a scripted installation is complete (because I can’t find any other way to force their order during the install).  It’s been a few months since I’ve done a scripted installation, so I definitely needed a refresher.  Plus, according to the VMware Enterprise Administration Exam Blueprint v3.5, section 8.1 is all about automating ESX deployments.  The good news is that section 8.1 is the last section of the blueprint, so I believe I’m almost done preparing for the VCDX Admin Exam, which I’m scheduled to take in a few days. 

Anyway, going back to the beginning of the Blueprint, and continuing from where I left off, here is the next section of my study notes.

Objective 1.3 – Troubleshoot Virtual Infrastructure storage components.

Knowledge

Identify storage related events and log entries.  Analyze storage events to determine related issues.

All storage related events will be recorded in the /var/log/vmkernel log file.  Most of the messages in this log file are fairly cryptic and can be difficult to interpret.  Furthermore, this log file contains all messages from the vmkernel, not just storage related messages, so you’ll have to filter through it.  An easy way to do this is simply to search for SCSI.  For example, the command cat /var/log/vmkernel | grep SCSI on one of my servers produces the following output (only showing the last 10 lines) …

[root@cincylab-esx3 root]# cat /var/log/vmkernel | grep SCSI | tail -10
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1069)iSCSI: login phase for session 0x8603f90 (rx 1071, tx 1070) timed out at 23051576, timeout was set for 23051576
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1071)iSCSI: session 0x8603f90 connect timed out at 23051576
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1071)<5>iSCSI: session 0x8603f90 iSCSI: session 0x8603f90 retrying all the portals again, since the portal list got exhausted
May 18 12:57:47 cincylab-esx3 vmkernel: 2:16:01:55.748 cpu2:1071)iSCSI: session 0x8603f90 to iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1 waiting 1 seconds before next login attempt
May 18 12:57:48 cincylab-esx3 vmkernel: 2:16:01:56.748 cpu2:1071)iSCSI: bus 0 target 0 trying to establish session 0x8603f90 to portal 0, address 10.10.8.200 port 3260 group 1
May 18 12:58:00 cincylab-esx3 vmkernel: 2:16:02:09.355 cpu2:1071)iSCSI: bus 0 target 0 established session 0x8603f90 #3, portal 0, address 10.10.8.200 port 3260 group 1
May 18 12:58:01 cincylab-esx3 vmkernel: VMWARE SCSI Id: Supported VPD pages for vmhba35:C0:T0:L0 : 0×0 0×80 0×83 
May 18 12:58:01 cincylab-esx3 vmkernel: VMWARE SCSI Id: Device id info for vmhba35:C0:T0:L0: 0×1 0×1 0×0 0×18 0×42 0×55 0×46 0×46 0×41 0x4c 0x4f 0×0 0×0 0×0 0×0 0×0 0×1 0×0 0×0 0×0 0×0 0×0 0×0 0×0 0×2 0×0 0×0 0×0 
May 18 12:58:01 cincylab-esx3 vmkernel: VMWARE SCSI Id: Id for vmhba35:C0:T0:L0 0×20 0×20 0×20 0×20 0×56 0×49 0×52 0×54 0×55 0×41 
[root@cincylab-esx3 root]#

If you look closely, I clearly had some issues with my iSCSI appliance a few hours ago.  I decided make some configuration changes to the switch and then, all of a sudden, the ESX server lost connectivity to its storage.  Weird! :)

Anyway, what does all this mean?  There’s an really good VMworld Europe 2008 presentation (which you can get from www.vmworld.com) titled VI3 Advanced Log Analysis, which goes into detail about how to interpret VMware log files.  From that presentation, I found this diagram which describes the components of a message in the vmkernel log file.

image

 

Skills and Abilities

Verify storage configuration and troubleshoot storage connection issues using CLI , VI Client and logs

  • Rescan events
    A rescan event can be initiated with the esxcfg-rescan at the command line.  The output should look like the following …
  • [root@cincylab-esx3 root]# esxcfg-rescan vmhba32
    Rescanning vmhba32 …
    On scsi1, removing: 0:0.
    On scsi1, adding: 0:0.
    Done.
    [root@cincylab-esx3 root]# cat /var/log/vmkernel | grep SCSI | tail -3
    May 18 19:13:09 cincylab-esx3 vmkernel: VMWARE SCSI Id: Supported VPD pages for vmhba32:C0:T0:L0 : 0×0 0×80 0×83 
    May 18 19:13:10 cincylab-esx3 vmkernel: VMWARE SCSI Id: Device id info for vmhba32:C0:T0:L0: 0×2 0×0 0×0 0×18 0x4c 0×69 0x6e 0×75 0×78 0×20 0×41 0×54 0×41 0x2d 0×53 0×43 0×53 0×49 0×20 0×73 0×69 0x6d 0×75 0x
    May 18 19:13:10 cincylab-esx3 vmkernel: VMWARE SCSI Id: Id for vmhba32:C0:T0:L0 0×36 0×52 0×58 0×36 0x4a 0×39 0×39 0×58 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×20 0×47 0×42 0×30 0×31 0×36 0x
    [root@cincylab-esx3 root]#


  • Failover events
    I don’t have redundant paths in my lab to simulate this.  So, again from the VMworld presentation, VI3 Advanced Log Analysis, here is a screen shot from the slide that covers this topic.

image

There will obviously be a lot of different types of error and event messages in /var/log/vmkernel.  And I’m certainly not going to try and list every possible combination here.  I highly suggest you download the VMworld preso because it does a great job of explaining how to further decipher the log files (like defining SCSI error codes). 

Well, that’s about it for this section.  Back to PowerShell scripting for another hour or so.

Last week I was in San Francisco with most (maybe all) of the VMware field technical folks for a three day technical summit.  One of the evenings we had an awards ceremony and dinner.  And guess what?  The first eight VCDX certifications ever to be awarded were announced.

Now, VMware is a pretty big company, so I didn’t recognize seven of the eight names.  But I definitely recognized one of them.  Well, that is, I should say I recognized his name.  Having never officially met him fact to face, I couldn’t pick him out of a crowd of two.  You might know him as the rock star blogger from Yellow Bricks.  Congratulations Duncan Epping!  I believe he said he is VCDX number seven and the first VCDX in Europe.  Very cool.  And well deserved, for sure.  :)

I’m a little behind Duncan.  I’m scheduled to take the Admin Exam later this month, which is the first of two exams.  Then I’ll have to present and defend a successful design and deployment before a jury … er, I mean, panel of my peers.

Anyway, here are my notes for section 1.2 of the VMware Enterprise Administration Exam Blueprint v3.5. (Section 1.1 can be found here).  Everything in Blue is a direct cut and paste from the exam blueprint.

Objective 1.2 – Implement and manage complex data security and replication
configurations.

Knowledge

Describe methods to secure access to virtual disks and related storage devices

  • Distributed Lock Handling

    vmfs_dfl
    In the graphic below, notice how each ESX server sees and has access to the same LUN? This is achieved via VMFS, a clustered file system which leverages distributed file locking to allow multiple hosts to access the same storage.  When a Virtual Machine is powered on, VMFS places a lock on its files, ensuring no other ESX server can access them.

Identify tools and steps necessary to manage replicated VMFS volumes

  • Resignaturing
    First, there’s a really good article on VMFS resignaturing by Duncan (go figure).  Also, Chad Sakac over at Virtual Geek has a great article too.  I’m not going to reinvent the wheel, so make sure you read their posts.  You’ll need to understand this.  For the exam, you’ll certainly need to know the following …

    The following is from the Fibre Channel SAN Configuration Guide:

EnableResignature=0, DisallowSnapshotLUN=1 (default)
In this state:

  • You cannot bring snapshots or replicas of VMFS volumes by the array into the ESX Server host regardless of whether or not the ESX Server has access to the original LUN.
  • LUNs formatted with VMFS must have the same ID for each ESX Server host.

EnableResignature=1, (DisallowSnapshotLUN is not relevant)
In this state, you can safely bring snapshots or replicas of VMFS volumes into the same servers as the original and they are automatically resignatured.

    EnableResignature=0, DisallowSnapshotLUN=0
    This is similar to ESX 2.x behavior.  In this state, the ESX Server assumes that it sees only one replica or snapshot of a given LUN and never tries to resignature. This is ideal in a DR scenario where you are bringing a replica of a LUN to a new cluster of ESX Servers, possibly on another site that does not have access to the source LUN. In such a case, the ESX Server uses the replica as if it is the original.

Do not use this setting if you are bringing snapshots or replicas of a LUN into a server
with access to the original LUN. This can have destructive results including:

  • If you create snapshots of a VMFS volume one or more times and dynamically
    bring one or more of those snapshots into an ESX Server, only the first copy is
    usable. The usable copy is most likely the primary copy. After reboot, it is
    impossible to determine which volume (the source or one of the snapshots) is
    usable. This nondeterministic behavior is dangerous.
  • If you create a snapshot of a spanned VMFS volume, an ESX Server host might
    reassemble the volume from fragments that belong to different snapshots. This can
    corrupt your file system.

Skills and Abilities

Configure storage network segmentation

  • FC Zoning
    Zoning delivers access control in the SAN, restricting visibility to devices in the zone solely to other members of that zone.  It is a common technique used to do things like group ESX servers into production/test/dev, increase security and decrease traffic, among other things.
  • iSCSI/NFS VLAN
    Storage segmentation for IP storage can be accomplished in one of two ways:  VLANs or physical segmentation (i.e. separate layer 2 switches for storage).

Configure LUN masking

The Disk.MaskLUNs parameter should be used when you’re trying to mask specific LUNs to your ESX host.  This is a useful option when you don’t want  your ESX server to access a particular LUN, but are unwilling (or unable) to configure your FC switch.

To configure LUN masking in the VI Client go to Configuration –> Advanced Settings for the host you want to configure. You’ll find the Disk.MaskLUNs parameter under the section Disk.  It looks like this in my VI Client.

disk.maskluns

Enter a value in the following format … <adapter>:<target>:<comma separated LUN range list>. Be sure to rescan when your done and verify the Mask has been properly applied.

Use esxcfg-advcfg
This one’s easy.  Just use the man page (type “man esxcfg-advcfg” at the command prompt).  It’ll tell you everything you need to know :)

Set Resignaturing and Snapshot LUN options
So, following along with the man page above, here is a cut and paste from my server …


[asweemer@cincylab-esx3 config]$ su -
Password:
[root@cincylab-esx3 root]# esxcfg-advcfg -s 0 /LVM/EnableResignature
Value of EnableResignature is 0
[root@cincylab-esx3 root]# esxcfg-advcfg -s 1 /LVM/EnableResignature
Value of EnableResignature is 1
[root@cincylab-esx3 root]#
[root@cincylab-esx3 root]# esxcfg-advcfg -s 0 /LVM/DisallowSnapshotLun
Value of DisallowSnapshotLun is 0
[root@cincylab-esx3 root]# esxcfg-advcfg -s 1 /LVM/DisallowSnapshotLun
Value of DisallowSnapshotLun is 1
[root@cincylab-esx3 root]#


Manage RDMs in a replicated environment
RDMs can be created via the CLI with the following command …

vmkfstools -r /vmfs/devices/disks/vmhbaX:Y:Z:0 my-vm.vmdk

By default, the RDM will be created in Virtual Compatibility Mode.  But should you need and/or prefer Physical Compatibility Mode, you can change this by editing the VMDK file and changing the createType value to vmfsPassthroughRawDeviceMap.

Use proc nodes to identify driver configuration and options
The proc filesystem is a pseudo filesystem, it’s not “real.”  It consumes no storage space and is used to access process information from the kernel.  You’ll find quite a bit of valuable data and configuration options in the many subdirectories of /proc/vmware/config.  Here’s a quick example from my ESX server …


[asweemer@cincylab-esx3 LVM]$ pwd
/proc/vmware/config/LVM
[asweemer@cincylab-esx3 LVM]$ ls
DisallowSnapshotLun  EnableResignature
[asweemer@cincylab-esx3 LVM]$ cat EnableResignature
EnableResignature (Enable Volume Resignaturing) [0-1: default = 0]: 0
[asweemer@cincylab-esx3 LVM]$


Use esxcfg-module

Just like esxcfg-adv, use the man page.


Since I was in a meeting during the launch of vSphere 4 on April 22nd, and since I found myself wide awake at 3AM, I decided to to watch the recording of the webcast early this morning. And as I was watching, I heard Steve Herrod (VMware CTO) make the following statement …

… So if you’re an existing customer today and you have a 100 host deployment using our vi3.5 product, simply upgrading the software will save you $2 Million dollars a year …

Wow, that’s pretty powerful.  In an age when words like costly, frustration, and BSOD’s (Blue Screen of Death) are often associated with software upgrades, it’s no wonder many companies are taking an “if it ain’t broke, don’t fix it” approach.  But here’s a software upgrade that, if for no other reason, should be considered purely for economic reasons.

How can VMware make such a bold claim?  Steve’s statement was based on the following efficiencies you’ll achieve with vSphere4 (over and above what you’re already seeing with VI3):

30% Greater Consolidation

Most people know that you’ll get the greatest VM density with VMware due to superior technologies like memory over commitment and Distributed Resource Scheduling.  But did you know that VM density is a critical metric when determining TCO?  There’s a great blog post over at VMware:  Virtual Reality which goes into detail.  But the following graphic sums it up pretty well …

cost_per_vm

Simply put, with VMware you’ll have less physical servers to buy, less network and storage connections, less floor space and less power and cooling to support your virtual infrastructure … AND you’ll have superior functionality like VMotion, DRS, HA, etc.

If you’re an existing VMware customer, then you are already benefiting from the efficiencies afforded to you by VI3.  And upgrading to vSphere4 is going to give you even greater efficiencies, allowing you to achieve an even greater VM density, as well as capture a greater number of high I/O applications that were previously considered non VM candidates.  The following table taken from the webcast summarizes these performance improvements in vSphere 4.

vsphere_performance

50% Storage Savings

There are over 150 new features in vSphere 4.  One of the more exciting features is Thin Provisioning.  This feature is already included VMware’s virtual desktop offering, VMware View, and I have a blog post about storage savings with View if you’re looking for more technical detail.  But for this post, know that the technology has been applied to vSphere 4 and allows for significant storage savings.

Basically, Thin Provisioning allows for the VM to consume no more space than the data requires.  So, for example, if you have VM with a 100G virtual drive but only 20G of data within the virtual drive, then only 20G will actually be consumed.  When applied across all your VMs, you’ll achieve economies of scale and you’ll likely see a 50% reduction in storage, if not more.

20% Power Savings with Distributed Power Management

What is Distributed Power management (DPM)?  Steve Harrod calls it “VM Tetris” or “Server Defrag,” which I thought was clever.  During low server utilization, DPM will intelligently VMotion workloads down to the smallest number of acceptable physical servers and then power off the unused servers.  As traffic increases during peak hours, DPM will power on the servers and again redistribute the workloads with VMotion.

Distributed Power Management isn’t new as it was introduced over a year ago in VI3.  However, up until vSphere 4, this feature  was only experimentally supported.  And with the lack of full support in VI3, I don’t believe many customers actually used DPM.  But VMware supports DPM in vSphere 4, assuming your hardware has IPMI, WOL or iLO.  And it can deliver significant savings in you power and cooling costs.  Plus you get the added bonus of doing your part to save the environment.

Here is an awesome video some of the VMware engineers created showing DPM in action …

At this point, you’re probably asking the following questions …

  • Sounds great, but how much is it going to cost me?  Nothing.  Your software maintenance covers like-for-like upgrades.  So, if you have VI3 Enterprise, then you can upgrade to vSphere 4 Enterprise at no additional cost.
  • Is it a difficult process to upgrade?  Will it require massive configuration changes?  Nope.  The upgrade is actually rather simple.  I upgraded my three lab VI3 servers to vSphere 4 in under an hour with no downtime of any of my VMs.  Basically, Update Manager handled just about everything for me.
  • Do I get anything else with vSphere?  Heck yeah!  Remember, there are 150 new features in vSphere 4, which I’m sure I’ll address in future posts.  I only addressed the ones that will save you money.

So let’s see if I can summarize this properly … a zero cost, easy upgrade = 30% Greater Consolidation + 50% Storage Savings + 20% Power Savings.  To me, that’s a no brainer.  What other software company in the world offers that kind of value?

I finally got a chance to sit down and reformat some of my notes for the VCDX Admin Exam.  Below are my notes for Section 1.1 of the VMware Enterprise Administration Exam Blueprint v3.5.  Everything in Blue is a direct cut and past from the exam blueprint.

Oh, and thanks to the Disqus comment from VirtualizationTeam (Blog), letting me know that Peter van den Bosch has a more recent version of his VMware Enterprise Administration Exam Study Guide 3.5

 

Section 1 – Storage

Objective 1.1 – Create and Administer VMFS datastores using advanced techniques.

Knowledge

Describe how to identify iSCSI, Fibre channel, SATA and NFS configurations using CLI commands and log entries

Here are a few command line examples that I believe would work well …

1)  esxcfg-mpath –l
This command produces the following output on my server:

 

[root@cincylab-esx3 root]# esxcfg-mpath -l

Disk vmhba0:0:0 /dev/sdb (152627MB) has 1 paths and policy of Fixed

Local 0:31.2 vmhba0:0:0 On active preferred

Disk vmhba32:0:0 /dev/sda (152627MB) has 1 paths and policy of Fixed

Local 0:31.2 vmhba32:0:0 On active preferred

Disk vmhba35:0:0 /dev/sdc (923172MB) has 1 paths and policy of Fixed

iScsi sw iqn.1998-01.com.vmware:cincylab-esx3-1d029e5f<->iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1 vmhba35:0:0 On active preferred


2)  esxcfg-info –s

The –s flag will narrow the scope of the output to just storage and disk related info.  But even with the narrowed scope, this command produces way too much output to be displayed here.  You’ll likely want to pipe the output into grep, or at a minimum to a more/less to get what you’re looking for.


3)  cat /var/log/vmkernel | grep vmhba | tail –10

This will search the vmkernel log file and display the last 10 lines containing the text vmhba.  If you want more (or fewer lines) change the –10 to whatever suits your needs.

If found this one particularly useful when you’ve enabled the software iSCSI initiator at the command line, but don’t know yet number has been assigned to the vmhba (e.g. vmhba35). 

4)  esxcfg-vmhbadevs –m  and  ls –lah /vmfs/volumes

The command esxcfg-vmhbadevs –m will show the mapping between vmhba numbers, device files and their UUIDs.  If you’d like a quick and easy way to see what UUIDs are mapped to their human readable name, you can follow that up with a ls –lah /vmfs/volumes.  The two commands back to back produce the following output on my server:

[root@cincylab-esx3 root]# esxcfg-vmhbadevs -m
vmhba35:0:0:1   /dev/sdc1                        4986310d-6525e5e6-ebbd-00237d0681e7
vmhba0:0:0:3    /dev/sdb3                        49e115fb-3e22358c-c10a-00237d0681e7
vmhba32:0:0:1   /dev/sda1                        4985c53e-e7b1904f-5042-00237d0681e7

 

[root@cincylab-esx3 root]# ls -lah /vmfs/volumes/
total 10M
drwxr-xr-x    1 root     root          512 Apr 20 23:07 .
drwxrwxrwt    1 root     root          512 Apr 11 18:12 ..
drwxr-xr-t    1 root     root         1.2K Feb  1 21:34 4985c53e-e7b1904f-5042-00237d0681e7
drwxr-xr-t    1 root     root         3.7K Apr 14 14:49 4986310d-6525e5e6-ebbd-00237d0681e7
drwxr-xr-t    1 root     root          980 Apr 11 18:13 49e115fb-3e22358c-c10a-00237d0681e7
lrwxr-xr-x    1 root     root           35 Apr 20 23:07 cincylab-esx3:storage1 -> 4985c53e-e7b1904f-5042-00237d0681e7
lrwxr-xr-x    1 root     root           35 Apr 20 23:07 cincylab-esx3:storage2 -> 49e115fb-3e22358c-c10a-00237d0681e7
lrwxr-xr-x    1 root     root           35 Apr 20 23:07 vol1 -> 4986310d-6525e5e6-ebbd-00237d0681e7

5)  vmkiscsi-ls

This one only applies to iSCSI storage, of course, and produces the following output on my server:

[root@cincylab-esx3 root]# vmkiscsi-ls

*************************************************************
        SFNet iSCSI Driver Version … 3.6.3 (27-Jun-2005 )
*************************************************************
TARGET NAME             : iqn.2004-08.jp.buffalo:TS-IGLA68-001D7315AA68:vol1
TARGET ALIAS            :
HOST NO                 : 4
BUS NO                  : 0
TARGET ID               : 0
TARGET ADDRESS          : 10.10.8.200:3260
SESSION STATUS          : ESTABLISHED AT Sun Apr 12 11:35:09 2009
NO. OF PORTALS          : 1
PORTAL ADDRESS 1        : 10.10.8.200:3260,1
SESSION ID              : ISID 00023d000001 TSIH 1400
*************************************************************


Describe the VMFS file system

There are many subsections here and before digging into each one, check out the following three links …

Metadata 
The simple definition of Metadata is “data about data.”  All file systems handle metadata differently.  VMFS uses metadata, stored in a special area of each volume, to manage all the files, directories (in VMFS-3 only), and attributes about the volume.  VMFS is a clustered file system, meaning more than one ESX server can access the same file system at the same time.  Therefore an update to the metadata requires locking of the LUN using a SCSI reservation.

Multi-access and locking
The following was taking from Advanced VMFS Configuration and Troubleshooting.

  Distributed Lock handling by VMFS3

  • Done in-band
  • Hosts mount a VMFS3 volume
  • Hosts’ ids posted to heartbeat region
  • Heartbeat records are updated at regular intervals by hosts
  • Host X locks a file, the lock is associated with its ID
  • If host X dies or loses access to volume the file lock is stale
  • Host Z attempts to lock the same file which is locked
  • Host Z check the heartbeat record of Host X (~5 times)
  • If host X heartbeat record is not updated, Host Z will age the lock
  • All other hosts yield to host Z and not attempt to lock the file
  • Lock is broken and Host Z acquires the lock
  • Journal is replayed by Host Z

 Extents
Extents are logical extensions of a file system.  They are typically used to grow a volume beyond the VMFS size limitations.  Essentially, an extent is the “joining” of two or more volumes into a single, logical VMFS volume.

Tree structure and files
The vmfs partition is mounted to the directory with the corresponding UUID found in /vmfs/volumes.  The human readable name of the volume is merely a symbolic link to that directory.  By default, all VMs are given a directory at the root of the partition.  So, for example, a VM with the name of AaronSweemer would have the directory /vmfs/volume/UUID/AaronSweemer.  In this directory you will find all files specific and relevant to that VM.  This is the default behavior as some (not all) of these files can be configured to reside elsewhere. 

Here is a table of common files found on the VMFS file system. 

Extension Usage
.dsk VM disk file
.vmdk VM disk file
.hlog VMotion log file
.vswp Virtual swap file
.vmss VM suspend file
.vmtd VM template disk file
.vmtx VM Template configuration file
.REDO Files used when VM is in REDO mode
.vmx VM configuration file
.log VM log file
.nvram Nonvolatile RAM

Journaling
From Wikipedia …

A journaling file system is a file system that logs changes to a journal (usually a circular log in a dedicated area) before committing them to the main file system. Such file systems are less likely to become corrupted in the event of power failure or system crash.

Explain the process used to align VMFS partitions 

The following procedure was found in VMware Enterprise Administration Exam study guide 3.5 (page 5) and Advanced VMFS Configuration and Troubleshooting (slide 36).

Aligned partitions start at 128. If the Start value is 63 (the default), the partition is
not aligned. If you choose not to use the VI Client and create partitions with
vmkfstools, or if you want to align the default installation partition before use, take
the following steps to use fdisk to align a partition manually from the ESX Server
service console:
1. Enter fdisk /dev/sd<x> where <x> is the device suffix.
2. Determine if any VMware VMFS partitions already exist. VMware VMFS
partitions are identified by a partition system ID of fb. Type d to delete to
delete these partitions.
Note: This destroys all data currently residing on the VMware VMFS partitions you
delete.
3. Ensure you back up this data first if you need it.
4. Type n to create a new partition.
5. Type p to create a primary partition.
6. Type 1 to create partition No. 1.
Select the defaults to use the complete disk.
7. Type t to set the partition’s system ID.
8. Type fb to set the partition system ID to fb (VMware VMFS volume).
9. Type x to go into expert mode.
10. Type b to adjust the starting block number.
11. Type 1 to choose partition 1.
12. Type 128 to set it to 128 (the array’s stripe element size).
13. Type w to write label and partition information to disk.

Explain the use cases for round-robin load balancing

Multipathing is typically used for failover.  Meaning, if one storage path becomes available the host can failover to an alternate path.  However, multipathing can also be used in a round-robin fashion to achieve load balancing to achieve better utilization of the HBAs.  There are a couple different configurable options that specify when an ESX server switches paths.  From the Round-Robin Load Balancing technical note …

When to switch – Specify that the ESX Server host should attempt a path switch after a specified number of I/O blocks have been issued on a path or after a specified number of read or write commands have been issued on a path. If another path exists that meets the specified path policy for the target, the active path to the target is switched to the new path. The –custom-max-commands and –custom-max-blocks options specify when to switch.

Which target to use – Specify that the next path should be on the preferred target, the most recently used target, or any target. The –custom-target-policy option specifies which target to use.

Which HBA to use – Specify that the next path should be on the preferred HBA, the most recently used HBA, the HBA with the minimum outstanding I/O requests, or any HBA. The –custom-HBA-policy option specifies which HBA to use.

Skills and Abilities

Perform advanced multi-pathing configuration

  • Configure multi-pathing policy
  • Configure round-robin behavior using command-line tools
  • Manage active and inactive paths

Setting the Path Switching Policy
You can set the path?switching policy for failover and for load balancing by using the esxcfg-mpath command.

You can set the path switching policy on a per?LUN basis by using the esxcfg-mpath command’s –policy custom option. If you specify –policy custom, you must also specify one of the custom policy options. Because the path switching policy is set on a per?LUN basis, you must always specify the LUN using the –lun option.

Notes

If you set the custom-max-blocks and custom-max-commands, options, the system attempts to switch paths as soon as one of the limits is reached.

If you set the target or the HBA policy to preferred, the system chooses the target or the HBA of the preferred path when possible. If a preferred policy is set on an active/passive SAN array, and the preferred target is not on the active SP (Storage Processor), the system does not select the preferred target but a target on the active SP.

Path switching is not performed if an outstanding SCSI reservation is on the target, or if a path failover is underway. Path switching is delayed until an I/O request is performed when no reservations or path failovers are pending.

 

 Configure and use NPIV HBAs

<<I don’t have NPIV in my lab.  Need to revisit this section>>

 
Manage VMFS file systems using command-line tools

The command line tool you’ll use for managing VMFS file systems in vmkfstools.  It’s a very powerful tool and there are many options available, so I suggest you read the man page.  The following examples (taken from the online documentation) are certainly not inclusive, just a quick sample of what the tool can do. 

Example for Creating a VMFS File System
vmkfstools -C vmfs3 -b 1m -S my_vmfs /vmfs/devices/disks/vmhba1:3:0:1

Example for Extending a VMFS-3 Volume
vmkfstools -Z /vmfs/devices/disks/vmhba0:1:2:1 /vmfs/devices/disks/vmhba1:3:0:1

Upgrading a VMFS-2 to VMFS-3
-T –tovmfs3 -x –upgradetype [zeroedthick|eagerzeroedthick|thin]

Example for Creating a Virtual Disk
vmkfstools -c 2048m /vmfs/volumes/myVMFS/rh6.2.vmdk

Example for Cloning a Virtual Disk
vmkfstools -i /vmfs/volumes/templates/gold-master.vmdk /vmfs/volumes/myVMFS/myOS.vmdk

 

 Configure NFS datastores using command-line tools

Assuming your NAS is configured properly, this is pretty easy.  The following command will mount an NFS datastore on an ESX host …

esxcfg-nas –a –o 10.10.8.25 –s /nfs/share NAS

In this example, the –a adds a host with the IP address followed by the –o flag using the share configured after the –s flag.  Upon successfully adding the datastore, the NFS mount will be found at /vmfs/volumes/NAS

The following command will remove the datastore

esxcfg-nas –d –o 10.10.8.25 NAS

 
Configure iSCSI hardware and software initiators using command-line tools

I don’t know if I’ve seen an official, formal example of how to do this (though I’m sure it exists somewhere).  So, here’s how I do it …

Step 1:  Add the portgroup to vSwitch0 
esxcfg-vswitch –add-pg=VMkernel vSwitch0

Step 2:  Add the IP to the VMkernel portgroup
esxcfg-vmknic -a -i 10.10.8.202 -n 255.255.255.0 VMkernel

Step 4:  Enable iSCSI
esxcfg-swiscsi –e

Step 5:  Add the target
vmkiscsi-tool -D -a 10.10.8.200 vmhba34

Step 6:  Rescan the HBA
esxcfg-rescan vmhba34

 

That’s it for section 1.1 … time to go reformat my notes for section 1.2!