Welcome to a Danish Virtualization blog! Thoughts, comments and tips and tricks on Virtualization topics are provided to you by Heino Skov and Nicolai Sandager.
The Virtual Troll
A virtualization blog!
On this blog we will post comments, thoughts, ideas, tips and tricks around virtualization topics. We may also discuss other topics and we hope you will enjoy it and feel free to leave a comment.
VMware vCenter and SNMP monitoring support with Sysorb
I recently had a case at a customer site, where they wanted to use Sysorb to monitor their VMware vSphere environment. I wrote this article, as I couldn’t find any useful information on Sysorb and VMware vSphere together.
SNMP was configured on the ESX servers and VMware vSphere MIBS was imported into the Sysorb tool.
However trying the same for the vCenter server, didn’t give the expected results. I configured some vCenter configuration alarms to test alarms to verify that SNMP trap where sent out. The Alarms and Event tab in vCenter stated that the SNMP trap was sent, however none were received in Sysorb.
I then installed trap receiver from trapreceiver.com and verified through this tool that the alarms where actually sent out and received from the vCenter server.
I wasn’t familiar with how Sysorb was working and had to read up and investigate how it used SNMP to monitor their server and network devices. I found that Sysorb is NOT supporting SNMP traps, but only works through the SNMP GET function. Basically what this means is, that the Sysorb tool was polling the target for status on preconfigured thresholds/triggers in the Sysorb tool.
Since vCenter itself monitors for configured alarms/thresholds/triggers and therefore sending out an SNMP trap if configured, the 2 systems would not work properly together.
I looked in the VMware vSphere documentation for support for SNMP GET and it was stated that support for SNMP GET on vCenter was not available. VMware vCenter only supports SNMP traps for sending out alarms, which also makes perfect sense to me.’
If any has built any workarrounds to get this to work, please let me know.
Fix: HP Virtual Connect Flex-10 - ESX 4.0 U1 in an Active/Active Configuration does Not Failover Using SmartLink
I’ve been troubleshooting a ESX implementation on a solution based on HP c7000 blade enclosure, which had two HP Virtual Connect interconnect modules builtin. The blade server used was HP Proliant DL460c G6. I checked the VMware HCL and noticed some requirements to get this to work on vSphere 4.0 Update 1.
Output from VMware HCL:
Notice that ESX 4.0 U1 is supported but there are a couple notes. One is to install a specific driver - esx40-net-bnx2x_400.1.48.107-1.0.4. I downloaded and installed the driver on all the ESX servers.
Now I wanted to test the failover. In the HP Virtual Connect Manager I disabled the Shared Uplink Set for Bay 2. I had already setup a continous ping and verified that I still had connectivity to both service console and to VMs running on the hosts, through interconnect bay 1. Test was succesful.
Then I switched arround, enabled Shared Uplink Set 2 and disabled Shared Uplink Set 1 for Bay 1. This time I lost connectivity to both service console and ESX hosts and even I waited a couple minutes, it never came up. I had one other blade server on that enclosure, that was running Microsoft Windows, which I didnt have any problems connecting to.
So I thought the reason was with the ESX configuration.
After I verified all settings on both the ESX hosts ,HP Virtual Connect and the physical switches, which all were identical configured in regards to both interconnect bays, I decided to call HP Support on this issue. I was referenced to a public advisory stating that HP Virtual Connect Flex-10 - ESX 4.0 U1 in an Active/Active Configuration does Not Failover Using SmartLink.
The solution is the following three action points:
- Verify that the firmware on HP Virtual Connect was running 2.30 as minimum. This setup was running with 2.32 (newest version)
- Verify that the NIC driver version was Broadcom NetXtreme II Ethernet Network Controller driver 1.52.12.v40.3 (minimum) for ESX/ESXi 4.0. This was different from what the VMware HCL stated.
- Verify that the NC532i/m bootcode version 5.0.11 (minimum). The bootcode on the NC532 was NOT up-to-date on each blade.
I updated both the NIC driver in ESX and the NIC bootcode with the HP Firmware Maintenance CD and after a reboot, failover was working just as expected. It is recommended by HP to update the bootcode after the NIC driver is installed on the ESX server.
I have NOT been able to find the public advisory article on the HP website on this in regards to VMware vSphere, hence this article.
Error while trying to virtualise an ESX server as a virtual machine on a physical ESX
Recently there has been numerious blogposts about how to virtualise an ESX server as a virtual machine on top off a physical ESX server hosts. This is particular useful for mainly testing purposes.
Check some of these blogposts out:
VMware ESX 4 can even virtualize itself by Eric Gray at www.vcritical.com:
http://www.vcritical.com/2009/05/vmware-esx-4-can-even-virtualize-itself/
VMware vSphere in a box by Hany Michael at www.hypervizor.com:
http://www.hypervizor.com/2009/07/vsphere-in-a-box-a-virtual-private-cloud-blueprint/
http://www.hypervizor.com/2009/07/vsphere-in-a-box-part2-putting-the-pieces-all-together/
http://www.hypervizor.com/2009/07/vsphere-in-a-box-part-3-the-lab-manager-40-automation/
The last link above even shows how to utilize VMware Lab Manager to deploy multiple vSphere in a box LABs. That is exactly what I’m trying to achieve in this environment.
So moving forward I ran into problems with the virtualising ESX servers. I could’nt even install ESX or ESXi in the VM. I got the following error ”Kernel panic – not syncing: No supported microcode levels for this stepping of AMD Family….”:
I was quite sure my physical ESX hosts were running a newer processor, that can be used to run a virtual ESX. My setup is like this:
2x HP DL385G5 Servers with AMD Barcelona processors
2x HP DL385G5p Servers with AMD Barcelona processors
I verified the BIOS settings was enabled for AMD-V. I was trying out several of the .vmx tweeks in the past used for VMware Workstation or ESX3.5. Neither of them allowed me to install ESX in the VM.
So I am running different ESX hosts in my cluster and I had enabled EVC on the cluster to allow for VMotion between different CPU models. I tried to disable EVC and bingo, now it worked like a charm.
Installing vSphere 4 Update 1 works without any adjustments in the vmx file. Next step is to see if a VM can be run on top of the virtual ESX server.
Quite a few updates from VMware incl. vSphere 4.0 Update 1 and View 4.0
View 4 available for download among other updates…
The long awaited View 4 is available for download. The earlier version did not support vSphere 4.0 and this new version of View 4 does. It requires VMware vSphere 4.0 Update 1 which also now is available.
So for all that was running View 3.1 in Proof of Concept which required VI 3.5, it would now be possible to upgrade your VMware platform with both vSphere 4.0 Update 1 – which also got released this week.
So there have been quite a few updates from VMware this week:
VMware vSphere 4.0 Update 1 available for both ESX and vCenter
http://www.vmware.com/support/vsphere4/doc/vsp_vc40_u1_rel_notes.html
http://www.vmware.com/support/vsphere4/doc/vsp_esx40_u1_rel_notes.html
http://www.vmware.com/support/vsphere4/doc/vsp_esxi40_u1_rel_notes.html
VMware Data Recovery 1.1 available:
http://www.vmware.com/support/vdr/doc/vdr_110_releasenotes.html
VMware vSphere PowerCLI 4.0 Update 1 available:
http://www.vmware.com/support/developer/windowstoolkit/wintk40u1/windowstoolkit40U1-200911-releasenotes.html
VMware vSphere CLI 4.0 Update 1 available:
http://www.vmware.com/support/developer/vcli/vcli401/vcli_401_relnotes.html
VMware View 4.0 available:
http://www.vmware.com/support/view40/doc/releasenotes_viewmanager40.html
VMware ThinApp 4.0.4 available:
http://www.vmware.com/support/thinapp4/doc/releasenotes_thinapp404.html
I linked to the release notes, because remember that it is important to READ them before using the software.
VMs NIC disconnects after a VMotion
I had a customer calling me the other day about a weird problem. Some VMs disconnected their NIC after a VMotion.
It was not possible to connect the NIC back on a again
The customer was in the process of upgrading to vSphere, so used DRS/VMotion to put ESX hosts in maintenance mode, so they could be upgraded.
After looking in log files, we saw the following errors:
————Edited/output ————
6 10:19:21 esx1 vmkernel: 0:20:07:45.197 cpu3:6256)Net: 1317: can’t connect device: LAN: Out of resources
6 10:24:03 esx1 vmkernel: 0:20:12:27.267 cpu3:6256)Net: 1317: can’t connect device: LAN: Out of resources
——————————————-
This led us to check for how many ports the vSwitch was configured to use:
esxcfg-vswitch -l
————Edited/output ————
[root@esx1 ~]# esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 32 32 32 1500 vmnic0
PortGroup Name VLAN ID Used Ports Uplinks
LAN 0 29 vmnic0
Service Console 0 1 vmnic0
——————————————-
As we can see they have a LAN portgroup connected on vSwitch0 together with the Service Console. The default vSwitch is configured with 32 ports and due to the maintenance mode of one ESX hosts, this customer ran into the problem that the vSwitch didn’t have enough available ports for the VMs that were migrated to that specific host.
It resulted in that the NIC in the VM got disconnected and that we couldn’t reconnect it.
Any additionally created vSwitches is per default created with 64 ports.
So this leads me to the point. First, this is a design issue and should never happen if designed accordingly, however I do think that the DRS/VMotion feature should come up with a warning if a vSwitch is full, instead of migrating (VMotion) the VM and disconnect the VMs NIC interface.
This setup would also provide problems in a VMware HA scenario, if no available ports is free, then it would also leads to disconnected NICs in the VMs restarting.
Make sure that if you use vSwitch0 for VM connectivity to raise the number of ports from default of 32 ports to whatever is needed.
Hypervisor Competitive Differences – VMware in the lead
One of the sessions at VMworld that I missed was Chris Wolf’s: DC15 - Hypervisor Competitive Differences: Beyond the Data Sheet, looking back this really was one of the sessions that I really would have loved watching.
Chris Wolf is an analyst from Burton Group which together with a couple other analysts has spent 3 month in developing the Industry’s First Hypervisor Evaluation Criteria and Vendor Scorecards.
The Burton Group announcement can be read here.
They break up the evaluation criteria’s in three categories, which is:
- Required
- Preferred
- Optional
A short 15 minutes overview of which required features that Burton Group think should be available within the hypervisor product or ecosystem is available here
You have to register to watch it, and I would recommend doing so.
During VMworld a couple of other articles got posted with some comments of Chris’ VMworld presentation, about the results of the evaluation criteria’s and scorecard. These can be found here:
VMware outshines Hyper-V, et al. in hypervisor comparison
Hypervisors Compared: VMware Stands Alone
Now, Chris Wolf has responded to the two articles with a new blog post explaining some of the comments posted and the feedbacks that he got.
Hypervisor Competitive Differences - The Aftermath
I can’t wait to read the full report coming first week in April about the complete evaluation criteria’s and the justification for each. Burton Group will follow this report up by releasing product profiles on Citrix XenServer, Microsoft Windows Server 2008 Hyper-V, Virtual Iron, and VMware vSphere 4.0
I just want to thank Chris and Burton Group for this work. It will be really nice to have a full objective comparison that is vendor neutral. Of course in each virtualization case you will have to go over the requirements that is needed for the each case.
Updated: VMware ESX/VC 3.5 Update 2 - Bug or Feature within EVC / CPU Masking
I wrote last week on this bug or new feature observation after Update 2. To read my first post about this click here.
I have been in touch with VMware to find out if this is a bug or a new feature. They confirmed there has been a change to VMotion CPU constraints.
Based on my experience, VMware is a little more flexible on CPU constraints than earlier. To sum this up:
If you have two almost identical CPUs, then if you power up VMs on the “oldest” CPUs these VMs would be able to migrate to the new CPU and back. However if you power up your VM on the newer CPU (enabling the new features) then this VM would not be able to VMotion to the older hosts.
However - one thing to keep in mind though.. even though you can avoid the manual work on configuring CPU masking on every VM… DRS and/or VMware HA would over time power on VMs on all hosts - and those powered up - on the new hosts would then have problems doing VMotion.
The solution to this is to use the new feature on EVC (Enhanced VMotion Compatibility). This knowledge base article lists the CPU which is supported for EVC usage: Enhanced VMotion Compatibility (EVC) processor support
VMware ESX/VC 3.5 Update 2 - Bug or Feature within EVC / CPU Masking?
A customer of mine had issues with CPU masking after setting up a new ESX host. They specific ordered the hardware to be alike with the other 3 hosts in that cluster. So it was the same model both Server and CPU wise.
VMotion worked just fine between all hosts as long as the VM was powered on, on one of the three oldest hosts.
But when they powered on a VM on the new host - this VM could not be migrated with VMotion to the other 3 hosts. It came up with the usual Host CPU is incompatible error.
I analyzed the /proc/cpuinfo file in the service console to determine differences between the hosts. The new host was CPU stepping 11 compared to the older which was 6. And the new host had the NX flag enabled. The results on this was:
|
/proc/cpuinfo file |
Older hosts |
New hosts |
|
Processor |
0 |
0 |
|
Vendor_id |
GenuineIntel |
GenuineIntel |
|
CPU Family |
6 |
6 |
|
Model |
15 |
15 |
|
Model Name |
Intel(R) Xeon(R) CPU 5150 @ 2.66GHz |
Intel(R) Xeon(R) CPU 5150 @ 2.66GHz |
|
Stepping |
6 |
11 |
|
CPU MHZ |
2660.059 |
2660.063 |
|
Cache Size |
4096 KB |
4096 KB |
|
Fdiv_bug |
No |
No |
|
Hlt_bug |
No |
No |
|
F00f_bug |
No |
No |
|
Coma_bug |
No |
No |
|
Fpu |
Yes |
Yes |
|
Fpu_exeption |
Yes |
Yes |
|
Cupid level |
10 |
10 |
|
Wp |
Yes |
Yes |
|
Flags |
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss tm lm |
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss tm nx lm |
|
Bogomips |
5308.41 |
5308.41 |
Then I checked to see if the new Enhanced VMotion Compatibility (EVC) was enabled. But this option was grayed out (disabled) – because the three oldest hosts were not compatible with EVC.
Before Update 2 – and when you had VMotion CPU compatibility issues, VMotions didn’t work either way. So something happened with this in Update 2. In the past you manually had to enter the correct CPU Bits, into each VMs configuration file for VMotion to work.
But in this case I could VMotion all VMs between all hosts forth and back - as long as the VM was started on one of the three older hosts.
I noticed that on the VMs powered on, on the oldest hosts - that CPU masking bits were entered into the vmx file. This was done automatically by the system!
This was added to the vmx file of VMs powered on, on the older hosts.
|
cpuid.1.eax = “xxxx————xx————–” cpuid.1.ecx = “R—-R–R-RRRR-0———–H-R–” cpuid.1.edx = “—————————T—-” cpuid.80000001.eax.amd = “xxxx————xx————–” cpuid.80000001.ecx.amd = “——————RR-RR-RRR-0—” cpuid.80000001.edx = “—-R—————H———–” cpuid.80000001.edx.amd = “—–R————–H——T—-”
cpuid.1.ecx.amd = “R——-R——————-R—”
hostCPUID.0 = “0000000a756e65476c65746e49656e69″ guestCPUID.0 = “0000000a756e65476c65746e49656e69″ userCPUID.0 = “0000000a756e65476c65746e49656e69″ hostCPUID.1 = “000006f6000208000004e3bdbfebfbff” guestCPUID.1 = “000006f800010800000022110febbbff” userCPUID.1 = “000006f6000208000004e3bdbfebfbff” hostCPUID.80000001 = “00000000000000000000000120000000″ guestCPUID.80000001 = “00000000000000000000000120000000″ userCPUID.80000001 = “00000000000000000000000120000000″ evcCompatibilityMode = “FALSE” |
So - I’m just wondering if this is a new feature in Update 2 VI 3.5. It is not described anywhere (at least I have not found any documentation on it).
After some thoughts I summed up - that indeed this is a nice feature to have (If you dont have the possibility to turn on EVC)… Just make sure to power on your VMs on the old servers and it automatically entered the CPU bits needed to do VMotions against the new host. And still a little weird since EVC was disabled. But it removed all the work on manually entering the CPU bits on every VM.
My concern though is that manually editing the CPU bits hasn’t been supported by VMware… so is this supported?
So is this by design or is it a bug (hence that EVC was disabled)?
VMware ESX and ESXi Comparison
VMware has just released an updated VMware ESX and ESXi Comparison:
|
Capability |
VMware ESX |
VMware ESXi |
|
Service Console |
Service Console is a standard Linux environment through which a user has privileged access to the VMware ESX kernel. This Linux-based privileged access allows you to highly customizing your environment by installing agents and drivers and executing scripts and other Linux-environment code. |
VMware ESXi is designed to make the server a computing appliance. Accordingly, VMware ESXi behaves more like firmware than traditional software. To provide hardware-like security and reliability, VMware ESXi does not support a privileged access environment like the Service Console of VMware ESX. To enable interaction with agents, VMware has provisioned CIM Providers through which monitoring and management tasks – traditionally done through Service Console agents – can be performed. VMware has provisioned RCLI to allow the execution of scripts. |
|
Remote CLI |
VMware ESX Service Console has a host CLI command through which VMware ESX can be configured. ESX 3.5 Update 2 supports RCLI. |
VMware ESX Service Console CLI has been ported to a Remote CLI (RCLI) for VMware ESXi. RCLI is a virtual appliance that interacts with VMware ESXi hosts to enable host configuration through scripts or specific commands.
Note: RCLI is limited to read-only access for the free version of VMware ESXi. To enable full functionality of RCLI on a VMware ESXi host, the host must be licensed with VI Foundation, VI Standard, or VI Enterprise.
The following Service Console CLI commands have not been implemented in RCLI:
|
|
Scriptable Installation |
VMware ESX supports scriptable installations through utilities like KickStart. |
VMware ESXi Installable does not support scriptable installations in the manner ESX does, at this time. VMware ESXi does provide support for post installation configuration script using RCLI-based configuration scripts. |
|
Serial Cable Connectivity |
VMware ESX supports interaction through direct-attached serial cable to the VMware ESX host. |
VMware ESXi does not support interaction through direct-attached serial cable to the VMware ESXi host at this time. |
|
SNMP |
VMware ESX supports SNMP. |
VMware ESXi supports SNMP when licensed to a VI Foundation, VI Standard, or VI Enterprise edition. The free version of VMware ESXi does not support SNMP. |
|
Active Directory Integration |
VMware ESX supports Active Directory integration through third-party agents installed on the Service Console. |
VMware ESXi with a Virtual Infrastructure license and in conjunction with VirtualCenter allows users to be authenticated via Active Directory. In this configuration, users can log in directly to an ESXi host and authenticate using a local username and password.
The free version of VMware ESXi does not support Active Directory integration at this time.
|
|
HW Instrumentation |
Service Console agents provide a range of HW instrumentation on VMware ESX. |
VMware ESXi provides HW instrumentation through CIM Providers. Standards-based CIM Providers are distributed with all versions of VMware ESXi. VMware partners may inject their own proprietary CIM Providers in customized versions of VMware ESXi. To obtain a customized version of VMware ESXi, you typically have to purchase a server with embedded VMware ESXi through a server vendor.
At this time, HP also offers its customized VMware ESXi Installable on www.vmware.com. Dell and IBM will soon offer their customized version of VMware ESXi on www.vmware.com.
Remote console applications like Dell DRAC, HP iLO, and IBM RSA are supported with ESXi.
Note: COS agents have a longer lineage than CIM Providers and are therefore more mature. VMware is actively working with its 250+ partners to close the CIM Provider–Service Console agent gap. |
|
Software Patches and Updates |
VMware ESX software patches and upgrades behave like traditional Linux based patches and upgrades. The installation of a software patch or upgrade may require multiple system boots as the patch or upgrade may have dependencies on previous patches or upgrades. |
VMware ESXi patches and updates behave like firmware patches and updates. Any given patch or update is all-inclusive of previous patches and updates. That is, installing patch version “n” includes all updates included in patch versions n-1, n-2, and so forth. |
|
VI Web Access |
VMware ESX supports managing your virtual machines through VI Web Access. You can use the VI Web Access to connect directly to the ESX host or to the VMware Infrastructure Client. |
VMware ESXi does not support web access at this time. |
|
Licensing |
VMware ESX hosts can be licensed as part of a VMware Infrastructure 3 Foundation, Standard, or Enterprise suite.
|
VMware ESXi hosts can be individually licensed (for free) or licensed as part of a VMware Infrastructure 3 Foundation, Standard, or Enterprise suite.
Individually licensed ESXi hosts offer a subset of management capabilities (see SNMP and Remote CLI). |
|
|
ESXi (ESX not available without VI) |
VI Foundation (with ESX or ESXi) |
VI Standard (with ESX or ESXi) |
VI Enterprise (with ESX or ESXi) |
|
Core hypervisor functionality |
Yes |
Yes |
Yes |
Yes |
|
Virtual SMP |
Yes |
Yes |
Yes |
Yes |
|
VMFS |
Yes |
Yes |
Yes |
Yes |
|
VirtualCenter Agent |
|
Yes |
Yes |
Yes |
|
Update Manager |
|
Yes |
Yes |
Yes |
|
Consolidated Backup |
|
Yes |
Yes |
Yes |
|
High Availability |
|
|
Yes |
Yes |
|
VMotion |
|
|
|
Yes |
|
Storage VMotion |
|
|
|
Yes |
|
DRS |
|
|
|
Yes |
|
DPM |
|
|
|
Yes |
VMware and VSS - Application Backup and Recovey whitepaper from Veeam Software
Veeam Software has just released an interesting whitepaper about application backup and recovery using different disaster recovery solutions. One of the new features in VMware Consolidated Backup 1.5 (from the VI 3.5 Update 2) is support for Microsoft Volume Shadow Copy Service (VSS).
Using VSS along with an image‐level backup of virtual machines (VM) running supported applications allows you to create a transactionally consistent backup image. With such a backup image, you can successfully recover both the VM, and any supported application installed on the VM.
This is very interesting and along time awaited feature. One important thing to notice is that to enable the VSS feature - VMware Tools also needs to be updated and in VMware Tools on each VM you have to enable the VSS support:
- In the VMware Tools installer, select Modify > Drivers > VSS.
- Complete the installation process.
- Restart the virtual machine to make sure VSS components are installed and running.
To read the full whitepaper from Veeam Software click here
Veeam Software has also posted a video testimonial that supports this paper. It’s available at their corporate blog
Remember any time you implement new features to test the implementation thoroughly in a LAB environment if possible.
Feel free to leave a comment. Thanks in advance. Regards Heino.
