cropped-VMware-ESXi-Hardware.png

How do I monitor ESXi hardware?

The problem

Out of the box, WhatsUp® Gold does not monitor ESXi hardware in any capacity; even with the Virtualization monitoring add-on. However, there are many different ways you can use custom monitors in WhatsUp® Gold in order to effectively monitor ESXi hardware. Think about the ‘Health Status’ tab in your VMware environment. Wouldn’t it be great if you could have all of those things monitored in WhatsUp® Gold as well?

The solutions

Well, as I mentioned earlier there are many different potential solutions to this. One of my preferred ways to use to use a PowerShell script connecting to a CIM session on the host itself. WhatsUp® Gold has the ability to use PowerShell scripts as active or performance monitors. This particular monitor is an active monitor. The beauty of this monitor is that is aggregates EVERY sensor into a single monitor. This includes vendor specific sensors such as disk drives, power supplies, etc. Obviously that piece requires you install the vendor specific ESXi, or load their management agents.

Another solution I have used with success is to configure the ESXi hosts to send SNMP traps to the WhatsUp® Gold server. This method has a couple of different approaches. You can either create a specific SNMP trap passive monitor for each potential trap you’d like to alert on, or you could simply listen for Any SNMP trap from the host.

The script

To use the script, simply create a PowerShell active monitor and copy/paste the code below. This script will use whichever VMware credential is applied to the device in WUG.

import-module CimCmdlets
# Get the credentials
$VMuser = $Context.GetProperty("CredVMware:Username");
$VMpass = $Context.GetProperty("CredVMware:Password");
$pwd = ConvertTo-SecureString $VMpass -asplaintext -force
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $VMuser,$pwd
# Get server
$esxi = $Context.GetProperty("Address");

#Heatlh State value translations
$HealthState0 = "Unknown"
$HealthState5 = "OK"
$HealthState10 = "Degraded/Warning"
$HealthState15 = "Minor failure"
$HealthState20 = "Major failure"
$HealthState25 = "Critical failure"
$HealthState30 = "Non-recoverable error"

#Set Session Options
$CIOpt = New-CimSessionOption -SkipCACheck -SkipCNCheck -SkipRevocationCheck -Encoding Utf8 -UseSsl
$Session = New-CimSession -Authentication Basic -Credential $cred -ComputerName $esxi -port 443 -SessionOption $CIOpt
#print device info to log
#$Chassis = Get-CimInstance -CimSession $Session -ClassName CIM_Chassis
#$sMessage = $sMessage + "`r`nModel:" + $Chassis.Manufacturer + " " + $Chassis.Model + "`r`n"
#$sMessage = $sMessage + "Serial:" + $Chassis.SerialNumber + "`r`n"; 
#Find sensors not in normal state
$bDown = 0
#Processors
$sensors = Get-CimInstance -CimSession $Session -ClassName CIM_Processor | Where {$_.HealthState -ge 0} | Select Caption, HealthState
foreach ($sensor in $sensors)
{
 If ($sensor.HealthState -ne 5) {
  $sensor.HealthState = $sensor.HealthState -replace "10", $HealthState10
  $sensor.HealthState = $sensor.HealthState -replace "15", $HealthState15
  $sensor.HealthState = $sensor.HealthState -replace "20", $HealthState20
  $sensor.HealthState = $sensor.HealthState -replace "30", $HealthState30
  $sensor.HealthState = $sensor.HealthState -replace "0", $HealthState0
  $sDownMessage = $sDownMessage + $sensor.Caption + ": " +  $sensor.HealthState + "`r`n"
  $sMessage = $sMessage + $sensor.Caption + ": " +  $sensor.HealthState + "`r`n"
  $bDown = 1 }
 Else {  
  $sMessage = $sMessage + $sensor.Caption + ": " + $HealthState5 + "`r`n" }
}

#Physical Memory
$sensors = Get-CimInstance -CimSession $Session -ClassName CIM_Memory | Where {$_.HealthState -ge 0 -and $_.ElementName -notlike '*Cache*'} | Select ElementName, HealthState
foreach ($sensor in $sensors)
{
 If ($sensor.HealthState -ne 5) {
  $sensor.HealthState = $sensor.HealthState -replace "10", $HealthState10
  $sensor.HealthState = $sensor.HealthState -replace "15", $HealthState15
  $sensor.HealthState = $sensor.HealthState -replace "20", $HealthState20
  $sensor.HealthState = $sensor.HealthState -replace "30", $HealthState30
  $sensor.HealthState = $sensor.HealthState -replace "0", $HealthState0
  $sDownMessage = $sDownMessage + $sensor.ElementName + ": " +  $sensor.HealthState + "`r`n"
  $sMessage = $sMessage + $sensor.ElementName + ": " +  $sensor.HealthState + "`r`n"
  $bDown = 1 }
 Else {
  $sMessage = $sMessage + $sensor.ElementName + ": " + $HealthState5 + "`r`n" }
}

#All vendor specific sensors
$sensors = Get-CimInstance -CimSession $Session -ClassName CIM_Sensor | Where {$_.HealthState -ge 0} | Select Caption, HealthState
foreach ($sensor in $sensors)
{
 If ($sensor.HealthState -ne 5) {
  $sensor.HealthState = $sensor.HealthState -replace "10", $HealthState10
  $sensor.HealthState = $sensor.HealthState -replace "15", $HealthState15
  $sensor.HealthState = $sensor.HealthState -replace "20", $HealthState20
  $sensor.HealthState = $sensor.HealthState -replace "30", $HealthState30
  $sensor.HealthState = $sensor.HealthState -replace "0", $HealthState0
  $sDownMessage = $sDownMessage + $sensor.Caption + ": " +  $sensor.HealthState + "`r`n"
  $sMessage = $sMessage + $sensor.Caption + ": " +  $sensor.HealthState + "`r`n"
  $bDown = 1 }
 Else {
  $sMessage = $sMessage + $sensor.Caption + ": " + $HealthState5 + "`r`n" }
}

#If down flag thrown, set down else set up
$sUpMessage = "All sensors were found to be in the 'OK' state."
If ($bDown -eq 1) {
 $Context.SetResult(1, "Down! One or more sensors was found to not be in the 'OK' state`r`n" + $sDownMessage); }
Else {
$Context.SetResult(0, $sMessage + "`r`nUP! " + $sUpMessage);
}

#Remove the CIMSession
$Remove = Remove-CimSession -CimSession $Session

Add a Comment

Your email address will not be published. Required fields are marked *


CAPTCHA Image
Reload Image

This site uses Akismet to reduce spam. Learn how your comment data is processed.