Just a very brief blog here – I wanted to mind-dump something I explained to one of my colleagues today on how to use WMI performance counters with Nagios using NRPE.

For those not in-the-know, WMI is Windows version of SNMP basically – and you can query it remotely to find out information as you would with an OID in SNMP.

You can then monitor this metric, be it “database size”, “number of transactions per second”, etc in your Opsview or Nagios system.

So, lets get cracking.

Step 1: Login to your Windows server using mstsc (Remote desktop), fire up a cmd window and run the command:

TypePerf.exe -q

This will output a very long list of every possible metric in your system (be warned!). I chose to run:

TypePerf.exe -q > wmilist.txt

…Given that Windows amazingly supports redirects at the command line, its so novel to be able to do what you want sometimes! Heres a snapshot of some of the metrics I got on my Windows server:

\SQLServer:Databases(*)\Data File(s) Size (KB)
\SQLServer:Databases(*)\Log File(s) Size (KB)
\SQLServer:Databases(*)\Log File(s) Used Size (KB)
\SQLServer:Databases(*)\Percent Log Used
\SQLServer:Databases(*)\Active Transactions
\SQLServer:Databases(*)\Transactions/sec
\SQLServer:Databases(*)\Repl. Pending Xacts
\SQLServer:Databases(*)\Repl. Trans. Rate
\SQLServer:Databases(*)\Log Cache Reads/sec

Step 2:  Now, after picking which metrics we want to monitor, we need to install NSClient/Opsview agent on the box (simply done, just give it a Google). This provides NRPE functionality, which we will need in a second – namely the “nsc_checkcounter” option that the agents bring.

Step 3: After installing the NSClient/Opsview agent, fire up a terminal on your  Opsview/Nagios box – su to nagios, cd to where check_nrpe lives (/usr/local/nagios/libexec on Opsview) and run a command similar to below:

$ /usr/local/nagios/libexec/check_nrpe -H 192.168.19.10 -c nsc_checkcounter -a '"\SQLServer:Databases(*)\Transactions/sec" MaxWarn=2000 MaxCrit=3000 ShowAll'
MONITORED BY: Slave2
RETURN CODE: 0 (OK)
OUTPUT:
OK: \SQLServer:Databases(*)\Transactions/sec: 0|'\SQLServer:Databases(*)\Transactions/sec'=0;2000;3000;

Where the “-H …” address is the address of the Windows server, and the bit between the “..” is the counter’s we saw in Step 1. As we can see, we are getting data back from the host showing we have a sum total of 0 transacations per second – this is a busy demo system…

Step 4: Now we know its working, simple navigate to the Opsview GUI to create your checks – or do it via the .cfg files / command line in Nagios if thats your distro of choice, and you will now be able to monitor any WMI performance counter you desire – see “Example” and below in the link here: http://www.everybodyhertz.co.uk/host-group-availability/

Hope this helps!