Thursday, May 1, 2008

Nagios with NSClient++ Character Flaws

Arg. It can be frustrating to pass special characters to check_nt arguments!

First, the dreaded ampersand (&):

Unfortunately, it appears as though the ampersand is the field delimiter used by NSClient++, so passing an ampersand to check_nt is absolutely not going to work. Take a look at the following code snippit from check_nt.c :

  249  case CHECK_PROCSTATE:
251 if (value_list==NULL)
252 output_message = strdup (_("No service/process specified"));
253 else {
254 preparelist(value_list); /* replace , between services with & to send the request */
255 asprintf(&send_buffer,"%s&%u&%s&%s", req_password,(vars_to_check==CHECK_SERVICESTATE)?5:6,
256 (show_all==TRUE) ? "ShowAll" : "ShowFail",value_list);
257 fetch_data (server_address, server_port, send_buffer);
258 return_code=atoi(strtok(recv_buffer,"&"));
259 temp_string=strtok(NULL,"&");
260 output_message = strdup (temp_string);
261 }
262 break
  620 void preparelist(char *string) {
621 /* Replace all , with & which is the delimiter for the request */
622 int i;
624 for (i = 0; (size_t)i < strlen(string); i++)
625 if (string[i] == ',') {
626 string[i]='&';
627 }
628 }

As you can see, the ampersand is hardwired into the request to the NSClient++ server, so any fix will require changes to both the check_nt plugin and NSClient++.

There is no workaround for this, except to avoid using the ampersand (escaping the ampersand with a backslash ( \ ) does not work). If, for example, you are trying to check on the status of the "Backup Exec Device & Media Service", use the service name instead of the display name -- NSClient++ can use either. In this case, the service name is "BackupExecDeviceMediaService", which you can find in the service properties.

P.S. - if you're unfortunate enough to be using Backup Exec, I feel for you.

Next, the dollar sign ($):

The dollar sign is a goofy one too, and can't be escaped with the backslash character ( \ ). Instead, you have to double it and put quotes around it (like so: "$$"). Neat, eh?

An example: let's say that you're trying to monitor the service MSSQL$BKUPEXEC. Unfortunately, this is both the display name, and the service name, so the last trick we used won't work. No worries, though, thanks to our friend the double-dollar-sign-enclosed-in-quotes. Your check_command will look like this:

check_command check_nt!SERVICESTATE!-l "MSSQL"$$"BKUPEXEC"

So awesome! Yay for annoying things!

Note: you might think you're clever and use single quotes instead of double around the entire service name. Unfortunately, that does not work reliably. It does seem to work, however, if you're only checking on one service name in the command. Anyhow, don't bother.

Next up, the backslash ( \ ):

This one is pretty easy, you just double it up ( like so: \\ ). Thus checking on a performance counter will look something like this:

check_command check_nt!COUNTER!-l "\\Network Interface(Intel[R] 82546EB Based Dual Port Network Connection - Packet Scheduler Miniport)\\Bytes Total/sec"

Phew, that counter name is a mouthfull, which is actually why I chose it. Don't try to manually type in your performance counters; copy and paste them. From a terminal to the Windows machine you're monitoring, open up Performance Monitor. Add the counter you're looking for to the graph, select the counter from the legend at the bottom of the window, and then click on the Copy Properties button (it's one of the buttons at the top of the graph). Now open up notepad or your favorite text editor and paste the performance counter data into it. Somewhere in there you should see a .path attribute that contains the entire counter reference which you can copy and paste into the specific Nagios configuration file we're working with (remove the server name and double all of the back slashes). Thankfully we can copy and paste from a terminal in Windows to local windows, if we're running Windows.

Note: I believe that much of the confusion over passing arguments to the check_nt command in Nagios has to do with this double back slash which looks like we're escaping the back slash. We're not...well, not really. Don't expect to simply use regular Bash shell notation in your arguments. Single quotes don't necessarily behave the way you'd like. Escaping doesn't work the way you might expect it to. Just don't bother trying to out-think the system, follow its conventions, make no assumptions, and you'll be fine.