TECHZEN Zenoss User Community ARCHIVE  

Monitoring Template not working

Subject: Monitoring Template not working
Author: Tim Meads
Posted: 2019-08-14 00:16

I've been rebuilding my system since I nuked it on accident and I've found out that I can't get any of my monitoring templates to work. 

When I do a test on the OID's I get the values set. I create a minmaxthreshold for that OID and state that when the min threshold is met to throw a critical...which should then send an email notification. 

I get nothing from it...even tho I know the threshold is not being met. So I create a graph and add the data points in including the minmaxthreshold and the graph shows all N/A's.....when I test it it works fine, but the system isn't grabbing the data from the devices. 

Any ideas?!

------------------------------
Tim Meads
NOC Supervisor
Mountain West Technologies Network
------------------------------


Subject: RE: Monitoring Template not working
Author: Tim Meads
Posted: 2019-08-14 11:30

I spent several hours digging and I cannot find a single reason to why my system isn't pulling any custom snmp data.....It's not getting anything from monitoring templates.

I adjusted the iptables rules to allow all SNMP.....but I'm not 100% convinced that this is working correctly....it acts like either the modeler is not able to crunch the data or the data isn't coming in. Maybe it's a permissions issue for the file system? I can't see anywhere in any logs where it's stopping writing or anything tho....



------------------------------
Tim Meads
NOC Supervisor
Mountain West Technologies Network
------------------------------


Subject: RE: Monitoring Template not working
Author: Jane Curry
Posted: 2019-08-15 04:23

Sooo......
Have you tried using snmpwalk tests from a command line?  You don'tsay what version of Zenoss but if it is a dockerised version then try snmpwalk tests from the zenperfsnmp container.  That will at least prove whether it is a communications issue external to Zenoss. It will also give you a way to easily play around with the community name parameter and SNMP version so you can then check back with the zProperties that Zenoss also uses; something like:

snmpwalk -v 2c -c public zen42.class.example.org system

to use a community name of pubic with SNMP V2 against the zen42.class.example.org device and just get the system table from MIB-2.  When you get some variant of this to work, then also test without the "system" on the end to ensure you get access to the rest of the MIB values (some RedHat systems, by default, will give answers for system but no access to other MIB variables).

The next thing I would try is to model a test device, in debug mode.  Do this from the zenmodeler container if you are dockerised ie:
serviced service attach zenmodeler su zenoss -l

To model zen42.class.example.org with full debugging, sending all output to /tmp/zen42model.out, try:
zenmodeler run -v 10 -d zen42.class.example.org > /tmp/zen42model.out 2>&1

If you want to just run a specific modeler plugin, like the InterfaceMap, you can add the --collect parameter:
zenmodeler run -v 10 -d zen42.class.example.org --collect InterfaceMap > /tmp/zen42model.out 2>&1

Inspecting your output file should then show what data you are / are not getting, and hopefully why.

Cheers,
Jane

------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------


Subject: RE: Monitoring Template not working
Author: Tim Meads
Posted: 2019-08-15 11:45

I suppose I should have left more info. 

Zenoss version 6.2.1.

I can run the desired snmpwalk command from the CLI without issue. When I go into the template and go to the spot where the OID is, I can put the address in of the device and hit test and it comes back with the proper polled snmp entry. 

For example, or to make it clearer, what I'm trying to do is monitor the number of station counts on PtMP wireless radios. Before I had the crash, I had it setup to where if one of the radios had 0 registered clients on it, it would send me an error letting me know that the radio has no clients on it. 

To achieve this, I'd go into monitoring templates, setup a template and bind it to the a device class the radio is in. I would create the data point, enter the oid info, set a minmax threshold. If I wanted to verify the data I would setup a graph to show me the number of clients etc. 

I have completely reconstructed this setup and I'm not getting any data in the graphs being generated by the monitoring template. On the other hand, I can get graphs related to interfaces and such automatically being pulled when the device is being modeled. 

Attempting from inside the container, 

root@zenoss [~] : serviced service attach zenperfsnmp/0

.1.4.5.1.157279e20 /]# snmpwalk -On -cmWtCorp -v1 10.58.21.9  .1.3.6.1.4.1.41112

.1.3.6.1.4.1.41112.1.4.5.1.15.1 = Gauge32: 6

[root@dfcb77279e20 /]#

I ran the command you wanted me to test with the output. I've attached it to the thread.....I don't really see anything that screams to me as an issue. I've also attached screenshots showing the issue. As I said before, the data for the interfaces comes through without issue, in the output of the file there is no mention of the OID that I'm trying to pull from the monitoring template. Which is  .1.3.6.1.4.1.41112.1.4.5.1.15




------------------------------
Tim Meads
NOC Supervisor
Mountain West Technologies Network
------------------------------

Attachments:

10.57.130.18.txt

Screen_Shot_2019-08-15_at_9.42.16_AM.png

Screen_Shot_2019-08-15_at_9.42.43_AM.png

Screen_Shot_2019-08-15_at_9.42.48_AM.png

Screen_Shot_2019-08-15_at_9.42.54_AM.png



Subject: RE: Monitoring Template not working
Author: Jane Curry
Posted: 2019-08-15 12:16

So is the OID that you are trying to get a scalar or a table (in SNMP-speak)?  Or another way of looking at it is are you running a query against the device itself or a component of the device?  Sorry - not quite sure from your description above?  I am thinking it is probably a scalar value run against the device itself??

If so, the trick is that you need to add a ".0" to the end of your OID in the template to denote the fact that it it is a single value for the device, not a table of values to fulfil a bunch of components.  The Test button in the template GUI strictly, does an snmpwalk so it will work if your OID is .1.3.6.1.4.1.41112.1.4.5.1.15 . The zenperfsnmp daemon, however, actually does an snmpget so the OID must be exactly correct.

Hope that helps,
Jane


------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------


Subject: RE: Monitoring Template not working
Author: Tim Meads
Posted: 2019-08-15 18:31

You are an animal! 

So I ran them with a .0 at the end, didn't return a result, so I ran them with a .1 incase that would be the first result under that table and it worked!

tmeads@MacBookTouch2 [~] : snmpwalk -cmWtCorp -v1 -On 10.57.130.18 .1.3.6.1.4.1.41112.1.4.5.1.15

.1.3.6.1.4.1.41112.1.4.5.1.15.1 = Gauge32: 0

tmeads@MacBookTouch2 [~] : snmpwalk -cmWtCorp -v1 -On 10.57.130.18 .1.3.6.1.4.1.41112.1.4.5.1.15.0

tmeads@MacBookTouch2 [~] : snmpwalk -cmWtCorp -v1 -On 10.57.130.18 .1.3.6.1.4.1.41112.1.4.5.1.15.1

.1.3.6.1.4.1.41112.1.4.5.1.15.1 = Gauge32: 0

I now go into the monitoring template graphs and they have the data and I'm getting my alerts. Now if I can just remember all of this stuff the next time the system decides to take a crap ;)

One more question that you may know about....I'm getting utilization errors on interfaces for a radio type thats new. I never had to deal with these before and they keep coming back. I would assume I can put an RRD string somewhere or something similar to make it legible but how does it determine if the interface is at capacity or nearing and throwing the error? These radios do far more then 100meg and I'd assume that's what its erroring on. 



------------------------------
Tim Meads
NOC Supervisor
Mountain West Technologies Network
------------------------------

Attachments:

Screen_Shot_2019-08-15_at_4.02.38_PM.png



< Previous
Inverting Ping Events?
  Next
Has Zenoss dropped support for Core and the Community?
>