Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jane Curry |
Posted: |
2018-12-16 11:56 |
Are all these devices down (as far as ping is concerned)?? If so, I think you will find that the Zenoss logic says that if zPingMonitorIgnore=False and the device doesn't respond to ping, then any other template monitoring will be suspended until the status returns to "Up". This would fit with getting one event after the Device Down event has been closed.
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jad Baz |
Posted: |
2018-12-17 03:44 |
That makes sense but I've already set zPingMonitorIgnore=True
------------------------------
Jad
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jane Curry |
Posted: |
2018-12-17 05:18 |
You say that you have zPingMonitorIgnore=False. Not sure but i suspect this only affects monitoring by the zenping daemon. From your screenshot, it looks like you are maybe doing your own ping-monitoring with a Command template and generating a /Status/Ping event if your monitoring fails?
I am guessing here but I wonder if a Critical severity /Status/Ping event stops other template monitoring, even if it isn't generated by zenping.
I would try a couple of tests. Try reducing the severity of your Heartbeat monitor to, say, Info. If my hypothesis is right, that shouldn't trigger the internal logic that says "stop running all other template monitoring". Alternatively, change the event class that you generate to something like /App/Test.
I see that you are trying to reimplement in 6 what you had in Zenoss 4, but the underlying logic may well have changed slightly over the intervenng years.
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jad Baz |
Posted: |
2018-12-17 09:08 |
OK, so I think we've gotten to the bottom of this.
I've disabled PING by default so the state of my device won't change unless I change it.
Since I've set my monitoring templates to the event class Status/Ping, the first monitoring template to run will set the device as down. Afterwards, since the device is down (ie not reachable in any way), Zenoss wouldn't bother running any monitoring templates.
This theory would validate what I'm seeing.
Do you think this is correct Jane?
------------------------------
Jad
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jason Olson |
Posted: |
2018-12-17 11:04 |
I think what's happening, based on that screenshot, is that your command is returning a value that Zenoss can't handle. It throws the exception, and the rest of the templates don't run as there's an unhandled error condition. I've seen that occur with other commands. What value is your command supposed to return?
------------------------------
Jason Olson
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jad Baz |
Posted: |
2018-12-18 07:06 |
The monitoring templates are good, I've tested them extensively.
------------------------------
Jad
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jason Olson |
Posted: |
2018-12-18 10:22 |
The commands, you mean? They probably are, but Zenoss is pretty particular about the values it gets returned from a command or a query, and it looks from the screenshot above that it's Zenoss that can't handle whatever value the command runs.
------------------------------
Jason Olson
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jane Curry |
Posted: |
2018-12-17 14:12 |
That's my theory. But as Jason says, it does look like you have an issue with at least one of your command monitors, in addition.
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Jad Baz |
Posted: |
2018-12-19 11:58 |
So my first question, is there a way to be able to set the device as down and still run monitoring templates?
Another question. Is there a way, in zendmd maybe, to set the device as up/down?
------------------------------
Jad
------------------------------
Subject: |
RE: Zenoss 6.2.1 monitoring templates only running on event close |
Author: |
Ryan Matte |
Posted: |
2018-12-26 12:47 |
So to clarify how this currently works, if a critical event is present in /Status the device will be marked as down and all monitoring except ping monitoring will cease on it. Previously this was only the case if a critical event was present in /Status/Ping but this got changed by the ZenPackLib code. There is currently an effort to re-architect the way this works, but it'll probably be a while before that's done. So to answer your questions...
is there a way to be able to set the device as down and still run monitoring templates? -> No there is not, if a device is marked as down due to the presence of a critical event in /Status currently all monitoring ceases on that device except for zenping monitoring. Zenoss are currently looking in to changing the way this works a bit but I'm not sure exactly what the end result of that is going to look like.
Is there a way, in zendmd maybe, to set the device as up/down? - No there is not, the down status of a device is currently purely based on whether or not there is a critical level event present anywhere under the /Status event class. As such I would recommend avoiding using the /Status event class for any custom thresholds/scripts so that you don't accidentally trigger a device showing as down and stop monitoring from occurring when you don't intend to.
Cheers.
------------------------------
Ryan Matte
------------------------------