Subject: |
RE: Dependent Triggers and Notifications - Zenoss 4.2.5 |
Author: |
Jane Curry |
Posted: |
2017-10-05 05:06 |
First thought is that if you lose an edge router and it is the only route to the site behind it, then your 55 events (presumably /Status/Ping Node down events?) ought to be suppressed anyway. Zenoss is supposed to be able to build its own internal topology map (using nmap as part of zenping) such that, if it can see a single point of failure occur, then events from behind that point are given Suppressed status. Then your trigger just has to check for Status=New and you shouldn't get your access point notifications.
That said, there are a number of reasons why this internal topology may not get built - depends on whether there is an IP route through your network; if your routing is all done at Layer 2 or lower, then this doesn't work automatically. I have worked with people in the past to add pseudo IP interfaces and routes to restore this internal topology feature.
Other than that, you are on to the classic dependency-style event logic. The chargeable version of Zenoss can do this but Core doesn't. Depends a bit on how big your network is, how many dependencies, and how much you want to solve the problem.
You might have a transform on /Status/Ping for your access points that searches the event status database for the relevant router down event, but then you need a way to specify the "relevant router" - a cProperty might be useable if each access point has a single router point-of-failure. Non-trivial but not impossible.
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------
Subject: |
RE: Dependent Triggers and Notifications - Zenoss 4.2.5 |
Author: |
Garrett Michael Hayes |
Posted: |
2017-10-06 12:18 |
Thanks. I wasn't aware of the topology aspect of Zenoss, but that sounds as if it directly addresses the sort of logical relationships I am looking to reflect. Can you point me to an overview of the topology concept and how it gets built in Zenoss Core?
------------------------------
Garrett Michael Hayes
Systems Administrator
Belnick, Inc.
Canton GA
------------------------------
Subject: |
RE: Dependent Triggers and Notifications - Zenoss 4.2.5 |
Author: |
Jane Curry |
Posted: |
2017-10-08 06:32 |
I wish I could! It has always been one of Zenoss's best-kept secrets! I am pretty sure it is done by the zenping daemon, which by default, actually uses nmap these days. I am pretty sure that, by default, every 5th nmap ping-poll has the trace flag on it which is what collects the routing data (very similar to using traceroute). What I have never found is any description of the logic that then uses that traceroute information to build the internal topology and then apply it to incoming events, automatically setting the status to "Suppressed" if the event is from behind a known network single-point-of-failure.
Can anyone else? Zenoss? elaborate on this??
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------
Subject: |
RE: Dependent Triggers and Notifications - Zenoss 4.2.5 |
Author: |
Garrett Michael Hayes |
Posted: |
2017-10-09 10:37 |
Thanks for the at least partial description. It's nice to know that sometimes even the cognoscenti are stymied. <grin>
In the interim, I found and installed the Layer2 ZenPack, which seems to give me at least some of what I am looking for. While I'm not sure of the underlying "magic", I now see what appears to be path-related information in the overview of a given device, and the network map that Zenoss generates is a great deal more comprehensible.
For example, I can look at the overview for a given server and I now see a notation of the two network switches through which data passes between the Zenoss server and the target server. And looking at a switch, I see all of its neighbor switches.
So I think I'm on path to solution. Probably my next step will be to simulate a link-outage and verify if the downstream events get suppressed.
Thanks again, and I'll update this thread when I have some results!
------------------------------
Garrett Michael Hayes
Systems Administrator
Belnick, Inc.
Canton GA
------------------------------