![]() |
![]() |
Subject: | Best practices for monitoring clusters in Zenoss |
Author: | [Not Specified] |
Posted: | 2015-08-27 07:50 |
Hello folks,
Hoping to pick the collective brains regarding best practices for monitoring clusters in Zenoss. Specifically I'm interested in Microsoft Windows clusters for SQL Server but probably applies to other cluster types like Redhat as well.
Our environment: Zenoss 4.2.5 Core + ZenUPS; all SNMP monitoring; Windows servers have SNMP Informant installed
Given a Windows cluster of two nodes, ServerA and ServerB; they are both modeled in Zenoss. We are monitoring our windows servers using SNMP at present.
ServerA is the active node and Components shows C: drive, OS Process Sqlservr.exe and supporting filesystems D: E: and F:
ServerB is the passive node and Components shows only its C: drive
Cluster failover occurs and ServerB becomes active. Sqlserver and the filesystems migrate from ServerA to ServerB. We receive proper notification of Sqlserver being down on ServerA. But at this time, Sqlserver and the filesystems are NOT YET MODELED on ServerB.
So I wrote a trigger to remodel the two nodes when Sqlserver reports down on either node, thereby moving the infrastructure from ServerA to ServerB. That works great, except there's no way to view the historical performance graphs on the passive node as the components are no longer modeled there.
To solve that, I know you can lock a component from being deleted by remodel. However, as the components actually are no longer present on the passive node, we need to suppress events for these components on the passive node. Zenoss Corp. suggested putting the passive server in Maintenance mode to suppress the alerts. But that will suppress ALL alerts - if it goes down or its C: drive fills up or whatever, we still need to be notified even if it is the passive node.
Another idea would be to change the Monitored status of each component on and off as the servers change roles. I've never done that using the DMD or the JSON API but it should be easy enough. However, I don't like this idea for two reasons: It puts the burden on me to accurately determine which node is active and which is passive and to not screw it up when making modifications; and as far as I can tell, it would require a unique script for each cluster we have, as the components of course have different names. An ideal solution would require nothing unique to each cluster be maintained outside of Zenoss.
Does WinRM/Microsoft Windows ZenPack handle this differently/better
(Why aren't we using WinRM I'm a unix guy tasked with monitoring windows boxes so it's hard for me to articulate the benefits to overcome pushback by the windows team on the grounds of security and other concerns; and I'm not sure the implication to collector performance if we switched from SNMP to WinRM at this time.)
Any input is appreciated.
Thanks
Dave
Subject: | Dave, |
Author: | Andrew Kirch |
Posted: | 2015-08-27 14:07 |
Dave,
Microsoft has ended support for SNMP, they've ended support for WMI, so I'd point out that your windows team can gripe as much as they like, but that after that they go google 'fait accompli'. They frankly don't have a choice but to switch to WinRM, and neither do we.
SNMP is not secure, or robust. It uses UDP, and unless you set up v3 (which is a pain), there is no auth. WinRM utilizes Kerberos Domain Auth, and encryption to connect.
I'd suggest using a PowerShell script to determine which is active/passive and react accordingly, but this will require WinRS (part of WinRM).
Andrew Kirch
akirch@gvit.com
Need Zenoss support, consulting or custom development Look no further. Email or PM me!
Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard
Subject: | All points taken. |
Author: | [Not Specified] |
Posted: | 2015-08-28 07:55 |
All points taken.
What do you suggest this powershell script actually do to the devices in Zenoss, though
Dave
Subject: | Determine which is the active |
Author: | Andrew Kirch |
Posted: | 2015-08-31 12:42 |
Determine which is the active node and call back via the JSON API and flip the "monitored" bits
Andrew Kirch
akirch@gvit.com
Need Zenoss support, consulting or custom development Look no further. Email or PM me!
Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard
< |
Previous Extend device.hw.* via Zenpacklib |
Next Start mode of windows service to monitor |
> |