Zenoss ZenTech Community

I am working on an issue that I haven't been able to resolve. Thought I would reach out to see if anyone has any recommendations.

Currently have a new environment that we are setting up for a new environment that is based on HP Blade hardware running Unix VM's for the application. We have setup 2 Admin Server VM's in two different data centers. We recently installed Cacti Hosts on two of them to monitor traffic and create reporting data for TPS and performance tuning/capacity planning, (currently use Cacti on our legacy system so management wants that used in the new environment.) Problems I am seeing are now the two servers that are running Cacti are constantly showing "Unable to read process on device" and Network connection errors. Inside the ZenOss site it shows that the nodes are bouncing up/down but the box is stable and functioning properly. The issue can be reproduced via command line. It appears to be a networking problem on the TOS01 node in both data centers. When I manually query it it starts returning data, then slows down then times out. Subsequent retries showed 100% timeout, connect with timeout

This hypothesis is backed up by the current count in Zenoss showing only 2 failures. That Cacti and ZenOSS are competing for the same SNMPD resource(s).

Subject:	Not heard of similar problems
Author:	Jane Curry
Posted:	2016-11-03 07:39

Not heard of similar problems before. Don't really understand your "TOS" acronym and architecture. Are these devices that might have a LOT of snmp data to send For example, if you have a massive router with 100s, maybe thousands of interfaces, then gathering the standard bytes in/out and packets in/out on thousands of interfaces may cause timeouts and retries. Other than that sort of scenario, it would be very unusual to see some SNMP responses and it then tailing off into timeouts.

One thing you might try on these "offending devices" that are managed by Zenoss, would be to make a note of the modeler plugins currently used for them and either remove all of them or just leave it at Zenoss.snmp.NewDeviceMap and Zemoss.snmp.DeviceMap which will only query at the device level for basic SNMP system table data, thus proving SNMP contact (it should populate the Overview page with some data) but NOT providing SNMP stress. Depending on the number of these devices, you might either change the modeler plugin set just for those devices (from left-hand Modeler Plugins menu) or, if there are lots of them, create a device subclass and change the Modeler Plugins for the class (use DETAILS link at top of left-hand menu for a device class).

Cheers,

Jane

Email: jane.curry@skills-1st.co.uk Web: https://www.skills-1st.co.uk

Subject:	ZenOss and Cacti Issues
Author:	[Not Specified]
Posted:	2016-11-02 14:36

ZenOss and Cacti Issues