Zenoss ZenTech Community

Has anyone else had issues with services randomly restarting in CC Lots of services seem to be restarting quite often, interrupting sessions within the GUI, impacting modeling, etc.

The localhost collector seems to be a big offendor, not seeming to last longer that 5 mins without some kind of daughter process restarting.

This is a big issue for me. I can't batchload a list of servers due to so many restarts during the modeling, or do much of any modeling for that matter. I also cannot stand having to reload the GUI every 5 minutes.

Has anyone seen this I assume I need to increase allocated RAM on each service (particularly ZenHub) but I am not sure what an ideal parameter would be. Everything is currently set to the default 256M.

I am running v. 5.0.9 as a VM with 8CPU and 96GB memory allocated, with a little over 1200 device endpoints.

Subject:	It looks like it may have to
Author:	[Not Specified]
Posted:	2016-01-11 15:25

It looks like it may have to do with health checks failing. Every time a service goes down, in the log of the service I usually see 'Health check "listening" failed.'

Any idea on how to resolve this This is heavily impacting what I am able to do in my Zenoss machine. There's no way this can be put into production with services failing health checks and restarting every few seconds.

Any help would be appreciated.

Subject:	When the services restart, I
Author:	[Not Specified]
Posted:	2016-01-29 11:28

When the services restart, I notice this line appears in journalctl:
Failed to connect to 127.0.0.1:2181: dial tcp 127.0.0.1:2181: i/o timeout

In /etc/default/serviced, it looks like 2181 is for the serviced zookeeper:

# Set the the zookeeper ensemble, multiple masters should be comma separated
# SERVICED_ZK=$SERVICED_MASTER_IP:2181

Why is zookeeper timing out so much

Subject:	Service Restart Issues
Author:	[Not Specified]
Posted:	2016-01-11 12:24

Service Restart Issues