TECHZEN Zenoss User Community ARCHIVE  

Datasources being disabled - ESXiMonitorPython / Python Collector

Subject: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Pheripheral Pheripheral
Posted: 2018-02-05 06:07

We're using the ZenPacks.community.VMwareESXiMonitorPython 3.0.3 (Jane's update to the perl sdk version of the zenpack) to monitor various esxi hosts.

We're occasionally seeing the python collector event being raised whereby it disables the datasource as it thinks the datasource is blocking for too long – as described in a couple of old threads:

Quirky disabling behaviour with PythonCollector ZenPack 1.7.3

and

Quirky disabling behaviour with PythonCollector ZenPack 1.7.3

 

When we see the event raised it is usually when hosts are being shutdown / started up (although not every time).

 We've been experimenting with upping the blockingtimeout value to a large value (100+ seconds) to prevent this or setting the blockgintimeout to 0 to prevent the blocking watchdog from being started in PythonCollector...

 but were wondering if anyone else has noticed the datasources from this ZenPack being blocked or any advice on how to avoid it?

Thanks for any advice!



------------------------------
Pheripheral
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Jane Curry
Posted: 2018-02-06 03:53

Yup - I have also seen this, particularly in the circumstances you describe.  For what it is worth, I have my zenpython.conf with blockingwarning at 3 seconds and blockingtimeout at 10.

I have a feeling that later versions of the PythonCollector ZenPack may also help - I am on  1.7.3 which is quite old now.

Cheers,
Jane

------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Pheripheral Pheripheral
Posted: 2018-02-07 10:28

OK, good to know others have seen it.

We're now on Python Collector 1.10.1 but still seeing it with default timeouts.
Currently assessing outcome of upping timeouts over a few weeks and seeing if the event reappears.

Thanks
Dafydd


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Paul Giordano
Posted: 2018-12-05 12:52

This one reared its ugly head at me. I added the parameters Jane mentioned, and now am getting a message "Process set contains 0 running processes: zenpython. Backing out the zenpython.conf changes doesn't seem to help. Any troubleshooting ideas?

------------------------------
Paul Giordano
Senior Systems Engineer
Zethcon Corporation
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Jane Curry
Posted: 2018-12-05 15:07

I think the ""Process set contains 0 running processes: zenpython" message is from zenprocess rather than zenpython.  There were certainly some versions of Zenoss that created these erroneously - what version are you at? 

It is basically saying that you have process monitoring configured for a process and that process isn't running - but it's not exactly the most user-friendly event on the planet ;)

Cheers,
Jane

------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Paul Giordano
Posted: 2018-12-05 15:57

Thanks Jane. We're running 6.1.2. How would I troubleshoot the original problem that this ticket mentioned? This is strange, it just started happening last night. Been running fine before that.

------------------------------
Paul Giordano
Senior Systems Engineer
Zethcon Corporation
------------------------------

We're using the ZenPacks.community.VMwareESXiMonitorPython 3.0.3 (Jane's update to the perl sdk version of the zenpack) to monitor various esxi hosts.

We're occasionally seeing the python collector event being raised whereby it disables the datasource as it thinks the datasource is blocking for too long – as described in a couple of old threads:

Quirky disabling behaviour with PythonCollector ZenPack 1.7.3

and

Quirky disabling behaviour with PythonCollector ZenPack 1.7.3

 

When we see the event raised it is usually when hosts are being shutdown / started up (although not every time).

 We've been experimenting with upping the blockingtimeout value to a large value (100+ seconds) to prevent this or setting the blockgintimeout to 0 to prevent the blocking watchdog from being started in PythonCollector...

 but were wondering if anyone else has noticed the datasources from this ZenPack being blocked or any advice on how to avoid it?

Thanks for any advice!



------------------------------
Pheripheral
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Paul Giordano
Posted: 2018-12-07 09:44

So, the zenpython logs show the datasources disabled. How do I re-enable them?

------------------------------
Paul Giordano
Senior Systems Engineer
Zethcon Corporation
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Paul Giordano
Posted: 2018-12-07 09:49

Answered my own question, go to Advanced -> Monitoring Templates, expand ESXiHost, select /Devices/VMWare/ESXiHost, double click on the datasource, disable and enable it again. This doesn't fix the original problem, but it resets the disabled datasources.

Interesting, when I restart zenpython I get the following:
2018-12-07 14:57:53,178 INFO zen.python: plugins disabled by watchdog: ['ZenPacks.community.VMwareESXiMonitorPython.datasources.VMwareDataSource.VMwareDataSourcePlugin']
2018-12-07 14:57:53,178 INFO zen.python: starting watchdog with 100.0s timeout
2018-12-07 14:57:53,216 INFO zen.zenpython: Connecting to localhost:8789
2018-12-07 14:57:53,237 INFO zen.zenpython: Connected to the zenhub/0 instance


Still getting the original messages, even after doing the above. Changed the blockingwarning at 30 seconds and blockingtimeout at 100, still getting the messages. Any help or pointers appreciated.

------------------------------
Paul Giordano
Senior Systems Engineer
Zethcon Corporation
------------------------------

We're using the ZenPacks.community.VMwareESXiMonitorPython 3.0.3 (Jane's update to the perl sdk version of the zenpack) to monitor various esxi hosts.

We're occasionally seeing the python collector event being raised whereby it disables the datasource as it thinks the datasource is blocking for too long – as described in a couple of old threads:

Quirky disabling behaviour with PythonCollector ZenPack 1.7.3

and

Quirky disabling behaviour with PythonCollector ZenPack 1.7.3

 

When we see the event raised it is usually when hosts are being shutdown / started up (although not every time).

 We've been experimenting with upping the blockingtimeout value to a large value (100+ seconds) to prevent this or setting the blockgintimeout to 0 to prevent the blocking watchdog from being started in PythonCollector...

 but were wondering if anyone else has noticed the datasources from this ZenPack being blocked or any advice on how to avoid it?

Thanks for any advice!



------------------------------
Pheripheral
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Jason Olson
Posted: 2018-12-07 11:07

Paul, I know you mentioned in another thread you were running version 6.1.2; if you can, I'd highly recommend moving to 6.2.1, as version 6.1.x had some problems with memory leaks, exceptions being thrown, zombie threads running at 100% utilization and other weirdness like graphing just....stopping for hours at a time until the next scheduled device remodel. Version 6.2.1 has been pretty solid, with caveats.

------------------------------
Jason Olson
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Paul Giordano
Posted: 2018-12-07 14:16

Sigh. OK, I'll try to upgrade this weekend, but I'm thinking it's a bigger job that just installing the new code... We'll see.

------------------------------
Paul Giordano
Senior Systems Engineer
Zethcon Corporation
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Pheripheral Pheripheral
Posted: 2018-12-10 05:44

Hi,

In terms of unblocking the blocked datasources, this involves removing the name of the blocked datasource from either /var/zenoss/zenpython.blocked on Zenoss 5, or /opt/zenoss/var/zenpython.blocked on Zenoss 4 and then restarting zenpython daemon.

Full info / discussion on this on the python collector page in the comments:
https://www.zenoss.com/product/zenpacks/pythoncollector

Hope this helps!

------------------------------
Pheripheral Pheripheral
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Jane Curry
Posted: 2018-12-11 05:56

Removing or editing this file is definitely the way to get the datasources collecting again - but it doesn't get to the root cause of the problem which is that devices are not responding to zenpython fast enough and are then "blocking".  The real problem (or certainly was back when I developed this) is that once a datasource blocks for any device, the datasource is blocked for ALL devices ie. it is put in thezenpython.blocked file.

In a perfect world, I need to rewrite some of this code to ensure that it never blocks....

Absolutely no promises - but how many people are affected by this??  Please report here.

Cheers,
Jane

------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------


Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Pheripheral Pheripheral
Posted: 2018-12-11 06:32

Hi,

We're certainly affected by this.

Although pleasingly have not seen it for a while but then we have currently set the blocking timeout to 0, i.e. do not use blocking timeout! as we can't be in a situation where as there may not be access to zenoss by someone capable of unblocking it for some time.

Thanks
Dafydd



------------------------------
Pheripheral Pheripheral
------------------------------

Subject: RE: Datasources being disabled - ESXiMonitorPython / Python Collector
Author: Austin Culbertson
Posted: 2018-12-12 12:03

This has been something of a pain for myself, as well - I have a task on my end to try and figure out what is causing us to run long, and I suspect it might have something to do with when we upgrade our ZenPacks and update our remote collectors. I have no conclusive evidence, but it is kind of a pain to have to try and track down the exact cause of it (i.e. the point that took too long to run - perhaps it's there and I'm just missing it in my cursory searching, however).

------------------------------
Austin Culbertson
NOC Monitoring Engineer
------------------------------


< Previous
Json API Issue
  Next
notification.recipients logical/display name?
>