Subject: |
RE: Scan stopped - collection time exceeded interval |
Author: |
Jane Curry |
Posted: |
2017-07-14 06:33 |
Sounds like you are fundamentally asking too much of a single collector. If you have 550 switches - and they may all have LOTS of interfaces - then this is a heavy load.
What version of Zenoss are you on? Presumably your traffic is largely SNMP?
Look at the Collector / Control Center performance graphs; ADVANCED -> Control Center, select localhost, change drop-down to Graphs for Zenoss 5; ADVANCED -> Collectors -> localhost -> Performance for Zenoss 4. See what zenperfsnmp is doing there.
I am a bit surprised that raising the polling interval didn't help - how long did you leave it with the interval at 900 seconds? It can take quite a while before the config is changed on all devices. Also, your performance templates will still have the 300s default poll I suspect - and that's what will be causing the load - 550 devices, all with lots of interfaces (if you are on Zenoss 4 don't change the polling interval in performance templates or you will stop data being collected successfully - limitation of RRD ).
Is it an option to stick half your routers into Decommissioned production state to stop the performance monitoring polling (that's the only default state that will actually stop the performance polling)?? That would at least prove whether it is simply the number of devices / interfaces that is killing you, or whether there is something more subtle going on.
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------