TECHZEN Zenoss User Community ARCHIVE  

"Discarded event - queue overflow" messages in zenpython log

Subject: "Discarded event - queue overflow" messages in zenpython log
Author: Larry
Posted: 2020-05-22 01:30

We are running Zenoss 6.2.1 (Community). Recently we started getting a lot of these messages in the zenpython service log:

WARNING zen.zenpython: Discarded event - queue overflow: .....[text cut]

The queue length in MetricConsumer is at 0, and the queue length in rediscollector seems low (average of 50?).

The rabbitmq log has a lot of "shutdown_error" messages:
SUPERVISOR REPORT==== 22-May-2020::05:01:50 ===
     Supervisor: {<0.8363.23>,rabbit_channel_sup_sup}
     Context:    shutdown_error
     Supervisor: {<0.8363.23>,rabbit_channel_sup_sup}
     Context:    shutdown_error
     Reason:     shutdown
     Offender:   [{nb_children,1},
     Reason:     shutdown
     Offender:   [{nb_children,1},
                  {name,channel_sup},
                  {name,channel_sup},
                  {mfargs,{rabbit_channel_sup,start_link,[]}},
                  {mfargs,{rabbit_channel_sup,start_link,[]}},
                  {restart_type,temporary},
                  {shutdown,infinity},
                  {restart_type,temporary},
                  {child_type,supervisor}]
                  {shutdown,infinity},
                  {child_type,supervisor}]


This might be the root cause (or at least related to the problem).

Any ideas?

Thanks in advance,



------------------------------
Larry
------------------------------


Subject: RE: "Discarded event - queue overflow" messages in zenpython log
Author: Michael Rogers
Posted: 2020-06-02 16:35

Larry,

MetricConsumer and CollectorRedis are both used to pass performance metrics from their source collector service back to OpenTSDB and shouldn't have any effect on event data.

After a collector service (zenpython, in this case) generates an event, it sends it to zenhub for validation and entry into the event processing pipeline (zeneventd, zeneventserver, zenoss_zep database in MariaDB, and zenactiond should a trigger match).  If zenhub is not available to receive an incoming event, the collector service will temporarily cache the event in an internal event queue.  By default, this queue will cache the first 5000 events it receives and will eject events in a FIFO fashion should the queue fill.

If you're seeing channel shutdown errors in RabbitMQ, it's likely the result of a Rabbit queue consumer disconnecting.  Whether that's through an error or the result of that service restarting is hard to tell from here.

I would double-check your zenhub log to see if it has any complaints around the same time as the queue overflow messages.

If you'd like to know more about event processing, we've got a handy video that covers the event pipeline here:


Let us know if it helps?



------------------------------
Michael J. Rogers
Senior Instructor - Zenoss
Austin TX
------------------------------


< Previous
Cannot resize Application Data - could not get the lv_uuid
  Next
"Unknown connection problem to zenhub" messages in zenhub log
>