TECHZEN Zenoss User Community ARCHIVE  

Zenoss Core 4 Out of Resources Issue

Subject: Zenoss Core 4 Out of Resources Issue
Author: Ken Jenkins
Posted: 2015-07-23 10:09

I reported in another topic that there was an issue with Zenoss Core 4 where the zenoss login reports the following:

2015-07-23 00:49:31,394 INFO zen.ZenHub: Worker (11858) reports /opt/zenoss/bin/zenfunctions: fork: retry: Resource temporarily unavailable

I am opening up a new topic because I seem to be blocked on the old one. :(

Before Zenoss crashed, this was my report of files in use and process / thread counts.

NOTE: Python stands out on thread count at 866.

hu Jul 23 04:35:02 UTC 2015
---------------------
Checking /opt/zenoss/log/zenhub.log for an issue with zenoss login - Resource temporarily unavailable ..

2015-07-23 00:49:31,394 INFO zen.ZenHub: Worker (11858) reports /opt/zenoss/bin/zenfunctions: fork: retry: Resource temporarily unavailable
Zenoss Processes Running = 18

Zenoss is running ...

---------------------
Free memory check ...

total used free shared buffers cached
Mem: 16334352 4803996 11530356 212 184964 732244
-/+ buffers/cache: 3886788 12447564
Swap: 835580 0 835580
---------------------
Zenoss processes running check ...

32
---------------------
Zenoss files in use check ...

By User ID ...
--------------
USER = dbus Count = 4 ...
USER = memcached Count = 4 ...
USER = mysql Count = 61 ...
USER = ntp Count = 13 ...
USER = rabbitmq Count = 20 ...
USER = root Count = 496 ...
USER = rpc Count = 12 ...
USER = rpcuser Count = 4 ...
USER = smmsp Count = 4 ...
USER = zenoss Count = 4066 ...

By User COMMAND ...
--------------
COMMAND = agetty Count = 4 ...
COMMAND = aio/0 Count = 4 ...
COMMAND = aio/1 Count = 4 ...
COMMAND = aio/2 Count = 4 ...
COMMAND = aio/3 Count = 4 ...
COMMAND = async/mgr Count = 4 ...
COMMAND = ata_aux Count = 4 ...
COMMAND = ata_sff/0 Count = 4 ...
COMMAND = ata_sff/1 Count = 4 ...
COMMAND = ata_sff/2 Count = 4 ...
COMMAND = ata_sff/3 Count = 4 ...
COMMAND = auditd Count = 8 ...
COMMAND = automount Count = 4 ...
COMMAND = awk Count = 0 ...
COMMAND = bdi-defau Count = 4 ...
COMMAND = beam.smp Count = 4 ...
COMMAND = certmonge Count = 4 ...
COMMAND = cgroup Count = 4 ...
COMMAND = crond Count = 8 ...
COMMAND = crypto/0 Count = 4 ...
COMMAND = crypto/1 Count = 4 ...
COMMAND = crypto/2 Count = 4 ...
COMMAND = crypto/3 Count = 4 ...
COMMAND = dbus-daem Count = 4 ...
COMMAND = deferwq Count = 4 ...
COMMAND = dhclient Count = 4 ...
COMMAND = egrep Count = 0 ...
COMMAND = epmd Count = 4 ...
COMMAND = events/0 Count = 4 ...
COMMAND = events/1 Count = 4 ...
COMMAND = events/2 Count = 4 ...
COMMAND = events/3 Count = 4 ...
COMMAND = ext4-dio- Count = 8 ...
COMMAND = flush-253 Count = 4 ...
COMMAND = inet_geth Count = 8 ...
COMMAND = init Count = 25 ...
COMMAND = java Count = 329 ...
COMMAND = jbd2/dm-0 Count = 4 ...
COMMAND = jbd2/vda1 Count = 4 ...
COMMAND = kacpid Count = 4 ...
COMMAND = kacpi_hot Count = 4 ...
COMMAND = kacpi_not Count = 4 ...
COMMAND = kauditd Count = 4 ...
COMMAND = kblockd/0 Count = 4 ...
COMMAND = kblockd/1 Count = 4 ...
COMMAND = kblockd/2 Count = 4 ...
COMMAND = kblockd/3 Count = 4 ...
COMMAND = kdmflush Count = 8 ...
COMMAND = kdmremove Count = 4 ...
COMMAND = khelper Count = 4 ...
COMMAND = khubd Count = 4 ...
COMMAND = khugepage Count = 4 ...
COMMAND = khungtask Count = 4 ...
COMMAND = kintegrit Count = 16 ...
COMMAND = kpsmoused Count = 4 ...
COMMAND = kseriod Count = 4 ...
COMMAND = ksmd Count = 4 ...
COMMAND = ksoftirqd Count = 16 ...
COMMAND = kstriped Count = 4 ...
COMMAND = ksuspend_ Count = 4 ...
COMMAND = kswapd0 Count = 4 ...
COMMAND = kthreadd Count = 4 ...
COMMAND = kthrotld/ Count = 16 ...
COMMAND = linkwatch Count = 4 ...
COMMAND = lsof Count = 26 ...
COMMAND = md/0 Count = 4 ...
COMMAND = md/1 Count = 4 ...
COMMAND = md/2 Count = 4 ...
COMMAND = md/3 Count = 4 ...
COMMAND = md_misc/0 Count = 4 ...
COMMAND = md_misc/1 Count = 4 ...
COMMAND = md_misc/2 Count = 4 ...
COMMAND = md_misc/3 Count = 4 ...
COMMAND = memcached Count = 4 ...
COMMAND = migration Count = 16 ...
COMMAND = mingetty Count = 24 ...
COMMAND = mysqld Count = 8 ...
COMMAND = mysqld_sa Count = 4 ...
COMMAND = netns Count = 4 ...
COMMAND = ntpd Count = 4 ...
COMMAND = oddjobd Count = 4 ...
COMMAND = pm Count = 8 ...
COMMAND = python Count = 3487 ...
COMMAND = rabbitmq- Count = 4 ...
COMMAND = redis-ser Count = 13 ...
COMMAND = rpcbind Count = 4 ...
COMMAND = rpc.statd Count = 4 ...
COMMAND = rrdcached Count = 48 ...
COMMAND = rsyslogd Count = 4 ...
COMMAND = runuser Count = 4 ...
COMMAND = runzope Count = 183 ...
COMMAND = scsi_eh_0 Count = 4 ...
COMMAND = scsi_eh_1 Count = 4 ...
COMMAND = sendmail Count = 8 ...
COMMAND = sh Count = 102 ...
COMMAND = snmpd Count = 4 ...
COMMAND = sort Count = 0 ...
COMMAND = sshd Count = 4 ...
COMMAND = sssd Count = 28 ...
COMMAND = sssd_be Count = 4 ...
COMMAND = sssd_nss Count = 4 ...
COMMAND = sssd_pac Count = 4 ...
COMMAND = sssd_pam Count = 4 ...
COMMAND = sssd_ssh Count = 4 ...
COMMAND = sssd_sudo Count = 4 ...
COMMAND = stopper/0 Count = 4 ...
COMMAND = stopper/1 Count = 4 ...
COMMAND = stopper/2 Count = 4 ...
COMMAND = stopper/3 Count = 4 ...
COMMAND = sync_supe Count = 4 ...
COMMAND = udevd Count = 12 ...
COMMAND = uniq Count = 0 ...
COMMAND = usbhid_re Count = 4 ...
COMMAND = vballoon Count = 4 ...
COMMAND = virtio-bl Count = 4 ...
COMMAND = virtio-ne Count = 4 ...
COMMAND = watchdog/ Count = 16 ...
COMMAND = zenoss_pe Count = 29 ...

---------------------
Zenoss threads in use ...
13290 java -server -XX:+HeapDumpO 54
13677 /opt/zenoss/bin/python /opt 2
13679 /opt/zenoss/bin/python /opt 5
13685 /usr/bin/rrdcached -b /opt/ 16
13842 /opt/zenoss/bin/python /opt 1
13991 /opt/zenoss/bin/python /opt 5
14003 /opt/zenoss/bin/python /opt 1
14004 /opt/zenoss/bin/python /opt 1
14129 /opt/zenoss/bin/python /opt 1
14144 /opt/zenoss/bin/python /opt 1
14145 /opt/zenoss/bin/python /opt 1
14279 /opt/zenoss/bin/python /opt 1
14536 /opt/zenoss/bin/python /opt 1
14564 /opt/zenoss/bin/python /opt 11
14645 /opt/zenoss/bin/python /opt 1
14872 /opt/zenoss/bin/python /opt 1
14892 /opt/zenoss/bin/python /opt 1
14998 /opt/zenoss/bin/python /opt 1
15064 /opt/zenoss/bin/python /opt 1
15093 /opt/zenoss/bin/python /opt 1
15098 /usr/sbin/redis-server /opt 3
15137 /opt/zenoss/bin/python /opt 1
15141 java -server -Xmx512m -cp . 22
15209 /opt/zenoss/bin/python /opt 868
25987 /opt/zenoss/bin/python /opt 1
30214 /opt/zenoss/bin/python /opt 1
30796 CROND 1
30797 /bin/sh -c /home/zenoss/bin 1
30798 /bin/bash /home/zenoss/bin/ 1
---------------------
Root mysql threads in use ...
1284 /bin/sh /usr/bin/mysqld_saf 1
---------------------
Mysql login mysql threads in use ...
1421 /usr/sbin/mysqld --basedir= 54
---------------------
Zenoss rabbitmq threads in use ...
1504 /bin/sh /usr/sbin/rabbitmq- 1
---------------------



Subject: As noted in the other thread,
Author: Andrew Kirch
Posted: 2015-07-23 14:17

As noted in the other thread, update to the latest 4.2.5 RPS and lets see if these issues clear up.

Andrew Kirch

akirch@gvit.com

Need Zenoss support, consulting or custom development Look no further. Email or PM me!

Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard



Subject: 203 isn't latest, http://wiki
Author: Andrew Kirch
Posted: 2015-07-23 14:25

203 isn't latest, http://wiki.zenoss.org/ZenUp

Andrew Kirch

akirch@gvit.com

Need Zenoss support, consulting or custom development Look no further. Email or PM me!

Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard



Subject: Some issue with the ZUP install
Author: Ken Jenkins
Posted: 2015-07-24 10:33

After a successful dry run I see this when I install.

....
Starting zeneventserver...
starting...
Waiting for zeneventserver to start...
2015-07-24 15:27:11: waiting a maximum of 15 seconds for ZEP to be available...
...............
2015-07-24 15:27:29: is ZEP available: False

Error while running command install
Traceback (most recent call last):
File "/opt/zenup/python/zenup/zenupcli.py", line 51, in __init__
run()
File "/opt/zenup/python/zenup/zenupcli.py", line 449, in _install
force=self.args.force)
File "/opt/zenup/python/zenup/zenupapi.py", line 308, in install
result = product.install(zup, force)
File "/opt/zenup/python/zenup/zupproduct.py", line 625, in install
self._run_scripts(zup, "post", auditer)
File "/opt/zenup/python/zenup/zupproduct.py", line 687, in _run_scripts
raise e
ArchiveError: Script "['/tmp/tmpCR6EBv/zenoss_core-4.2.5-SP457-zenup11.zup_HQdDV3/post/run_migrates', '4.2.5']" returned a non-zero exit code 1.
[zenoss@ ~]$

Any suggestions



Subject: I ran zenup status and it
Author: Ken Jenkins
Posted: 2015-07-24 11:09

I ran zenup status and it looks like the migrate function failed ...

[zenoss@ ~]$ zenup status

Product: zenoss-core-4.2.5 (id = zenoss-core-4.2.5)
Home: /opt/zenoss
Revision: 203
Upgrading: 457
Last Attempted Step: post/run_migrates 4.2.5
Minimum: 203
Updated On: Wed Apr 29 03:28:53 2015



Subject: I checked zenoss and it is
Author: Ken Jenkins
Posted: 2015-07-24 11:22

I checked zenoss and it is not started ... I ran another dry-run and see this ...

[zenoss@ ~]$ zenoss status
Daemon: zeneventserver program running; pid=19221
Daemon: zopectl not running
Daemon: zenrrdcached not running
Daemon: zenhub not running
Daemon: zenjobs not running
Daemon: zeneventd not running
Daemon: zenping not running
Daemon: zensyslog not running
Daemon: zenstatus not running
Daemon: zenactiond not running
Daemon: zentrap not running
Daemon: zenmodeler not running
Daemon: zenperfsnmp not running
Daemon: zencommand not running
Daemon: zenprocess not running
Daemon: zredis not running
Daemon: zenjmx not running
Daemon: zenpython not running
[zenoss@ ~]$ zenup install /tmp/zenoss_core-4.2.5-SP457-zenup11.zup --dry-run
WARNING: You will not be able to downgrade after installing this zup. Press ENTER to continue or to quit.

Install Steps:
1) Non-Destructive Pre-Install Steps:
[c] A) Some general sanity checks (check/sanity_checks)
[c] B) Check nginx configuration (check/check_nginx_config)
[c] C) Check database permissions (check/database_perms)
[c] D) Check to see if mibs needs to be updated (check/check_mibs)
[c] E) Check to see if zenmigrate will need to run (check/check_migrate)
[c] F) Check for edge case of new files added after an RPM upgrade (check/check_new_files)
[c] G) Check to see that the patch utility exists (check/check_patch_exists)
[c] H) Check to see if zenpacks need to be updated (check/check_zenpack_updates)
2) Staging Steps:
[c] A) Set variables & stage things for consumption by the post lifecycle scripts (pre/pre)
3) File Patching:
[c] A) Lay down changes (File Patching)
4) Post-Install Steps:
[c] A) Sanity checks (post/sanity)
[c] B) Apply patches that need to be outside the normal ZenUp patching process (post/patch_post_install_files)
[c] C) Install binaries into webapps (post/install_changed_custom_blobs webapps)
[c] D) Remove binary files that are no longer needed (post/custom_file_remover)
[c] E) Install binaries into protocols (post/install_changed_custom_blobs lib/python/zenoss/protocols)
[c] F) If necessary, install recompiled javascript bundle (post/install_rebuilt_js)
[c] G) Install external depended-upon eggs (post/external_eggs)
[c] H) Update zenpacks (post/update_zenpacks $ZENPACK_UPDATE_LIST)
[c] I) If necessary, rebuild new javascript (post/build_javascript)
[i] J) Run zenmigrate (post/run_migrates 4.2.5)
[ ] K) If necessary, run zenmib (post/install_mibs)
[ ] L) Wrap-up text to console (post/finish)
[ ] M) Truncate connection_info table (post/run_zencheckzends_truncate)
(c = complete, i = incomplete)
NOTE: Pre-install & staging steps (1 + 2) are executed on every install attempt

In-progress update detected - will replay pre-install & staging steps, then pick up at the last previously attempted step (post/run_migrates 4.2.5). Press ENTER to continue or to quit
No changes have been applied to your system
Error while running command install
Traceback (most recent call last):
File "/opt/zenup/python/zenup/zenupcli.py", line 51, in __init__
run()
File "/opt/zenup/python/zenup/zenupcli.py", line 449, in _install
force=self.args.force)
File "/opt/zenup/python/zenup/zenupapi.py", line 300, in install
product.config.upgrading)
TypeError: not all arguments converted during string formatting



Subject: I decided to start up zenoss
Author: Ken Jenkins
Posted: 2015-07-24 11:31

I decided to start up zenoss to prevent it from being down.

Zenoss starts but apparently the portal is not coming back up.

Any assistance to finish the post migration steps is appreciated.

Thank you,
Ken



Subject: Ken,
Author: Andrew Kirch
Posted: 2015-07-24 11:37

Ken,

shut it down, rerun zenup, paste the output.

Andrew Kirch

akirch@gvit.com

Need Zenoss support, consulting or custom development Look no further. Email or PM me!

Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard



Subject: It works now! Thank you!
Author: Ken Jenkins
Posted: 2015-07-24 12:25

Thank you! It worked.

After a few runs, for some reason, zenup did not catch file permission issues. I resolved them.

Starting zeneventserver...
starting...
Waiting for zeneventserver to start....
2015-07-24 16:51:43: waiting a maximum of 15 seconds for ZEP to be available...

2015-07-24 16:51:51: is ZEP available: True

Running zenmigrate...
INFO:zen.migrate:Will execute these steps: AddMissedRunsMonitorGraph
INFO:zen.migrate:Installing AddMissedRunsMonitorGraph (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: AddSnmpV3EngineIdPlugin
INFO:zen.migrate:Installing AddSnmpV3EngineIdPlugin (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: EventFlapping
INFO:zen.migrate:Installing EventFlapping (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: EventsAppZenoss
INFO:zen.migrate:Installing EventsAppZenoss (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: ProcessCountThreshold
INFO:zen.migrate:Installing ProcessCountThreshold (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
/zport/dmd/Devices/rrdTemplates/OSProcess/thresholds/count ['count_count']
/zport/dmd/Devices/Server/Microsoft/rrdTemplates/OSProcess/thresholds/count ['process_count']
INFO:zen.migrate:Will execute these steps: RegenerateComponentSearchCatalog
INFO:zen.migrate:Installing RegenerateComponentSearchCatalog (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: RemoveIgnoreParametersFromOsProcessClass
INFO:zen.migrate:Installing RemoveIgnoreParametersFromOsProcessClass (4.2.5)
INFO:zen.migrate:Removing ignoreParameters and ignoreParametersWhenModeling from all OSProcessClass objects
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: UpdateMW
INFO:zen.migrate:Installing UpdateMW (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: addLockingToProcesses
INFO:zen.migrate:Installing addLockingToProcesses (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: addzSnmpContext
INFO:zen.migrate:Installing addzSnmpContext (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful
INFO:zen.migrate:Will execute these steps: zSnmpDiscoveryPorts
INFO:zen.migrate:Installing zSnmpDiscoveryPorts (4.2.5)
INFO:zen.migrate:Committing changes
INFO:zen.migrate:Migration successful

---------------------------------------------------------------------------
UPGRADE SUCCESSFUL!

2.) Start zenoss
zenoss start

3.) Please update all distributed hubs and collectors.
---------------------------------------------------------------------------

Running 'zencheckzends truncate'

Updating product information
Success!

zenup status

Product: zenoss-core-4.2.5 (id = zenoss-core-4.2.5)
Home: /opt/zenoss
Revision: 457
Upgrading: None
Minimum: 457
Updated On: Fri Jul 24 16:53:19 2015



Subject: Status
Author: Ken Jenkins
Posted: 2015-07-27 10:25

Zenoss crashed again .. .After the Zenoss Core 4 patching, it seems that Zenoss was up a bit longer than last time but we still had a crash (out of resources)

Any suggestions to isolate this resource issue is appreciated.

Thank you,
Ken



Subject: what I did
Author: Ken Jenkins
Posted: 2015-07-27 10:27

I bumped nproc from 2048 to 4096 for zenoss ... I bumped the default nproc from 1024 to 2048 and rebooted.

I will monitor Zenoss to see how long it stays up this time with this change.



Subject: Keep me in the loop.
Author: Andrew Kirch
Posted: 2015-07-29 08:53

Keep me in the loop.

Andrew Kirch

akirch@gvit.com

Need Zenoss support, consulting or custom development Look no further. Email or PM me!

Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard



Subject: unless there is a serious bug
Author: Andrew Kirch
Posted: 2015-07-31 11:20

unless there is a serious bug in a ZenPack, we will not ship an update for it via a ZUP. The latest ZenPacks are always on the wiki ZenPack Catalog. http://wiki.zenoss.org/ZenPack_Catalog

Andrew Kirch

akirch@gvit.com

Need Zenoss support, consulting or custom development Look no further. Email or PM me!

Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard



Subject: Thanks for the information.
Author: Ken Jenkins
Posted: 2015-07-31 11:28

The thread pooling issue crashing Zenoss sure seemed serious enough as it relates to the MySQL Monitoring ZenPack but maybe the seriousness of this fix was not reported adequately by the community when the bug was fixed. =)



Subject: We depend on feedback to
Author: Andrew Kirch
Posted: 2015-08-03 09:12

We depend on feedback to determine the severity of a bug, and as you rightly point out, accurate bug reporting is critical. :)

Andrew Kirch

akirch@gvit.com

Need Zenoss support, consulting or custom development Look no further. Email or PM me!

Ready for Distributed Topology (collectors) for Zenoss 5 Coming May 1st from GoVanguard



< Previous
ZENOSS: How to pull a report of events closed/resolved by owner/user ?
  Next
Windows Monitoring - No graph, data returned
>