TECHZEN Zenoss User Community ARCHIVE  

Processes being monitored always show status "up" even if down?

Subject: Processes being monitored always show status "up" even if down?
Author: [Not Specified]
Posted: 2014-02-08 12:02

Hi guys,

Okay so I configured zenoss to monitor a specific process, I added the correct regex and then I saved the process. Now i restarted zenprocess. Then I went to my device and remodelled it, it detected the process that was running and added it to the "OS Processes" list. Now I killed that process, but zenoss still shows that process as "up". I even waited for multiple hours to ensure that there was no polling delay issues but still the process is being shown as "up".

Infact the only way I am able to remove that process from the list is to remodel the device, when the process is not running. And that obviously removes the whole process itself rather than showing it as "down".

How to get zenoss to report the accurate status of the process It just shows "up" no matter the actual state of the process. (I have even tried restarting zenoss, and locking down device after modeling, but to no avail)

BTW, I am running zenoss on Ubuntu Server machine, and my test client/device is also an Ubuntu desktop. (Everything else is working, disk usage, up/down status, memory usage etc what not)

Thanks in advance for the help!



Subject: Please share your regex and
Author: [Not Specified]
Posted: 2014-02-10 11:34

Please share your regex and does the process you are monitoring use a specific tcp/udp port It might be easier to just monitor that instead.

Hydruid



Subject: Same problem here
Author: [Not Specified]
Posted: 2014-03-07 01:55

Did you ever find a solution for this I've got a new zenoss 4.2.4 install with SP405 applied. I can add a new OS process and as Paradox says modeling will detect the process, however killing the process will not generate an event. I've toggled the monitoring flag for the component on/off with no change.

regex: agent_manager
example: /usr/local/rvm/gems/ruby-1.9.3-p484@cloud_manager/bin/agent_manager

Any ideas This is a pretty big roadblock. Oh, I should also mention that I'm using SSH, not SNMP.

Thanks in advance



Subject: On the custom OS process,
Author: [Not Specified]
Posted: 2014-03-07 12:22

On the custom OS process, what did you set for the following values:

Enable Monitoring (zMonitor)
Failure Event Severity (zFailSeverity)

I would set them as follows:
zMonitor = Local Value Yes
zFailSeverity = Local Value Error

Hydruid



Subject: Are you use that your regex
Author: [Not Specified]
Posted: 2014-03-10 09:15

Are you use that your regex matches

Try this regex tester out: http://www.myregextester.com/

Hydruid



Subject: Yeah, the regexes match
Author: [Not Specified]
Posted: 2014-03-10 15:44

That's how the modeler detected them in the first place. I don't think I have a regex matching issue as far as I can tell. They are pretty basic. Zenoss just isn't detecting when a process goes away. On that note, it does appear to be at least occasionally detecting when this process restarts (I have restart notifications enabled).

Not sure how to detect whether or not it's doing a periodic check with the correct regex though...that might help me troubleshoot the issue.



Subject: I'm guessing that something
Author: [Not Specified]
Posted: 2014-03-11 10:35

I'm guessing that something with the regex is incorrectly detecting that the process is still running. Let me do some testing since I'm not a regex master and see if i can list some examples. I know the simplest option is to just put the process's name (e.g. java).

Hydruid



Subject: Well, one of them is simply...
Author: [Not Specified]
Posted: 2014-03-11 13:20

agent_manager. The other one I had to distinguish because it was also catching mongod :) The other simpler one behaves exactly the same. Do you know how to set the logging such that it will show all detection attempts with matches it finds/doesn't find



Subject: Update
Author: [Not Specified]
Posted: 2014-03-11 14:37

I'm not the pro in this area in Zenoss so I setup a simple test. I created a dummy process that I could stop/start easily. Below are the details and my results.

1. Created a new process called pingtest in the Zenoss Web UI. Set the values as:
zMonitor=yes
zAlertOnRestart=Yes
zFailSeverity=Error
Pattern=ping
2. Logged into a linux server and issued "ping 4.2.2.2"
3. Remodeled the server and boom got a new OS Process
4. Stopped the ping
5. The OS process never errored out.......very interesting

It turns out that if you're using the SSH template, OS Processes are discovered but aren't actually monitored. When using the SNMP template, OS Processes are discovered and are properly monitored.

Which template are you using to monitor the servers

Hydruid



Subject: Additional Notes
Author: [Not Specified]
Posted: 2014-03-11 14:38

The SSH template in Core requires modification to monitor many things, including interface utilization. By default, it only gives you a few basics.

Hydruid



Subject: You really can't go wrong
Author: [Not Specified]
Posted: 2014-03-12 15:16

You really can't go wrong with SNMP ;)

Hydruid



< Previous
Cleaning/Shrinking the Event Database went wrong
  Next
Zenoss reporting devices in /Ping class as Down
>