![]() |
![]() |
Subject: | Zenoss 5 Ping Uptime Graph |
Author: | [Not Specified] |
Posted: | 2015-02-27 01:20 |
Hi,
I am having the following issue with the new Zenoss 5 graphs.
Currently most of my devices are ping devices, and I use the ping data source for graphing information. Our requirements are to view avg latency and uptime through a graph. Now average latency isn't a problem as it is built into the ping data source, uptime is a different story. (Still weird that Zenoss doesn't have that)
To get the uptime graph (in v4.2.0), I use the rtt_min data point and manipulate the graph using a custom graph definition:
CDEF:uptime=rtt_min-raw,0,GT,1,0,IF
AREA:uptime#00cc00ff:uptime
Now with Zenoss 5 this option is gone. I completely understand this as Zenoss 5 doesn't use RRD's anymore.
So now, these are my options:
1. I was thinking of maybe inverting the rtt_losspct datapoint into a graph, so that when there is 0% packet loss it would graph 1 and when there is a 100% packet loss it would graph 0, but I am not sure how to achieve this.
2. Second option is to create a command data source with the following:
/bin/sh -c "if [ ${dev/getPingStatus} = 0 ]; then echo 'OK|ping=1'; else echo 'OK|ping=0'; fi"
Problem is, command data sources tend to be resource intensive and I have about 6000 ping devices, so I want to stay away from that option.
3. Lastly, maybe that option is just hidden and someone can point the graph manipulation out to me.
Sorry for all the blabber, hope someone can help me with this :)
Stefan
Subject: | You can get uptime using SNMP |
Author: | [Not Specified] |
Posted: | 2015-03-03 23:18 |
You can get uptime using SNMP. Enable snmp on your devices and create a datasource for 1.3.6.1.4.1.9.9.27 (sysuptime).
Subject: | I thought about using SNMP, |
Author: | [Not Specified] |
Posted: | 2015-03-04 01:55 |
I thought about using SNMP, but about 80% of my devices doesn't have SNMP on them (They are very basic machine to machine communication devices).
So I am forced to use ping. Any other ideas you might have for me
Subject: | You could modify Zenoss's |
Author: | [Not Specified] |
Posted: | 2015-03-04 09:49 |
You could modify Zenoss's ping task to calculate that info for you.
https://github.com/zenoss/zenoss-prodbin/blob/develop/Products/ZenStatus...
That method is called every time an IP is pinged. You can change the method to:
1. lookup the first ping up, if it's not found store current time (if device is up)
2. calculate uptime by subtracting the current time from #1 to get uptime (or set to 0 if down)
3. add to existing datapoints
You'll also have to add the datasource to: (for it to show up on new datasource instances).
https://github.com/zenoss/zenoss-prodbin/blob/develop/Products/ZenModel/...
Zenoss core includes redis, IIRC. You can use that to store your first ping time.
Alternatively you can write your own collection mechanism.
Subject: | Thanks for the response |
Author: | [Not Specified] |
Posted: | 2015-03-09 08:57 |
Thanks for the response Dgarcia,
Will have a chat a bit with a developer sitting across from me to see if we can get this right.
Just sad to see that the only way in achieving this is by modifying source code.
Subject: | Dgarcia, one more question |
Author: | [Not Specified] |
Posted: | 2015-03-19 02:53 |
Dgarcia, one more question about this. With Zenoss 5, the docker ID changes every time we restart the zenoss services to check if our changes worked. Currently there is about 50+ docker ID's.
Is there one central place that we can modify zenping and pingtask, so that it will apply to all of them
Subject: | See as with me being stupid |
Author: | [Not Specified] |
Posted: | 2015-03-19 05:55 |
See as with me being stupid again, I kind of got it working.
I just used the current RPN data that we are using in Zenoss 4.
I did the following, edited the graph data point (Used rtt_min as the data point) and used "raw,0,GT,1,0,IF" as the RPN info. I now get a 1 when the device is up and a 0 when the device is down. Only issue is, Zenoss 5 now draws a line between the last data entry and the newest data entry. See image: http://imgur.com/Ua9sI4j
Is there a way to disable this for a certain graph As I want it clear as to when the device was down.
Subject: | Got it working with the |
Author: | [Not Specified] |
Posted: | 2015-03-19 08:21 |
Got it working with the following RPN, happy days :D
100.0,-,ABS,100,/
Linkage: http://imgur.com/ZvdlnbH
Hope this helps someone in the future.
Subject: | I'm glad you were able to |
Author: | [Not Specified] |
Posted: | 2015-03-19 16:30 |
I'm glad you were able to find a solution that works for you. They graph you linked, however, does not plot "uptime" (at least not how I thought of it eg http://uptime.netcraft.com/images/uptime.demon.net.png ).
Patching zenoss would involve using the serviced service shell command. This will let you create a container (based on the current zenoss image), patch it, save the resulting image, retag it as the current image, and push it in to the local registry. Then on next restart of your services, all the containers will have the modified source code.
Your original idea approach for determining uptime with ping data could be useful. You should consider opening a ticket at jira.zenoss.com for an enhancement request as well as references to other products that do this.
Subject: | Ah, yeah I see what you mean, |
Author: | [Not Specified] |
Posted: | 2015-03-24 03:35 |
Ah, yeah I see what you mean, actual uptime reporting like on unix servers.
It's something nice to have, but for our support center, it's easy for them to spot when a device was down say last week. You can easily spot a gap in the uptime graph. It's like the availability report, but in a graph.
Thanks for your help! :)
Subject: | You want an availability graph? You've got one. |
Author: | Kent Erickson |
Posted: | 2016-09-29 13:47 |
1) Create a graph on the sysuptime datapoint
2) Set the graph y axis limits to 0 and 100, and the legend to "%"
3) In the sysuptime datapoint, type 0,GT,100,* in the RPN field. You might need to play with this a bit. The units of sysuptime are in 1/100 of a second, so maybe you want the 0 to be 29999, indicating that sometime in the last 300 seconds there was a reset, and hence downtime.
4) I set the graph type to Area and the color to 00FF00, but this is an artistic choice.
The graph will show 100 if there is an uptime value, and 0 if there is not. Since the graph control aggregates data points, this may not be perfect, but it's not too bad.
Respond and let me know if this helps, please!
< |
Previous zenoss 5 ova file |
Next I'm an idiot, can you help? Just rolled Zenoss Core 4.2.5 via autodeploy scrip ... |
> |