![]() |
![]() |
Subject: | Setting up thresholds for hosts |
Author: | [Not Specified] |
Posted: | 2015-12-08 04:55 |
Hi All,
I'm writing with a hope to get some help from you in setting up CPU/disk/memory utilization thresholds for linux hosts. Below is my test setup.
zenoss server - 192.168.0.200||centos6
zenosslab- 192.168.0.201 || centos6
What i want is to set 2 thresholds called 70_max and 90_max . With 70_max I should receive a warning alert when cpu/memory/disks reach a utilzation of 70%. As with 90_max Id like a "CRITICAL" alert when utilzation reach a 90%.
Before you all blast me for not doing my home work let me tell you that I have but could not come across a solid guide.
Really appreciate your input here gentleman cos if I get these thresholds straighten out I'm going to deploy 4.2.5 on a 1000+ prodeuction environemt.
Million thanks in advance.
/BIndo
Subject: | Re: [Zenoss] Setting up multiple thresholds for cpu memory and |
Author: | [Not Specified] |
Posted: | 2015-12-08 08:00 |
Subject: | Have a look at this wiki tip |
Author: | Jane Curry |
Posted: | 2015-12-08 14:11 |
Have a look at this wiki tip - http://wiki.zenoss.org/MultipleThresholds . I agree that sometimes the thresholds seem non-obvious but this example I think gives you what you want and ensures that you only have one event open at once. ie. you go through a warning threshold and get an event; you then go through a critical threshold, the warning event is closed and the critical event is open.
Cheers,
Jane
Email: jane.curry@skills-1st.co.uk Web: https://www.skills-1st.co.uk
Subject: | Hi Guys, |
Author: | [Not Specified] |
Posted: | 2015-12-13 21:49 |
Hi Guys,
Just wanted to tell you that what Bindo has suggested worked perfectly :) :) :). I really have no words to express my gratitude. Thank you so much.
Please can you walk me through as to how I should do the same for memory and CPU as well Really appreciate it guys.
Many thanks in advance.
/Bindo
Subject: | Sorry I missed to have |
Author: | [Not Specified] |
Posted: | 2015-12-14 04:37 |
Sorry I missed to have mentioned this in my previous reply. I'd like the memory figures to be shown in percentages. For an instance when the memory is utilized beyond assigned thresholds I'd like to receive an alert in the form of "high memory utilization exceeded, current value is 70%" or something similar. Please guys Im very close to present this to the management and simply cant do without you.
/Bindo
Subject: | If you want to convert values |
Author: | Jane Curry |
Posted: | 2015-12-14 06:28 |
If you want to convert values to percentages in graphs, have a look at the Device template for /Server/Linux - check the Memory Utilisation graph. You need to select the graph definition, then Manage Graphpoints from the dropdown, then select the MemAvailReal graphpoint and click the "gear icon". That shows you the deail of the point. The "RPN" field is Reverse Polish Notation which is how you manipulate a raw data point with arithmetic functions and other values that exist on the object.
1024,*,${here/hw/totalMemory},/,1,-,-100,*
Basically this multiplies the datapoint value by 1024, divides it by ${here/hw/totalMemory} which is the total amount of memory field on the device object and then turns it into a percentage. The RPN is rather funny stuff but there are plenty of references around.
If you want to do similar functions on thresholds then the mechanism is rather easier. You might look at the monitoring template for FileSystem for /Server/Linux:
here.getTotalBlocks() * .9
This MAX value simply looks up the object's method getTotalBlocks() and sets the MAX to .9 (90%) of that value. No RPN here - more like "human" arithmetic.
Cheers,
Jane
Email: jane.curry@skills-1st.co.uk Web: https://www.skills-1st.co.uk
Subject: | functi |
Author: | [Not Specified] |
Posted: | 2015-12-14 14:56 |
Thank you James,
im a little confused over here. Ok as you know thanks to your help i got the disk space thresholds working perfectly. So simply put can I use "getTotalblocks" function on setting up Memory thresholds as wEll as CPU Please kindly advice sir.
Subject: | You haven't shown us your |
Author: | Jane Curry |
Posted: | 2015-12-15 06:42 |
You haven't shown us your memory threshold.
Easy datapoint you have access to is memAvailReal. You want a warning threshold at 60% used, so consider this as a MINIMUM threshold at 40% available. You could then have a Crit threshold with a MINIMUM less than 30%.
This is a function of ${here/hw/totalMemory}. So try a warn threshold of Min here.hw.totalMemory /1024 * .4
You need to divide the here.hw.totalMemory by 1024 as that value is in Bytes and your memAvailReal is delivered in KBytes.
Cheers,
Jane
Email: jane.curry@skills-1st.co.uk Web: https://www.skills-1st.co.uk
Subject: | Thank you very much James for |
Author: | [Not Specified] |
Posted: | 2015-12-15 21:32 |
Thank you very much James for the excellent explanation. Now everything make sense. Im being taken on something else today the whole day and therefore I'm gonna have to try this only on tomorrow. In the interim please can you tell me whats the equalant function of "here.hw.totalMemory" in CPU cos I need the same threshold implementation on cpu as well. Please staty tuned good people, I will get back to you with my results soon. Thank you loads.
/Bindo
Subject: | Thank you very much for |
Author: | [Not Specified] |
Posted: | 2015-12-21 19:39 |
Thank you very much for trying to help out, I was on leave and will be trying this out on tomorrow. Will let you know how it went. Thanks again.
/Bindo
Subject: | Hi Jane/James |
Author: | [Not Specified] |
Posted: | 2015-12-21 19:42 |
Hi Jane/James
I tried but still at a loss. I will give you a breakdown of my scenario so that you wont miss it.
Scenario: I want to set up a memory utilization threshold in order to generate a warning alert when the physical memory is utilized by 60%.
My implementation:
1. created a threshold called "high memory" under monitoring templates > device > /server/linux. Below are the properties in this threshold.
datapoints - memAvailReal_memAvailReal
Severaity - warning
Maximum value - 60
event class - /perf/memory
escalate count - 0
2. On graph definitions, I used the Default "Memory utilization graph" and included "high memory" threshold I created, "memAvailReal" and memAvailSwap" data points. Below are the propeties of "memAvailReal" data point ( only the ones that I think is appropriate)
Format -%5.2lf%s
limit -1
Consolidation - AVERAGE
Legend -${graphPoint/id}
3) below are the properties of the Data Source "memAvailReal.memAvailReal
- Alias:ID/FORMULA: memoryAvailable__bytes || default value is - 1024,*
a) Please see the attached screenshot "graphs.png" to see how zenoss has created the graph with assigned 60 threshold
b) Please see the attached screenshot "Alert.png" to see how zenoss is alerting upon corresponding threshold breach.
Special Notes:
As you can see it says in the screenshot that " threshold of high memory exceeded:current value 900636.00000" which is not what i want. There are 3 major problems here.
1. my test server's memory never hit a 60%memory utilization. This is a virtualbox centos 6 guest with 1024MB of memory. utilization is well under 10% during the period of my test.
2. second prob is of course the way this alert shows the memory utilization value. 900636.00000
3. last but not least is the way graph has produced. I think it is wiered.
Also it is important to note that I want threshold to be shown in graphs. (i.e:a straight line across the 60 mark). know this is long and nasty but please can you guys help me out.I feel that Im so closecant wait to get this right.
Subject: | Hi guys, Hope you've had a |
Author: | [Not Specified] |
Posted: | 2015-12-21 19:46 |
Hi guys, Hope you've had a good weekend. I tried what you've suggested and I'm so happy to say that I have progressed.
1. I set "here.hw.totalMemory /1024 * .4" as my minimum value on the threshold I created. however it was still not showing the vlaues in percentages. I found the below transform script online and placed it under "Events > Perf > Memory" and it did the trick finally :)
# Converts memory events into percentage with raw values
import time, re, logging
match = re.search('threshold of .*(swap|memory).* (exceeded|restored|not met): current value ([\d\.]+)', evt.message, re.I)
if match and device:
available = float(match.groups()[2])
total = device.hw.totalMemory
evt.component = "Memory"
if match.groups()[0].lower() == "swap":
total = device.os.totalSwap
evt.component = "Swap"
evt.memoryavailable = available
evt.total = total
if total:
percent_free = (available / total) * 100
percent_used = ((total - available) / total) * 100
evt.summary = "High Memory Utilization: Currently %3.0f%% used (%3.0f%% free)" % (percent_used, percent_free)
evt.message = evt.summary
else:
evt.summary = 'High Memory Utilization: Currently: %s' %(convToUnits(available))
evt.message = evt.summary
sum=evt.summary
if (sum.find("cbsModuleFreePageAvailableNorm") >= 0):
sum=re.sub ("cbsModuleFreePageAvailableNorm", "cbsModuleFreePageAvailableHigh", sum)
evt.summary=sum
evt.message = evt.summary
evt._action="history"
2) However the alert still say below. Dont mind the pipe signs.
"localhost | Memory | /Perf/Memory | High Memory Utilization: Currently 100% used ( 0% free)"
which is not right. My test box has 3774MB of memory and at the time of the alert only 2498MB has been used and about 1270MB left. I have attached screnshots of corresponding settings on below link. Please be good enough to take a look.
https://drive.google.com/openid=0B1eaxXnvgyIXaHpSdXFhY0Z3MEU
I thnk that Im very close to see the light of this. PLease can you you take a look and help me out. I'm really Counting on it.
Million thanks in advance.
/Bndo
< |
Previous Failed to Send Email Sometimes |
Next Dashboard view filtering where has it gone |
> |