TECHZEN Zenoss User Community ARCHIVE  

List Metrics Collected for a Server

Subject: List Metrics Collected for a Server
Author: Gary Samek
Posted: 2018-07-02 09:39

Hello,

I was wondering if someone has found a way to pull the active metrics for a server? 

Thanks.

- Gary

------------------------------
Gary
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Jay Stanley
Posted: 2018-07-02 10:17

Just performance metrics or all things monitored?

------------------------------
jstanley
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Gary Samek
Posted: 2018-07-03 07:50

​The major focus is performance metrics. 
But if there is a solution for all metrics, that would be very interesting to have as well.

------------------------------
Gary
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Jay Stanley
Posted: 2018-07-05 08:46

Metrics is usually defined as performance metrics. Other items monitored would be labeled statuses (up/down) or events/alarms (events or alarms from a system - think event log or vSphere alarms).

Now, are you looking to pull the actual metrics collected or just the template information (snmp <datasource> using <OID> writing to <datapoint>)?

What version of Zenoss are you using?

------------------------------
jstanley
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Gary Samek
Posted: 2018-07-06 09:21

Thank you for the clarification.  I am after performance metrics.  The version is Zenoss 6.1.0.

------------------------------
Gary
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Jay Stanley
Posted: 2018-07-06 09:41

Are you looking to pull the actual metrics collected or just the template information (snmp <datasource> using <OID> writing to <datapoint>)?

------------------------------
jstanley
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Gary Samek
Posted: 2018-07-19 15:12

I am familiar with pulling the performance metrics, I have done that in the past.
What I am needing to do here is be able to get a list of the metrics that are being collected for a server and a server object (like a filesystem or a network interface).

The need is to be able to make sure that a server is being monitored on a set of key metrics and that if any new metrics become available. 

So my request is how would I query for a server to get a list of the metrics that are being collected and are current?  Also how to do the same for each drive/filesystem and network interface.  I need to be able to prove that a server is being monitored by a minimum standard across many installs of zenoss.

Thanks, I was out of town for a bit and just caught up and my forum reading today.

------------------------------
Gary
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Craig Massey
Posted: 2018-07-19 17:06

I did something like this a while ago using python 3 against the JSON API.

In my case it was to export all of the thresholds for all devices for one client on a multi-tenanted platform.
If you only have a single client, there is a reporting Zenpack that might be useful. It didn't work for us but it might for you, and save you some coding.

https://www.zenoss.com/product/zenpacks/installed-templates-report

If my code can be useful I can provide extracts from it to help you. The whole thing had a specialised purpose and was only for internal use, but I can share that too if you want.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Gary Samek
Posted: 2018-07-20 09:28

Very interesting.
We also have a multi-tenant environment.
I would very much like to see how you got the threshold information as I have a report that does that for the other monitoring tools and would very much like to add Zenoss to that report, and that should satisfy my immediate need for the active metrics reporting.

------------------------------
Gary
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Jay Stanley
Posted: 2018-07-22 02:30

You can use the report or build something that uses the Zenoss API

Would be something like:

getDevices
for each device:
  getTemplates(id=deviceId)
  getComponents(uid=deviceId)
  for each component:
    getTemplates(id=componentId)

Then, depending on your needs..

for each template:
  getDataSources(uid=templateId)
  for each datasource:
    getDataSourceDetails(uid=datasourceId)
  getThresholds(uid=templateId)
  for each threshold:
    getThresholdDetails(uid=thresholdId)

This would account for performance metric monitoring you are doing. It would not account for all things monitored, as events can be created outside of metrics monitoring. Think vSphere Alarms or CiscoUCS faults, each of those are pulled into Zenoss as events. And some daemons and datasources have builtin alerting. Like Windows ZP if kerberos auth fails.

Zenoss API documentation:
Zenoss JSON API 6.2.0

And Zenoss API script helper which also includes examples:
https://github.com/zenoss/zenoss-RM-api

------------------------------
jstanley
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Craig Massey
Posted: 2018-07-22 18:46

That's almost exactly the logic I walked through, which I can't take any credit for, I was given pretty much the same pseudo-code.
The exception being that I didn't extract the DataSources.
I was pointed at them, but when I looked at what they contained there weren't any thresholds, which is what I was worried about and I suspect, what you want too.


I will warn you that if you have a large or complex config, this is going to be slow via the JSON API.

There's nothing wrong with the API, it just takes a few seconds to return each item and when you have to walk a tree of configuration for a large list of devices, it adds up, to hours. I added primitive resume support by dumping the device list to json and working off that. Then when I was in a hurry I copied the json file several times, trimmed each to contain a subset of the device list and ran several instances in parallel.

A  modified version of the reporting Zenpack is the logical way to do this in a more effective time. It's on my ToDo list


Using Python 3 (because I didn't know that Zenoss was built on 2.7 and I just picked the latest, um, because), the main functions in my script are:

import sys
import os
import simplejson # only used to manage a config file
import requests
import re
def get_devices(UID):
'''
Get a list of all devices
Optionally, specify an object organiser UID, eg device class, group, system, or location
If an organiser is not specified, all devices are returned
'''
print('Listing all devices')
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getDevices'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + UID + '","limit":200000}]}'
# print(payload)
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['devices']
return resp_data
else:
print('No devices found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))


def get_templates(uid):
'''
Devices have templates attached, which can have thresholds set
Get the templates associated with this device
'''
print('Getting device templates for ' + uid)
api_endpoint = url + '/template_router'
router = 'TemplateRouter'
method = 'getObjTemplates'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + uid + '"}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = {}
for template in response.json()['result']['data']:
# print(template['uid'])
resp_data[template['uid']] = template
return resp_data
else:
print('No templates found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))


def get_device_components(uid):
'''
Devices can have associated components, which can also have templates attached, which can have thresholds set
Get the Components associated with this device
'''
print('Getting device components for ' + uid)
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getComponents'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + uid + '","limit":200000}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['data']
return resp_data
else:
print('No components found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))


def get_thresholds(UID):
'''
Get the thresholds contained in this template
'''
print('Getting thresholds for ' + UID)
api_endpoint = url + '/template_router'
router = 'TemplateRouter'
method = 'getThresholds'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + UID + '"}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
resp_data = {}
# print(response.text)
for threshold in response.json()['result']['data']:
resp_data[threshold['uid']] = threshold
return resp_data
else:
print('No thresholds found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))

Then you use this to do what Jay says.

In my approach I added items to the device list dictionary to build a dictionary of dictionaries a few levels deep, eg.

# for each template
# get the thresholds
# store in a dictionary with template uid as the key
# add to a list in the device
# print the output line for each threshold
# for each component_template
# get the thresholds
# if thresholds are defined
# for each threshold
# see if is is already in the dictionary
# if not
# get them
# add them
# print the output line for each threshold

Then when I had gathered all of the data for a device, walked my dictionary for that device appending the data to a tab delimited text file so I would have something to look at if the script hit an error.

Kicked the whole thing off with:

if __name__ == '__main__':
auth = ('username', 'password')
headers = {'Content-Type': 'application/json'}
url = 'https://zenoss_url/zport/dmd'

main()


That main() was a bit of a hairy monster and I'm not proud of it, it should have been broken down a lot smaller but I ran out of time and it worked so ....

But it's where the real hard work was and where I had to accommodate some interesting quirks. What I've posted is really just examples of translating the JSON API to Python. I'm sure better Python could be written.


Oh, I was also pointed at Postman to initially test the JSON API and get my head around the data that was being returned.

Given the functions above you probably don't need that, you can use Python debugging to see the data structures.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Luke Lofgren
Posted: 2018-07-23 06:47

Thank you for sharing the code; it would be much faster if you used the "keys" parameter to avoid the "recursive" event counts that are generally part of device and component requests by default. Much less load on your Zope as well. I like how cleanly the code is put together and while I generally use the zenoss.py module I may try and update what you have to see how much use of "keys" would help.

------------------------------
Luke
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Jay Stanley
Posted: 2018-07-23 08:37

I can confirm, use keys where ever you can. It speeds things up greatly in some spots.

Two good places would be getDevices and getComponents, since you only really need the Uid returned.

You could add "keys": ["name", "uid"] or just uid.

------------------------------
jstanley
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Craig Massey
Posted: 2018-07-23 16:58

Looking up the API, "keys. What does that do?"
Oh, OK, good point.

I made so much use use of the full object structure to explore what I could get at that I didn't think to limit what I accessed.
I do write a few other parameters out to my export file, but it's easy to include them too.

Doing that is a LOT faster!

So that makes:

def get_devices(UID):
'''
Get a list of all devices
Optionally, specify an object organiser UID, eg device class, group, system, or location
If an organiser is not specified, all devices are returned
'''
print('Listing all devices')
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getDevices'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + UID + '","limit":200000,"keys": ["name", "uid","productionState"]}]}'
# print(payload)
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['devices']
return resp_data
else:
print('No devices found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))

def get_device_components(uid):
'''
Devices can have associated components, which can also have templates attached, which can have thresholds set
Get the Components associated with this device
'''
print('Getting device components for ' + uid)
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getComponents'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + uid + '","limit":200000,"keys": ["name", "uid"]}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['data']
return resp_data
else:
print('No components found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))



As I recall zenoss.py is Python 2.7?

That's really the only reason that I didn't use it and if I'd realised that so much of Zenoss was 2.7 I would have gone that way.
Also I guess I was at the bottom of my learning curve with the API when I was looking at the options for existing tooling and I struggled at the time to translate what I was doing in Postman to that module.
Now, it would be trivial, hindsight is awonderful thing.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Jay Stanley
Posted: 2018-07-24 08:29

Yeah, live and learn. As you use it more you will learn more.

I think I might post my Postman setup + examples. Think it might help get people started

------------------------------
jstanley
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Craig Massey
Posted: 2018-07-24 18:20

As you've mentioned it and in case it's useful.
When I was working on this I was guided by Michael De Simone of Zenoss. I asked an enormous number of questions at the start and he patiently helped me until I got my head around the API and the relationship between the object types. He was the one  who drew up something like the pseudo-code you posted Jay.

He has a Postman collection that he pointed me at to get me started and I've checked and he has said it's OK to share it:

https://www.getpostman.com/collections/75b6b0f1712aade35d1b



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand
------------------------------

Looking up the API, "keys. What does that do?"
Oh, OK, good point.

I made so much use use of the full object structure to explore what I could get at that I didn't think to limit what I accessed.
I do write a few other parameters out to my export file, but it's easy to include them too.

Doing that is a LOT faster!

So that makes:

def get_devices(UID):
'''
Get a list of all devices
Optionally, specify an object organiser UID, eg device class, group, system, or location
If an organiser is not specified, all devices are returned
'''
print('Listing all devices')
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getDevices'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + UID + '","limit":200000,"keys": ["name", "uid","productionState"]}]}'
# print(payload)
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['devices']
return resp_data
else:
print('No devices found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))

def get_device_components(uid):
'''
Devices can have associated components, which can also have templates attached, which can have thresholds set
Get the Components associated with this device
'''
print('Getting device components for ' + uid)
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getComponents'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + uid + '","limit":200000,"keys": ["name", "uid"]}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['data']
return resp_data
else:
print('No components found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))



As I recall zenoss.py is Python 2.7?

That's really the only reason that I didn't use it and if I'd realised that so much of Zenoss was 2.7 I would have gone that way.
Also I guess I was at the bottom of my learning curve with the API when I was looking at the options for existing tooling and I struggled at the time to translate what I was doing in Postman to that module.
Now, it would be trivial, hindsight is awonderful thing.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand

That's almost exactly the logic I walked through, which I can't take any credit for, I was given pretty much the same pseudo-code.
The exception being that I didn't extract the DataSources.
I was pointed at them, but when I looked at what they contained there weren't any thresholds, which is what I was worried about and I suspect, what you want too.


I will warn you that if you have a large or complex config, this is going to be slow via the JSON API.

There's nothing wrong with the API, it just takes a few seconds to return each item and when you have to walk a tree of configuration for a large list of devices, it adds up, to hours. I added primitive resume support by dumping the device list to json and working off that. Then when I was in a hurry I copied the json file several times, trimmed each to contain a subset of the device list and ran several instances in parallel.

A  modified version of the reporting Zenpack is the logical way to do this in a more effective time. It's on my ToDo list


Using Python 3 (because I didn't know that Zenoss was built on 2.7 and I just picked the latest, um, because), the main functions in my script are:

import sys
import os
import simplejson # only used to manage a config file
import requests
import re
def get_devices(UID):
'''
Get a list of all devices
Optionally, specify an object organiser UID, eg device class, group, system, or location
If an organiser is not specified, all devices are returned
'''
print('Listing all devices')
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getDevices'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + UID + '","limit":200000}]}'
# print(payload)
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['devices']
return resp_data
else:
print('No devices found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))


def get_templates(uid):
'''
Devices have templates attached, which can have thresholds set
Get the templates associated with this device
'''
print('Getting device templates for ' + uid)
api_endpoint = url + '/template_router'
router = 'TemplateRouter'
method = 'getObjTemplates'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + uid + '"}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = {}
for template in response.json()['result']['data']:
# print(template['uid'])
resp_data[template['uid']] = template
return resp_data
else:
print('No templates found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))


def get_device_components(uid):
'''
Devices can have associated components, which can also have templates attached, which can have thresholds set
Get the Components associated with this device
'''
print('Getting device components for ' + uid)
api_endpoint = url + '/device_router'
router = 'DeviceRouter'
method = 'getComponents'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + uid + '","limit":200000}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
# print(response.text)
resp_data = response.json()['result']['data']
return resp_data
else:
print('No components found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))


def get_thresholds(UID):
'''
Get the thresholds contained in this template
'''
print('Getting thresholds for ' + UID)
api_endpoint = url + '/template_router'
router = 'TemplateRouter'
method = 'getThresholds'
payload = '{"action": "' + router + '", "tid": 1, "method": "' + method + '", "data":[{"uid":"' + UID + '"}]}'
response = requests.post(api_endpoint, payload, headers=headers, auth=auth, verify=False)
if response.status_code == 200:
if response.json()['result']['success']:
resp_data = {}
# print(response.text)
for threshold in response.json()['result']['data']:
resp_data[threshold['uid']] = threshold
return resp_data
else:
print('No thresholds found')
resp_data = ()
else:
print('HTTP Status: %s' % (response.status_code))

Then you use this to do what Jay says.

In my approach I added items to the device list dictionary to build a dictionary of dictionaries a few levels deep, eg.

# for each template
# get the thresholds
# store in a dictionary with template uid as the key
# add to a list in the device
# print the output line for each threshold
# for each component_template
# get the thresholds
# if thresholds are defined
# for each threshold
# see if is is already in the dictionary
# if not
# get them
# add them
# print the output line for each threshold

Then when I had gathered all of the data for a device, walked my dictionary for that device appending the data to a tab delimited text file so I would have something to look at if the script hit an error.

Kicked the whole thing off with:

if __name__ == '__main__':
auth = ('username', 'password')
headers = {'Content-Type': 'application/json'}
url = 'https://zenoss_url/zport/dmd'

main()


That main() was a bit of a hairy monster and I'm not proud of it, it should have been broken down a lot smaller but I ran out of time and it worked so ....

But it's where the real hard work was and where I had to accommodate some interesting quirks. What I've posted is really just examples of translating the JSON API to Python. I'm sure better Python could be written.


Oh, I was also pointed at Postman to initially test the JSON API and get my head around the data that was being returned.

Given the functions above you probably don't need that, you can use Python debugging to see the data structures.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand

https://www.zenoss.com/product/zenpacks/installed-templates-report

If my code can be useful I can provide extracts from it to help you. The whole thing had a specialised purpose and was only for internal use, but I can share that too if you want.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand


Subject: RE: List Metrics Collected for a Server
Author: Gary Samek
Posted: 2018-07-26 16:14

Wow!

This is very helpful, I have some homework time to take care of now :)

Can you share what the postman collection is actually about?

Thanks.

------------------------------
Gary
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: Craig Massey
Posted: 2018-07-26 17:04

Yes, well, I'm a very pragmatic Postman user, ie i know how to do what I needed to do. There is a lot more to it and my explanation while accurate as far as it goes may not be complete and almost certainly will be phrased in terms that will make a Postman purist cringe.

For my, probably "our" purposes, Postman is a tool that can be used to make JSON API calls.
A collection is a sort of library of saved API calls.

You import the collection, edit some of the parameters, initially auth and base URL and the rest of the API calls will "just work".
Postman can also run whole script of chained web API calls to simulate transactions, or probably more often to test a API.
And you can create a web API in it, which is high art akin to sorcery as far as I'm concerned at this point in my use of it..

What I did was use Michael's collection to carry out each step of the pseudo-code that Jay posted so that I knew what the API expected and what it returned.

So:

  • Figure out how to get a list of devices.
  • Figure out how to get the templates for one device in the list
  • How to get the components for a device

etc.

Then I tried to use a feature of Postman that translates the API call into one of several languages, Python 3 in my case.
That didn't work so well, but it got me started.

After getting a couple of API calls working in Python I found myself using Postman less and instead I would use the PyCharm debugger to inspect the data that the API gave me. It was easier to navigate and the debugger snowed clearly what each data type in the structure was and clearly illustrated how to access it using Python. With Postman there was a bit of guesswork (and not a small amount of swearing) to translate what Postman showed to Python data structures. 

Give the examples I have posted, you very likely don't need Postman. Those are all of the API calls that I think you will need to build what you want and once you get any one of them working, the rest should too and you can use the data returned to build whatever you want.

There is a, actually several, python modules that do a lot of the heavy lifting to interacting with the Zenoss JSON API  and expose far more functionality that you will need. They let you use the API to set configuration for example. You may prefer to use them, specifically the one that Jay linked, many of the others are no longer maintained.



------------------------------
Craig Massey
Dimension Data
Auckland
New Zealand
------------------------------


Subject: RE: List Metrics Collected for a Server
Author: newlife N
Posted: 2020-06-07 14:26

Hello Gary,
I am new to ZENOSS. Carrying out exploration of zenoss product documentation in order to extract performance metrics data from ZENOSS server. Could not find concrete way to extract performance metrics. Happy to see you have already cracked the problem of pulling the performance data from ZENOSS  time series database.  could you please let me know the solution.
Thanks in advance.

------------------------------
newlife N
------------------------------


< Previous
Execute a python script on ZenOSS server
  Next
Embedding Video Streams into a Portlet
>