Zenoss ZenTech Community

Subject:	RE: Blank graphs since repair and restart
Author:	Arthur
Posted:	2018-01-24 16:43

Hi Gregg

The performance metrics are stored in HBase through OpenTSDB.
Performance monitoring

Zenoss

remove preview

Performance monitoring

Zenoss Core stores device and daemon performance metrics directly in OpenTSDB, a time series database that runs on top of an HBase instance. Writing directly to OpenTSDB eliminates the need for RRD files to be stored on the collectors. The following image shows how the collector daemons fit into the data collection portion of the Zenoss Core architecture.

View this on Zenoss >

First action is to check if all services are up and running using the command

# serviced service status

Cheers Arthur

------------------------------
Arthur
------------------------------

Subject:	RE: Blank graphs since repair and restart
Author:	Gregg Hughes
Posted:	2018-01-25 17:31

Good afternoon, Arthur!

Services are running per attached, except zenmail and zenpop3.

Thanks!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------

Attachments:

servstatus.txt

Subject:	RE: Blank graphs since repair and restart
Author:	Arthur
Posted:	2018-01-27 05:21

Hi Gregg

Correct, zenmail and zenpop3 has no impact to this issue. On the other hand MetricShipper does!

Both MetricShippers and zenhub are showing HC Fail (Health Check Fail) in your output.

In ControlCenter try to stop/start the MetricShipper under Metrics first and then

2^nd stop/start zenhub

3^rd stop/start the MetricShipper under localhost.

A good staged startup procedure is also given in the following article

https://support.zenoss.com/hc/en-us/articles/211783563-Zenoss-Master-Staged-Startup-and-Shutdown-Best-Practices-for-Maintenance-

It describes the startup for the Zenoss ResourceManager (RM) but it is also valid for Zenoss Core.

Just replace Zenoss.resmgr with Zenoss.core in step one.

Cheers

------------------------------
Arthur
------------------------------

Subject:	RE: Blank graphs since repair and restart
Author:	Gregg Hughes
Posted:	2018-02-01 11:45

Hello, Arthur!

Apologies, I've been offline working on a VMware conversion project.

Tried all three solutions - when I restart MetricShipper under either itself or under localhost I get this in the instance log:

I0201 16:38:47.942441 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
	February 1st 2018, 10:38:14.439	I0201 16:38:04.807583 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
	February 1st 2018, 10:37:09.438	I0201 16:37:03.494342 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store

Plenty more examples of the same entry.

Any thoughts? At this point I'm not sure whether a fresh installation of an updated Core would be a quicker solution. I will, of course, be guided by your experience.

Thanks very much!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------

Subject:	RE: Blank graphs since repair and restart
Author:	Gregg Hughes
Posted:	2018-02-01 12:36

Hello, Arthur!

You may have heard this one before - I REBOOTED THREE TIMES AND THIS TIME IT'S WORKING!! I SWEAR I CHANGED NOTHING!

I worked from the Staged Shutdown and Startup document you referred me to. When I restarted serviced, I got a boatload of this:

time="2018-02-01T17:10:35Z" level=warning msg="Unable to obtain authentication token. Retrying in 10s" error="No such entity {kind:keyregistry, id:180fedeb}" location="token.go:155" logger=auth
time="2018-02-01T17:10:45Z" level=warning msg="Unable to obtain authentication token. Retrying in 10s" error="No such entity {kind:keyregistry, id:180fedeb}" location="token.go:155" logger=auth

E0201 17:11:32.397417 02359 docker.go:34] Error checking Docker Hub login: config.json is not populated

So I stopped serviced again and exited the CLI - and this happened:
[root@ucspm ~]# systemctl stop serviced
[root@ucspm ~]# exit
exit
PolicyKit daemon disconnected from the bus.
We are no longer a registered authentication agent.

Never saw that one before.....

So I rebooted the appliance again, got into the CC and started the Zenoss.core instance. After some fooling around, the instance showed running with a check mark. I drilled down and found that the MetricShipper, which had failed all previous attempts to start, was also running!

I'm now seeing graphed traffic. I'll watch it and exercise it for a few days before I call it good, but these are promising signs!

Thanks very much for your assistance!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------

Subject:	RE: Blank graphs since repair and restart
Author:	Arthur
Posted:	2018-02-06 06:09

Hi Gregg

Great to hear that it is working again. Take my excuse for my late reply, but I was a bit busy during the last few days.

Yep, I’ve also experienced it myself. I personally I would not say nothing has changed, perhabs I was just not able to find what changed.

My impression is that Zenoss 5 demands a lot of resources (CPU/Storage) especially during startup. I saw such startup issues in my environment when the CPUReady value was high.

---

level=warning msg="Unable to obtain authentication token. Retrying in 10s" error="No such entity {kind:keyregistry, id:180fedeb}" location="token.go:155" logger=auth

Looks like the master was not able to get authentication token from the delegate with the ID 180fedeb. The reason why is hard to say from this sip of logs. You can find out the hostname matching the ID from the cli or the CC.

# serviced host list

---

E0201 17:11:32.397417 02359 docker.go:34] Error checking Docker Hub login: config.json is not populated

I see this also. I suppose, it is a leftover from the older times where installations and updates where done through the docker hub. The config.json file contains the login credentials for the Docker Hub. From my point of view nothing to worry about.

---

PolicyKit daemon disconnected from the bus.
We are no longer a registered authentication agent.

Google is my friend

https://www.centos.org/forums/viewtopic.php?t=58119

https://bugzilla.redhat.com/show_bug.cgi?id=1249627

Cheers and have a G-Day

------------------------------
Arthur
------------------------------

I0201 16:38:47.942441 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
	February 1st 2018, 10:38:14.439	I0201 16:38:04.807583 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
	February 1st 2018, 10:37:09.438	I0201 16:37:03.494342 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store

Hi Gregg

Correct, zenmail and zenpop3 has no impact to this issue. On the other hand MetricShipper does!

Both MetricShippers and zenhub are showing HC Fail (Health Check Fail) in your output.

In ControlCenter try to stop/start the MetricShipper under Metrics first and then

2^nd stop/start zenhub

3^rd stop/start the MetricShipper under localhost.

A good staged startup procedure is also given in the following article

https://support.zenoss.com/hc/en-us/articles/211783563-Zenoss-Master-Staged-Startup-and-Shutdown-Best-Practices-for-Maintenance-

It describes the startup for the Zenoss ResourceManager (RM) but it is also valid for Zenoss Core.

Just replace Zenoss.resmgr with Zenoss.core in step one.

Cheers

------------------------------
Arthur

Zenoss

remove preview

Performance monitoring

View this on Zenoss >

First action is to check if all services are up and running using the command

# serviced service status

Cheers Arthur

------------------------------
Arthur

Subject:	Blank graphs since repair and restart
Author:	Gregg Hughes
Posted:	2018-01-23 11:19

Blank graphs since repair and restart