TECHZEN Zenoss User Community ARCHIVE  

Blank graphs since repair and restart

Subject: Blank graphs since repair and restart
Author: Gregg Hughes
Posted: 2018-01-23 11:19

Zenoss Core 5.2.1 and I'm getting blank graphs.

Good morning, again!

I just finished repairing the Docker problems and I have Control Center and Zenoss Core again - almost.

All my devices are listed and I can go back and see information from the time I fired up the appliance in October 2017 until the crash in December. However, all graphs show No Data Available for any interface.  I have some information from my old 4.2 installation but I have no clue on how to fix this or where to look in the 5.2 stream.

I tried to delete and re-add a device but I still get No Data Available in the graph.

Where do I start looking for this?

Thanks to all for checking!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------


Subject: RE: Blank graphs since repair and restart
Author: Arthur
Posted: 2018-01-24 16:43

Hi Gregg

The performance metrics are stored in HBase through OpenTSDB.
Performance monitoring
Zenoss remove preview
Performance monitoring
Zenoss Core stores device and daemon performance metrics directly in OpenTSDB, a time series database that runs on top of an HBase instance. Writing directly to OpenTSDB eliminates the need for RRD files to be stored on the collectors. The following image shows how the collector daemons fit into the data collection portion of the Zenoss Core architecture.
View this on Zenoss >


First action is to check if all services are up and running using the command

# serviced service status

Cheers Arthur




------------------------------
Arthur
------------------------------


Subject: RE: Blank graphs since repair and restart
Author: Gregg Hughes
Posted: 2018-01-25 17:31

Good afternoon, Arthur!

Services are running per attached, except zenmail and zenpop3. 

Thanks!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------

Attachments:

servstatus.txt



Subject: RE: Blank graphs since repair and restart
Author: Arthur
Posted: 2018-01-27 05:21

Hi Gregg

Correct, zenmail and zenpop3 has no impact to this issue. On the other hand MetricShipper does!

Both MetricShippers and zenhub are showing HC Fail (Health Check Fail) in your output.

In ControlCenter  try to stop/start the MetricShipper under Metrics first  and then

2nd  stop/start zenhub

3rd  stop/start the MetricShipper under localhost.

A good staged startup procedure is also given in the following article

https://support.zenoss.com/hc/en-us/articles/211783563-Zenoss-Master-Staged-Startup-and-Shutdown-Best-Practices-for-Maintenance-

It describes the startup for the Zenoss ResourceManager (RM) but it is also valid for Zenoss Core.

Just replace Zenoss.resmgr with Zenoss.core  in step one.

Cheers



------------------------------
Arthur
------------------------------


Subject: RE: Blank graphs since repair and restart
Author: Gregg Hughes
Posted: 2018-02-01 11:45

Hello, Arthur!

Apologies, I've been offline working on a VMware conversion project.

Tried all three solutions - when I restart MetricShipper under either itself or under localhost I get this in the instance log:

I0201 16:38:47.942441 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
February 1st 2018, 10:38:14.439
I0201 16:38:04.807583 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
February 1st 2018, 10:37:09.438
I0201 16:37:03.494342 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
Plenty more examples of the same entry.

Any thoughts? At this point I'm not sure whether a fresh installation of an updated Core would be a quicker solution. I will, of course, be guided by your experience.

Thanks very much!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------


Subject: RE: Blank graphs since repair and restart
Author: Gregg Hughes
Posted: 2018-02-01 12:36

Hello, Arthur!

You may have heard this one before - I REBOOTED THREE TIMES AND THIS TIME IT'S WORKING!! I SWEAR I CHANGED NOTHING!

I worked from the Staged Shutdown and Startup document you referred me to. When I restarted serviced, I got a boatload of this:

time="2018-02-01T17:10:35Z" level=warning msg="Unable to obtain authentication token. Retrying in 10s" error="No such entity {kind:keyregistry, id:180fedeb}" location="token.go:155" logger=auth
time="2018-02-01T17:10:45Z" level=warning msg="Unable to obtain authentication token. Retrying in 10s" error="No such entity {kind:keyregistry, id:180fedeb}" location="token.go:155" logger=auth


E0201 17:11:32.397417 02359 docker.go:34] Error checking Docker Hub login: config.json is not populated

So I stopped serviced again and exited the CLI - and this happened:
[root@ucspm ~]# systemctl stop serviced
[root@ucspm ~]# exit
exit
PolicyKit daemon disconnected from the bus.
We are no longer a registered authentication agent.

Never saw that one before.....

So I rebooted the appliance again, got into the CC and started the Zenoss.core instance. After some fooling around, the instance showed running with a check mark. I drilled down and found that the MetricShipper, which had failed all previous attempts to start, was also running!

I'm now seeing graphed traffic.  I'll watch it and exercise it for a few days before I call it good, but these are promising signs!

Thanks very much for your assistance!

Gregg


------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301
------------------------------


Subject: RE: Blank graphs since repair and restart
Author: Arthur
Posted: 2018-02-06 06:09

Hi Gregg

Great to hear that it is working again. Take my excuse for my late reply, but I was a bit busy during the last few days.

Yep, I’ve also experienced it myself. I personally I would not say nothing has changed, perhabs I was just not able to find what changed.

My impression is that Zenoss 5 demands a lot of resources (CPU/Storage) especially during startup. I saw such startup issues in my environment when the CPUReady  value was high.

---

level=warning msg="Unable to obtain authentication token. Retrying in 10s" error="No such entity {kind:keyregistry, id:180fedeb}" location="token.go:155" logger=auth

Looks like the master was not able to get authentication token from the delegate with the ID 180fedeb. The reason why is hard to say from this sip of logs. You can find out the hostname matching the ID from the cli or the CC.

# serviced host list

---

E0201 17:11:32.397417 02359 docker.go:34] Error checking Docker Hub login: config.json is not populated

I see this also. I suppose, it is a leftover from the older times where installations and updates where done through the docker hub. The config.json file contains the login credentials for the Docker Hub. From my point of view nothing to worry about.

---

PolicyKit daemon disconnected from the bus.
We are no longer a registered authentication agent
.

Google is my friend

https://www.centos.org/forums/viewtopic.php?t=58119

https://bugzilla.redhat.com/show_bug.cgi?id=1249627

 

Cheers and have a G-Day



------------------------------
Arthur
------------------------------
I0201 16:38:47.942441 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
February 1st 2018, 10:38:14.439
I0201 16:38:04.807583 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
February 1st 2018, 10:37:09.438
I0201 16:37:03.494342 00067 pool.go:40] Unable to connect to consumer ws://localhost:8080/ws/metrics/store
Plenty more examples of the same entry.

Any thoughts? At this point I'm not sure whether a fresh installation of an updated Core would be a quicker solution. I will, of course, be guided by your experience.

Thanks very much!

Gregg

------------------------------
Gregg Hughes
Sr Systems Engineer
ISC International Limited
Milwaukee WI
4147210301

Hi Gregg

Correct, zenmail and zenpop3 has no impact to this issue. On the other hand MetricShipper does!

Both MetricShippers and zenhub are showing HC Fail (Health Check Fail) in your output.

In ControlCenter  try to stop/start the MetricShipper under Metrics first  and then

2nd  stop/start zenhub

3rd  stop/start the MetricShipper under localhost.

A good staged startup procedure is also given in the following article

https://support.zenoss.com/hc/en-us/articles/211783563-Zenoss-Master-Staged-Startup-and-Shutdown-Best-Practices-for-Maintenance-

It describes the startup for the Zenoss ResourceManager (RM) but it is also valid for Zenoss Core.

Just replace Zenoss.resmgr with Zenoss.core  in step one.

Cheers



------------------------------
Arthur
Zenoss remove preview
Performance monitoring
Zenoss Core stores device and daemon performance metrics directly in OpenTSDB, a time series database that runs on top of an HBase instance. Writing directly to OpenTSDB eliminates the need for RRD files to be stored on the collectors. The following image shows how the collector daemons fit into the data collection portion of the Zenoss Core architecture.
View this on Zenoss >


First action is to check if all services are up and running using the command

# serviced service status

Cheers Arthur




------------------------------
Arthur


< Previous
SNMP trap going up and down
  Next
Monitor and send alarm on important interfaces
>