TECHZEN Zenoss User Community ARCHIVE  

Health Checks for opentsdb & Central Query failing

Subject: Health Checks for opentsdb & Central Query failing
Author: Louis Henn
Posted: 2017-08-21 12:18

Hi Zenoss Community,

I was hoping someone could assist here.

We have an instance of Zenoss Core 5.2.6 that has been running without issues. All services checked out on day one of deployment couple of months ago.

Today I decided to do some routine checks and maintenance on the health of the actual server and found that opentsdb and Central Query is failing the health checks and a red circle with an exclamation mark is present. When hovering over it, it lists Answering as the reason. Zenoss works and everything is accessible and monitoring as it should.

Any ideas what could've gone wrong here? Is it even important?

Here are the top 10 lines on the Central Query log file:

August 21st 2017, 18:05:55.230 0
INFO [2017-08-21 16:05:48,655] org.apache.http.impl.client.DefaultHttpClient: I/O exception (java.net.SocketException) caught when processing request: Connection reset
August 21st 2017, 18:05:55.230 0
INFO [2017-08-21 16:05:48,655] org.apache.http.impl.client.DefaultHttpClient: Retrying request
August 21st 2017, 18:04:50.227 0
INFO [2017-08-21 16:04:43,401] org.apache.http.impl.client.DefaultHttpClient: Retrying request
August 21st 2017, 18:04:50.227 0
INFO [2017-08-21 16:04:43,401] org.apache.http.impl.client.DefaultHttpClient: I/O exception (java.net.SocketException) caught when processing request: Connection reset
August 21st 2017, 18:03:45.225 0
INFO [2017-08-21 16:03:38,165] org.apache.http.impl.client.DefaultHttpClient: I/O exception (java.net.SocketException) caught when processing request: Connection reset
August 21st 2017, 18:03:45.225 0
INFO [2017-08-21 16:03:38,165] org.apache.http.impl.client.DefaultHttpClient: Retrying request
August 21st 2017, 18:01:30.880 0
INFO [2017-08-21 16:01:29,827] org.apache.http.impl.client.DefaultHttpClient: I/O exception (java.net.SocketException) caught when processing request: Connection reset
August 21st 2017, 18:01:30.880 0
INFO [2017-08-21 16:01:29,827] org.apache.http.impl.client.DefaultHttpClient: Retrying request
August 21st 2017, 18:00:35.878 0
INFO [2017-08-21 16:00:29,259] org.apache.http.impl.client.DefaultHttpClient: Retrying request
August 21st 2017, 18:00:35.878 0
INFO [2017-08-21 16:00:29,259] org.apache.http.impl.client.DefaultHttpClient: I/O exception (java.net.SocketException) caught when processing request: Connection reset
August 21st 2017, 17:59:30.876 0
INFO [2017-08-21 15:59:29,037] org.apache.http.impl.client.DefaultHttpClient: Retrying request


opentsdb Log file:


Starting opentsdb with ZK_QUORUM=localhost:2181
2017-08-21 16:12:58,960 CRIT Supervisor running as root (no user in config file)
2017-08-21 16:12:59,060 INFO RPC interface 'supervisor' initialized
2017-08-21 16:12:59,061 CRIT Server 'inet_http_server' running without any HTTP authentication checking
2017-08-21 16:12:59,062 INFO supervisord started with pid 38
2017-08-21 16:13:00,110 INFO spawned: 'tsdbwatchdog' with pid 41
2017-08-21 16:13:00,114 INFO spawned: 'opentsdb' with pid 42
2017-08-21 16:13:01,354 INFO success: tsdbwatchdog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-08-21 16:13:03,339 INFO exited: opentsdb (exit status 1; not expected)
2017-08-21 16:13:04,343 INFO spawned: 'opentsdb' with pid 123
2017-08-21 16:13:06,104 INFO exited: opentsdb (exit status 1; not expected)
2017-08-21 16:13:08,111 INFO spawned: 'opentsdb' with pid 208
2017-08-21 16:13:09,573 INFO exited: opentsdb (exit status 1; not expected)
2017/08/21 16:13:11 200 37.461288ms POST /api/metrics/store
2017-08-21 16:13:12,581 INFO spawned: 'opentsdb' with pid 293
2017-08-21 16:13:14,282 INFO exited: opentsdb (exit status 1; not expected)
2017-08-21 16:13:15,284 INFO gave up: opentsdb entered FATAL state, too many start retries too quickly

Really appreciate any assistance.

Regards,



------------------------------
Louis Henn
Senior IT Administrator
Cape Town
------------------------------


Subject: RE: Health Checks for opentsdb & Central Query failing
Author: Mohammed Irshad
Posted: 2017-10-13 18:03

Hi,

It is really important to have opentsdb and central query to pass all health checks. You can try restarting them and if does not fix, please restart all the zenoss services.

Let me know.

------------------------------
Irshad.
------------------------------

Subject: RE: Health Checks for opentsdb & Central Query failing
Author: Laurent Hemeryck
Posted: 2018-09-05 04:51

Hello, 

I currently have a similar issue where the health checks for opentsdb (reader and writer) and CentralQuery are failing. 
Were you able to solve your issue ? 

Kind regards, 

Laurent

------------------------------
Laurent Hemeryck
Monitoring Engineer
FedNot
------------------------------


< Previous
Viewing data collcted by Microsoft WIndows ZenPack
  Next
Zenoss 5 behind Apache
>