UPDATE: In case this helps anyone, I was able to get rid of the "referenced context has expired" log entries and get consistent data collection for 3 days straight by setting up a cron job to renew the Kerberos ticket every 4 hours.
There is probably a bigger underlying issue here, but I have not been able to get to the root cause yet.
Subject: |
RE: Zenpython stops collecting after Kerberos ticket is renewed |
Author: |
ge kr |
Posted: |
2019-01-08 03:47 |
Daniel,
we do see similar issues on Zenoss 5 to 6.2.1 (and the latest Zenpacks) within our environment (~200 Windows Servers).
Several attempts including Zenoss Professional Services did not lead to a solution, neither a clue of the root cause ...
That's why we find your work-around quite interesting.
Could you please add some further details about the mentioned cron job. Especially how you renew the Kerberos ticket.
Thanks in advance and Best Regards,
------------------------------
Georg
------------------------------
2018-11-17 06:15:56,028 INFO zen.zenpython: 44 devices processed (50580 datapoints)2018-11-17 06:15:56,037 INFO zen.collector.scheduler: Tasks: 135 Successful_Runs: 6146 Failed_Runs: 0 Missed_Runs: 0 Queued_Tasks: 0 Running_Tasks: 02018-11-17 06:19:50,710 ERROR zen.MicrosoftWindows: WindowsServiceLog: failed collection - (('The referenced context has expired', 786432), ('Success', 100001)) EXAMPLE-SERVER-012018-11-17 06:19:53,333 ERROR zen.MicrosoftWindows: WindowsServiceLog: failed collection - (('The referenced context has expired', 786432), ('Success', 100001)) EXAMPLE-SERVER-012018-11-17 06:20:56,080 INFO zen.maintenance: Performing periodic maintenance2018-11-17 06:20:56,080 INFO zen.zenpython: Counter eventCount, value 961826735
I understand that this means the Kerberos ticket has expired, but this log entry is only generated for the servers that stop collecting data. This happens to about 2-3 out of 50 servers on each zenpython daemon throughout the day.
Looking at the Kerberos credentials cache file, I notice most, if not all, tickets are set to expire at the same time. This would suggest that they are also all getting renewed around the same time. My theory is that too many servers are set to get their tickets renewed at the same time and some servers are falling through the cracks.
Is there any way to spread the Kerberos ticket renewals? I am not 100% sure this is the problem, but it's my only theory after troubleshooting and researching for days.
Thanks,
Dan
------------------------------
Daniel
------------------------------
Subject: |
RE: Zenpython stops collecting after Kerberos ticket is renewed |
Author: |
Daniel |
Posted: |
2019-01-08 09:18 |
Hi Georg,
You can use the klist command to view the existing Kerberos tickets and their expiration times. You need to specify your Kerberos cache file which is found, by default on Zenoss Core 4, inside /opt/zenoss/var/krb5cc
klist -c /opt/zenoss/var/krb5cc/<your-credentials-cache-file>
You can manually renew the Kerberos tickets using the kinit command, like so:
kinit -R -c /opt/zenoss/var/krb5cc/<your-credentials-cache-file>
This command on its own would not run properly inside cron, so I set up a shell script called kerberos_renewal.sh with the following contents:
#!/bin/bash
# location of the shell script that initializes the zenoss environment
ZENOSS_ENV=~zenoss/.bashrc
# print the error message passed and exit with a return code of 1 (error)
fail() {
echo $*
exit 1
}
#
# main script starts here
#
# set up the environment
test -f ${ZENOSS_ENV} || fail "Source environment not found"
. ${ZENOSS_ENV}
/usr/bin/kinit -R -c /opt/zenoss/var/krb5cc/<your-credentials-cache-file>
My crontab entry looks like this:
# Renew Kerberos tickets every 4 hours
0 */4 * * * /opt/zenoss/bin/kerberos_renewal.sh
Hope this helps.
Daniel
------------------------------
Daniel
------------------------------
Subject: |
RE: Zenpython stops collecting after Kerberos ticket is renewed |
Author: |
Jane Curry |
Posted: |
2019-01-09 04:11 |
Great information! Many thanks, Daniel.
Cheers,
Jane
------------------------------
Jane Curry
Skills 1st United Kingdom
jane.curry@skills-1st.co.uk
------------------------------
Subject: |
RE: Zenpython stops collecting after Kerberos ticket is renewed |
Author: |
Shane Quinsey |
Posted: |
2019-01-09 17:54 |
Thanks Daniel. Great info.
We are about to go to RM6.2.1 and start monitoring Windows servers, so will save tearing my hair out over it :)
------------------------------
Shane, Australia
ZenN00b
------------------------------