Zenoss ZenTech Community

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	Arthur
Posted:	2019-03-01 18:30

Have seen this if you have any snapshots left over. Check it with

# serviced snapshot list

------------------------------
Arthur
------------------------------

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	Arthur
Posted:	2019-03-02 03:46

This could help also:
Display the amount of disk space each container is using:

for d in `docker ps | awk '{print $1}' | tail -n +2`; do
    d_name=`docker inspect -f {{.Name}} $d`
    echo "========================================================="
    echo "$d_name ($d) container size:"
    sudo du --max-depth 2 -h /var/lib/docker/devicemapper | grep `docker inspect -f "{{.Id}}" $d`
    echo "$d_name ($d) volumes:"
    for mount in `docker inspect -f "{{range .Mounts}} {{.Source}}:{{.Destination}}                                                                                                                                                      
    {{end}}" $d`; do
        size=`echo $mount | cut -d':' -f1 | sudo xargs du -d 0 -h`
        mnt=`echo $mount | cut -d':' -f2`
        echo "$size mounted on $mnt"
    done
done

https://github.com/control-center/serviced/wiki/Control-Center-Tips-and-Tricks

Version:	1.5.1
GoVersion:	go1.7.4
Date:	Fri_May_18_18:35:47_UTC_2018
Gitcommit:	9ccf1f2-dirty
Gitbranch:	HEAD
Buildtag:	jenkins-ControlCenter-support-1.5.x-merge-rpm-build-7
Release:	1.5.1-1

- David

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	David Whiteside
Posted:	2019-03-07 15:05

Tried @Arthur 's bash snippet to show each containers usage, it doesn't look like its using anything. thinpool is showing 89GB used.

==================================================================================================================/serviced-isvcs_logstash (b7ce42e58129) container size:/serviced-isvcs_logstash (b7ce42e58129) volumes:54M /var/log/serviced mounted on /var/log/serviced16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources=========================================================/serviced-isvcs_kibana (4dd8b50e8ff7) container size:/serviced-isvcs_kibana (4dd8b50e8ff7) volumes:16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources=========================================================/serviced-isvcs_opentsdb (f9ace2ec0df8) container size:/serviced-isvcs_opentsdb (f9ace2ec0df8) volumes:2.4G /opt/serviced/var/isvcs/opentsdb/hbase mounted on /opt/zenoss/var/hbase16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources=========================================================/serviced-isvcs_elasticsearch-logstash (d034ad354b89) container size:/serviced-isvcs_elasticsearch-logstash (d034ad354b89) volumes:1.8G /opt/serviced/var/isvcs/elasticsearch-logstash/data mounted on /opt/elasticsearch-logstash/data16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources=========================================================/serviced-isvcs_docker-registry (a4f2ada3dc2e) container size:/serviced-isvcs_docker-registry (a4f2ada3dc2e) volumes:975M /opt/serviced/var/isvcs/docker-registry/v2 mounted on /tmp/registry-dev16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources=========================================================/serviced-isvcs_elasticsearch-serviced (8f241689c8ea) container size:/serviced-isvcs_elasticsearch-serviced (8f241689c8ea) volumes:5.5M /opt/serviced/var/isvcs/elasticsearch-serviced/data mounted on /opt/elasticsearch-serviced/data16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources=========================================================/serviced-isvcs_zookeeper (075190ebbe36) container size:/serviced-isvcs_zookeeper (075190ebbe36) volumes:3.1M /opt/serviced/var/isvcs/zookeeper/data mounted on /var/zookeeper16M /opt/serviced/isvcs/resources mounted on /usr/local/serviced/resources

There were two volumes in /opt/serviced/var/volumes/. I tried fstrim on both, output shown below. It cleared my usage down to 7GB after running fstrim, so maybe a cron will fix my issue.

# /sbin/fstrim /opt/serviced/var/volumes/4hoc5vjm0doqdvzvapq13vi9g
fstrim: /opt/serviced/var/volumes/4hoc5vjm0doqdvzvapq13vi9g: the discard operation is not supported
# /sbin/fstrim /opt/serviced/var/volumes/d5ig9lx0u0hnap5fuuxxq62ei

Thanks for your help. I will report back if this issue continues, but I think fstrim may resolve it.

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	David Whiteside
Posted:	2019-03-07 14:57

No snapshots.

# serviced snapshot list
no snapshots found

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	Arthur
Posted:	2019-03-07 15:57

Hi David

I have attached the /etc/cron.weekly/serviced-fstrim from Zenoss for reference.

------------------------------
Arthur
------------------------------

Attachments:

serviced-fstrim.txt

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	Ryan Matte
Posted:	2019-03-03 13:37

This will happen over time with a thinpool. What you need to do is crontab a job to perform an fstrim on the volume periodically. For instance on my lab box I did crontab -e to edit the crontab as root and then I put the following line...

*/15 * * * * /sbin/fstrim /opt/serviced/var/volumes/6zbnssjd86cva5rtoa5ifuhsa

You would need to replace 6zbnssjd86cva5rtoa5ifuhsa with your instance id. That runs an fstrim against the volume every 15 minutes. When something is deleted from a thinpool the old data remains. An fstrim is required to clean up that old data and bring the utilization back down. You can try just running the fstrim command by hand as root to see if it causes the utilization to drop back down.

If it doesn't cause it to drop back down then it's likely legitimate usage at which point you need to cd in to /opt/serviced/var/volumes/<instance id> and then do a "du -sh *" to figure out which of the directories is taking up the most space. Try the fstrim like I mentioned above and then follow up to let us know if that resolves it or not.

------------------------------
Ryan Matte
------------------------------

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	krishna reddy
Posted:	2021-05-04 16:32

Hi Ryan,

I see similar issue on 6.2.3 and in hbase-master folder i see oldWAls to be around 333GB. How to get rid of that cleanly? Below is data i see

3.3G archive
4.0K corrupt
221G data
4.0K hbase.id
4.0K hbase.version
4.0K MasterProcWALs
333G oldWALs
832M WALs

------------------------------
krishna reddy
cox
------------------------------

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	Ryan Matte
Posted:	2021-05-04 16:47

Stop then start the hbase services. That should cause it to clear up.

------------------------------
Ryan Matte
------------------------------

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	krishna reddy
Posted:	2021-05-04 17:09

Thanks Ryan. If for some reason, that doesn't clean, is it fine to manullay delete those files as they seem to be log files from replication?

see this error:

2021-05-04 16:40:01,811 ERROR [localhost,60000,1618593951858_ChoreService_1] zookeeper.ZooKeeperWatcher: replicationLogCleaner-0x178dbb7a27b0003, quorum=zk1:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/replication/rs at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:835) at org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:80) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:233) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(CleanerChore.java:157) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.chore(CleanerChore.java:124) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745), replicationLogCleaner-0x178dbb7a27b0003, quorum=zk1:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/replication/rs at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:835) at org.apache.hadoop.hbase.replication.ReplicationQueuesClientZKImpl.getQueuesZNodeCversion(ReplicationQueuesClientZKImpl.java:80) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.loadWALsFromQueues(ReplicationLogCleaner.java:99) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.getDeletableFiles(ReplicationLogCleaner.java:70) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:233) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(CleanerChore.java:157) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.chore(CleanerChore.java:124) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

------------------------------
krishna reddy
cox
------------------------------

Subject:	RE: ZenOSS 6 thin pool storage filling up quickly
Author:	krishna reddy
Posted:	2021-05-04 17:19

Thanks Ryan. Restart fixed it...

------------------------------
krishna reddy
cox
------------------------------

Subject:	ZenOSS 6 thin pool storage filling up quickly
Author:	David Whiteside
Posted:	2019-03-01 15:19

ZenOSS 6 thin pool storage filling up quickly