Skip to content

Instantly share code, notes, and snippets.

@rmyers
Last active December 28, 2015 11:19
Show Gist options
  • Save rmyers/7492560 to your computer and use it in GitHub Desktop.
Save rmyers/7492560 to your computer and use it in GitHub Desktop.
Cassandra Backup
from trove.guestagent.strategies.backup import base
from eventlet.green import subprocess
class CassandraDump(base.BackupRunner):
""" Implementation of Backup Strategy for CassandraDump """
__strategy_name__ = 'cassandradump'
# The '-' will redirect the output to stdout for streaming
cmd = ('sudo tar cpfP - '
'$(sudo find /var/lib/cassandra/data/ -type d -name %(filename)s)')
def pre_cmd(self):
# TODO: Need to add pre_command to base class
# TODO: Add kwargs to base class
pre_cmd = 'nodetool snapshot -t %(filename)s' % self.kwargs
subprocess.check_call(pre_cmd, shell=True)
def run(self):
self.pre_cmd()
super(CassandraDump, self).run()
self.post_cmd()
def post_cmd():
# see https://review.openstack.org/#/c/55311/
# TODO: Do some post processing to see if nodetool worked?
subprocess.check_call('nodetool clearsnapshot', shell=True)
return True
def check_process():
# check for error logs?
pass
@property
def filename(self):
return '%s.tar' % self.base_filename
@denismakogon
Copy link

It could work but, at check_process it would be better to use utils.execute_with_timeout, which would allow to handle output and then check if output contains "all keyspaces"

@denismakogon
Copy link

about "-h 127.0.0.1" - it is not necassery because nodetool connects cassandra by several hosts (lo, eth0 etc).

@denismakogon
Copy link

about data_dir, since data dir is configurable, it would be better to read it from config file.

@rmyers
Copy link
Author

rmyers commented Feb 20, 2014

this is just a rough idea, I didn't completely test it. I think the check_process should be the post_dump command I was referring to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment