Skip to content

Instantly share code, notes, and snippets.

@khellan
Created September 21, 2018 11:15
Show Gist options
  • Save khellan/335465ca5d72ace94e1c80387ee6a459 to your computer and use it in GitHub Desktop.
Save khellan/335465ca5d72ace94e1c80387ee6a459 to your computer and use it in GitHub Desktop.
Batchwise deletion of malformed HBase row keys. It will not stop when done so it needs monitoring.
import happybase
connection = happybase.Connection(HBASE_MASTER_IP)
table = connection.table(TABLE_NAME)
while True:
batch = table.batch()
for key, _ in table.scan(columns=[COLUMN_NAMES], filter="RowFilter(=, 'regexstring:.*\x09.*')", limit=10000):
batch.delete(key)
batch.send()
print(key)
connection.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment