Skip to content

Instantly share code, notes, and snippets.

@marcosgz
Last active August 23, 2022 15:26
Show Gist options
  • Save marcosgz/a6414a760b65a705e494aa6f8eb41315 to your computer and use it in GitHub Desktop.
Save marcosgz/a6414a760b65a705e494aa6f8eb41315 to your computer and use it in GitHub Desktop.
Prune orphaned documents from esse index that uses the active_record plugin
# Remove orphaned items (documents without an equivalent database record) from elasticsearch/opensearch index
index = AccountsIndex # Target index
index.search(query: { match_all: {} }, _source: false).scroll_hits do |hits|
print(".")
es_ids = hits.map { |hit| hit.fetch("_id").to_i }
db_ids = index.repo.dataset.except(:includes, :preload).where(id: es_ids).pluck(:id)
prune_ids = es_ids - db_ids
next if prune_ids.none?
puts "Pruning #{prune_ids.size} #{index} from Elasticsearch"
index.cluster.client.delete_by_query(
index: index.index_name,
body: {
query: {
ids: {
values: prune_ids
}
}
}
)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment