This script allows you to do SQL GROUPBY-like aggregations on multiple fields in an Elasticsearch index.
Performance will likely be poor on large data sets.
Saved Groovy script in <elasticsearch_dir>/config/scripts/join-param-list.groovy
:
return fields.collect { doc[it].value }.join(delimiter);
A representative query that does a "GROUPBY" to see the number of identical first-name / last-name / employer pairs:
{
"query": {
"term":{"_type":"account"}
},
"size":1,
"aggs": {
"agg1": {
"terms": {
"script": {
"file": "join-param-list",
"lang": "groovy",
"params": {"fields":["firstname","lastname","employer"], "delimiter":"|" }
}
}
}
}
}
Sample agg output:
"aggregations": {
"agg1": {
"doc_count_error_upper_bound": 5,
"sum_other_doc_count": 990,
"buckets": [
{
"key": "abbott|smith|acme",
"doc_count": 1
},
etc