Top 10 lessons learned from Dockerized Elasticsearch

Combining Docker and Elasticsearch together will make many things in life easier. Depending on the scales, there are three scenarios for an implementation of Elasticsearch use case -

1. Single ES instance and single machine
1. Multiple ES instances and single machine
1. Multiple ES instances and multiple machines

Docker has three mechanisms corresponding to each of the scenarios above.

1. Docker commands
1. Docker Compose
1. Docker Swarm

I played around Docker and Elasticsearch for a while and really learned quite a few hard lessons. Hereby I summerize some of them below for future reference.

Choose the right Docker base image

One great thing using Docker is that Elasticsearch can be painlessly upgraded.
It may be better to use the Alpine versions from the official Dockerhub repo: no extra extension; smaller size
Commit customized images to a repository, which scales to the cluster or other clusters

Persistence of three things

The configuration of Elasticsearch elasticsearch.yml and the logging is controlled by log4j2.properties.
The three things including configuration, data and logging should be mounted, like the docker compose file.

version: '3'
services:
  elasticsearch:
    image: your-repo/elasticsearch:5.4.0
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms8g -Xmx8g"
      - node.name=main01
      - network.publish_host=${MAIN01IP}
    ulimits:
      nofile:
        soft: 65536
        hard: 65536
      memlock:
        soft: -1
        hard: -1
    volumes:
       - ${YOUR_PATH}/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml 
       - ${YOUR_PATH}/data:/usr/share/elasticsearch/data
       - ${YOUR_PATH}/logs:/usr/share/elasticsearch/logs
       - ${YOUR_PATH}/backup:/usr/share/elasticsearch/backup

    ports:
      - 9200:9200
      - 9300:9300

If the mechanism of snapshot is desired, the backup volume could be also mounted.

Three levels of backup:
- Snapshot important indexes
- Version control
- Delete by alias
Three levels of isolation

It is always the first challenge to decide the entry point for a data source. Wrong decision means re-indexing. Elasticsearch has three levels for the isolation: cluster
1. Cluster level: multiple clusters could be formed for various purposes, although they may coexist in the same physical machines. The data on different clusters are completely isolated unless a cluster exhausts the resources other clusters also rely on.
1. Index level: data that are separated on different indexes means that they are on different shards. It is still possible to search and aggregate them together, such as GET /idxA,idxB/type1,type2/_search?q=Q1 , but the indices are logically distant and any operation is costly. The most important parameters for an index are number of shards and number of replicas.

# Python representation
from elasticsearch_dsl import Index

blogs = Index('blogs')
blogs.settings(
    number_of_shards=1,
    number_of_replicas=0
)

1. Type level: not like a table from a relational dabase, type itself works only as filter. But it is the minimal unit to hold the mapping, which rules over how to search. Just like the incoming Elasticsearch 6, the _all field should be always disabled and will save up to 50% disk usage (if all fileds are text fields).

# Python representation
from elasticsearch_dsl import DocType, Date, Integer, Keyword, Text, MetaField

@blog.doc_type
class Article(DocType):
    title = Text(analyzer='standard', fields={'raw': Keyword()})
    body = Text(analyzer='snowball')
    tags = Keyword()
    published_from = Date()
    lines = Integer()
    
    class Meta:
        all = MetaField(enabled=False)

Shard management

Try to index once for all, since reindexing is painful. Elasticsearch first searches the documents within a shard. According to the formula shard = hash(routing) % number_of_primary_shards, the routing string can be used to specify the shard that a particular document lies.
- set cluster.routing.allocation.same_shard.host=true
- disable cluster.routing.allocation.allow_rebalance at rush hours
time-based. while the disadvantages are the unbalanced shards (one is 20GB and others may only have couple of MB).
Rollover and Shrink APIs
The advantages for customized routing are the faster queries and the easiness to phase out the cold data later on, Cold data migration:

POST /_cluster/reroute
{
    "commands" : [
        {
            "move" : {
                "index" : "test", "shard" : 0,
                "from_node" : "node1", "to_node" : "node2"
            }
    ]
}

I think the simplest management tool is the Chrome version of Elasticsearch Head.

Use Docker to squeeze hardware

In my experience, Elasticsearch is mostly memory bound, since every segment costs expensive space. The two rules should be kept in mind.
There are some hareware beasts such as Dell 730xd, which are good candidates for multiple instaces on single machine. I have a 6-node docker-compose demo file here.

Use Elasticsearch to monitor Elasticsearch

There are a lot of options to monitor an Elasticsearch cluster such as Graphite and cAdvisor. But Elasticsearch itself is Kibana themselves are probably easiest.
Sometimes a simple Python script can realize node-level monitoring like

import requests
import json
from datetime import datetime
import time

es = Elasticsearch(["IP-ADDRESS-01"])

while True:
    r = requests.get('IP-ADDRESS-02:9200/_nodes/stats?pretty')
    current = json.loads(r.text)
    current.update({'timestamp': datetime.utcnow()})
    es.index(index='es-stats', doc_type='data', body=current)
    time.sleep(1)

Avoid brain split

Cluster name is very important. The nodes will find others with the same cluster name to form a cluster. Node name should be unique.
Add the masters' IP to discovery.zen.ping.unicast.hosts. Then the node will scan the port from 9300 to 9305。
Set discovery.zen.minimum_master_nodes with the equation (the number of masters / 2) + 1. If there are 3 masters, then discovery.zen.minimum_master_nodes: 2

Docker is not always idempotent

Developement environments use Docker Engine.

Bootstrap Checks（生产环境）

上面讲到了 Bootstrap Checks，这也是它区别之前老版本的一个地方，在以前的版本中，也有这些警告，但有时会被人忽视，造成了服务的不稳定性。在版本 5.0 之后，对这些在启动时做了强校验，来保证在生产环境下的稳定性。

这些校验主要涉及有内存、线程数、文件句柄等，

JVM heap：
建议将最小堆与最大堆设置为一样，当设置bootstrap.memory_lock时，在程序启动就会对内存进行锁定。

ES_JAVA_OPTS=-Xms512m -Xmx512m
内存锁定，禁止内存与磁盘的置换
bootstrap.memory_lock: true
取消文件数限制
ulimit -n unlimited
取消线程数限制
ulimit -l unlimited

Use Nginx as Docker cluster's load balancer

Indexing in Elasticsearch takes quite a lot of computaton power, because it needs an extra tokenization step while ingesting data. As an application, the indexer is usually not a rate limiter, since we can easily do multi-processing. In other words, the indexing is CPU bound on ES cluster.
The usual way to speed up this process is to build a streaming to conduct bulk indexing based on a queue service such as Kafka.
An simple alternative is to use Ningix as a cheap load balancer so that every CPU will be fully exposed. I have seen an ES culster to increase throughput from 2k documents per second(DPS) to 5k DPS after applying this approach.

http {
    upstream docker_cluster {
        server docker_ip_1;
        server docker_ip_2;
        server docker_ip_3;
    }
}

Or furthermore adjust the servers with weight based on their hardware limits.

http {
    upstream docker_cluster {
        server docker_ip_1 weight=5;
        server docker_ip_2 weight=3;
        server docker_ip_3 weight=2;
    }
}

dapangmao/data.md