The following settings are needed to set in the configuration in the cluster to achieve high availability and stability.
Data and logs path
By default, the plugins, logs, and data are placed in the installation path. This can lead to unfortunate accidents, whereby the installation directory is accidentally overwritten by a new installation of Elasticsearch. The best thing to do is relocate the data, log, plugins directory outside the installation location as follows
# Path to store data path.data: /path/to/data1,/path/to/data2 # Path to store log files path.logs: /path/to/logs # Path to where plugins are installed path.plugins: /path/to/plugins
Minimum master nodes
Minimum master node settings prevent split brain, which appears due to the presence of two master in a single cluster. If we have two masters, data integrity becomes perilous, since we have two nodes that think they are in charge. If the split brain happens, there is a possibility of losing data.
This setting should always be configured to a quorum (majority) of your master-eligible nodes. A quorum is (number of master-eligible nodes / 2) +1. In our case currently, we are having 3 nodes so we can set a minimum master node as 2. This setting can be configured as follows in config file
discovery.zen.minimum_master_nodes: 2
The cluster settings can be configured using dynamic API calls also as follows
PUT /_cluster/settings { "persistent" : { "discovery.zen.minimum_master_nodes" : 2 } }
This setting will become a persistent setting that takes precedence over whatever is in the static configuration.
Default Ping timeout
The node detection process is processed by the discover.zen.fd.ping_timeout property. The ping between nodes will be timed out within 3 seconds in Elasticsearch v1.5.2. This setting can be adjusted if we are facing slowness in the network.
discover.zen.fd.ping_timeout: 30s
Delete all indices settings
The delete index operation can be applied to more than one index or all indices in the cluster using wildcards. In order to avoid this case, we can
disable deleting indices using wild cards by setting action.destructive_requires_name property into true.
PUT /_cluster/settings { "persistent": { "action.destructive_requires_name": true } }
Using Unicast over Multicast
Multicast is excellent in the development phase, which will automatically join nodes in the network and forms cluster. But Elasticsearch recommends using Unicast over Multicast in Production. We should provide a list of nodes that need to be present in the cluster. The Unicast can be achieved by setting as follows in the Configuration file
discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["HOST1", "HOST2", "HOST3"]
Recovery settings
To avoid the shard shuffle on the recovery when the cluster restarts, we need to customized the recovery settings. When cluster restarts, ElasticSearch does not know how many nodes will finally be in the cluster, it tries to balance all known primary shards and replicas to the known machines. When another node suddenly joins the cluster, this whole strategy will change. To avoid the traffic and the overhead of this, ElasticSearch can be configured with a minimum number of nodes it should expect before starting the recovery phase.
The following settings will prevent Elasticsearch from starting until the at least specified number of nodes are present.
gateway.recover_after_nodes: 2
The following settings configures Elasticsearch, how many nodes should be in the cluster, and how long the cluster want to wait for all those
nodes
gateway.expected_nodes: 2 gateway.recover_after_time: 5m
Memory Settings
The ES_HEAP_SIZE environment variable allows setting the heap memory that will be allocated to elasticsearch java process. I already made a detailed post about settings Heap Size in Windows.
The environment variable set with values as follows
ES_HEAP_SIZE 4g
mlockall
To lock the process address space into RAM, preventing any Elasticsearch memory from being swapped out by configuring as follows
bootstrap.mlockall: true
We can check whether settings are applied to the nodes or not by using any one of the following commands
curl http://localhost:9200/_nodes/process?pretty GET /_nodes/process?pretty
Note that the environment configuration options available during the installation are copied and will be used during the service lifecycle.
This means any changes made to them after the installation will not be picked up unless the service is reinstalled.