Elasticsearch Cluster Deployment on DigitalOcean
Elasticsearch is deployed based on three droplets located on DigitalOcean.
The Elasticsearch cluster consists of several nodes, each interacting through transport port 9300. Internal communication between cluster nodes is done using the internal network 10.10.1.0/24.
Request load balancing is performed using a DigitalOcean Internal Load Balancer, accessible via the domain name elasticsearch-lb.prod.rockengroup.com on port 9200.
Port forwarding is also configured on the load balancer:
HTTPS on port 9200 -> HTTP on port 9200
This same port is used for health checks of the droplets (Health checks are configured on port 9200).
List of Elasticsearch Droplets:
|
Droplet Name |
Domain |
IP Address |
|---|---|---|
|
elasticsearch-01-prod |
elasticsearch-01.prod.rockengroup.com |
167.172.97.179 |
|
elasticsearch-02-prod |
elasticsearch-02.prod.rockengroup.com |
178.128.207.86 |
|
elasticsearch-03-prod |
elasticsearch-03.prod.rockengroup.com |
46.101.168.210 |
Access to the cluster is restricted with a login and password (password can be found in 1Password).
The cluster is deployed using Docker Compose files (GitLab URL).
Directory Structure and Description:
certs – folder with certificates used for internal communication between cluster nodes
docker-compose.yml – Docker Compose file with configuration
.env – file with variables used in the Docker Compose file
The connection between nodes is established using SSL.
Adding new nodes is possible by modifying discovery.seed_hosts and creating a new Docker Compose file for each new node.
Example of the Docker Compose file for elasticsearch-01-prod:
version: "3.8"
services:
elasticsearch-node:
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
container_name: elasticsearch-docker-node-1
volumes:
- elastic-data1:/usr/share/elasticsearch/data
- elastic-config:/usr/share/elasticsearch/config
ports:
- ${ES_PORT}:9200
- ${ES_PORT_TRANSFER}:9300
environment:
- node.name=elasticsearch-01
- cluster.name=elasticsearch-cluster
- network.host=0.0.0.0
- network.publish_host=10.10.1.26
- xpack.security.enabled=true
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.security.transport.ssl.keystore.path=/usr/share/elasticsearch/config/certs/elastic-certificates.p12
- xpack.security.transport.ssl.keystore.password=${CERT_STORE_PASSWORD}
- xpack.security.transport.ssl.truststore.path=/usr/share/elasticsearch/config/certs/elastic-certificates.p12
- xpack.security.transport.ssl.truststore.password=${CERT_STORE_PASSWORD}
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- bootstrap.memory_lock=true
- discovery.seed_hosts=10.10.1.27:9300,10.10.1.28:9300
# - cluster.initial_master_nodes=elasticsearch-01,elasticsearch-02,elasticsearch-03
ulimits:
memlock:
soft: -1
hard: -1
networks:
elastic_network:
ipv4_address: 172.20.1.11
networks:
elastic_network:
driver: bridge
ipam:
config:
- subnet: 172.20.1.0/24
volumes:
elastic-data1:
driver: local
elastic-config:
driver: local
To add a new node to an existing cluster: Simply copy the docker-compose.yml, cert folder and .env file from one of the existing nodes, and update the node_name, IP address and other variables if needed. The cluster will automatically sync all indexes and settings to the new node.
Key Parameters Explained:
-
image –
docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}– The image used to start the container. The${STACK_VERSION}variable is stored in the.envfile. -
volumes – Local storage for persistent data and certificates:
elastic-data1:/usr/share/elasticsearch/data elastic-config:/usr/share/elasticsearch/config/certs
certfolder with self-signed certificates must be located inelastic-configvolume -
xpack.security.enabled – Enables X-Pack Security to support SSL and password authentication in Elasticsearch. Other options such as
xpack.security.transport.ssl.enabled,xpack.security.transport.ssl.verification_mode, etc., configure additional security settings. -
ELASTIC_PASSWORD – Password for accessing Elasticsearch.
-
bootstrap.memory_lock=true – Prevents Elasticsearch from swapping memory to disk.
-
discovery.seed_hosts – List of IP addresses for other cluster nodes.
-
cluster.initial_master_nodes – Used to initialize the cluster during its first deployment.
-
ulimits – Removes memory limits for Elasticsearch
-
networks – Creates a network using the bridge driver.
-
volumes – Local storage for Elasticsearch data
Elasticsearch and mmapfs:
Elasticsearch uses the mmapfs directory by default to store indices. Since the default operating system value may be too low, the Elasticsearch documentation recommends increasing it to 262144.
To do this, add the parameter vm.max_map_count=262144 to the /etc/sysctl.conf file and execute the command sysctl -p to apply the changes.
You can also change this value temporarily using the command sysctl -w vm.max_map_count=262144, but this will only be applied until the next system reboot.
Checking Cluster Status:
The cluster status can be viewed via the following URL:
https://elasticsearch-01.prod.rockengroup.com:9200/_cat/nodes
Elasticsearch Cluster setup
-
Master Node Elections: Elasticsearch uses a master node election process to manage cluster state. With 3 master-eligible nodes, you can achieve a majority quorum (more than half) for decisions, ensuring that the cluster can function even if one master node fails.
-
If 2 out of 3 nodes in an Elasticsearch cluster go offline, the cluster will become inaccessible or unavailable until a majority quorum can be established and data is recovered.
-
Once the second node comes back online, Elasticsearch will begin restoring the cluster state.
Elasticsearch Backup Strategy
Snapshot Repository Setup
-
Repository Type: DigitalOcean Spaces (
backup-elasticsearch-prod) -
Repository Name:
backup-elasticsearch-prod -
Configured Location: DigitalOcean Spaces bucket, with a specific base path (
es-snapshots). -
Access: Credentials (access key and secret key) in 1Password
DO Space: backup-elasticsearch-prod.
Automation for Backups
Scripts for backups are configured on the elasticsearch-01-prod node.
-
Backup Script:
#!/bin/bash DATE=$(date +%Y%m%d%H%M%S) ES_DIR="/root/elasticsearch" ES_HOST="http://localhost:9200" REPO_NAME="backup-elasticsearch-prod" SNAPSHOT_NAME="snapshot_$DATE" ES_PASSWORD=$(grep "^ELASTIC_PASSWORD=" $ES_DIR/.env | cut -d '=' -f2) LOG_NAME="$ES_DIR/es_snapshot.log" # create snapshot echo "$(date)" | tee -a $LOG_NAME echo "Creating a snapshot with name $SNAPSHOT_NAME" | tee -a $LOG_NAME # Perform snapshot request time curl -u "elastic:$ES_PASSWORD" -X PUT "${ES_HOST}/_snapshot/${REPO_NAME}/${SNAPSHOT_NAME}?wait_for_completion=true&pretty" \ -H 'Content-Type: application/json' -d' { "indices": "*", "ignore_unavailable": true, "include_global_state": true } ' | tee -a $LOG_NAME sleep 5 # check if snapshot is available if curl -u "elastic:$ES_PASSWORD" -X GET "${ES_HOST}/_cat/snapshots/${REPO_NAME}?v&s=id&pretty" | grep "$SNAPSHOT_NAME" then echo "$(date) - Snapshot ${SNAPSHOT_NAME} created" | tee -a $LOG_NAME else echo "$(date) - !!! Snapshot was not found. Exit !!!" exit 1 fi echo "" >> $LOG_NAME -
Execution Schedule:
The script is scheduled to run automatically every night at 1:00 AM UTC using a cron job.
0 1 * * * /root/elasticsearch/es_backup.sh
-
Verification
Snapshots can be verified using the following command:
curl -u "user:password" -X GET "http://localhost:9200/_cat/snapshots/backup-elasticsearch-prod?v&s=id&pretty"
Logs from the backup script are stored in /root/elasticsearch/es_snapshot.log.
Retention Policy
To manage storage effectively and prevent the repository from filling up with excessive snapshots, a retention policy is implemented to keep only the last 7 backups. Older backups are automatically deleted.
-
Cleanup Script
#!/bin/bash ES_DIR="/root/elasticsearch" ES_HOST="http://localhost:9200" REPO_NAME="backup-elasticsearch-prod" ES_PASSWORD=$(grep "^ELASTIC_PASSWORD=" $ES_DIR/.env | cut -d '=' -f2) LOG_NAME="$ES_DIR/es_snapshot_cleanup.log" KEEP_COUNT=7 # Number of snapshots to keep echo "$(date)" | tee -a $LOG_NAME echo "Retrieving snapshot list..." | tee -a $LOG_NAME SNAPSHOT_LIST=$(curl -u "elastic:$ES_PASSWORD" -X GET "${ES_HOST}/_cat/snapshots/${REPO_NAME}?h=id&s=start_epoch&pretty" | awk '{print $1}') TOTAL_SNAPSHOTS=$(echo "$SNAPSHOT_LIST" | tr ' ' '\n' | wc -l) # Check if cleanup is needed if [ "$TOTAL_SNAPSHOTS" -gt "$KEEP_COUNT" ]; then echo "Cleaning up old snapshots. Total: $TOTAL_SNAPSHOTS, Keeping: $KEEP_COUNT" | tee -a $LOG_NAME DELETE_COUNT=$((TOTAL_SNAPSHOTS - KEEP_COUNT)) DELETE_SNAPSHOTS=$(echo "$SNAPSHOT_LIST" | tr ' ' '\n' | head -n "$DELETE_COUNT") for SNAPSHOT in $DELETE_SNAPSHOTS; do echo "Deleting snapshot: $SNAPSHOT" | tee -a $LOG_NAME curl -u "elastic:$ES_PASSWORD" -X DELETE "${ES_HOST}/_cat/snapshots/${REPO_NAME}/$SNAPSHOT?pretty" done else echo "No cleanup needed. Total snapshots: $TOTAL_SNAPSHOTS" | tee -a $LOG_NAME fi echo "" >> $LOG_NAME
-
Execution Schedule:
The script is scheduled to run automatically every night at 2:00 AM UTC using a cron job.
0 2 * * * /root/elasticsearch/es_backup_cleanup.sh
-
Verification
Snapshots can be verified using the following command:
curl -u "user:password" -X GET "http://localhost:9200/_cat/snapshots/backup-elasticsearch-prod?v&s=id&pretty"
Verify that only the last 7 snapshots are retained
-
Logs from the cleanup script are stored in
/root/elasticsearch/es_snapshot_cleanup.log.

Leave a Reply
You must be logged in to post a comment.